<a href="https://colab.research.google.com/github/kangmg/compchem_with_colab/blob/main/growing_string_method.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
#@title Install conda

!rm -rf /content/sample_data
!rm -rf /content/condacolab_install.log

try:
  import condacolab
  condacolab.check()
except:
  %pip install condacolab
  import condacolab
  condacolab.install()

✨🍰✨ Everything looks OK!


In [None]:
#@title import modules

import subprocess
import shutil
import os

In [None]:
#@title UDFs

def check_xtb():
  """check if xtb is installed
  """
  try:
    return True if subprocess.run(['xtb', '--version']).returncode == 0 else False
  except FileNotFoundError:
    return False

In [None]:
#@title install xtb
from IPython.display import clear_output

print("Checking xTB installation . . . \n")
if check_xtb():
  print('\033[34mxTB is normally installed\033[0m\n')
  !xtb --version
else:
  print('\033[31m[WARNING] xTB is not installed \033[0m\n')
  print("Installing xTB . . . \n")
  !conda install -c conda-forge xtb
  clear_output()
  print("Installation Done!")

Checking xTB installation . . . 

[34mxTB is normally installed[0m

      -----------------------------------------------------------      
     |                           x T B                           |     
     |                         S. Grimme                         |     
     |          Mulliken Center for Theoretical Chemistry        |     
     |                    University of Bonn                     |     
      -----------------------------------------------------------      

   * xtb version 6.6.1 (8d0f1dd) compiled by 'conda@1efc2f54142f' on 2023-08-01

normal termination of xtb


In [None]:
#@title install gsm-xtb
#@markdown tmp directory : ./tmp
!wget -q https://github.com/grimme-lab/molecularGSM/releases/download/rev1/gsm.tar.xz ./gsm.tar.xz
!tar -vxf ./gsm.tar.xz
!rm -rf gsm.tar.xz
!chmod +x ./GSM/gsm.orca
!chmod +x ./GSM/tm2orca.py

os.environ['gsm.orca'] = os.path.join(os.getcwd(), 'GSM/gsm.orca')
os.environ['tm2orca'] = os.path.join(os.getcwd(), 'GSM/tm2orca.py')

os.makedirs('tmp', exist_ok=True)
clear_output()
print("Installation Done!")


Installation Done!


In [None]:
#@title <font size=5 color=skyblue>inpfileq & ograd writer</font>

#@markdown <font color=skyblue size=5>inpfileq params : </font>
#@markdown <font size=3 color=pink>
#@markdown - SM_TYPE: Toggles between GSM and SSM. Use the indicated 3 letter combinations to utilize each of the chosen features. <br><br>
#@markdown - MAX_OPT_ITERS: Maximum iterations for each step of the GSM. <br><br>
#@markdown - STEP_OPT_ITERS: Maximum optimizing iterations for each step of the SSM. If starting structure is not properly optimized jobs can fail if the max opt set by this variable is exceeded! <br><br>
#@markdown - CONV_TOL: Controls the optimization threshold for nodes. Smaller threshold increases the run time, but can improve TS finding in difficult systems. <br><br>
#@markdown - ADD_NODE_TOL: Tolerance variable for the addition of next node in GSM. Higher numbers will afford fast growth, but decrease accuracy of reaction path identification. <br><br>
#@markdown - SCALING: For opt steps. This feature controls how the step size is adjusted based on the topography of the previous optimization step. Since the step size is the product of dqmax and the scaling variable, increasing this variable will increase the step size. For more rapid adjustment based on the topography of the reaction path, increase this variable. For less automatic adjustment of the step size, decrease this variable. For the most part, the default setting is fine. <br><br>
#@markdown - SSM_DQMAX: Controls the spacing between the nodes in SSM. In cases where the RP struggles to converge or the optimization of RP fails, decreasing this value might help. <br><br>
#@markdown - GROWTH_DIRECTION: GSM specific toggle which enables user control of growth direction. Typically the default (0) is preferred. However, this is a good toggle for debugging more difficult cases. For new users think of the location of the TS; if the TS favors the product then growth from the product with low tolerance convergence could provide better TS identification. <br><br>
#@markdown - INT_THRESH: Detection threshold for intermediate during string growth. GSM will not consider structures that have a energy below this threshold as a TS along the RP. <br><br>
#@markdown - INITIAL_OPT: Starting structure optimization performed on the first structure of the input file. <br><br>
#@markdown - FINAL_OPT: Max number of optimization steps performed on the last node of an SSM run. <br><br>
#@markdown - TS_FINAL_TYPE: Determines whether rotations are considered causation for termination of the GSM or SSM run. Typically we are searching for a change in bond connectivity so this value is set to 1 for delta bond. <br><br>
#@markdown - NNODES: Max number of nodes used for GSM. Set this number high (30) for SE-GSM as the convergence criterion typically results in less nodes needed and too small a number for this setting will result in job failure. For DE-GSM the typical is an odd number ranging from 9-15 with higher values being used for identifying multiple TSs along the path. <br><br>
#@markdown </font>

def write_ograd(charge:int=0)->None:
  pwd = os.getcwd()
  OGRAD_head = '''#!/bin/bash

if [ -z $2 ]
then
  echo " need two arguments! "
  exit
fi

ofile=orcain$1.in
ofileout=orcain$1.out
molfile=structure$1
ncpu=$2
basename="${ofile%.*}"'''
  OGRAD_xtb_setting = f'''########## XTB settings: #################
cd scratch
wc -l < $molfile > $ofile.xyz
echo "Dummy for XTB/TM calculation" >> $ofile.xyz
cat $molfile >> $ofile.xyz

xtb $ofile.xyz --grad --chrg {charge} > $ofile.xtbout

python3 {pwd}/tm2orca.py $basename
rm xtbrestart
cd ..'''
  OGRAD = OGRAD_head + "\n\n" + OGRAD_xtb_setting

  with open('ograd', 'w') as f:
    f.write(OGRAD)

def write_inpfileq(tmp_dir:str='tmp', jop_name:str='SE_GSM',
                   sm_type:str='SSM', nnodes:int=30,
                   bond_breaking:bool=False, initial_opt:int=30,
                   step_opt_iters:int=30, final_opt:int=150,
                   max_opt_iters:int=160, ssm_dqmax:int=0.8,
                   bond_fragments=1, min_spacing:float=5.0)->None:
  tmp_path = os.path.join(os.getcwd(), tmp_dir)
  INPFILEQ = f"""# FSM/GSM/SSM inpfileq

QCSCRATCH={tmp_path}
JOBNAME={jop_name}

------------- QCHEM Scratch Info ------------------------
$QCSCRATCH/    # path for scratch dir. end with '/'
$JOBNAME       # name of run
---------------------------------------------------------

------------ String Info --------------------------------
SM_TYPE                 {sm_type}    # SSM, FSM or GSM
RESTART                 0      # read restart.xyz
MAX_OPT_ITERS           {max_opt_iters}    # maximum iterations
STEP_OPT_ITERS          {step_opt_iters}     # for FSM/SSM
CONV_TOL                0.0005 # perp grad
ADD_NODE_TOL	        0.1    # for GSM
SCALING	                1.0    # for opt steps
SSM_DQMAX               {ssm_dqmax}    # add step size
GROWTH_DIRECTION        0      # normal/react/prod: 0/1/2
INT_THRESH              2.0    # intermediate detection
MIN_SPACING             {min_spacing}    # node spacing SSM
BOND_FRAGMENTS          {bond_fragments}      # make IC's for fragments
INITIAL_OPT             {initial_opt}      # opt steps first node
FINAL_OPT               {final_opt}    # opt steps last SSM node
PRODUCT_LIMIT           100.0  # kcal/mol
TS_FINAL_TYPE           {int(bond_breaking)}      # 0=no bond breaking, 1=breaking of bond
NNODES		        {nnodes}     # including endpoints
---------------------------------------------------------"""
  with open('inpfileq', 'w') as f:
    f.write(INPFILEQ)


In [None]:
def run_ssm(xyz_path:str, charge:int=0, **inpfileq_kwargs):
  """run single ended string method
  """
  # default values
  tmp_dir = inpfileq_kwargs.get('tmp_dir', 'tmp')
  sm_type = inpfileq_kwargs.get('sm_type', 'SSM')
  nnodes = inpfileq_kwargs.get('nnodes', 30)

  inpfileq_kwargs = {
      **inpfileq_kwargs,
      'sm_type': sm_type,
      'nnodes': nnodes,
      'tmp_dir': tmp_dir
  }

  # scratch
  os.makedirs('scratch', exist_ok=True)

  # tmp directory
  if not os.path.exists(tmp_dir): os.makedirs(tmp_dir)

  # copy xyz file
  if not os.path.isfile(xyz_path): raise FileNotFoundError(f'{xyz_path} not found')
  xyz_copy_path = os.path.join(os.getcwd(), 'scratch', 'initial0000.xyz')
  shutil.copy(xyz_path, xyz_copy_path)

  # write gsm inp files
  write_ograd(charge=charge)
  write_inpfileq(**inpfileq_kwargs)

  # permission
  os.chmod('./ograd', 0o755)

  # copy gsm.orca & tm2orca
  shutil.copy2(os.environ['gsm.orca'], os.path.join(os.getcwd(), 'gsm.orca'))
  shutil.copy2(os.environ['tm2orca'], os.path.join(os.getcwd(), 'tm2orca.py'))


  # run gsm
  #subprocess.run(args=["./gsm.orca"])

  # remove tmp files
  #os.remove('./inpfileq')
  #os.remove('./ograd')
  #os.remove(xyz_copy_path)
  #os.remove('tm2orca.py')
  #os.removedirs('scratch')





In [None]:
def run_dsm(reac_xyz_path:str, prod_xyz_path:str, charge:int=0, **inpfileq_kwargs):
  """run double ended string method
  """
  # default values
  tmp_dir = inpfileq_kwargs.get('tmp_dir', 'tmp')
  sm_type = inpfileq_kwargs.get('sm_type', 'DSM')
  nnodes = inpfileq_kwargs.get('nnodes', 15)

  inpfileq_kwargs = {
      **inpfileq_kwargs,
      'sm_type': sm_type,
      'nnodes': nnodes,
      'tmp_dir': tmp_dir
  }

  # scratch
  os.makedirs('scratch', exist_ok=True)

  # tmp directory
  if not os.path.exists(tmp_dir): os.makedirs(tmp_dir)

  # copy xyz file
  if not os.path.isfile(reac_xyz_path): raise FileNotFoundError(f'{reac_xyz_path} not found')
  else:
    with open(reac_xyz_path, 'r') as reac: reac_xyz = reac.read().strip()
  if not os.path.isfile(prod_xyz_path): raise FileNotFoundError(f'{prod_xyz_path} not found')
  else:
    with open(prod_xyz_path, 'r') as prod: prod_xyz = prod.read().strip()

  # write merged structures
  merged_xyz = reac_xyz + '\n' + prod_xyz
  xyz_copy_path = os.path.join(os.getcwd(), 'scratch', 'initial0000.xyz')
  with open(xyz_copy_path, 'w') as f: f.write(merged_xyz)

  # write gsm inp files
  write_ograd(charge=charge)
  write_inpfileq(**inpfileq_kwargs)

  # permission
  os.chmod('./ograd', 0o755)

  # copy gsm.orca & tm2orca
  shutil.copy2(os.environ['gsm.orca'], os.path.join(os.getcwd(), 'gsm.orca'))
  shutil.copy2(os.environ['tm2orca'], os.path.join(os.getcwd(), 'tm2orca.py'))

  # run gsm
  #subprocess.run(args=["./gsm.orca"])

  # remove tmp files
  #os.remove('./inpfileq')
  #os.remove('./ograd')
  #os.remove(xyz_copy_path)
  #os.remove('tm2orca.py')
  #os.removedirs('scratch')

In [None]:
def clear_tmp():
  !rm -rf scratch/*
  !rm -rf ograd
  !rm -rf inpfileq
  !rm -rf gsm.orca
  !rm -rf tm2orca.py
  !rm -rf stringfile.xyz0000
  !rm -rf stringfile.xyz0000fr

clear_tmp()

In [None]:
%%writefile tmp.xyz
 6
 0.000000
  C -1.277168 0.545365 -0.000063
  Br 0.648058 0.543727 0.000199
  H -1.652166 0.593222 1.017641
  H -1.652215 -0.359651 -0.467952
  H -1.651698 1.403205 -0.550042
  Cl -4.402752 0.572053 0.000227

Writing tmp.xyz


In [None]:
%%writefile scratch/initial0000.xyz
 6
 0.000000
  C -1.277168 0.545365 -0.000063
  Br 0.648058 0.543727 0.000199
  H -1.652166 0.593222 1.017641
  H -1.652215 -0.359651 -0.467952
  H -1.651698 1.403205 -0.550042
  Cl -4.402752 0.572053 0.000227

Writing scratch/initial0000.xyz


In [None]:
%%writefile scratch/initial0001.xyz
 6
 0.000000
  C -1.294919 0.542959 0.001299
  Br 0.694616 0.543808 -0.000438
  H -1.636980 0.598072 1.030309
  H -1.637355 -0.369230 -0.477817
  H -1.633737 1.408958 -0.553873
  Cl -4.352783 0.570770 0.000338

Writing scratch/initial0001.xyz


In [None]:
run_ssm("tmp.xyz", charge=-1, min_spacing=10)

In [None]:
%%writefile ISOMERS0000
NEW
ADD 3 4

Overwriting ISOMERS0000


In [None]:
!./gsm.orca

 Number of QC processors: 1 
***** Starting Initialization *****
 runend 1
  -structure filename from input: scratch/initial0000.xyz 
Initializing Tolerances and Parameters... 
  -Opening inpfileq 
  -reading file... 
  -using SSM 
  -RESTART: 0
  -MAX_OPT_ITERS: 160
  -STEP_OPT_ITERS: 30
  -CONV_TOL = 0.0005
  -ADD_NODE_TOL = 0.1
  -SCALING = 1
  -SSM_DQMAX: 0.8
  -SSM_DQMIN: 0.2
  -GROWTH_DIRECTION = 0
  -INT_THRESH: 2
  -SSM_MIN_SPACING: 10
  -BOND_FRAGMENTS = 1
  -INITIAL_OPT: 30
  -FINAL_OPT: 150
  -PRODUCT_LIMIT: 100
  -TS_FINAL_TYPE: 0
  -NNODES = 30
 Done reading inpfileq 

 using ISOMERS file: ISOMERS0000 
 reading isomers 
 adding bond: 3 4 
 found 1 isomer

Reading and initializing string coordinates 
  -Opening structure file 
  -reading file... 
  -The number of atoms is: 6
  -Reading the atomic names...  -Reading coordinates...Opening xyz file 
Finished reading information from structure file

****************************************
**************************************

In [None]:
run_dsm("wittig_reac.xyz", "wittig_prod.xyz", charge=0, bond_breaking=True)

In [None]:
!./gsm.orca

 Number of QC processors: 1 
***** Starting Initialization *****
 runend 1
  -structure filename from input: scratch/initial0000.xyz 
Initializing Tolerances and Parameters... 
  -Opening inpfileq 
  -reading file... 
  -using GSM 
  -RESTART: 0
  -MAX_OPT_ITERS: 160
  -STEP_OPT_ITERS: 30
  -CONV_TOL = 0.0005
  -ADD_NODE_TOL = 0.1
  -SCALING = 1
  -SSM_DQMAX: 0.8
  -SSM_DQMIN: 0.2
  -GROWTH_DIRECTION = 0
  -INT_THRESH: 2
  -SSM_MIN_SPACING: 5
  -BOND_FRAGMENTS = 1
  -INITIAL_OPT: 30
  -FINAL_OPT: 150
  -PRODUCT_LIMIT: 100
  -TS_FINAL_TYPE: 1
  -NNODES = 15
 Done reading inpfileq 

 reading isomers 
 couldn't find ISOMERS file: scratch/ISOMERS0000 
Reading and initializing string coordinates 
  -Opening structure file 
  -reading file... 
  -The number of atoms is: 47
  -Reading the atomic names...  -Reading coordinates...Opening xyz file 
Finished reading information from structure file

****************************************
****************************************
****** Starting I

In [None]:
!cat /content/scratch/tsq0000.xyz

 47
-65.351081 -65.369232 -65.397096
C -1.580725 -1.898172 0.870406
C -1.469690 -1.554977 2.248111
O -0.256726 -0.113095 2.079463
P -0.105047 -0.203298 0.569176
H -0.785238 -2.144821 2.849741
C -1.346225 0.604674 -0.493771
C -1.421612 0.473436 -1.872080
C -2.194019 1.486951 0.163941
C -2.349118 1.214744 -2.581955
H -0.752609 -0.192604 -2.395359
C -3.130788 2.213606 -0.546097
H -2.096051 1.602984 1.234268
C -3.210288 2.076674 -1.922566
H -2.398340 1.121642 -3.657708
H -3.792873 2.893543 -0.029851
H -3.934007 2.650178 -2.482598
C 1.104366 1.204046 0.348682
C 1.623191 1.447593 -0.918541
C 1.481083 2.029964 1.397656
C 2.508986 2.487007 -1.129603
H 1.336797 0.808320 -1.742995
C 2.367013 3.073559 1.187351
H 1.080809 1.847814 2.383319
C 2.882701 3.303758 -0.074632
H 2.908664 2.662745 -2.118443
H 2.655394 3.707374 2.013534
H 3.574773 4.117017 -0.238791
C 1.025221 -1.457587 -0.135158
C 2.265701 -1.536040 0.480486
C 0.723929 -2.296334 -1.196112
C 3.210478 -2.439695 0.029899
H 2.475611 -0.881352 

In [None]:
#@title write wittig_reac.xyz
%%writefile wittig_reac.xyz
47

C       -1.4508270       -1.7370320        1.0369570
C       -1.2372390       -1.2421820        2.4697700
O       -0.3027010       -0.1547390        2.1731120
P       -0.1751230       -0.2888580        0.4773510
H       -0.6927520       -1.9737940        3.0798710
C       -1.3216410        0.5900230       -0.7103560
C       -1.4162490        0.3165370       -2.0824370
C       -2.1648680        1.5701020       -0.1596710
C       -2.3333230        1.0073740       -2.8836110
H       -0.7716640       -0.4229150       -2.5454140
C       -3.0866580        2.2523190       -0.9577610
H       -2.1024980        1.8116030        0.8985790
C       -3.1754490        1.9722170       -2.3250430
H       -2.3846140        0.7878770       -3.9469000
H       -3.7311840        3.0043420       -0.5102950
H       -3.8893410        2.5036630       -2.9481970
C        1.0600580        1.2029250        0.5121770
C        1.6266130        1.6113580       -0.7100600
C        1.4148930        1.9305880        1.6607610
C        2.5120890        2.6885460       -0.7875790
H        1.3818690        1.0792590       -1.6258090
C        2.2988390        3.0169230        1.5906600
H        1.0021430        1.6478010        2.6210960
C        2.8532160        3.4003100        0.3681730
H        2.9337920        2.9718110       -1.7487890
H        2.5527370        3.5592900        2.4983350
H        3.5402680        4.2407060        0.3136830
C        1.0153250       -1.4656380       -0.3540580
C        2.3492280       -1.5209240        0.0791580
C        0.5758870       -2.4173590       -1.2909460
C        3.2242140       -2.4936620       -0.4159660
H        2.7137720       -0.8076090        0.8115430
C        1.4591580       -3.3631060       -1.8170260
H       -0.4621540       -2.4376430       -1.6077830
C        2.7864180       -3.4085850       -1.3762120
H        4.2483840       -2.5276580       -0.0542730
H        1.1046140       -4.0750430       -2.5575410
H        3.4687160       -4.1557560       -1.7720970
C       -2.4257960       -0.7194590        3.2604930
H       -2.0866850       -0.2808800        4.2054840
H       -2.9804000        0.0450970        2.7076300
H       -3.1132800       -1.5395290        3.4988330
C       -2.8715660       -1.8294980        0.4917160
H       -3.4493530       -2.5931700        1.0311440
H       -3.4212480       -0.8867130        0.5635620
H       -2.8620520       -2.1218650       -0.5645350
H       -0.9586330       -2.7036620        0.8938460

Writing wittig_reac.xyz


In [None]:
#@title write wittig_prod.xyz
%%writefile wittig_prod.xyz
47
0 1 0.000000
C       -2.4487840       -2.7510630        1.2791190
C       -2.0806820       -2.3828390        2.5155740
O       -0.2137890        0.0165600        1.8684090
P        0.2091950        0.1686110        0.4250860
H       -1.2541820       -2.9297310        2.9709780
C       -1.1735660        0.6763440       -0.6708360
C       -1.2000380        0.5016350       -2.0639250
C       -2.2026100        1.4096000       -0.0551780
C       -2.2435400        1.0409280       -2.8226710
H       -0.4091480       -0.0425780       -2.5689430
C       -3.2322600        1.9650740       -0.8179280
H       -2.1892040        1.5483800        1.0214790
C       -3.2567250        1.7788290       -2.2035650
H       -2.2571050        0.8906370       -3.8985870
H       -4.0151710        2.5385110       -0.3294280
H       -4.0593170        2.2071340       -2.7975340
C        1.4119670        1.5432970        0.2896850
C        1.9939480        1.9353040       -0.9243920
C        1.7243570        2.2362140        1.4667750
C        2.8816100        3.0115930       -0.9577380
H        1.7575220        1.4049480       -1.8430470
C        2.6141450        3.3133680        1.4318960
H        1.2674140        1.9252450        2.4010160
C        3.1917640        3.7018330        0.2203750
H        3.3306680        3.3130850       -1.8999960
H        2.8537290        3.8467850        2.3476880
H        3.8823950        4.5401570        0.1911460
C        1.0835120       -1.2897850       -0.2704650
C        2.4010430       -1.5000400        0.1771680
C        0.5155220       -2.2144660       -1.1608180
C        3.1422810       -2.5921230       -0.2752420
H        2.8567430       -0.8047990        0.8762890
C        1.2664570       -3.2991830       -1.6273530
H       -0.5084370       -2.0988200       -1.4985710
C        2.5810620       -3.4864990       -1.1923880
H        4.1585960       -2.7369790        0.0799610
H        0.8194210       -3.9979320       -2.3289390
H        3.1622880       -4.3278250       -1.5594100
C       -2.6544330       -1.2878120        3.3675540
H       -1.8643050       -0.5799040        3.6396080
H       -3.4493330       -0.7286710        2.8661000
H       -3.0684680       -1.6975280        4.2997420
C       -3.5332680       -2.1532100        0.4255040
H       -4.3173420       -2.8939510        0.2138960
H       -4.0111300       -1.2865960        0.8891710
H       -3.1373400       -1.8357660       -0.5476620
H       -1.9016070       -3.5754940        0.8200020

Writing wittig_prod.xyz


In [None]:
#@title write test xyz file
#@markdown cyclohexane.xyz

%%writefile cyclohexance.xyz
18

  C     -1.3763      0.1297     -0.4173
  C     -0.6538      1.2401      0.3544
  C      0.8504      1.2665      0.0365
  H     -1.1062      2.2207      0.1202
  H     -0.7993      1.0861      1.4411
  C      1.4054     -0.1227     -0.3217
  H      1.0456      1.9600     -0.8024
  H      1.3963      1.6791      0.9050
  C      0.6302     -1.2503      0.3714
  H      1.3611     -0.2678     -1.4182
  H      2.4758     -0.1792     -0.0510
  C     -0.8567     -1.2627     -0.0238
  H      1.0900     -2.2260      0.1305
  H      0.7183     -1.1325      1.4687
  H     -1.0218     -1.9616     -0.8644
  H     -1.4518     -1.6595      0.8199
  H     -2.4652      0.1900     -0.2391
  H     -1.2362      0.2856     -1.5043

Writing cyclohexance.xyz
