<a href="https://colab.research.google.com/github/valsson-group/UNT-Chem5660-Fall2023/blob/main/Python-JupyterNotebooks/Assignment4_Create_Reactant_from_InChI.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Assignment 4: Example how to create initial reactant from InChl Strings

This is an example how to create an initial geometry for reactant for Assignment 4 using InChl strings



# Imports and Function Definitions

In [None]:
%%capture
!pip install rdkit
!pip install py3Dmol
!pip install ipywidgets

In [None]:
# RDKit imports:
from rdkit import Chem
from rdkit.Chem import (
    AllChem,
    rdCoordGen,
)
from rdkit.Chem.Draw import IPythonConsole

IPythonConsole.ipython_useSVG = True

import py3Dmol

In [None]:
def get_xyz(molecule, optimize=False):
    """Get xyz-coordinates for the molecule"""
    mol = Chem.Mol(molecule)
    mol = AllChem.AddHs(mol, addCoords=True)
    AllChem.EmbedMolecule(mol)
    if optimize:  # Optimize the molecules with the MM force field:
        AllChem.MMFFOptimizeMolecule(mol)
    xyz = []
    for lines in Chem.MolToXYZBlock(mol).split("\n")[2:]:
        strip = lines.strip()
        if strip:
            xyz.append(strip)
    xyz = "\n".join(xyz)
    return mol, xyz

def writeMoleculeToXYZfile(mol,fn_out,comment=""):
    conf = mol.GetConformer()
    numAtoms = mol.GetConformer().GetNumAtoms()
    f = open(fn_out,'w')
    f.write("{0:3d}\n".format(numAtoms))
    f.write("  {0}\n".format(comment))
    for i in range(numAtoms):
        atom = mol.GetAtomWithIdx(i).GetSymbol()
        x = conf.GetAtomPosition(i)[0]
        y = conf.GetAtomPosition(i)[1]
        z = conf.GetAtomPosition(i)[2]
        f.write("  {0:<3s}{1:>20.10f}{2:>20.10f}{3:>20.10f}\n".format(atom,x,y,z))
#--------------------------------------------

# Create Molecules from InChl Strings

### Define InChl Strings for the Molecules in the Reactant

You need find the correct dieneophile molecules for Assignment 4 yourself. One way to do that is to search for them on PubChem using the [Draw Structre feature](https://pubchem.ncbi.nlm.nih.gov/#draw=true).

One can also use SMILES string if you want. Then you would use the `Chem.MolFromSmiles` command instead.



In [None]:
# Cyclopentadiene from
# https://pubchem.ncbi.nlm.nih.gov/compound/Cyclopentadiene
molecule_1_inchi = "InChI=1S/C5H6/c1-2-4-5-3-1/h1-4H,5H2"

# dieneophile molecule; here we only consider Ethylene from
# https://pubchem.ncbi.nlm.nih.gov/compound/6325
molecule_2_inchi = "InChI=1S/C2H4/c1-2/h1-2H2"



## Setup Molecules

In [None]:
molecule_1 = Chem.MolFromInchi(molecule_1_inchi)
molecule_1

In [None]:
molecule3d_1, xyz_1 = get_xyz(molecule_1, optimize=True)

print(xyz_1)
# we also write the XYZ coordiantes to file
writeMoleculeToXYZfile(molecule3d_1,"reactant_molecule_1.xyz",comment="")

In [None]:
view = py3Dmol.view(
    data=Chem.MolToMolBlock(molecule3d_1),
    style={"stick": {}, "sphere": {"scale": 0.3}},
    width=600,
    height=600,
)
view.zoomTo()



In [None]:
molecule_2 = Chem.MolFromInchi(molecule_2_inchi)
molecule_2

In [None]:
molecule3d_2, xyz_2 = get_xyz(molecule_2, optimize=True)

print(xyz_2)
# we also write the XYZ coordiantes to file
writeMoleculeToXYZfile(molecule3d_2,"reactant_molecule_2.xyz",comment="")

In [None]:
view = py3Dmol.view(
    data=Chem.MolToMolBlock(molecule3d_2),
    style={"stick": {}, "sphere": {"scale": 0.3}},
    width=600,
    height=600,
)
view.zoomTo()


## Orienting the Molecules

Now we have the molecules in XYZ file that we will work with. What we will do is use the `orient.py` script from https://github.com/smparker/orient-molecule/tree/master to manipulate the molecules.



In [None]:
!rm -f ./orient.py*
!wget https://raw.githubusercontent.com/smparker/orient-molecule/master/orient.py
!chmod a+x ./orient.py
!ls

In [None]:
!./orient.py -h

Here we first use the `-p` flag to align the molecules such that they are in the xy plane. We then use the `-tz` flag to translate the molecules in the z-direction, one -2 Angstrom and the other +2 Angstrom so that they have roughly 4 Angstrom difference between them. Both of these commands can be stacked in the call to the `orient.py` script (but the order matters). We then pipe the output to new files.


In [None]:
%%bash

./orient.py -p 1 2 3 4 5 -tz -2.0 reactant_molecule_1.xyz > reactant_molecule_1_align_to_z.xyz
echo "reactant_molecule_1_align_to_z.xyz:"
cat reactant_molecule_1_align_to_z.xyz
echo ""

./orient.py -p 1 2 3 4 -tz 2.0 reactant_molecule_2.xyz > reactant_molecule_2_align_to_z.xyz
echo "reactant_molecule_1_align_to_z.xyz:"
cat reactant_molecule_2_align_to_z.xyz
echo ""


Then we can combine the two XYZ files into a single file using bash commands.

In [None]:
%%bash
cat reactant_molecule_1_align_to_z.xyz | sed '1,2d' > reactant_1.tmp.xyz
cat reactant_molecule_2_align_to_z.xyz | sed '1,2d' > reactant_2.tmp.xyz
cat reactant_1.tmp.xyz reactant_2.tmp.xyz > reactant.tmp.xyz
NumAtoms=`cat reactant.tmp.xyz | wc -l`
rm -f reactant.xyz
echo "${NumAtoms}" >> reactant.xyz
echo " " >> reactant.xyz
cat reactant.tmp.xyz >> reactant.xyz
rm -f *.tmp.xyz

echo "reactant.xyz:"
cat reactant.xyz

To create the product, it is better to avogardo or something like that as we need to create bonds