# Building the system

This section will focus on the construction of an initial molecular structure, which can be done using a number of ways:

- Downloading a structure from the [protein data bank](https://www.rcsb.org/)

- Using a graphical molecular editor, such as [Avogadro](https://avogadro.cc/)

- Specify the structure using the simplified molecular-input line-entry system ([SMILES](https://en.wikipedia.org/wiki/Simplified_molecular-input_line-entry_system)) line notation, and employ a program which can build the molecule from this

In this section we will focus on the last option, using the [molSimplify](https://github.com/hjkgrp/molSimplify) program. For a full description of how a molecule is written using SMILES can be found on, *e.g.*, the [wiki](https://en.wikipedia.org/wiki/Simplified_molecular-input_line-entry_system) entry --- here we will illustrate its use for a few . Make sure to visualize the resulting geometries and ensure they correspond to what you want to study.

In [2]:
import py3Dmol as p3d
from rdkit import Chem
from rdkit.Chem import AllChem

## Caffeine

In [62]:
def smilestoxyz(smiles, opt = True):
    mol_bare = Chem.MolFromSmiles(smiles)
    mol_full = Chem.AddHs(mol_bare)
    AllChem.EmbedMolecule(mol_full)
    if opt: AllChem.UFFOptimizeMolecule(mol_full)
    return Chem.MolToXYZBlock(mol_full)

caffeine_xyz = smilestoxyz('CN1C=NC2=C1C(=O)N(C(=O)N2C)C')

In [63]:
print(caffeine_xyz)

24

C      3.167624   -1.053198    0.019519
N      2.156340   -0.002008   -0.023714
C      2.370872    1.335406   -0.081622
N      1.222124    2.060433   -0.105889
C      0.266645    1.109907   -0.058955
C      0.818551   -0.123678   -0.008062
C      0.013649   -1.253585    0.053021
O      0.527792   -2.405433    0.077818
N     -1.351527   -1.079127    0.085240
C     -1.893878    0.193835    0.009189
O     -3.148892    0.342268    0.013177
N     -1.080179    1.303809   -0.072191
C     -1.646655    2.658812   -0.106520
C     -2.240156   -2.252542    0.129834
H      3.047699   -1.724659   -0.856381
H      3.057982   -1.640212    0.955403
H      4.186998   -0.612624   -0.007007
H      3.355008    1.785474   -0.106389
H     -0.922286    3.392198   -0.520493
H     -1.927356    2.970379    0.921522
H     -2.546909    2.688787   -0.756890
H     -2.512421   -2.552212   -0.903996
H     -3.166296   -2.031104    0.702077
H     -1.754726   -3.110927    0.641309



In [64]:
viewer = p3d.view(width=400, height=300)
viewer.addModel(caffeine_xyz, 'xyz')
viewer.setViewStyle({"style": "outline", "color": "black", "width": 0.1})
viewer.setStyle({"stick": {}})
viewer.show()

## Linear molecules

The basic principle of SMILES is to...

Bonds:
- ethane
- ethene
- ethyne
- propane

Substitutions:
- 1-vinylfluoride
- 2-vinylfluoride
- 11-difluoroethene
- 12 difluoroethene

### Bond type

In [50]:
ethane_xyz = smilestoxyz('CC')
ethene_xyz = smilestoxyz('C=C')
ethyne_xyz = smilestoxyz('C#C')

viewer = p3d.view(viewergrid=(1, 3), width=800, height=300, linked=False)
viewer.addModel(ethane_xyz, 'xyz', viewer=(0, 0))
viewer.addModel(ethene_xyz, 'xyz', viewer=(0, 1))
viewer.addModel(ethyne_xyz, 'xyz', viewer=(0, 2))
viewer.setViewStyle({"style": "outline", "color": "black", "width": 0.1})
viewer.setStyle({"stick": {}})
viewer.show()

### Substitution

In [67]:
fluoro1_xyz = smilestoxyz('FCC')
fluoro2_xyz = smilestoxyz('CCF')
difluoro12_xyz = smilestoxyz('C(=CF)F')
difluoro22_xyz = smilestoxyz('C=C(F)(F)')

viewer = p3d.view(viewergrid=(2, 2), width=600, height=500, linked=False)
viewer.addModel(fluoro1_xyz, 'xyz', viewer=(0, 0))
viewer.addModel(fluoro2_xyz, 'xyz', viewer=(0, 1))
viewer.addModel(difluoro12_xyz, 'xyz', viewer=(1, 0))
viewer.addModel(difluoro22_xyz, 'xyz', viewer=(1, 1))
viewer.setViewStyle({"style": "outline", "color": "black", "width": 0.1})
viewer.setStyle({"stick": {}})
viewer.show()

### Optimization

In [70]:
hexane = smilestoxyz('C1CCCCCCC1')
hexane_start = smilestoxyz('C1CCCCCCC1', opt=False)

viewer = p3d.view(viewergrid=(1, 2), width=600, height=300, linked=False)
viewer.addModel(hexane, 'xyz', viewer=(0, 0))
viewer.addModel(hexane_start, 'xyz', viewer=(0, 1))
viewer.setViewStyle({"style": "outline", "color": "black", "width": 0.1})
viewer.setStyle({"stick": {}})
viewer.show()

In [10]:

AllChem.EmbedMolecule(m3)
AllChem.UFFOptimizeMolecule(m3)
mol_xyz = Chem.MolToXYZBlock(m3)
print(mol_xyz)

viewer = py3Dmol.view(width=400, height=300)
# black outline for nicer-looking figures
viewer.setViewStyle({"style": "outline", "color": "black", "width": 0.1})
viewer.addModel(mol_xyz)
# visualize with the stick option - can also consider spheres and more
viewer.setStyle({"stick": {}})
# rotate for a better initial view
viewer.rotate(-90, "y")
viewer.show()

NameError: name 'm3' is not defined

## Branched molecules

For branching, it...

- octane isomers

## Ring systems

Ring systems are constructed

- cyclohexane
- benzene

Ring and substituted branch?

## More complex systems

Guanine, which can be found on the [wiki](https://en.wikipedia.org/wiki/Guanine) entry:

- keto

- enol