# Make LaTeX tables

This notebook will be used to make LaTeX tables for the supporting informations

In [13]:
import pickle

In [14]:
def process_smirks(smirks):
    new_smirks = smirks.replace('~', '$\\sim$')
    new_smirks = new_smirks.replace('#', '\#')
    return '\\texttt{%s}' % new_smirks

In [15]:
d = pickle.load(open('./mol_files/reduced_smirks_dict_5k.p', 'rb'))
lines = list()

## Longest SMIRKS

Longest SMIRKS was the sorting that worked for the most fragment types so those are the ones we decided to show in the supporting information PDF. We'll do this separately for each parameter type because they all have different parameter situations.

**Bond**

The label for bonds is in the form `zz_[k]\t[length]` where k is in the AMBER form (that is technically k/2) then length in A.


In [16]:
lines.append('\\begin{longtable}{>{\\baselineskip=10pt}p{.2\\textwidth} >{\\baselineskip=10pt}p{.2\\textwidth} >{\\baselineskip=10pt}p{.5\\textwidth}} \n')
lines.append('\\hline \n')
lines.append('\multicolumn{3}{c}{Bond Parameters} \\\\ \n')
lines.append('\\hline \n')
lines.append('\\textbf{$k$} & \\textbf{$l$} & \\textbf{\\texttt{SMIRKS}} \\\\ \n')
lines.append('\\hline \n')
lines.append('\\endhead')

for param, smirks in d['bond']['big_smirks']['output_5k']:
    p = param.replace('zz_', '')
    k, l = p.split()
    lines.append('%s & %s & %s \\\\ \n' % (k, l, process_smirks(smirks)))

lines.append('\\hline')
lines.append('\\caption{These are the bond parameters from the reference force field with the associated \\texttt{SMIRKS} patterns created with ChemPer} \n')
lines.append('\\label{tab:protein_bond}\n')
lines.append('\\end{longtable}\n\n\n')



**Angle**

The label for angles is in the form `zz_[k]\t[angle]` where k is in the AMBER form (that is technically k/2) then angle in degrees.

In [17]:
lines.append('\\begin{longtable}{>{\\baselineskip=10pt}p{.2\\textwidth} >{\\baselineskip=10pt}p{.2\\textwidth} >{\\baselineskip=10pt}p{.5\\textwidth}} \n')
lines.append('\\hline \n')
lines.append('\multicolumn{3}{c}{Angle Parameters} \\\\ \n')
lines.append('\\hline \n')
lines.append('\\textbf{$k$} & \\textbf{$\\theta$} & \\textbf{\\texttt{SMIRKS}} \\\\ \n')
lines.append('\\hline \n')
lines.append('\\endhead')

for param, smirks in d['angle']['big_smirks']['output_5k']:
    p = param.replace('zz_', '')
    k, theta = p.split()
    lines.append('%s & %s & %s \\\\ \n' % (k, theta, process_smirks(smirks)))

lines.append('\\hline')
lines.append('\\caption{These are the angle parameters from the reference force field with the associated \\texttt{SMIRKS} patterns created with ChemPer} \n')
lines.append('\\label{tab:protein_angle}\n')
lines.append('\\end{longtable}\n\n\n')

**Improper Torsions**

The label for improper torsions is in the form `zz_[u]\t[w]\t[n]`. These will need to be reordered to
n, w, u with less decimal points on n and w

In [18]:
lines.append('\\begin{longtable}{>{\\baselineskip=10pt}p{.05\\textwidth} >{\\baselineskip=10pt}p{.07\\textwidth} >{\\baselineskip=10pt}p{.12\\textwidth} >{\\baselineskip=10pt}p{.72\\textwidth}} \n')
lines.append('\\hline \n')
lines.append('\multicolumn{4}{c}{Improper Torsion Parameters} \\\\ \n')
lines.append('\\hline \n')
lines.append('\\textbf{$n$} & textbf{$\\omega$} & \\textbf{$u$} & \\textbf{\\texttt{SMIRKS}} \\\\ \n')
lines.append('\\hline \n')
lines.append('\\endhead')

for param, smirks in d['improper_torsion']['big_smirks']['output_5k']:
    p = param.replace('zz_', '')
    u,w,n, _ = p.split()
    w = w.split('.')[0]
    n = n.split('.')[0]
    
    lines.append('%s & %s & %s & %s \\\\ \n' % (n, w, u, process_smirks(smirks)))

lines.append('\\hline')
lines.append('\\caption{These are the improper torsion parameters from the reference force field with the associated \\texttt{SMIRKS} patterns created with ChemPer} \n')
lines.append('\\label{tab:protein_improper}\n')
lines.append('\\end{longtable}\n\n\n')

**Proper Torsion**

Proper torsions can have multiple torsion parameters assigned to them so we need to account for that they come in the form:
`zz_[u]\t[w]\t[n]\t[u2]\t[w2]\t[n2]...`

In [19]:
lines.append('\\begin{longtable}{>{\\baselineskip=10pt}p{.05\\textwidth} >{\\baselineskip=10pt}p{.07\\textwidth} >{\\baselineskip=10pt}p{.12\\textwidth} >{\\baselineskip=10pt}p{.72\\textwidth}} \n')
lines.append('\\hline \n')
lines.append('\multicolumn{4}{c}{Proper Torsion Parameters} \\\\ \n')
lines.append('\\hline \n')
lines.append('\\textbf{$n$} & \\textbf{$\\omega$} & \\textbf{$u$} & \\textbf{\\texttt{SMIRKS}} \\\\ \n')
lines.append('\\hline \n')
lines.append('\\endhead')

for param, smirks in d['proper_torsion']['big_smirks']['output_5k']:
    p = param.replace('zz_', '').split('\t')
    torsions = list()
    for idx in range(0,len(p),3):
        u = p[idx]
        w = p[idx+1].split('.')[0]
        n = p[idx+2].split('.')[0]
        torsions.append((n,w,u))
    if len(torsions) == 1:
        n,w,u = torsions[0]
        lines.append('%s & %s & %s & %s \\\\ \n' % (n, w, u, process_smirks(smirks)))
    else:
        ts = len(torsions)
        n,w,u = torsions[0]
        lines.append('%s & %s & %s & \\multirow[t]{%i}{*}{%s} \\\\ \n' % (n, w, u, ts, process_smirks(smirks)))
        for n,w,u in torsions[1:]:
            lines.append('%s & %s & %s & \\\\ \n' % (n,w,u))
        
lines.append('\\hline')
lines.append('\\caption{These are the proper torsion parameters from the reference force field with the associated \\texttt{SMIRKS} patterns created with ChemPer} \n')
lines.append('\\label{tab:protein_proper}\n')
lines.append('\\end{longtable}\n\n\n')

**Lennard-Jones**
These have parameters in the order `zz_[epsilon]\t[rmin_half]`

In [20]:
# table header
lines.append('\\begin{longtable}{>{\\baselineskip=10pt}p{.2\\textwidth} >{\\baselineskip=10pt}p{.2\\textwidth} >{\\baselineskip=10pt}p{.5\\textwidth}} \n')
lines.append('\\hline \n')
lines.append('\multicolumn{3}{c}{Lennard-Jones Parameters} \\\\ \n')
lines.append('\\hline \n')
lines.append('\\textbf{$\\epsilon$} & \\textbf{$r_{min}$} & \\textbf{\\texttt{SMIRKS}} \\\\ \n')
lines.append('\\hline \n')
lines.append('\\endhead')

for param, smirks in d['lj']['big_smirks']['output_5k']:
    p = param.replace('zz_', '')
    e,r = p.split()
    lines.append('%s & %s & %s \\\\ \n' % (e, r, process_smirks(smirks)))
    
lines.append('\\hline')
lines.append('\\caption{These are the Lennard-Jones parameters from the reference force field with the associated \\texttt{SMIRKS} patterns created with ChemPer} \n')
lines.append('\\label{tab:protein_lj}\n')
lines.append('\\end{longtable}\n\n\n')



**Charges** 

These did not pass when sorted by longest SMIRKS so in this case we'll use the one case where it did pass, `biggest_size`

```python
lines.append('\\begin{longtable}{>{\\baselineskip=10pt}p{.2\\textwidth} >{\\baselineskip=10pt}p{.75\\textwidth}} \n')
lines.append('\\hline \n')
lines.append('\\textbf{$q$} & \\textbf{\\texttt{SMIRKS}} \\\\ \n')
lines.append('\\hline \n')
lines.append('\\endhead')

for param, smirks in d['charge']['biggest_size']['output_5k']:
    q = param.replace('zz_','').split('\t')[0]
    lines.append('%s & %s \\\\ \n' % (q, process_smirks(smirks)))
    
lines.append('\\hline')
lines.append('\\caption{These are the charge parameters from the reference force field with the associated \\texttt{SMIRKS} patterns created with ChemPer} \n')
lines.append('\\label{tab:protein_charge}\n')
lines.append('\\end{longtable}\n\n\n')
```

### Save lines to tex file

In [21]:
f = open('protein_smirks.tex','w')
f.writelines(lines)
f.close()