# E. Coli Credentialing samples, run RP ESI+


Demo of annoTree functions. Example of a tree:

```
95.0492@30.5
├── 101.0693@30.8 13C/12C*6
└── 136.0758@30.5 Acetonitrile
    └── 142.0958@30.8 13C/12C*6
```

In [1]:
!pip install --upgrade mass2chem

Requirement already up-to-date: mass2chem in /opt/conda/lib/python3.7/site-packages (0.4.1)


In [2]:
from mass2chem.io import read_features
from mass2chem.annotree import *

In [3]:
f = 'export/full_Feature_table.tsv'
flist = read_features(f, id_col=0, mz_col=1, rtime_col=2, intensity_cols=(11, 17))
print(len(flist), flist[5])

table headers ordered:  mz rtime
Read 3602 feature lines
3602 {'id': 'F6', 'mz': 115.988, 'rtime': 19.63, 'intensities': [506535.0, 510552.0, 451142.0, 549764.0, 442653.0, 549216.0], 'representative_intensity': 501643.6666666667}


In [4]:
# Initial trees by isotopic pattern
# search_patterns: Need to consider all possible number of 13C labels. 
#            If *6 exists, it's not granted that *4 exists. So we can't rely on pairwise to connect all pairs.
#            The third item, (0, 0.8) here, is an option to constrain ratios but not used in this function.
        
n2tree = construct_isotopic_trees(
            flist, 
            search_patterns=[
                   (1.003355, '13C/12C', (0, 0.8)),
                   (2.00671, '13C/12C*2', (0, 0.8)),
                   (3.010065, '13C/12C*3', (0, 0.8)),
                   (4.01342, '13C/12C*4', (0, 0.8)),
                   (5.016775, '13C/12C*5', (0, 0.8)),
                   (6.02013, '13C/12C*6', (0, 0.8)),
                  ],)

Found 719 isotopic pairs, 455 trees and 264 in branches.


In [5]:
# Merge adducts 
gt2 = merge_trees_by_insrc_modifications(
            set(n2tree.values()), 
            flist,
            search_patterns=[                  
                    (1.0078, 'H'),
                    (21.9820, 'Na/H'), # Na replacing H
                    (10.991, 'Na/H, double charged'),
                    (18.0106, '+H2O'), 
                    (18.033823, '+NH4'),
                    (37.9559, '39K/H'),
                    (39.9540, '41K/H'),
                    (41.026549, 'Acetonitrile'),], 
)

Merging adducts on 409 trees...
Unresolved multiple relationships:  {'F2755', 'F590', 'F834', 'F835', 'F323', 'F685', 'F3097', 'F943', 'F641', 'F1629', 'F2317', 'F1123', 'F1579', 'F873'}
Got 312 merged trees.


In [6]:
num_labels = find_trees_by_datatag_list(gt2)
for x in num_labels:
    print(len(x))

312
31
52
51
73
57


In [7]:
# Show first 15 trees in the *6 labeled group
for t in num_labels[5][:15]:
    t.show()

223.0248@21.5 anchor
└── 229.045@22.0 13C/12C*6

322.9932@43.4 anchor
└── 329.0134@44.9 13C/12C*6

158.0814@24.6 anchor
└── 164.1014@24.1 13C/12C*6

237.0906@25.3 anchor
└── 243.1108@24.1 13C/12C*6

285.0993@30.3 anchor
└── 291.1191@29.9 13C/12C*6

243.0266@34.5 anchor
└── 249.0468@34.5 13C/12C*6

188.1759@20.6 anchor
└── 194.196@20.8 13C/12C*6

232.1931@30.5 anchor
└── 238.2134@30.5 13C/12C*6

263.0529@33.3 anchor
└── 269.0732@33.8 13C/12C*6

131.118@25.1 anchor
└── 137.1381@24.8 13C/12C*6

132.115@25.1 anchor
└── 138.1354@25.1 13C/12C*6

145.0497@68.9 anchor
└── 151.0702@68.9 13C/12C*6

126.0916@30.3 anchor
└── 129.1023@30.5 13C/12C*3
    └── 135.1225@30.5 13C/12C*6

236.9053@20.8 anchor
└── 242.9257@21.8 13C/12C*6

182.0813@30.5 anchor
└── 185.0922@29.6 13C/12C*3
    ├── 190.1081@30.8 13C/12C*5
    └── 191.1114@30.5 13C/12C*6



In [8]:
# these are more likely naturally occuring 13C
for t in num_labels[0][:5]:
    t.show()

632.2232@30.8 anchor
└── 633.2265@31.0 13C/12C

159.9899@25.9 anchor
└── 164.0039@25.9 13C/12C*4

162.022@30.5 anchor
└── 167.0389@30.3 13C/12C*5

344.0767@31.2 anchor
└── 347.0852@32.8 13C/12C*3

164.9948@37.8 anchor
└── 169.0083@37.1 13C/12C*4



In [9]:
# Export to JSON

export_json_trees(gt2, outfile="test.json")

## Conclusion

We have used the annoTree functions to find all isotopologues in this 13C-glucose tracing dataset, and grouped them with adducts.

Each tree is considered an empirical compound,
as defined in https://github.com/shuzhao-li/metDataModel
