# PreAnnotation and Khipu

Pre-Annotation is the assignment of features to ion relations and isotopologues. This creates EmpiricalCompounds, collections of features likely to represent the same chemical entity. Pre-Annotation is not unique to Khipu; however, the empirical compounds are. These computable data structures empower annotation in the PCPFM. 

First we will install Khipu and demonstrate stand alone usage before working with the pipeline or applying this software to isotope labelled data. 

In [None]:
!pip3 install --upgrade khipu-metabolomics

In [None]:
from khipu.extended import *

In [None]:
for x in (adduct_search_patterns, isotope_search_patterns, extended_adducts):
    print(x, '\n')

In [None]:
# 12C data

peaklist_12C = read_features_from_text(open("../Datasets/ecoli_pos.tsv").read(), id_col=0, mz_col=1, rtime_col=2, intensity_cols=(3, 6), delimiter="\t")

In [None]:
# filter low intensity peaks

peaklist_12C = [p for p in peaklist_12C if p['representative_intensity'] > 1]
print(len(peaklist_12C))

In [None]:
# 13C data

peaklist_13C = read_features_from_text(open("../Datasets/ecoli_pos.tsv").read(), id_col=0, mz_col=1, rtime_col=2, intensity_cols=(6, 9), delimiter="\t")

In [None]:
# filter low intensity peaks

peaklist_13C = [p for p in peaklist_13C if p['representative_intensity'] > 1]
print(len(peaklist_13C))

In [None]:
# lets perform pre-annotation

subnetworks, peak_dict, edge_dict = peaks_to_networks(peaklist_12C,
                    isotope_search_patterns,
                    adduct_search_patterns,
                    mz_tolerance_ppm=5,
                    rt_tolerance=2)

WV = Weavor(peak_dict, isotope_search_patterns=isotope_search_patterns, 
                adduct_search_patterns=adduct_search_patterns, 
                mz_tolerance_ppm=5, 
                mode='pos')

khipu_list = graphs_to_khipu_list(
        subnetworks, WV, mz_tolerance_ppm=5,)

print(len(subnetworks), len(khipu_list))

list_assigned_peaks = []
for KP in khipu_list:
    list_assigned_peaks += list(KP.feature_map.keys())
    
print(len(list_assigned_peaks))



In [None]:
print(json.dumps(peak_dict, indent=4))

In [None]:
from khipu.extended import *


ext_khipu_list, all_assigned_peaks = extend_khipu_list(khipu_list, [x for x in peak_dict.values()], extended_adducts, mz_tolerance_ppm=5, rt_tolerance=2)

print(len(ext_khipu_list), len(all_assigned_peaks))
