# pyHDX basics

In [1]:
from pyhdx import PeptideMasterTable, read_dynamx
import os

In [2]:
data_dir = '../tests/test_data'
filename = 'ecSecB_apo.csv'
fpath = os.path.join(data_dir, filename)

We can use the ``read_dynamx`` function to read the file. This function returns a ``numpy`` structured array where each
entry corresponds to one peptide, in this example 567 peptides. 

In [3]:
data = read_dynamx(fpath)
len(data)

567

 This array is loaded into the ``PeptideMasterTable`` class, which is the main data entry class. By specifying ``drop_first`` 
 the number of n-terminal residues to remove can be changed and with ``ignore_prolines`` prolines residues, which do not
 have exchanging amide hydrogens, can be ignored.


In [4]:
master_table = PeptideMasterTable(data, drop_first=1, ignore_prolines=True)

This master table allows us to control how the deuterium uptake content is determined. The method ``set_control`` can be
used to choose which set of peptides is used as the fully deuterated (FD) control. This adds a new field called 'uptake'
which is the normalized (to 100%) deuterium uptake of each peptide. 

In [5]:
master_table.set_control(('Full deuteration control', 0.167))
master_table.data['uptake'][:50]

array([ 0.      ,  0.      ,  5.0734  ,  2.486444,  2.857141,  3.145738,
        3.785886,  4.08295 ,  4.790625,  0.      ,  0.      ,  3.642506,
        1.651437,  1.860919,  2.107151,  2.698036,  2.874801,  3.449561,
        0.      ,  0.      ,  5.264543,  1.839924,  2.508343,  2.969332,
        3.399092,  3.485568,  4.318144,  0.      ,  0.      ,  6.3179  ,
        2.532099,  3.306167,  3.996718,  4.38941 ,  4.379495,  5.283969,
        0.      ,  0.      ,  6.812215,  3.11985 ,  3.874881,  4.342807,
        4.854057,  4.835639,  5.780219,  0.      ,  0.      , 10.8151  ,
        5.432395,  6.1318  ])

Next we'll split the data and group them by the different states. This returns a dictionary where the values are
all peptides for a given state. The peptides for each state are grouped by their exposure time, forming a ``KineticSeries`` 
object

In [6]:
states = master_table.groupby_state()
print(states.keys())

dict_keys(['Full deuteration control', 'SecB WT apo'])


In [9]:
series = states['SecB WT apo']
type(series), len(series), series.times

(pyhdx.pyhdx.KineticsSeries,
 7,
 array([  0.      ,   0.167   ,   0.5     ,   1.      ,   5.      ,
         10.      , 100.000008]))

Iterating over a ``KineticSeries`` object returns a set of ``PeptideMeasurements`` each with their own attributes describing
the topology of the coverage. When all ``PeptideMeasurements`` in the series have identical coverage, the series is said
to be ``uniform``, which can be checked by the ``uniform`` property. Series can be made ``uniform`` with the
``make_uniform`` method, removing peptides which are not found in all timepoints. ``KineticsSeries`` are required to be 
uniform before fitting them.  

In [None]:
print(series.uniform)
series.make_uniform()  # This series already is uniform
