In [173]:
import sbmlcore, pandas, numpy, pytest
%load_ext autoreload
%autoreload 2

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


## Overview of feature addition methods

First create a mutation dataframe using sbmlcore.FeatureDataset. 

There are then several methods for adding on features to an existing mutation dataframe:
1. `features.add_feature([list of classes])` - this returns a `sbmlcore.FeaturesDataFrame.FeatureDataset` object
2. `mutation_dataframe + feature` - this returns a `sbmlcore.FeaturesDataFrame.FeatureDataset` object
3. Use the internal `._add_feature(args)` of each sbmlcore class - but in this case the mutation dataframe to add to must 
   already be a `pandas.core.frame.DataFrame` and the method returns a `pandas.core.frame.DataFrame`. 

### Create a mutation dataframe

In [174]:
a = {'segid': ['A', 'A', 'A', 'A'], 'mutation': ['M1D','R2K', 'A3V', 'A3F']}
df1 = pandas.DataFrame.from_dict(a)
df1

Unnamed: 0,segid,mutation
0,A,M1D
1,A,R2K
2,A,A3V
3,A,A3F


### Import mutation dataframe into sbmlcore

In [175]:
features = sbmlcore.FeatureDataset(df1, species='M. tuberculosis', gene='pncA', protein='pncA')
features

species:          M. tuberculosis
gene name:        pncA
protein name:     pncA
number of rows:   4

  segid mutation
0     A      M1D
1     A      R2K
2     A      A3V

### Adding features method 1: `.add_feature(list)`

You can add multiple features at once:

In [176]:
a = sbmlcore.AminoAcidHydropathyChangeKyteDoolittle()
b = sbmlcore.AminoAcidHydropathyChangeWimleyWhite()
features.add_feature([a, b])
features #Returns sbmlcore.FeaturesDataFrame.FeatureDataset object

species:          M. tuberculosis
gene name:        pncA
protein name:     pncA
number of rows:   4

  segid mutation  d_hydropathy_KD  d_hydropathy_WW
0     A      M1D             -5.4             2.85
1     A      R2K              0.6             0.81
2     A      A3V              2.4            -0.86

Or one feature at a time (in which case you don't have to specify the features as a list):

In [177]:
c = sbmlcore.AminoAcidVolumeChange()
features.add_feature(c)
features #Returns sbmlcore.FeaturesDataFrame.FeatureDataset object

species:          M. tuberculosis
gene name:        pncA
protein name:     pncA
number of rows:   4

  segid mutation  d_hydropathy_KD  d_hydropathy_WW  d_volume
0     A      M1D             -5.4             2.85     -51.8
1     A      R2K              0.6             0.81      -4.8
2     A      A3V              2.4            -0.86      51.4

You can convert the `sbmlcore.FeaturesDataFrame.FeatureDataset` object to a `pandas.core.frame.DataFrame` via:

In [178]:
features.df

Unnamed: 0,segid,mutation,d_hydropathy_KD,d_hydropathy_WW,d_volume
0,A,M1D,-5.4,2.85,-51.8
1,A,R2K,0.6,0.81,-4.8
2,A,A3V,2.4,-0.86,51.4
3,A,A3F,1.0,-0.91,101.3


### Adding features method 2: addition overloading

In [179]:
d = sbmlcore.FreeSASA('tests/3pl1.pdb')
features = features + d
features #Returns a sbmlcore.FeaturesDataFrame.FeatureDataset object

species:          M. tuberculosis
gene name:        pncA
protein name:     pncA
number of rows:   4

  segid mutation  d_hydropathy_KD  d_hydropathy_WW  d_volume       SASA
0     A      M1D             -5.4             2.85     -51.8  96.204428
1     A      R2K              0.6             0.81      -4.8  57.391769
2     A      A3V              2.4            -0.86      51.4   0.000000

Or you can use addition overloading AND convert to a `pandas.core.frame.DataFrame` at the same time:

In [180]:
e = sbmlcore.AminoAcidMWChange()
features1 = (features + e).df #using overloading AND converting to pandas.core.frame.DataFrame
features1

Unnamed: 0,segid,mutation,d_hydropathy_KD,d_hydropathy_WW,d_volume,SASA,d_MW
0,A,M1D,-5.4,2.85,-51.8,96.204428,-16.1
1,A,R2K,0.6,0.81,-4.8,57.391769,-28.0
2,A,A3V,2.4,-0.86,51.4,0.0,28.0
3,A,A3F,1.0,-0.91,101.3,0.0,76.1


### Adding features method 3: using the internal `._add_feature(args)` of each sbmlcore class
N.B. This only works if the dataframe to be added to is a `pandas.core.frame.DataFrame`!

In [181]:
f = sbmlcore.AminoAcidPiChange()
features1 = f._add_feature(features1)
features1 #N.B. Returns a pandas.core.frame.DataFrame

Unnamed: 0,segid,mutation,d_hydropathy_KD,d_hydropathy_WW,d_volume,SASA,d_MW,d_Pi
0,A,M1D,-5.4,2.85,-51.8,96.204428,-16.1,-2.97
1,A,R2K,0.6,0.81,-4.8,57.391769,-28.0,-1.02
2,A,A3V,2.4,-0.86,51.4,0.0,28.0,-0.04
3,A,A3F,1.0,-0.91,101.3,0.0,76.1,-0.52
