# Examples of using the FreeSASA and SNAP2 classes in sbmlcore #

Overall workflow for FreeSASA:
1. Provide an initial mutation dataframe
2. Load the FreeSASA class, specifying the relevant pdb file and a dictionary of offsets (if required) to bring the resids in the pdb inline with those in the mutation dataframe. Offsets must be a dictionary of the form {segid: int}. The offsets will be the same as those used for the StructuralFeatures class. 

Overall workflow for SNAP2:
1. Provide an initial mutation dataframe
2. Load the SNAP2 classes, specifying a .csv file and a mandatory dictionary of offsets (in the same form as for FreeSASA and the other classes). N.B. All .csv files must have a column called segid - this can be added using csv_segid_concat.ipynb which adds a segid for the .csv of each SINGLE chain and then concatenates the single chain .csvs. 

In [1]:
import sbmlcore, pandas, numpy, pytest
%load_ext autoreload
%autoreload 2

## Example 1: PncA ##
This example requires no offsets.

First, load FreeSASA class, specifying the correct pdb file and no offsets. 

In [41]:
file = sbmlcore.FreeSASA('tests/3pl1.pdb')


In [42]:
b = {'segid': ['A', 'A', 'A'], 'mutation': ['M1D','R2K', 'A3V']}
df = pandas.DataFrame(b)
df

Unnamed: 0,segid,mutation
0,A,M1D
1,A,R2K
2,A,A3V


Now calculate the surface accessible surface areas for each residue and attach to the mutation dataframe. 

In [43]:
sasa_df = file.add_feature(df)
sasa_df

Unnamed: 0,segid,mutation,SASA
0,A,M1D,96.204428
1,A,R2K,57.391769
2,A,A3V,0.0


Now add in predicted effects to protein function from SNAP2. 

In [44]:
a = sbmlcore.SNAP2('tests/3pl1-complete.csv', offsets = {'A':0})
sasa_df = a.add_feature(sasa_df)
sasa_df

Unnamed: 0,segid,mutation,SASA,Predicted Effect,Score,Expected Accuracy
0,A,M1D,96.204428,effect,74,85%
1,A,R2K,57.391769,neutral,-88,93%
2,A,A3V,0.0,effect,26,63%


## Example 2: RNAP ##
This example requires offsets to bring the resids from the mutation in line with the pdb file. 
A 'pdb_resid' column is given so that you can check the that you have specified the offsets correctly such that the 
entries in this column are the same as in the pdb file. For a further check, if you use the StructuralFeatures class, 
the offsets should be the same for both classes. 

In [33]:
file = sbmlcore.FreeSASA('tests/5uh6.pdb', offsets = {'A': 0, 'B': 0, 'C': -6}) #
b = {'segid': ['A', 'A', 'A', 'B', 'C', 'C'], 'mutation': ['I3D','S4K', 'Q5V', 'R6D', 'S450F', 'D435F']}
df = pandas.DataFrame(b)
df

Unnamed: 0,segid,mutation
0,A,I3D
1,A,S4K
2,A,Q5V
3,B,R6D
4,C,S450F
5,C,D435F


In [34]:
sasa_df = file.add_feature(df)
sasa_df

Unnamed: 0,segid,mutation,SASA
0,A,I3D,50.869281
1,A,S4K,61.119937
2,A,Q5V,123.631715
3,B,R6D,112.768052
4,C,S450F,8.75706
5,C,D435F,15.85735


In [35]:
a = sbmlcore.AminoAcidVolumeChange()
sasa_df = a.add_feature(sasa_df)
sasa_df

Unnamed: 0,segid,mutation,SASA,d_volume
0,A,I3D,50.869281,-55.6
1,A,S4K,61.119937,79.6
2,A,Q5V,123.631715,-3.8
3,B,R6D,112.768052,-62.3
4,C,S450F,8.75706,100.9
5,C,D435F,15.85735,78.8


In [36]:
c = sbmlcore.SNAP2("tests/5uh6-complete.csv", offsets = {'A': 0, 'B': 0, 'C': -6, 'D':0, 'E':0, 'F':0})
sasa_df = c.add_feature(sasa_df)
sasa_df

Unnamed: 0,segid,mutation,SASA,d_volume,Predicted Effect,Score,Expected Accuracy
0,A,I3D,50.869281,-55.6,neutral,-73,87%
1,A,S4K,61.119937,79.6,neutral,-84,93%
2,A,Q5V,123.631715,-3.8,neutral,-29,61%
3,B,R6D,112.768052,-62.3,effect,81,91%
4,C,S450F,8.75706,100.9,effect,91,95%
5,C,D435F,15.85735,78.8,effect,96,95%
