# Classifying Vascular Networks
Jocelyn Shen

###### In this notebook, I will be using SVMs, PCA, and logistic regression to classify networks

<img src="http://faculty.biomath.ucla.edu/vsavage/Assets/Graphics/dicom_small_9_gd_crop_alpha.png",width=250>

### Animal Networks vs. Plant Networks

The data I am using is from the AngiCAML version of Angicart software.
1. hht_master is data from 18 human head and torso images (ANIMAL)

2. mouselung_master is data from a mouse lung data (ANIMAL)

3. pinon_master is data from a Pinon tree/pine tree (PLANT)

4. ponderosa_master is data from a 5 sapling ponderosa pine trees (PLANT)

5. root_master is data from a large collection of clumps of tree roots (PLANT)  

6. treetips_master is data from a collection of 50cm long tree tip samples (PLANT) 
    
    a) 30 samples -- 5 individuals from 6 different species
    
    b) "class" distinguishes between angiosperm (AS, flowering plant), and gymnosperm (GS, non-flowering plant).

#### STEP 1: Load the Data

In [2]:
import pandas as pd

In [3]:
hht = pd.read_csv("hht_master.csv")
ml = pd.read_csv("mouselung_master.csv")
pinon = pd.read_csv("pinon_master.csv")
ponderosa = pd.read_csv("ponderosa_master.csv")
root = pd.read_csv("root_master.csv")
treetips = pd.read_csv("treetips_master.csv")

#### STEP 2: Clean the Datasets

##### Human Head and Torso Data

In [4]:
hht.head()

Unnamed: 0.1,Unnamed: 0,nodeid,parent,indv,generation,n,beta.ave,beta.diff,gamma.ave,gamma.diff
0,1,"(113, 303, 19)-(116, 317, 20)",0,hht01,1.0,2.0,0.87839,-0.052772,0.967488,0.18146
1,2,"(107, 293, 13)-(113, 303, 19)","(113, 303, 19)-(116, 317, 20)",hht01,0.3,2.0,0.321974,-0.025686,0.733451,0.56894
2,3,"(105, 293, 13)-(107, 293, 13)","(107, 293, 13)-(113, 303, 19)",hht01,0.3,2.0,0.704715,0.296466,3.513122,3.013122
3,4,"(104, 292, 13)-(105, 293, 13)","(105, 293, 13)-(107, 293, 13)",hht01,0.2,2.0,1.827236,0.938097,6.745904,5.480993
4,5,"(103, 291, 12)-(104, 292, 13)","(104, 292, 13)-(105, 293, 13)",hht01,0.2,2.0,3.323552,0.572827,12.394017,5.719845


In [None]:
hht

##### Mouse Lung Data

In [5]:
ml.head()

Unnamed: 0.1,Unnamed: 0,nodeid,parent,generation,n,beta.ave,beta.diff,gamma.ave,gamma.diff
0,1,"(156, 101, 104)-(156, 109, 101)",0,0.9,2.0,0.776606,-0.090943,4.343893,1.837972
1,2,"(156, 109, 101)-(159, 131, 98)","(156, 101, 104)-(156, 109, 101)",1.0,2.0,0.551383,-0.008586,0.328757,0.026243
2,3,"(159, 131, 98)-(162, 136, 101)","(156, 109, 101)-(159, 131, 98)",1.0,2.0,1.022952,0.384252,1.305386,0.666746
3,4,"(162, 136, 101)-(164, 137, 112)","(159, 131, 98)-(162, 136, 101)",0.9,2.0,0.485138,0.357218,0.379264,0.296903
4,5,"(164, 137, 112)-(167, 141, 121)","(162, 136, 101)-(164, 137, 112)",0.9,2.0,0.255049,0.103189,0.164871,0.043065


##### Pinon Data (Pine Tree Data)

In [6]:
pinon.head()

Unnamed: 0.1,Unnamed: 0,nodeid,parent,generation,n,beta.ave,beta.diff,gamma.ave,gamma.diff
0,1,2.0,0.0,1.0,4.0,0.523997,,1.24537,
1,2,3.0,2.0,0.8,3.0,0.444882,,1.505051,
2,3,4.0,3.0,0.0,,,,,
3,4,5.0,3.0,0.7,4.0,0.722336,,1.115854,
4,5,6.2,3.0,0.0,,,,,


##### Ponderosa Data

In [7]:
ponderosa.head()

Unnamed: 0.1,Unnamed: 0,nodeid,parent,indv,generation,n,beta.ave,beta.diff,gamma.ave,gamma.diff
0,1,1,0,pond03_edited,1.0,2.0,0.576667,-0.39,1.014706,0.573529
1,2,2,1,pond03_edited,1.0,3.0,0.444828,,3.288889,
2,3,3,1,pond03_edited,0.2,2.0,0.544643,-0.133929,0.188889,0.055556
3,4,4,2,pond03_edited,0.9,3.0,0.525641,,1.45,
4,5,5,2,pond03_edited,0.0,,,,,


##### Root Data

In [8]:
root.head()

Unnamed: 0.1,Unnamed: 0,nodeid,parent,generation,n,beta.ave,beta.diff,gamma.ave,gamma.diff
0,1,1,0,1,2.0,0.287234,0.053191,0.264423,0.1875
1,2,2,1,0,,,,,
2,3,3,1,0,,,,,
3,4,1,0,1,2.0,0.277778,0.166667,0.134199,0.024892
4,5,2,1,0,,,,,


##### Treetips Data

In [9]:
treetips.head()

Unnamed: 0.1,Unnamed: 0,class,spcs,id,nodeid,parent,generation,n,beta.ave,beta.diff,gamma.ave,gamma.diff
0,1,GS,dougfir,df737g4,96,,1.0,2.0,1.113079,-0.717984,1.941558,0.344156
1,2,GS,dougfir,df737g4,108,96.0,0.75,2.0,0.658621,-0.231034,0.681818,0.170455
2,3,GS,dougfir,df737g4,120,108.0,0.75,3.0,0.405685,,1.840741,
3,4,GS,dougfir,df737g4,110,108.0,0.25,2.0,0.435484,-0.145161,0.176667,0.116667
4,5,GS,dougfir,df737g4,122,110.0,0.0,,,,,
