<h1><center> Predicting Pancreatic Cancer Tumor Response to Gemcitabine with Neural Network Optimization </h1></center>


<img src="http://kluwerpatentblog.com/wp-content/uploads/sites/52/2012/02/gemcitabine.gif" style="height:250px">

## Import the necessary libraries

In [86]:
from __future__ import absolute_import, division, print_function, unicode_literals
import numpy as np
import pandas as pd
import scipy
import sklearn
from sklearn.preprocessing import StandardScaler

import tensorflow as tf

from tensorflow.keras import layers
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation

from sklearn.preprocessing import OneHotEncoder    
from sklearn.metrics import accuracy_score
from tensorflow.python.client import device_lib



## Import the data and transpose

Data were retrieved from [The Cancer Genome Atlas]("https://portal.gdc.cancer.gov/projects/TCGA-PAAD")

- The transcriptome data (FPKM) data was used. 
- Subjects were selected if they were on the Gemcitabine Drug 
- Patients with a Stable or Clinically Progressive Disease Tumor were classified as Non Responders
- Patients with a Complete or Partial Response were classified as Responders 

In [87]:
pd.read_csv("C:\\Users\coder\Desktop\Cancer Data\Panc Responders NonResponders Gemcitabine.csv", header=0, low_memory=False).T.to_csv('C:\\Users\coder\Desktop\Cancer Data\Panc Responders NonResponders Gemcitabine T.csv', header=False, index=False)
df = pd.read_csv("C:\\Users\coder\Desktop\Cancer Data\Panc Responders NonResponders Gemcitabine T.csv", low_memory = False)

## Transform the categorial variables into digits and seperate the target from data frame 

In [88]:
#Save the data frame and pop the 

print(df.head())
df.dtypes 
df['Target'] = pd.Categorical(df['Target'])
df['Target'] = df.Target.cat.codes
df['Cancer_Type'] = pd.Categorical(df['Cancer_Type'])
df['Cancer_Type'] = df.Cancer_Type.cat.codes
target = df.pop("Target")
X = df.iloc[:,0:len(df.columns)].values



      Target       Cancer_Type  ENSG00000242268.2  ENSG00000270112.3  \
0  Responder  Ductal Neoplasm         1528.121743         106.015872   
1  Responder  Ductal Neoplasm          354.349507         245.835597   
2  Responder  Ductal Neoplasm         2324.176086           0.000000   
3  Responder  Ductal Neoplasm            0.000000          89.124163   
4  Responder  Ductal Neoplasm          339.897606          47.161872   

   ENSG00000167578.15  ENSG00000273842.1  ENSG00000078237.5  \
0         62143.74692                  0        94660.89880   
1         50547.68254                  0        80811.14588   
2        187042.45300                  0        57179.88270   
3        196458.72150                  0        75731.48091   
4         79563.83910                  0       118148.99240   

   ENSG00000146083.10  ENSG00000225275.4  ENSG00000158486.12  ...  \
0         409162.6952                  0         3191.077740  ...   
1         230163.1523                  0         1

## Normalize the data 

In [89]:
#Normalizing the data

sc = StandardScaler()
X = sc.fit_transform(X)


## Use SVM-RFE to select the top 100 Features


In [90]:
from sklearn.svm import LinearSVC
from sklearn.feature_selection import RFE

svm = LinearSVC()
rfe = RFE(svm, 100, step = 0.5)
rfe = rfe.fit(X, target)
selected_features = rfe.transform(X)




## Make the target data into a 2D matrix for the network 

In [93]:
ohe = OneHotEncoder(categories = 'auto')
Y = target.values.reshape(-1,1)
Y = ohe.fit_transform(Y).toarray()

## Make the Neural Network 

The network consists of a 
- 100 node input layer with Xavier initialized weights 
- A 32 node hidden layer 
- The final 2 node output layer 

In [94]:
def makeModel():

    model = Sequential([
        layers.Dense(100, input_shape = (100,), kernel_initializer = "glorot_normal" , activation ='relu', ),
        layers.Dense(32, activation = 'relu'),
        layers.Dense(2, activation = 'relu')
        ])
    
    return model



## Initialize an instance of the model and record output 






In [96]:
neuralNetwork = makeModel()

neuralNetwork.compile(loss = 'binary_crossentropy', optimizer='adam', metrics=['accuracy'])

history = neuralNetwork.fit(selected_features, Y, batch_size = 10 ,  epochs=25, verbose = 1, validation_split = 0.1)


Train on 45 samples, validate on 6 samples
Epoch 1/25
Epoch 2/25
Epoch 3/25
Epoch 4/25
Epoch 5/25
Epoch 6/25
Epoch 7/25
Epoch 8/25
Epoch 9/25
Epoch 10/25
Epoch 11/25
Epoch 12/25
Epoch 13/25
Epoch 14/25
Epoch 15/25
Epoch 16/25
Epoch 17/25
Epoch 18/25
Epoch 19/25
Epoch 20/25
Epoch 21/25
Epoch 22/25
Epoch 23/25
Epoch 24/25
Epoch 25/25


## Resources 

- https://www.nature.com/articles/s41598-018-34753-5