# Pulmonary Fibrosis Progression (II)

## Training of Tabular Data
In this section we are going to cover the training of the model with the pre-processed tabular data that we have generated in the first notebook.

As a reminder there are now 3 files in the pp folder:
* pp_train.csv which contains the data ready to be used to train our model
* pp_test.csv contains the data for the inference exercise
* pp_results.csv which is the template for the final csv generated as part of this exercise with the predictions for per patient and week

In the following cells we create a PyTorch estimator and we train them with our train script. Note that the relevant source code is attached to this Notebook in the `source` folder:
* train.py - code containing the script to train our NN
* model.py - definition of the PyTorch NN
* predict.py - script that calls the trained model in order to get inference from an input dataset
* preprocess.py - helper funtion to preprocess tabular data in order to prepare it for the training excercise



In [54]:
# sagemaker
import boto3
import sagemaker
from sagemaker import get_execution_role
from sagemaker.pytorch import PyTorch
from sagemaker.pytorch import PyTorchModel
import pandas as pd
import os

import matplotlib.pyplot as plt

%matplotlib inline

#plotly
import plotly.express as px
import chart_studio.plotly as py
import plotly.graph_objs as go
from plotly.offline import iplot
import plotly.figure_factory as ff
import cufflinks
cufflinks.go_offline()
cufflinks.set_config_file(world_readable=True, theme='pearl')


## Upload train and test data to S3

In [13]:
# SageMaker session and role
sagemaker_session = sagemaker.Session()
role = sagemaker.get_execution_role()

# default S3 bucket
bucket = sagemaker_session.default_bucket()
# specify where to upload in S3
prefix = 'sagemaker/pulmonar-fibrosis'

# upload to S3
input_data = sagemaker_session.upload_data(path='data/pp/', bucket=bucket, key_prefix=prefix)
print(input_data)

s3://sagemaker-eu-west-1-666856156774/sagemaker/pulmonar-fibrosis


## Model and Training
In the next two cells we are creating and training the model using a PyTorch estimator.

In [22]:
# instantiate a pytorch estimator
estimator = PyTorch(entry_point='train.py',
                    source_dir='source',
                    train_instance_type='ml.c4.xlarge',
                    train_instance_count=1,
                    framework_version='1.5.0', 
                    role=role,
                    sagemaker_session=sagemaker_session,
                    hyperparameters = {
                        'epochs':150,
                        'batch-size': 32,
                        'seed': 1,
                        'lr':3e-3,
                        'in_tabular_features':10,
                        'quantiles': '0.2, 0.5, 0.8'
                    })


In [23]:
%%time 
# train the estimator on pre processed data
estimator.fit({'train': input_data})

'create_image_uri' will be deprecated in favor of 'ImageURIProvider' class in SageMaker Python SDK v2.
's3_input' class will be renamed to 'TrainingInput' in SageMaker Python SDK v2.
'create_image_uri' will be deprecated in favor of 'ImageURIProvider' class in SageMaker Python SDK v2.


2020-08-14 12:04:12 Starting - Starting the training job...
2020-08-14 12:04:15 Starting - Launching requested ML instances......
2020-08-14 12:05:38 Starting - Preparing the instances for training......
2020-08-14 12:06:34 Downloading - Downloading input data......
2020-08-14 12:07:38 Training - Training image download completed. Training in progress..[34mbash: cannot set terminal process group (-1): Inappropriate ioctl for device[0m
[34mbash: no job control in this shell[0m
[34m2020-08-14 12:07:39,387 sagemaker-containers INFO     Imported framework sagemaker_pytorch_container.training[0m
[34m2020-08-14 12:07:39,389 sagemaker-containers INFO     No GPUs detected (normal if no gpus installed)[0m
[34m2020-08-14 12:07:39,400 sagemaker_pytorch_container.training INFO     Block until all host DNS lookups succeed.[0m
[34m2020-08-14 12:07:42,431 sagemaker_pytorch_container.training INFO     Invoking user training script.[0m
[34m2020-08-14 12:07:42,843 sagemaker-containers INFO 

[34m[TRAIN] Epoch #48 Iteration #30 quantile loss: 149.85911560058594[0m
[34mEpoch #48 Training loss : 177.2497 Validation LLL : -6.7054 Time taken : 196.7456340789795 milliseconds[0m
[34m[TRAIN] Epoch #49 Iteration #10 quantile loss: 163.07662963867188[0m
[34m[TRAIN] Epoch #49 Iteration #20 quantile loss: 173.9950714111328[0m
[34m[TRAIN] Epoch #49 Iteration #30 quantile loss: 179.8210906982422[0m
[34mEpoch #49 Training loss : 176.1395 Validation LLL : -6.7037 Time taken : 249.1471767425537 milliseconds[0m
[34m[TRAIN] Epoch #50 Iteration #10 quantile loss: 152.31082153320312[0m
[34m[TRAIN] Epoch #50 Iteration #20 quantile loss: 187.67050170898438[0m
[34m[TRAIN] Epoch #50 Iteration #30 quantile loss: 178.36965942382812[0m
[34mEpoch #50 Training loss : 175.9386 Validation LLL : -6.7057 Time taken : 187.21723556518555 milliseconds[0m
[34m[TRAIN] Epoch #51 Iteration #10 quantile loss: 158.19302368164062[0m
[34m[TRAIN] Epoch #51 Iteration #20 quantile loss: 165.965988

[34mEpoch #100 Training loss : 172.1973 Validation LLL : -6.6737 Time taken : 189.99934196472168 milliseconds[0m
[34m[TRAIN] Epoch #101 Iteration #10 quantile loss: 212.36302185058594[0m
[34m[TRAIN] Epoch #101 Iteration #20 quantile loss: 179.95787048339844[0m
[34m[TRAIN] Epoch #101 Iteration #30 quantile loss: 125.59121704101562[0m
[34mEpoch #101 Training loss : 171.5115 Validation LLL : -6.6757 Time taken : 199.19967651367188 milliseconds[0m
[34m[TRAIN] Epoch #102 Iteration #10 quantile loss: 168.08279418945312[0m
[34m[TRAIN] Epoch #102 Iteration #20 quantile loss: 121.24410247802734[0m
[34m[TRAIN] Epoch #102 Iteration #30 quantile loss: 121.13101959228516[0m
[34mEpoch #102 Training loss : 171.7479 Validation LLL : -6.6732 Time taken : 210.6606960296631 milliseconds[0m
[34m[TRAIN] Epoch #103 Iteration #10 quantile loss: 166.66998291015625[0m
[34m[TRAIN] Epoch #103 Iteration #20 quantile loss: 201.6164093017578[0m
[34m[TRAIN] Epoch #103 Iteration #30 quantile lo


2020-08-14 12:08:40 Uploading - Uploading generated training model
2020-08-14 12:08:40 Completed - Training job completed
Training seconds: 126
Billable seconds: 126
CPU times: user 780 ms, sys: 35.9 ms, total: 816 ms
Wall time: 4min 42s


## Deploying our model
Below we deploy our trained model in Sagemaker

In [35]:
#fetch model
model = PyTorchModel(model_data=estimator.model_data, role=role, entry_point='predict.py', source_dir='source',
                     framework_version='1.0')

# deploy and create a predictor
predictor = model.deploy(instance_type='ml.t2.medium', initial_instance_count=1)

Parameter image will be renamed to image_uri in SageMaker Python SDK v2.
'create_image_uri' will be deprecated in favor of 'ImageURIProvider' class in SageMaker Python SDK v2.


---------------!

## Inference
Below we test our deployed model in AWS Sagemaker by calling the `predict.py` class in our `source` folder.

In [38]:
data_dir = "data/pp/"
# evaluate test data
input_data = pd.read_csv(filepath_or_buffer=data_dir+"pp_test.csv", header=None, names=None)
input_data = input_data.drop([0], axis=1).values.astype('float32')
predictions_df = pd.DataFrame(predictor.predict(input_data))

## Results and submission.csv
In the following cells we analyse the inference results and create an output csv as expected in the Kaggle competition.

In [55]:
# visualise results into dataframe and csv
results_df = pd.read_csv(filepath_or_buffer=os.path.join(data_dir, "results.csv"), header=0, names=None)
results_df['FVC'] = predictions_df[1]
results_df['Confidence'] = predictions_df[2] - predictions_df[0]
results_df.Weeks = results_df.Weeks.astype('str')
results_df['Patient_Week'] = results_df['Patient'] + '_' + results_df['Weeks']
results_df = results_df.drop(columns=['Patient', 'Weeks'])
results_df_cols = list(results_df.columns)
results_df_cols[0], results_df_cols[1], results_df_cols[2] = results_df_cols[2], results_df_cols[0], results_df_cols[1]
results_df = results_df[results_df_cols]
results_df.to_csv('data/submission.csv', index=False)
results_df.head()

Unnamed: 0,Patient_Week,FVC,Confidence
0,ID00419637202311204720264_6,2811.186768,278.26123
1,ID00421637202311550012437_15,2857.008301,282.773926
2,ID00422637202311677017371_6,2236.281738,221.555176
3,ID00423637202312137826377_17,3121.286133,308.846436
4,ID00426637202313170790466_0,2786.121338,275.634766


In [64]:
def plot_results(results_df):
    plot_results_df = results_df.copy()
    # remove the duplicate as a consequence of using test.csv and sample_submission.csv
    plot_results_df = plot_results_df.drop_duplicates()
    plot_results_df['Weeks'] = plot_results_df['Patient_Week'].apply(lambda x: int(x.split('_')[-1]))
    plot_results_df['Patient'] = plot_results_df['Patient_Week'].apply(lambda x: str(x.split('_')[0]))
    fig1 = px.scatter(plot_results_df, x="Weeks", y="FVC", color='Patient', title='Evolution of FVC predictions per week and Patient')
    fig1.show()
    fig2 = px.scatter(plot_results_df, x="Confidence", y="FVC", color='Patient', title='Confidence of our FVC predictions per Patient')
    fig2.show()

plot_results(results_df)

In [65]:
# Accepts a predictor endpoint as input
# And deletes the endpoint by name
def delete_endpoint(predictor):
    try:
        boto3.client('sagemaker').delete_endpoint(EndpointName=predictor.endpoint)
        print('Deleted {}'.format(predictor.endpoint))
    except:
        print('Already deleted: {}'.format(predictor.endpoint))

In [66]:
delete_endpoint(predictor)

Deleted sagemaker-pytorch-2020-08-14-12-37-15-084
