Import some basic libraries.
* Pandas - provided data frames
* matplotlib.pyplot - plotting support

Use Magic %matplotlib to display graphics inline instead of in a popup window.


In [61]:
import pandas as pd # pandas is a dataframe library
import boto3
import matplotlib.pyplot as plt      # matplotlib.pyplot plots data

%matplotlib inline

# Using your trained Model

## Load trained model from file

In [62]:
import joblib 
import pickle

# Read from local folder
# lr_cv_model = joblib.load("./model/diabetes-model.pkl")

s3_model_bucket = 'demo-predict-diabetes'
model_file = 'model/diabetes-model.pkl'

s3_client = boto3.client('s3')
response = s3_client.get_object(Bucket=s3_model_bucket, Key=model_file)
lr_cv_model = pickle.loads(response['Body'].read())


## Test Prediction on data

Once the model is loaded we can use it to predict on some data.  In this case the data file contains a few rows from the original Pima CSV file.


In [63]:
# get data from local  data file
# df_predict = pd.read_csv("./local-test/feature-data/data.csv")

feature_data_bucket = 'demo-predict-diabetes-feature-data'
feature_file_key = 'data/feature-data.csv'

s3uri = 's3://{}/{}'.format(feature_data_bucket, feature_file_key)
df_predict = pd.read_csv(s3uri)

print(df_predict.shape)

(5, 10)


In [64]:
df_predict

Unnamed: 0,num_preg,glucose_conc,diastolic_bp,thickness,insulin,bmi,diab_pred,age,skin,diabetes
0,1,89,66,23,94,28.1,0.167,21,0.9062,False
1,2,197,70,45,543,30.5,0.158,53,1.773,True
2,7,100,0,0,0,30.0,0.484,32,0.0,True
3,1,103,30,38,83,43.3,0.183,33,1.4972,False
4,1,93,70,31,0,30.4,0.315,23,1.2214,False


Data has 0 in places it should not.  

Just like test or test datasets we will use imputation to fix this.

In [67]:
#Impute with mean all 0 readings
from sklearn.impute import SimpleImputer
fill_0 = SimpleImputer(missing_values=0, strategy="mean") #, axis=0)
X_predict = fill_0.fit_transform(df_predict)

At this point our data is ready to be used for prediction.  

## Predict diabetes with the prediction data.  Returns 1 if True, 0 if false

In [68]:
lr_cv_model.predict(X_predict)

array([0, 1, 0, 0, 0])