## Create Model

In [20]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from sklearn.linear_model import LinearRegression

# create df
dataset = pd.read_csv('sales.csv')

# drop null values
dataset['rate'].fillna(0, inplace=True)

dataset['sales_in_first_month'].fillna(dataset['sales_in_first_month'].mean(), inplace=True)

# features and target
#target = 'sales_in_third_month'
#features = ['Psales_in_second_month']

# X matrix, y vector
#X = train[features]
X = dataset.iloc[:, :3]
#y = train[target]
y = dataset.iloc[:, -1]


In [21]:
dataset.head()

Unnamed: 0,rate,sales_in_first_month,sales_in_second_month,sales_in_third_month
0,0,2,500,300
1,0,4,300,650
2,four,600,200,400
3,nine,450,320,650
4,seven,600,250,350


In [22]:
X.head()

Unnamed: 0,rate,sales_in_first_month,sales_in_second_month
0,0,2,500
1,0,4,300
2,four,600,200
3,nine,450,320
4,seven,600,250


In [23]:
y.tail()

1    650
2    400
3    650
4    350
5    700
Name: sales_in_third_month, dtype: int64

The code above creates a pandas dataframe from the csv data, drops null values, defines the features and target for the model, splits the data into a matrix with just the features and a vector with the target, and fits a linear regression model, then scores it.

In [24]:
def convert_to_int(word):
    word_dict = {'one':1, 'two':2, 'three':3, 'four':4, 'five':5, 'six':6, 'seven':7, 'eight':8,
                'nine':9, 'ten':10, 'eleven':11, 'twelve':12, 'zero':0, 0: 0}
    return word_dict[word]

In [25]:
X['rate'] = X['rate'].apply(lambda x : convert_to_int(x))

# X.tail()

In [26]:
# model 
regressor = LinearRegression()
regressor.fit(X, y)
regressor.score(X,y)

0.6948637514051955

This creates a model that can predict the sales with **~70% accuracy** which can then be pickled.

## Pickle Model

In [27]:
import pickle
pickle.dump(regressor, open('model.pkl','wb'))

The pickle file can be found inside the same directory as the Jupyter notebook.

In [28]:
model = pickle.load(open('model.pkl','rb'))
# print predictions with rate = 4, sales_in_first_month = 300, sales_in_second_month = 500
print(model.predict([[4, 300, 500]]))

[143.3072588]


sales_in_third_month = 143.3 $

## Test Flask in Production
### Test the Deployed Model & Generate Prediction

In [29]:
import requests
import json

In [8]:
#heroku_url = 'https://predictivapp.herokuapp.com/'
#data = {"Pclass":3, "Age":2, "SibSp":1, "Fare":50}
#response = requests.post(url, json.dumps(data))
#print(response.json())

Import requests and json in your Jupyter notebook, then create a variable to store the Heroku app url (you can find this by clicking “open app” in the top right corner of the app page on Heroku). Then create some sample data and convert it to JSON

In [31]:
# local url
url = 'http://localhost:5000'


# test data
data = {  'rate':5
             , 'sales_in_first_month':200
             , 'sales_in_second_month':400}

data = json.dumps(data)
data

'{"rate": 5, "sales_in_first_month": 200, "sales_in_second_month": 400}'

Check the response code using the following code. A response code of 200 means everything is running correctly.

In [26]:
#send_request = requests.post(url, data)
# get prediction
#print(send_request.json())

In [58]:
#send_request = requests.post(url, data)
#print(send_request)

<Response [200]>


And finally, look at the model’s prediction

In [9]:
#print(send_request.json())

{'results': {'results': 1}}


The result of 1 in this case means that in the case of our sample data, the model predicts the passenger survived — and more importantly the API works!

## Test App in Heroku

In [37]:
## heroku url
heroku_url = 'https://ml-predict-sales.herokuapp.com/'


# test data
data = {  'rate':5
             , 'sales_in_first_month':200
             , 'sales_in_second_month':400}

data = json.dumps(data)
data

'{"rate": 5, "sales_in_first_month": 200, "sales_in_second_month": 400}'

In [36]:
# check response code
#send_request = requests.post(heroku_url, data)
#print(send_request)

In [35]:
# get prediction
#print(send_request.json())