# EZ-Pickle for the Flask Lesson
___

The `pickle` library allows us to serialize any python object. This saves the object exactly as it sits in our code to an actual file that we can load up later (or even send to someone else). This process can be applied in many ways, it is used here to save a trained model for use later on in another script.

Since everything in python is an object, (almost) anything can be serialized.

**Imports**

Pickle is a python built-in! Simply `import pickle` and you are good to go.

In [1]:
import pickle

import pandas as pd, numpy as np
from sklearn.linear_model import LinearRegression

**Read in data and fit a model**

We are fitting a linear regression model on the Ames `train.csv`. This model will power a web form in the flask demo.

In [2]:
# read in the data
df = pd.read_csv('./data/ames.csv')

# pick some columns, drop the nulls.
good_cols = ['Overall Qual', 'Full Bath', 'Garage Area', 'Lot Area']
df.dropna(subset=good_cols, inplace=True)

# set up feature matrix and target vector
X = df[good_cols]
y = df['SalePrice']

# instantiate the model
model_to_be_pickled = LinearRegression()

# fit the model
model_to_be_pickled.fit(X,y)

# print out the score and coefficients
print(f'The model explains {100*model_to_be_pickled.score(X,y):.2f}% of the variance' + '\n-----\n' + 'Coefficients:')
print(dict(zip(list(X.columns), np.round(model_to_be_pickled.coef_, 4))))

FileNotFoundError: [Errno 2] No such file or directory: './data/ames.csv'

**Pickling**  
Everything above this was just 'normal' modeling. Now we will actually save the model to a file with the '.p' extension
- `open(filename, permissions)`: allows us to write to a file on our computer, can be used in many different ways. 
- `pickle.dump(object, file)`: serializes an object to an open file. 

In [3]:
# put the two functions above together, using 'write binary' permissions
pickle.dump(model_to_be_pickled, open('model.p', 'wb'))

**Check our work**

Let's read in our model and check the score/coefficients.
- `pickle.load(file)`: de-serializes the stored object back into a variable

In [4]:
# use the above function with open() and 'read binary' permissions to get our model back
model_that_was_pickled = pickle.load(open('model.p', 'rb'))

print(f'The model explains {100*model_that_was_pickled.score(X,y):.2f}% of the variance' + '\n-----\n' + 'Coefficients:')
print(dict(zip(list(X.columns), np.round(model_that_was_pickled.coef_, 4))))

The model explains 73.43% of the variance
-----
Coefficients:
{'Overall Qual': 32968.7974, 'Full Bath': 17719.2556, 'Garage Area': 82.6078, 'Lot Area': 1.8704}


Nice