# Machine Learning With Python: Save And Load Trained Model
---





In [None]:
!pip -q install word2number

In [None]:
import pandas as pd
import numpy as np
from sklearn import linear_model
import matplotlib.pyplot as plt

In [None]:
df = pd.DataFrame({
    'area': [2600, 3000, 3200, 3600, 4000, 4100],
    'bedrooms': [3.0, 4.0, None , 3.0, 5.0, 6.0],
    'age': [20, 15, 18, 30, 8, 8],
    'price': [550000, 565000, 610000, 595000, 760000, 810000]
})

df

## Data Preprocessing: Fill NA values with median value of a column

In [None]:
df.bedrooms.median()

In [None]:
df.bedrooms = df.bedrooms.fillna(df.bedrooms.median())
df

In [None]:
plt.xlabel('area')
plt.ylabel('price')
plt.scatter(df.area, df.price)
plt.show()

In [None]:
plt.xlabel('bedrooms')
plt.ylabel('price')
plt.scatter(df.bedrooms, df.price)
plt.show()

In [None]:
plt.xlabel('age')
plt.ylabel('price')
plt.scatter(df.age, df.price)
plt.show()

In [None]:
reg = linear_model.LinearRegression()
reg.fit(df[['area', 'bedrooms', 'age']], df.price)

In [None]:
reg.coef_

In [None]:
reg.intercept_

**Find price of home with 3000 sqr tf area, 3 bedrooms, 40 year old**

In [None]:
reg.predict([[3000, 3, 40]])

In [None]:
3000 *  112.06244194 + 3 * 23388.88007794  + 40 * -3231.71790863 + 221323.00186540396

**Find price of home with 2500 sqr tf area, 4 bedrooms, 5 year old**

In [None]:
reg.predict([[2500, 4, 5]])

# Save Model To a File Using Python Pickle


In [None]:
import pickle

In [None]:
with open('model_pickle', 'wb') as f:
  pickle.dump(reg, f)

## Load Saved Model

In [None]:
with open('model_pickle', 'rb') as f:
  model = pickle.load(f)

In [None]:
model.coef_

In [None]:
model.intercept_

In [None]:
model.predict([[2500, 4, 5]])

## Save Trained Model Using joblib

In [None]:
!pip -q install joblib

In [None]:
import joblib

In [None]:
joblib.dump(reg, 'model_joblib')

## Load Saved Model

In [None]:
mj = joblib.load('model_joblib')

In [None]:
mj.coef_

In [None]:
mj.intercept_

In [None]:
mj.predict([[2500, 4, 5]])

# Exercise

In exercire folder(same leverl as this notebook on github) there is `hiring.csv`. This file contains hiring statics for a firm such as experience of candidate, his writte test score and personal interview score. Based on there 3 factors, HR will decide the salary. Given this data, you need to build a machine learning model for HR department that can help them decide salaries for future candidates. Using this predict salaries for following candidates,

**2 yr experience, 9 test score , 6 interview score**

**12 yr experience, 10 test score, 10 interview score**



In [None]:
df_ex = pd.read_csv('/content/hiring.csv')
df_ex

## Data Preprocessing

In [None]:
df_ex.experience = df_ex.experience.fillna('zero')
df_ex

In [None]:
from word2number import w2n

df_ex.experience = df_ex.experience.apply(w2n.word_to_num)
df_ex

In [None]:
median_test_score = df_ex['test_score(out of 10)'].median()
median_test_score

In [None]:
df_ex['test_score(out of 10)'] = df_ex['test_score(out of 10)'].fillna(median_test_score)
df_ex

In [None]:
plt.xlabel('experience')
plt.ylabel('salary')
plt.scatter(df_ex.experience, df_ex['salary($)'])
plt.show()

In [None]:
plt.xlabel('test_score(out of 10)')
plt.ylabel('salary')
plt.scatter(df_ex['test_score(out of 10)'], df_ex['salary($)'])
plt.show()

In [None]:
plt.xlabel('interview_score(out of 10)')
plt.ylabel('salary')
plt.scatter(df_ex['interview_score(out of 10)'], df_ex['salary($)'])
plt.show()

In [None]:
reg = linear_model.LinearRegression()
reg.fit(df_ex[['experience', 'test_score(out of 10)', 'interview_score(out of 10)']], df_ex['salary($)'])

In [None]:
# 2 yr experience, 9 test score , 6 interview score

reg.predict([[2, 9, 6]])

In [None]:
# 12 yr experience, 10 test score, 10 interview score

reg.predict([[12,10,10]])