# Exercise: Using a trained model on a new data

In Unit 3, we created a basic model that let us find the relationship between a person's shoe size and their height. We showed how this model could then be used to make a prediction about a new, previously unseen person.

It's common to build, train, then use a model while we are just learning about machine learning, but in the real world we don't want to train the model _every time_ we want to make a prediction.

Consider our shoe-store scenario:
* We want to train the model just once, then load that model onto the server that runs our online store. 
* Although the model is _trained_ on a dataset we downloaded from the internet, we actually want to _use_ it to estimate the heights of our customers who are not in this dataset! 

How can we do this?

Here we will:

1. Create a basic model
2. Save it to disk
3. Load it from disk
4. Use it to make predictions about a customers who were not in the training dataset.

## Load the dataset

Let's begin by opening the dataset from file

In [1]:
import pandas

# Load a file containing people's shoe sizes
# and height, both in cm
data = pandas.read_csv('Data/shoe-size-height.csv')

# Convert EU shoe sizes to the USA shoe sizes
# that we sell in our store
data["shoe_size_usa"] = data.shoe_size_eu - 33

# Print the first few rows
data.head()


Unnamed: 0,shoe_size_eu,height,sex,age_years,shoe_size_usa
0,39,173,male,60,6
1,38,173,male,48,5
2,37,157,female,43,4
3,39,175,male,51,6
4,38,170,male,39,5


## Create and train a model

As we have done before, we will create a simple Linear Regression model and train it on our dataset.

In [2]:
import statsmodels.formula.api as smf

# Fit a simple model that finds a linear relationship
# between shoe size and height, which we can use later
# to predict someone's height based on their shoe size
model = smf.ols(formula = "height ~ shoe_size_usa", data = data).fit()

print("Model trained!")

  import pandas.util.testing as tm
Model trained!


## Saving and loading a model

Our model is ready to use, but we don't need it yet. Let's save it to disk.

In [3]:
import joblib

model_filename = './height_shoes_model.pkl'
joblib.dump(model, model_filename)

print("Model saved!")

Model saved!


Loading our model is just as easy:

In [4]:
model_loaded = joblib.load(model_filename)

print("We have loaded a model with the following parameters:")
print(model_loaded.params)

We have loaded a model with the following parameters:
Intercept        151.227143
shoe_size_usa      2.927229
dtype: float64


## Putting it together

On our website, we will want to take the shoe size of our customer, then calculate their height using the model that we've already trained.

Let's put everything here together to make a function that loads the model from disk, then uses it to predict our customer's height.

In [5]:
# Let's write a function that loads and uses our model
def load_model_and_predict(customer_shoe_size_usa):
    '''
    This function loads a pretrained model the shoe size
    of a customer. It uses the model to predict how 
    tall the person is.

    customer_shoe_size_usa: The shoe size, in USA units 
    '''

    # Load the model from file and print basic information about it
    loaded_model = joblib.load(model_filename)

    print("We have loaded a model with the following parameters:")
    print(loaded_model.params)

    # Prepare data for the model
    inputs = {"shoe_size_usa":[customer_shoe_size_usa]} 

    # Use the model to make a prediction
    predicted_height = loaded_model.predict(inputs)[0]

    return predicted_height

# Practice using our model
predicted_height = load_model_and_predict(4.5)

print("Predicted height (cm):", predicted_height)

We have loaded a model with the following parameters:
Intercept        151.227143
shoe_size_usa      2.927229
dtype: float64
Predicted height (cm): 164.3996724579892


## Real world use 

We've done it - we can predict someone's height based on their shoe size. Our last step is to use this to warn people if their jeans might be the wrong size. 

As an example, we'll make a function that accepts the shoe size, the size of the jeans selected, and returns a message for the customer. We would integrate this function into our online store.

In [6]:
def check_size_of_jeans_and_shoes(customer_shoe_size_usa, jeans_inseam_length):
    '''
    Calculates whether the customer has chosen a pair of jeans that have
    a sensible inseam (leg) length. This works by estimating their height 
    from their shoe size, then looking up the best jeans inseam length.

    This returns a message for the customer that should be shown before
    they complete their payment 

    customer_shoe_size_usa: The customer shoe size, in USA units
    jeans_inseam_length: The leg length of the jeans selected, in inches
    '''

    # Estimate the customer's height, in cm
    estimated_height = load_model_and_predict(customer_shoe_size_usa)

    # Find the optimum jeans inseam (leg) length for a person of this height
    if estimated_height <= 166:
        ideal_jeans_inseam_length = 28
    elif estimated_height <= 171:
        ideal_jeans_inseam_length = 30
    elif estimated_height <= 182:
        ideal_jeans_inseam_length = 32
    elif estimated_height <= 193:
        ideal_jeans_inseam_length = 34
    elif estimated_height <= 198:
        ideal_jeans_inseam_length = 36
    else:
        ideal_jeans_inseam_length = 38

    # Check if the jeans selected are appropriate
    if jeans_inseam_length < ideal_jeans_inseam_length:
        # Selected jeans might be too small 
        return "The jeans you have selected might be too short for someone of "\
               "your height. We recommend jeans with an inseam length of "\
               f"{ideal_jeans_inseam_length} inches"

    if jeans_inseam_length > ideal_jeans_inseam_length:
        # Selected jeans might be too big
        return "The jeans you have selected might be too long for someone of "\
               "your height. We recommend jeans with an inseam length of "\
               f"{ideal_jeans_inseam_length} inches"
    
    # The selected jeans are probably OK
    return f"Great choice! We think these jeans will fit you well"


# Practice using our new warning system
check_size_of_jeans_and_shoes(customer_shoe_size_usa=4.5, jeans_inseam_length=30)

We have loaded a model with the following parameters:
Intercept        151.227143
shoe_size_usa      2.927229
dtype: float64


'The jeans you have selected might be too long for someone of your height. We recommend jeans with an inseam length of 28 inches'


Change `customer_shoe_size_usa` and `jeans_inseam_length` in the example above and re-run to see this in action.

## Summary

Well done! We've put together a system that can predict if customers are buying jeans that may not fit them, based solely on their shoe size. 

In this exercise, we practiced:

1. Creating basic models
2. Training, then saving them to disk
3. Loading them from disk
4. Making predictions with them using new data sets