# Exercise: Training and Running Your First Model

We've learned that models are computer code that converts information into a prediction or a decision. Here we will train a model to guess a comfortable boot size for a dog, based on the size of the harness that fits them.

In the examples below, there is no need to edit any code. Try to read it, understand it, then press the run button to run it. As always with these notebooks, it is vitally important that these code blocks are run in the correct order, and nothing is missed.

## Preparing data

The first thing we do with a model is load data. We will cover this in more detail in a later exercise. For now, we will just write our data directly in our code. Review and run the code below to get started 


In [1]:
import pandas

# Make a dictionary of data for boot sizes
# and harness size in cm
data = {
    'boot_size' : [ 39, 38, 37, 39, 38, 35, 37, 36, 35, 40, 
                    40, 36, 38, 39, 42, 42, 36, 36, 35, 41, 
                    42, 38, 37, 35, 40, 36, 35, 39, 41, 37, 
                    35, 41, 39, 41, 42, 42, 36, 37, 37, 39,
                    42, 35, 36, 41, 41, 41, 39, 39, 35, 39
 ],
    'harness_size': [ 58, 58, 52, 58, 57, 52, 55, 53, 49, 54,
                59, 56, 53, 58, 57, 58, 56, 51, 50, 59,
                59, 59, 55, 50, 55, 52, 53, 54, 61, 56,
                55, 60, 57, 56, 61, 58, 53, 57, 57, 55,
                60, 51, 52, 56, 55, 57, 58, 57, 51, 59
                ]
}

# Convert it into a table using pandas
dataset = pandas.DataFrame(data)

# Print the data
# In normal python we would write
# print(dataset)
# but in Jupyter notebooks, if we simple write the name
# of the variable and it is printed nicely 
dataset

Unnamed: 0,boot_size,harness_size
0,39,58
1,38,58
2,37,52
3,39,58
4,38,57
5,35,52
6,37,55
7,36,53
8,35,49
9,40,54


As you can see, we have the sizes of boots and harnesses for 50 avalanche dogs.

## Training our model

We will now _train_ (_fit_) our model to predict dogs' boot size based on their harness size. We're just getting started, so we will start with a very simple model - fitting a straight line to the data. 

The code below:
* creates a simple linear regression model
* fits this to the data you have now seen
* graphs the result.

In [2]:
# Load some libraries to do the hard work for us
import graphing 
import statsmodels.formula.api as smf

# First, we define our formula using a special syntax
# This says that boot_size is explained by harness_size
formula = "boot_size ~ harness_size"

# Perform linear regression (a kind of learning) to fit a 
# line to our data. This method does the hard work for
# us. We will look at how this method words in a later unit.
model = smf.ols(formula = formula, data = dataset).fit()

# Show a graph of the result
# Don't worry about how this works for now
graphing.scatter_2D(dataset,    label_x="harness_size", 
                                label_y="boot_size",
                                trendline=lambda x: model.params[1] * x + model.params[0]
                                )

  import pandas.util.testing as tm


The graph above shows our original data as circles, with a red line through it. The red line shows our _model_, which lets us predict a dog's boot size from their harness size.

For example, although we have no dog with a harness size of 52.5, we can predict that they would have a boot size of about 36.5, by looking at the red line.

We don't have to do this by eye though. We can use the model in our program to predict any boot size we like. Run the code below to see how we can use our model now it is trained

In [3]:
# harness_size states the size of the harness we are interested in
harness_size = { 'harness_size' : [52.5] }

# Use the model to predict what size of boots the dog will fit
approximate_boot_size = model.predict(harness_size)

# Print the result
print("Estimated approximate_boot_size:")
print(approximate_boot_size[0])

Estimated approximate_boot_size:
36.48019419144181


If you would like, change the value of `52.5` in `harness_size` to a new value and run the block above to see the model in action.

## Summary
Well done! You've trained your first model. We've demonstrated some topics here without detailed exaplanation in order to just get your feet wet. In later units many of these topics are explained in more detail.