# Find the best model

## Setup

In [1]:
import pandas as pd
import altair as alt
import warnings

warnings.simplefilter(action='ignore', category=FutureWarning)
alt.data_transformers.disable_max_rows()

DataTransformerRegistry.enable('default')

## Data

In [2]:
# let's change some values in ads
df = pd.DataFrame(
    {'sales': [2500, 4500, 6500, 8500, 10500, 12500, 14500, 16500, 18500, 20500],
      'ads'  : [900, 1400, 3600, 3800, 6200, 5200, 6800, 8300, 9800, 10100]}
)

## Analysis

In [3]:
alt.Chart(df).mark_point().encode(
    x='ads',
    y='sales',
    tooltip= ['ads', 'sales']
).interactive()

*Is there a "perfect" relationship between the two variables (like in our last example)?*

## Model

Remember our simple model from before. Instead of using `number_0` and `number_1`, we use the following expressions:

- `number_0` = `intercept`

- `number_1` = `slope` (also called coefficient)


Use the following values to make a prediction:

- intercept = 500
- slope = 2

Hint:

---


```python
intercept = ___ 
slope = ___

df['___'] = ___ + ___ * df['___'] 
```

---

- Make a prediction for `sales` by using the variable `ads`
- Save the new column as `sales_prediction`

In [4]:
### BEGIN SOLUTION
intercept = 500 
slope = 2

df['sales_prediction'] = intercept + slope * df['ads'] 
### END SOLUTION

In [5]:
df.head()

Unnamed: 0,sales,ads,sales_prediction
0,2500,900,2300
1,4500,1400,3300
2,6500,3600,7700
3,8500,3800,8100
4,10500,6200,12900


In [6]:
# Check your code
assert df.loc[0, 'sales_prediction'] == 2300 

In [6]:
chart = alt.Chart(df).mark_point().encode(
    x='ads',
    y='sales'
)

line = alt.Chart(df).mark_line().encode(
         alt.X('ads', axis=alt.Axis(title='Ads (in $)')),
         alt.Y('sales_prediction', axis=alt.Axis(title="Sales (in units)")),
         color=alt.value("#0001F5"))

chart + line

*Can you guess why the two numbers are called intercept and slope?* Write down your answer below.