# Linear Regression

### Two-Dimensional Problem

In this example, we'll be working with data on the average life expectancy at birth and the average BMI for males across the world. The data comes from Gapminder.

The data file can be found under the "bmi_and_life_expectancy.csv". It includes three columns, containing the following data:

    - Country – The country the person was born in.
    - Life expectancy – The average life expectancy at birth for a person in that country.
    - BMI – The mean BMI of males in that country.
    
#### You'll need to complete each of the following steps:

1. Load the data

The data is in the file called "bmi_and_life_expectancy.csv".
Use pandas read_csv to load the data into a dataframe
Assign the dataframe to the variable bmi_life_data.

2. Build a linear regression model

Create a regression model using scikit-learn's LinearRegression and assign it to bmi_life_model.
Fit the model to the data.

3. Predict using the model

Predict using a BMI of 21.07931 and assign it to the variable laos_life_exp.

#### Note: 

Here, BMI is the predictor, also known as an independent variable. A predictor is a variable you're looking at in order to make predictions about other variables, while the values you are trying to predict are known as dependent variables. In this case, life expectancy is the dependent variable.

In [8]:
# Imports
import pandas as pd
from sklearn.linear_model import LinearRegression

# Load the data
bmi_life_data = pd.read_csv('bmi_and_life_expectancy.csv')
X = bmi_life_data[['BMI']]
y = bmi_life_data[['Life expectancy']]

# Create and fit the linear regression model
bmi_life_model = LinearRegression()
bmi_life_model.fit(X, y)

# Predict life expectancy for a BMI value of 21.07931
laos_life_exp = bmi_life_model.predict(21.07931)

# Print prediction
print("Laos life expectancy in relation to thier BMI is:", laos_life_exp)

Laos life expectancy in relation to thier BMI is: [[60.31564716]]


### N-Dimensional Problem

In this example, we'll be using the Boston house-prices dataset. The dataset consists of 13 features of 506 houses and the median home value in USD 1000's. You'll fit a model on the 13 features to predict the value of the houses.

We'll need to complete each of the following steps:

1. Build a linear regression model

Create a regression model using scikit-learn's LinearRegression and assign it to model.
Fit the model to the data.

2. Predict using the model

Predict the value of sample_house.

In [2]:
# Imports
from sklearn.linear_model import LinearRegression
from sklearn.datasets import load_boston

# Load the data from the boston house-prices dataset 
boston_data = load_boston()
X = boston_data['data']
y = boston_data['target']

# Create and fit the linear regression model
boston_housing_model = LinearRegression()
boston_housing_model.fit(X, y)

# Make a prediction using the model
sample_house = [[2.29690000e-01, 0.00000000e+00, 1.05900000e+01, 0.00000000e+00, 4.89000000e-01,
                6.32600000e+00, 5.25000000e+01, 4.35490000e+00, 4.00000000e+00, 2.77000000e+02,
                1.86000000e+01, 3.94870000e+02, 1.09700000e+01]]

# Predict housing price for the sample_house
prediction = boston_housing_model.predict(sample_house)

# Print prediction
print(prediction)

[23.68420569]
