# (Linear) Regression
The goal of Linear Regression is to **model the relationship between feature(s)** (explanatory variable(s) x (x2,...,xn)) **and a continuous valued response** (target variable y).

In [6]:
import pandas as pd
from sklearn.model_selection import train_test_split

df = pd.read_csv("sample_dataset.txt", sep='	')
X = df.iloc[:, :-1]
y = df.iloc[:, -1]

x_train, x_test, y_train, y_test = train_test_split(X, y, test_size=.3, random_state=42)

#### Evaluation Metrics
There a *three* common evaluation metrics for regression problems:
- **Mean Absolute Error (MAE)**: The *mean of the absolute value* of the errors
- **Mean Squared Error (MSE)**: The *mean* of the *squared* errors
- **Root Mean Squared Error (RMSE)**: The *square root* of the *mean* of the *squared* errors
We want to minimize all of these functions.

### Creating the Model
Linear Regression has essentially no parameters, so there is no need to worry about that!

In [None]:
import matplotlib.pyplot as plt
import numpy as np
from sklearn.linear_model import LinearRegression

# Create & Fit the model
model = LinearRegression()
model.fit(x_train, y_train)

# Get the prediction & plot it against the datapoints
y_pred = model.predict(x_test)
plt.plot(y_test, y_pred, '.')

# plot a line, a perfect prediction would fall on this line
x = np.linspace(0, 5, 5)  # Axis sizes of graph
y = x
plt.plot(x, y)
plt.show()

# NOTE: This output is a bad example because it is classification but whatever, this is super easy

#### Getting Coefficients and Intercepts
Under this module, these are just stored as member variables under the model object.

In [9]:
print("Coefficients: ", model.coef_)
print("y-Intercept: ", model.intercept_)

Coefficients:  [ 0.017626   -0.00820165  0.05652635  0.01638335  0.07202078 -0.04161304
 -0.02456968]
y-Intercept:  6.086426231090267
