# Simple Linear Regression

## Intuition behind Simple Linear Regression

Linear Regression is useful when we want to predict a continuous numerical value for linear datasets (i.e. datasets with linear correlations). Depending on the no: of features, Linear Regression can be

1.   Simple Linear Regression - only a single feature
2.   Multiple Linear Regression - multiple features


Simple Linear Regression involves finding the best fitting line that correlates a single feature with the target variable.
![Simple Linear Regression Equation](Simple-Linear-Regression-Intuition-01.PNG)

Lets say we have a dataset containing salary data of employees of different levels of experience. Here,


*   Independent Variable/Feature = Experience
*   Dependent/Target Variable = Salary

We want to build a model that can predict the salary given experience of an employee. Since we have only one feature (Experience) and the Salary is a continuous variable, this is a Simple Linear Regression problem.

How this scenario fits into Simple Linear Regression can be visualized as below:
![Simple Linear Regression for Salary Prediction: Problem](Simple-Linear-Regression-Intuition-02.PNG)

For finding the line that best fits with the training data, the sum of squares of deviation of observed data from predicted data is calculated for each line. The line with the least sum of squares is chosen as the best fit line.

![Simple Linear Regression for Salary Prediction: Solution](Simple-Linear-Regression-Intuition-03.PNG)

## Importing the libraries

In [1]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

## Importing the dataset

In [2]:
dataset = pd.read_csv('Salary_Data.csv')
X = dataset.iloc[:, :-1].values
y = dataset.iloc[:, -1].values

## Splitting the dataset into the Training set and Test set

In [3]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)

## Training the Simple Linear Regression model on the Training set

In [4]:
from sklearn.linear_model import LinearRegression
regressor = LinearRegression()
regressor.fit(X_train, y_train)

LinearRegression(copy_X=True, fit_intercept=True, n_jobs=None, normalize=False)

## Predicting the Test set results

In [5]:
y_pred = regressor.predict(X_test)

## Visualising the Training set results

## Visualising the Test set results