# Intro to Simple Linear Regression

#### Experimenting with simple linear regression models to fit the data

Links:
- [Linear Regression Documnetation](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html)

Import necessary modules, load dataset into Python, preview dataset:

In [None]:
from dotenv import load_dotenv
import os

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from datetime import datetime
from sklearn.linear_model import LinearRegression

load_dotenv()
DATASET_PATH = os.environ.get("DATASET_PATH")

df = pd.read_excel(DATASET_PATH + "Conversion by Day.xlsx")

states = sorted(set(df["STATE_CODE"]))
df.head()

Fit simple linear regression model to data and output parameters:

In [None]:
# convert quote date to an integer (since linear regression doesn't work with date times)
df_nsw = df.loc[df.STATE_CODE == "NSW"][["QUOTE_DATE", "Net Closing Rate"]]
df_nsw["QUOTE_DATE_INT"] = df_nsw["QUOTE_DATE"].map(datetime.toordinal)

x = df_nsw.to_numpy()[:, 2].reshape(-1, 1)  # date
y = df_nsw.to_numpy()[:, 1]  # conversion rate

model = LinearRegression().fit(x, y)

print(f"R^2 = {model.score(x, y)}")
print(f"Equation: y = {model.intercept_} + ({model.coef_[0]})x")
df_nsw.plot(x="QUOTE_DATE", y="Net Closing Rate");