## Linear regression

<p> Linear regression is a basic machine learning algorithm used for predicting a continuous outcome. It models the relationship between a dependent variable (the thing you want to predict) and one or more independent variables (the input features) by fitting a straight line to the data. The line is chosen so that it best represents the data points, typically by minimising the sum of the squared differences between the observed values and the values predicted by the line. The equation of the line is often written as:
𝑦=𝑚𝑥+𝑏
<p>

In [None]:
# Library Imports
import pandas as pd
import numpy as np
import sklearn
import matplotlib.pyplot as pyplot
import pickle
from sklearn import linear_model
from sklearn.utils import shuffle
from matplotlib import style

: 

In [None]:
data = pd.read_csv("forestfires.csv")
# Read the csv using pandaas
print(data.head())

: 

In [None]:
data = data[["X", "DMC", "DC", "ISI", "FFMC", "temp", "area", "wind", "rain"]]
# Prints only the essential data
print(data.head())


: 

In [None]:
predict = "FFMC"

x = np.array(data.drop(predict, axis=1))
y = np.array(data[predict])

: 

In [None]:
# Split the data set into train and test sets
x_train, x_test, y_train, y_test = sklearn.model_selection.train_test_split(x, y, test_size=0.1)

: 

<p>This code splits the dataset into training and testing sets for a machine learning model<p>

In [None]:
linear = linear_model.LinearRegression()


linear.fit(x_train, y_train)
acc = linear.score(x_test, y_test)
print(acc)

: 

<p>This code snippet initialises a Linear Regression model and determines a 'line of best fit,' which represents the relationship between the variables. This line captures the correlation between the data points. <p>

In [None]:
with open("forestfires.pickle", "wb") as f:
    pickle.dump(linear, f)

: 

In [None]:
# Load the pickle file
pickle_in = open("forestfires.pickle", "rb")
linear = pickle.load(pickle_in)

: 

In [None]:
print('Coefficient: \n', linear.coef_)
print('Intercept: \n', linear.intercept_)

: 

In [None]:
predictions = linear.predict(x_test)

for x in range(len(predictions)):
    print(predictions[x], x_test[x], y_test[x]) 

: 

In [None]:
style.use("ggplot")

# Set up a scatter plot
pyplot.scatter(predictions, y_test)
pyplot.xlabel("predictions")
pyplot.ylabel("Drought Code")
pyplot.show()

: 