# Group 1 Advanced Python Group Project Explanation Notebook
This document explains step by step the actions taken in order to develop the first part of the Group Project: Flask. The overall structure of the notebook is a little explanation along with the code

### Developing the Machine Learning Model

After having found the dataset, the first step was to import all the libraries used for the model. 

In [1]:
#From model.py
import os
import pickle

import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score

 - After that since we are trying to make a machine learning model, we first had to load the data and drop colums, which dont add value to the model - in this case the Order_ID. 
- In order to make sure that we dont train on empty data, we also dropped all empty datapoints.
- Since the dataset is working with categorical data and the aim was to use a linear regression, we used one-hot encoding or "dummies" to transform the data such that the Linear Regression would work with the dataset
- Next we split the data into X and Y 
- Lastly we used scikitlearn to split the data into training and testing data 



In [2]:
#From model.py

data = pd.read_csv("data/Food_Delivery_times.csv")

data.drop(columns=["Order_ID"], inplace=True)
data.dropna(inplace=True)

#We have some categorical variables - for linear regression we need to convert them to numerical variables
data = pd.get_dummies(data)

X = data.drop(columns=["Delivery_Time_min"])
Y = data["Delivery_Time_min"]

X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.20, random_state=42)

- Next we created a Linear Regression and fit the model to our X and Y training data respectively
- In order to see what happened "under the hood", we examined the coefficients of each variable to better understand which variables were key for the prediction
- We printed the feature name and coefficient respectively

In [None]:
#From model.py

model = LinearRegression()

model.fit(X_train, Y_train)

#During development wanted to see the coefficients of the different features
coefficients = model.coef_
feature_names = X.columns

for feature, coef in zip(feature_names, coefficients):
    print(f"{feature}: {coef}")

- Next we made the model predict Y based on our X testdata
- We then measured the performance using mean squred error and R^2
- During development, the R^2 of the model was approx. 0.83 which is an acceptable score
- Lastly we saved the model into a pickle file which we could use in our Flask backend

In [None]:
#Predicting the test set results
Y_pred = model.predict(X_test)

mse = mean_squared_error(Y_test, Y_pred)
r2 = r2_score(Y_test, Y_pred)

print(f"Mean Squared Error: {mse}")
print(f"R2 Score: {r2}") #During testing was approx .83 which is acceptable

#Saving the model to pickle file
pickle.dump(model, open("food_delivery_model.pkl", "wb"))

### Developing the Backend Routes & Templates for the Home Route and Predict Route

First we needed to import all of the necessary libraries for a functioning backend