# **Edmonton Food Drive 2024 - Feature Engineering**

## **Introduction**
The Edmonton Food Drive is a collaborative initiative designed to tackle food insecurity in Edmonton. This drive ​mobilizes volunteers and resources to collect and distribute food to those in need. By leveraging machine learning, ​the project aims to enhance operational efficiency and maximize the impact of each donation. The integration of ​real-time data and optimized routing will contribute to a more streamlined and effective food distribution process.

**Team Name:** Team 404

**Team Members:**
*   Catrina Llamas
*   Roe Joshua Alincastre
*   Kendrick Moreno

#### **Task 1: Imports and data loading**
Now lets import required libraries that we are going to use throughout this project.

In [1]:
import os
import yaml
import numpy as np
import pandas as pd
import sys
import warnings
warnings.filterwarnings("ignore")

sys.path.append(os.path.abspath(os.path.join(os.getcwd(), '..', 'src')))
from utils.open_config import load_config
from preprocess import FeatureEngineer

##### **Load the dataset**
We are going to load the cleaned dataset.

In [2]:
params, project_root = load_config()
base_path = project_root
df_efd_cleaned = pd.read_csv(os.path.join(base_path, params["files"]["cleaned_data"]))
df_efd_cleaned.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1045 entries, 0 to 1044
Data columns (total 18 columns):
 #   Column                   Non-Null Count  Dtype  
---  ------                   --------------  -----  
 0   Unnamed: 0.1             1045 non-null   int64  
 1   Unnamed: 0               1045 non-null   int64  
 2   Drop Off Location        1045 non-null   object 
 3   Stake                    1045 non-null   object 
 4   Route Information        1045 non-null   object 
 5   Time Spent               1045 non-null   int64  
 6   Total Adult Volunteers   1045 non-null   float64
 7   Total Youth Volunteers   1045 non-null   float64
 8   Number of Doors          1045 non-null   float64
 9   Number of Donation Bags  1045 non-null   float64
 10  Number of Routes         1045 non-null   float64
 11  Ward                     1045 non-null   object 
 12  Year                     1045 non-null   int64  
 13  Latitude                 1041 non-null   float64
 14  Longitude               

#### **Task 1: Feature Engineering**

Now lets introduced 3 new fields.
*   Total Volunteers
*   Donation Bags per Door
*   Donation Bags per Route

In [3]:
feature_engineer = FeatureEngineer(df_efd_cleaned)
df_efd_cleaned = feature_engineer.feature_engineering()
df_efd_cleaned.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1045 entries, 0 to 1044
Data columns (total 18 columns):
 #   Column                   Non-Null Count  Dtype  
---  ------                   --------------  -----  
 0   Unnamed: 0.1             1045 non-null   int64  
 1   Unnamed: 0               1045 non-null   int64  
 2   Drop Off Location        1045 non-null   object 
 3   Stake                    1045 non-null   object 
 4   Route Information        1045 non-null   object 
 5   Time Spent               1045 non-null   int64  
 6   Total Adult Volunteers   1045 non-null   float64
 7   Total Youth Volunteers   1045 non-null   float64
 8   Number of Doors          1045 non-null   float64
 9   Number of Donation Bags  1045 non-null   float64
 10  Number of Routes         1045 non-null   float64
 11  Ward                     1045 non-null   object 
 12  Year                     1045 non-null   int64  
 13  Latitude                 1041 non-null   float64
 14  Longitude               

In [4]:
df_efd_cleaned

Unnamed: 0.2,Unnamed: 0.1,Unnamed: 0,Drop Off Location,Stake,Route Information,Time Spent,Total Adult Volunteers,Total Youth Volunteers,Number of Doors,Number of Donation Bags,Number of Routes,Ward,Year,Latitude,Longitude,Total Volunteers,Donation Bags per Door,Donation Bags per Route
0,0,0,Bearspaw Chapel,Riverbend Stake,Route 676,15,7.0,7.0,78.0,599.0,1.0,Woodbend Ward,2024,53.474700,-113.639900,14.0,7.679487,599.0
1,1,1,Bearspaw Chapel,Gateway Stake,Route 0,15,0.0,0.0,0.0,0.0,1.0,Lee Ridge Ward,2024,53.469593,-113.444355,0.0,0.000000,0.0
2,2,2,Londonberry Chapel,Bonnie Doon Stake,Route Unknown,15,1.0,0.0,1.0,1.0,1.0,Clareview Ward,2024,53.595400,-113.415300,1.0,1.000000,1.0
3,3,3,Gateway Stake Centre,Gateway Stake,Route 50,15,2.0,2.0,20.0,20.0,1.0,Lee Ridge Ward,2024,53.469593,-113.444355,4.0,1.000000,20.0
4,4,4,Bonnie Doon Stake Centre,Bonnie Doon Stake,Route 98,15,2.0,2.0,20.0,15.0,1.0,Forest Heights Ward,2024,53.544537,-113.451250,4.0,0.750000,15.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1040,1040,1050,Onoway,Edmonton North Stake,Route Unknown,45,2.0,0.0,195.0,10.0,2.0,Onoway Ward,2023,53.703000,-114.197300,2.0,0.051282,5.0
1041,1041,1051,North Stake Centre,Edmonton North Stake,Route Unknown,45,2.0,0.0,150.0,20.0,2.0,Namao Ward,2023,53.724200,-113.479900,2.0,0.133333,10.0
1042,1042,1053,Parkland (Spruce Grove/Stony Plain),Edmonton North Stake,Route Unknown,120,2.0,4.0,195.0,51.0,2.0,Stony Plain Ward,2023,53.528600,-114.010300,6.0,0.261538,25.5
1043,1043,1054,North Stake Centre,Edmonton North Stake,Route Unknown,150,2.0,0.0,600.0,78.0,3.0,Griesbach Ward,2023,53.606808,-113.504229,2.0,0.130000,26.0


Now, let's save the file in a csv file.

In [5]:
df_efd_cleaned.to_csv(os.path.join(base_path, params["files"]["cleaned_data"]))