<img src="http://imgur.com/1ZcRyrc.png" style="float: left; margin: 20px; height: 55px">

# DSI 37 Project 4

<a id='part_i'></a>
[Part II](Part_2-EDA_and_Feature_Engineering.ipynb#part_ii) <br>
[Part III](Part_3-Modelling.ipynb#part_iii)

# Part I: Cleaning

<a id='part_i'></a>

## Contents

[1. Intro](#intro)<br>
[2. Glossary](#glossary)<br>
[3. Imports](#imports)<br>
[4. Code](#code)<br>
[5. Citations](#citations)<br>


<a id='intro'></a>

## 1. Intro

## Problem Statement

As a newly appointed member of the Disease And Treatment Agency's division of Societal Cures In Epidemiology and New Creative Engineering (DATA-SCIENCE), our task is to develop an efficient plan for the deployment of pesticides in response to the endemicity of West Nile Virus in the city. With the establishment of a surveillance and control system by the Department of Public Health, there is an opportunity to leverage collected data on mosquito populations to derive valuable insights. The aim is to strategically allocate resources and minimize costs associated with pesticide use while ensuring public health and safety. Our expertise in data analysis and modeling will be instrumental in formulating an effective pesticide deployment plan to combat the West Nile Virus outbreak in the Windy City.

## Objectives

* The primary objective entails constructing a robust predictive model to facilitate informed decision-making by the city of Chicago regarding the strategic allocation of pesticide spraying for mosquito control.

* Another component of the project involves conducting a comprehensive cost-benefit analysis. This analysis encompasses projecting direct and indirect costs associated with pesticide coverage and evaluating the corresponding benefits yielded by pesticide application.

## Description of this codebook

This is part 1 of our overall code for this project. This part concerns the methods used to clean the data.

<a id='glossary'></a>

## 2. Glossary

### West Nile Virus:
West Nile virus is primarily transmitted by infected mosquitoes, causing symptoms in around 20% of individuals, ranging from fever to severe neurological illnesses. 
In 2002, the first human cases were reported in Chicago, leading to the establishment of a surveillance program. Mosquito traps are tested regularly during late spring to fall, guiding pesticide spraying for mosquito control. 
This Kaggle competition seeks to predict West Nile virus presence by utilizing weather, location, testing, and spraying data. Efficient outbreak predictions aid resource allocation. 

<b> Primary Data </b>:

The training dataset comprises data from the years 2007, 2009, 2011, and 2013, while the test dataset comprises data from the years 2008, 2010, 2012, and 2014.
To facilitate data organization, mosquito count records exceeding 50 are split into separate entries, ensuring that the number of mosquitoes does not exceed this limit.

<b> Weather Data </b>:

The weather dataset from NOAA contains weather conditions recorded between 2007 and 2014, specifically during the months of the tests. It is believed that hot and dry conditions are more conducive to West Nile virus than cold and wet conditions. 
The weather data is available for two stations: 
1. CHICAGO O'HARE INTERNATIONAL AIRPORT (Latitude: 41.995, Longitude: -87.933, Elevation: 662 ft. above sea level)
2. CHICAGO MIDWAY INTL ARPT (Latitude: 41.786, Longitude: -87.752, Elevation: 612 ft. above sea level)

<b> Spray Data </b>:

The City of Chicago conducts mosquito spraying efforts, and GIS data for their spraying activities in 2011 and 2013 is provided. Spraying can reduce mosquito populations and potentially eliminate the presence of West Nile virus.

<b> Map Data </b>:

The map files, mapdata_copyright_openstreetmap_contributors.rds and mapdata_copyright_openstreetmap_contributors.txt, are sourced from OpenStreetMap and are primarily intended for use in visualizations.

Acknowledgements:
This competition is sponsored by the Robert Wood Johnson Foundation. Data is provided by the Chicago Department of Public Health.
https://www.kaggle.com/competitions/predict-west-nile-virus

### Data Dictionary
The data dictionary for the four datasets utilized in this project is provided below for reference.

`train_df`

Period: 2007, 2009, 2011, and 2013

|Feature|Type|Description|
|:---|:---:|:---|
|<b>Date</b>|*object*|Date that the WNV test is performed|
|<b>Address</b>|*object*|Approximate address of the location of trap, this is used to send to the GeoCoder|
|<b>Species</b>|*object*|The species of mosquitos|
|<b>Block</b>| *int64*|Block number of address|
|<b>Street</b>|*object*|Street name|
|<b>Trap</b>|*object*|Id of the trap|
|<b>AddressNumberAndStreet</b>|*object*|Approximate address returned from GeoCoder|
|<b>Latitude</b>|*float64*|Latitude returned from GeoCoder|
|<b>Longitude</b>|*float64*|Longitude returned from GeoCoder|
|<b>AddressAccuracy</b>|*int64*|Accuracy returned from GeoCoder|
|<b>NumMosquitos</b>|*int64*|Number of mosquitoes caught in this trap|
|<b>WnvPresent</b>|*int64*|Whether West Nile Virus was present in these mosquitos. 1 means WNV is present, and 0 means not present. |

<br>

`test_df`

Period: 2008, 2010, 2012, and 2014

|Feature|Type|Description|
|:---|:---:|:---|
|<b>Id</b>|*int64*|The id of the record|
|<b>Date</b>|*object*|Date that the WNV test is performed|
|<b>Address</b>|*object*|Approximate address of the location of trap, this is used to send to the GeoCoder|
|<b>Species</b>|*object*|The species of mosquitos|
|<b>Block</b>| *int64*|Block number of address|
|<b>Street</b>|*object*|Street name|
|<b>Trap</b>|*object*|Id of the trap|
|<b>AddressNumberAndStreet</b>|*object*|Approximate address returned from GeoCoder|
|<b>Latitude</b>|*float64*|Latitude returned from GeoCoder|
|<b>Longitude</b>|*float64*|Longitude returned from GeoCoder|
|<b>AddressAccuracy</b>|*int64*|Accuracy returned from GeoCoder|

<br>

`weather_df`

Period: 2007, 2008, 2009, 2010, 2011, 2012, 2013, and 2014

|Feature|Type|Description|
|:---|:---:|:---|
|<b>Date</b>|*object*|Date of record|
|<b>Station</b>|*int64*|Station number, either 1 or 2|
|<b>Tmax</b>|*int64*|Maximum temperature in Degrees Fahrenheit|
|<b>Tmin</b>|*int64*|Minimum temperature in Degrees Fahrenheit|
|<b>Tavg</b>|*object*|Average temperature in Degrees Fahrenheit|
|<b>Depart</b>| *object*|Temperature departure from normal in Degrees Fahrenheit|
|<b>DewPoint</b>|*int64*|Average Dew Point in Degrees Fahrenheit|
|<b>WetBulb</b>|*object*|Average Wet Bulb in Degrees Fahrenheit|
|<b>Heat</b>|*object*|Absolute temperature difference of Tavg from base temperature of 65 Degrees Fahrenheit if Tavg < 65|
|<b>Cool</b>|*object*|Absolute temperature difference of Tavg from base temperature of 65 Degrees Fahrenheit if Tavg > 65|
|<b>Sunrise</b>|*object*|Time of Sunrise (Calculated, not observed)|
|<b>Sunset</b>|*object*|Time of Sunset (Calculated, not observed)|
|<b>CodeSum</b>|*object*|Weather Phenomena, refer to CodeSum Legend below|
|<b>Depth</b>|*object*|Snow / ice in inches|
|<b>Water1</b>|*object*|Water equivalent of Depth|
|<b>SnowFall</b>| *object*|Snowfall in inches and tenths|
|<b>PrecipTotal</b>|*object*|Rainfall and melted snow in inches and hundredths|
|<b>StnPressure</b>|*object*|Average station pressure in inches of HG|
|<b>SeaLevel</b>|*object*|Average sea level pressure in inches of HG|
|<b>ResultSpeed</b>|*float64*|Resultant wind speed in miles per hour|
|<b>ResultDir</b>|*int64*|Resultant wind direction in Degrees|
|<b>AvgSpeed</b>|*object*|Average wind speed in miles per hour|

<br>

<b>CodeSum Legend</b>

|code| explanation| 
|:-|:-|
|+FC| TORNADO/WATERSPOUT|
|FC | FUNNEL CLOUD|
|TS | THUNDERSTORM|
|GR | HAIL|
|RA | RAIN|
|DZ | DRIZZLE|
|SN | SNOW|
|SG | SNOW GRAINS|
|GS | SMALL HAIL &/OR SNOW PELLETS|
|PL | ICE PELLETS|
|IC | ICE CRYSTALS|
|FG+ | HEAVY FOG (FG & LE.25 MILES VISIBILITY)|
|FG | FOG|
|BR | MIST|
|UP | UNKNOWN PRECIPITATION|
|HZ | HAZE|
|FU | SMOKE|
|VA | VOLCANIC ASH|
|DU | WIDESPREAD DUST|
|DS | DUSTSTORM|
|PO | SAND/DUST WHIRLS|
|SA | SAND|
|SS | SANDSTORM|
|PY | SPRAY|
|SQ | SQUALL|
|DR | LOW DRIFTING|
|SH | SHOWER|
|FZ | FREEZING|
|MI | SHALLOW|
|PR | PARTIAL|
|BC | PATCHES|
|BL | BLOWING|
|VC | VICINITY|
|- | LIGHT|
|+ | HEAVY|
|"NO SIGN" | MODERATE|

`spray_df`

Period: 2011, and 2013

|Feature|Type|Description|
|:---|:---:|:---|
|<b>Date</b>|*object*|Date of the spray|
|<b>Time</b>|*object*|Time of the spray|
|<b>Latitude</b>|*float64*|Latitude returned from GeoCoder|
|<b>Longitude</b>|*float64*|Longitude returned from GeoCoder|

<br>

<a id='imports'></a>

## 3. Imports (Libraries)

In [2]:
# for cleaning

import pandas as pd
import numpy as np

<a id='code'></a>

## 4. Code

### 1. Load train and test datasets

The train.csv adn test.csv files were downloaded from https://www.kaggle.com/competitions/predict-west-nile-virus

#### Import Datasets

In [3]:
train_df = pd.read_csv('../data/input/train.csv')
test_df = pd.read_csv('../data/input/test.csv')
weather_df = pd.read_csv('../data/input/weather.csv')
spray_df = pd.read_csv('../data/input/spray.csv')

#### View full dataframe information

In [4]:
pd.set_option('display.max_colwidth',None)
pd.set_option('display.max_columns',100)

#### View Dataframes

In [5]:
list_df = [train_df,test_df,weather_df,spray_df]
list_df_name = ['train_df','test_df','weather_df','spray_df']

In [6]:
for i in range(len(list_df)):
    print()
    print(list_df_name[i])
    display(list_df[i].head())


train_df


Unnamed: 0,Date,Address,Species,Block,Street,Trap,AddressNumberAndStreet,Latitude,Longitude,AddressAccuracy,NumMosquitos,WnvPresent
0,2007-05-29,"4100 North Oak Park Avenue, Chicago, IL 60634, USA",CULEX PIPIENS/RESTUANS,41,N OAK PARK AVE,T002,"4100 N OAK PARK AVE, Chicago, IL",41.95469,-87.800991,9,1,0
1,2007-05-29,"4100 North Oak Park Avenue, Chicago, IL 60634, USA",CULEX RESTUANS,41,N OAK PARK AVE,T002,"4100 N OAK PARK AVE, Chicago, IL",41.95469,-87.800991,9,1,0
2,2007-05-29,"6200 North Mandell Avenue, Chicago, IL 60646, USA",CULEX RESTUANS,62,N MANDELL AVE,T007,"6200 N MANDELL AVE, Chicago, IL",41.994991,-87.769279,9,1,0
3,2007-05-29,"7900 West Foster Avenue, Chicago, IL 60656, USA",CULEX PIPIENS/RESTUANS,79,W FOSTER AVE,T015,"7900 W FOSTER AVE, Chicago, IL",41.974089,-87.824812,8,1,0
4,2007-05-29,"7900 West Foster Avenue, Chicago, IL 60656, USA",CULEX RESTUANS,79,W FOSTER AVE,T015,"7900 W FOSTER AVE, Chicago, IL",41.974089,-87.824812,8,4,0



test_df


Unnamed: 0,Id,Date,Address,Species,Block,Street,Trap,AddressNumberAndStreet,Latitude,Longitude,AddressAccuracy
0,1,2008-06-11,"4100 North Oak Park Avenue, Chicago, IL 60634, USA",CULEX PIPIENS/RESTUANS,41,N OAK PARK AVE,T002,"4100 N OAK PARK AVE, Chicago, IL",41.95469,-87.800991,9
1,2,2008-06-11,"4100 North Oak Park Avenue, Chicago, IL 60634, USA",CULEX RESTUANS,41,N OAK PARK AVE,T002,"4100 N OAK PARK AVE, Chicago, IL",41.95469,-87.800991,9
2,3,2008-06-11,"4100 North Oak Park Avenue, Chicago, IL 60634, USA",CULEX PIPIENS,41,N OAK PARK AVE,T002,"4100 N OAK PARK AVE, Chicago, IL",41.95469,-87.800991,9
3,4,2008-06-11,"4100 North Oak Park Avenue, Chicago, IL 60634, USA",CULEX SALINARIUS,41,N OAK PARK AVE,T002,"4100 N OAK PARK AVE, Chicago, IL",41.95469,-87.800991,9
4,5,2008-06-11,"4100 North Oak Park Avenue, Chicago, IL 60634, USA",CULEX TERRITANS,41,N OAK PARK AVE,T002,"4100 N OAK PARK AVE, Chicago, IL",41.95469,-87.800991,9



weather_df


Unnamed: 0,Station,Date,Tmax,Tmin,Tavg,Depart,DewPoint,WetBulb,Heat,Cool,Sunrise,Sunset,CodeSum,Depth,Water1,SnowFall,PrecipTotal,StnPressure,SeaLevel,ResultSpeed,ResultDir,AvgSpeed
0,1,2007-05-01,83,50,67,14,51,56,0,2,0448,1849,,0,M,0.0,0.0,29.1,29.82,1.7,27,9.2
1,2,2007-05-01,84,52,68,M,51,57,0,3,-,-,,M,M,M,0.0,29.18,29.82,2.7,25,9.6
2,1,2007-05-02,59,42,51,-3,42,47,14,0,0447,1850,BR,0,M,0.0,0.0,29.38,30.09,13.0,4,13.4
3,2,2007-05-02,60,43,52,M,42,47,13,0,-,-,BR HZ,M,M,M,0.0,29.44,30.08,13.3,2,13.4
4,1,2007-05-03,66,46,56,2,40,48,9,0,0446,1851,,0,M,0.0,0.0,29.39,30.12,11.7,7,11.9



spray_df


Unnamed: 0,Date,Time,Latitude,Longitude
0,2011-08-29,6:56:58 PM,42.391623,-88.089163
1,2011-08-29,6:57:08 PM,42.391348,-88.089163
2,2011-08-29,6:57:18 PM,42.391022,-88.089157
3,2011-08-29,6:57:28 PM,42.390637,-88.089158
4,2011-08-29,6:57:38 PM,42.39041,-88.088858


In [11]:
spray_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 14835 entries, 0 to 14834
Data columns (total 4 columns):
 #   Column     Non-Null Count  Dtype  
---  ------     --------------  -----  
 0   Date       14835 non-null  object 
 1   Time       14251 non-null  object 
 2   Latitude   14835 non-null  float64
 3   Longitude  14835 non-null  float64
dtypes: float64(2), object(2)
memory usage: 463.7+ KB


Check null values

In [9]:
for i in range(len(list_df)):
    print()
    print(list_df_name[i])
    print(list_df[i].isnull().sum())


train_df
Date                      0
Address                   0
Species                   0
Block                     0
Street                    0
Trap                      0
AddressNumberAndStreet    0
Latitude                  0
Longitude                 0
AddressAccuracy           0
NumMosquitos              0
WnvPresent                0
dtype: int64

test_df
Id                        0
Date                      0
Address                   0
Species                   0
Block                     0
Street                    0
Trap                      0
AddressNumberAndStreet    0
Latitude                  0
Longitude                 0
AddressAccuracy           0
dtype: int64

weather_df
Station        0
Date           0
Tmax           0
Tmin           0
Tavg           0
Depart         0
DewPoint       0
WetBulb        0
Heat           0
Cool           0
Sunrise        0
Sunset         0
CodeSum        0
Depth          0
Water1         0
SnowFall       0
PrecipTotal    0
StnPressur

584 null values found in `Time` column of `spray_df`. Considering that this constitutes a minor fraction of the overall data present in `spray_df`, subsequent course of action would involve either removing the rows with null values or entirely discarding the column if it is deemed uninformative for future analysis.

#### Check for missing values 'M', trace values 'T' and '-' in `weather_df`

In [9]:
# convert weather_df 'Date' to datetime
weather_df['Date'] = pd.to_datetime(weather_df['Date'])

In [10]:
# set index as 'Date'
weather_df = weather_df.set_index('Date')
# ensure data is sorted by 'Date'
weather_df = weather_df.sort_index()

In [11]:
weather_df[weather_df=='M'].count()

Station           0
Tmax              0
Tmin              0
Tavg             11
Depart         1472
DewPoint          0
WetBulb           4
Heat             11
Cool             11
Sunrise           0
Sunset            0
CodeSum           0
Depth          1472
Water1         2944
SnowFall       1472
PrecipTotal       2
StnPressure       4
SeaLevel          9
ResultSpeed       0
ResultDir         0
AvgSpeed          3
dtype: int64

In [12]:
weather_df[weather_df=='-'].count()

Station           0
Tmax              0
Tmin              0
Tavg              0
Depart            0
DewPoint          0
WetBulb           0
Heat              0
Cool              0
Sunrise        1472
Sunset         1472
CodeSum           0
Depth             0
Water1            0
SnowFall          0
PrecipTotal       0
StnPressure       0
SeaLevel          0
ResultSpeed       0
ResultDir         0
AvgSpeed          0
dtype: int64

#### Fill rows containing 'M' (missing) and '-' with data found from other station

In [13]:
# change 'M' and '-' values to NaN
list_m = ['Tavg','PrecipTotal','WetBulb','Depart','Depth', 'SnowFall','StnPressure'] 
list_missing = ['Sunrise','Sunset']
for i in list_m:
    weather_df[i] = weather_df[i].replace('M', np.NaN)
for j in list_missing:
     weather_df[j] = weather_df[j].replace('-', np.NaN)

In [14]:
weather_df.isnull().sum()

Station           0
Tmax              0
Tmin              0
Tavg             11
Depart         1472
DewPoint          0
WetBulb           4
Heat              0
Cool              0
Sunrise        1472
Sunset         1472
CodeSum           0
Depth          1472
Water1            0
SnowFall       1472
PrecipTotal       2
StnPressure       4
SeaLevel          0
ResultSpeed       0
ResultDir         0
AvgSpeed          0
dtype: int64

In [15]:
# try to fill missing values from the other station

# reset index
weather_df = weather_df.reset_index()

for column in list_m:
    for index, row in weather_df.iterrows():
        if pd.isna(row[column]) and row['Station'] == 1:
            weather_df.at[index, column] = weather_df.at[index + 1, column]
        elif pd.isna(row[column]) and row['Station'] == 2:
            weather_df.at[index, column] = weather_df.at[index - 1, column]
            
for column in list_missing:
    for index, row in weather_df.iterrows():
        if pd.isna(row[column]) and row['Station'] == 1:
            weather_df.at[index, column] = weather_df.at[index + 1, column]
        elif pd.isna(row[column]) and row['Station'] == 2:
            weather_df.at[index, column] = weather_df.at[index - 1, column]

# set index as 'Date'
weather_df = weather_df.set_index('Date')
# ensure data is sorted by 'Date'
weather_df = weather_df.sort_index()

In [16]:
weather_df.isnull().sum()

Station        0
Tmax           0
Tmin           0
Tavg           0
Depart         0
DewPoint       0
WetBulb        0
Heat           0
Cool           0
Sunrise        0
Sunset         0
CodeSum        0
Depth          0
Water1         0
SnowFall       0
PrecipTotal    0
StnPressure    2
SeaLevel       0
ResultSpeed    0
ResultDir      0
AvgSpeed       0
dtype: int64

#### Fill rows that are still missing using forward fill

In [17]:
# forward fill using ffill
weather_df = weather_df.ffill(axis=0)

In [18]:
weather_df.isnull().sum()

Station        0
Tmax           0
Tmin           0
Tavg           0
Depart         0
DewPoint       0
WetBulb        0
Heat           0
Cool           0
Sunrise        0
Sunset         0
CodeSum        0
Depth          0
Water1         0
SnowFall       0
PrecipTotal    0
StnPressure    0
SeaLevel       0
ResultSpeed    0
ResultDir      0
AvgSpeed       0
dtype: int64

Convert '  T' trace to a numerical value

In [19]:
weather_df[weather_df=='  T'].count()

Station          0
Tmax             0
Tmin             0
Tavg             0
Depart           0
DewPoint         0
WetBulb          0
Heat             0
Cool             0
Sunrise          0
Sunset           0
CodeSum          0
Depth            0
Water1           0
SnowFall        24
PrecipTotal    318
StnPressure      0
SeaLevel         0
ResultSpeed      0
ResultDir        0
AvgSpeed         0
dtype: int64

In [20]:
weather_df['PrecipTotal'].unique()

array(['0.00', '  T', '0.13', '0.02', '0.38', '0.60', '0.14', '0.07',
       '0.11', '0.09', '1.01', '0.28', '0.04', '0.08', '0.01', '0.53',
       '0.19', '0.21', '0.32', '0.39', '0.31', '0.42', '0.27', '0.16',
       '0.58', '0.93', '0.05', '0.34', '0.15', '0.35', '0.40', '0.66',
       '0.30', '0.24', '0.43', '1.55', '0.92', '0.89', '0.17', '0.03',
       '1.43', '0.97', '0.26', '1.31', '0.06', '0.46', '0.29', '0.23',
       '0.41', '0.45', '0.83', '1.33', '0.91', '0.48', '0.37', '0.88',
       '2.35', '1.96', '0.20', '0.25', '0.18', '0.67', '0.36', '0.33',
       '1.28', '0.74', '0.76', '0.71', '0.95', '1.46', '0.12', '0.52',
       '0.64', '0.22', '1.24', '0.72', '0.73', '0.65', '1.61', '1.22',
       '0.50', '1.05', '2.43', '0.59', '2.90', '2.68', '1.23', '0.62',
       '6.64', '3.07', '1.44', '1.75', '0.82', '0.80', '0.86', '0.63',
       '0.55', '1.03', '0.70', '1.73', '1.38', '0.44', '1.14', '1.07',
       '3.97', '0.87', '0.78', '1.12', '0.68', '0.10', '0.61', '0.54',
       

#### Change '  T' to float

According to Northeast Regional Climate Center, trace is "less than 0.01" of precipitation, less than 0.1" of snow"

In [21]:
# replace '  T' with 0.005 for 'PrecipTotal'
weather_df['PrecipTotal'] = weather_df['PrecipTotal'].replace('  T','0.005')
# replace '  T' with 0.05 for 'SnowFall'
weather_df['SnowFall'] = weather_df['SnowFall'].replace('  T','0.05') 

In [22]:
# reset index
weather_df = weather_df.reset_index()

In [23]:
# set 'Date' back to object
weather_df['Date'] = weather_df['Date'].dt.strftime('%Y-%m-%d')

#### Save cleaned `weather_df` as csv file

In [24]:
weather_df.to_csv('../data/input/cleaned_weather.csv',index=False)

<a id='citations'></a>

## 5. Citations

Drakou, K., Nikolaou, T., Vasquez, M. I., Petrić, D., Michaelakis, A., Kapranas, A., Papatheodoulou, A., & Koliou, M. (2020). The Effect of Weather Variables on Mosquito Activity: A Snapshot of the Main Point of Entry of Cyprus. International Journal of Environmental Research and Public Health, 17(4), 1403. Retrieved July 4, 2023, from https://doi.org/10.3390/ijerph17041403

Lebl, K., Brugger, K., & Rubel, F. (2013). Predicting Culex pipiens/restuans population dynamics by interval lagged weather data. Parasites & Vectors, 6(1). Retrieved July 4, 2023, from https://doi.org/10.1186/1756-3305-6-129

Thornton, S. (2021). Chicago Turns to Predictive Analytics to Map West Nile Threat. GovTech. Retrieved July 4, 2023, from https://www.govtech.com/analytics/chicago-turns-to-predictive-analytics-to-map-west-nile-threat.html

What Does It All Mean? (2020, February 24). Retrieved July 4, 2023, from https://www.nrcc.cornell.edu/services/blog/2020/02/24/index.html

## <b> End of Part I</b> <br>

The next codebook will cover EDA. <br>

[Part II](Part_2-EDA_and_Feature_Engineering.ipynb#part_ii) <br>
[Part III](Part_3-Modelling.ipynb#part_iii)