# Predicting Forest Fires in Algeria

## Introduction

Climate change has increased the risk and extent of forest fires in many places around the world. Forest fires bring many devastating effects including the destruction of wildlife habitat and animal life, toxic gas emission to the atmosphere, infrastructure damage and could potentially costing human lives in the wake of a fire. Recognizing the potentially catastrophic effect of forest fires and to potentially save human lives and the environment, it is important to have some sort of early warning system that can help governmental agencies in forecasting forest fires.

<p float="left">
  <img src = "https://www.lifeinsuranceinternational.com/wp-content/uploads/sites/8/2019/02/shutterstock_710588224.jpg" width = "400"/>
  <img src = "https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcQkiv3dvAtycEW-ZvEomKQXvL38bNuSKx1sOQ&usqp=CAU" width = "350" height = 265/>
  <img src = "https://i.natgeofe.com/n/77462492-ea41-41fe-9c07-296dc330181f/80133.jpg" width = "355"/>
</p>

With that motivation in mind, this project aim to develope a k-nearest neighbors machine learning model that can predict whether or not a forest fire will occur base on different weather metrics. Our dataset is obtained from <https://archive.ics.uci.edu/ml/datasets/Algerian+Forest+Fires+Dataset++#>. This dataset is a group dataset that contains observations from two different regions of Algeria, namely the Bejaia region in the northeast and Siddi Bel-Abbes region in the northwest.

The dataset contains 14 columns:
* Day (day)
* Month (month): June to September
* Year (year): 2012
* Temperature: maximum temperature at noon, in degree Celsius (range: 22 -40)
* Relative humidity (RH): relative humidity in % (range: 21 - 90)
* Windspeed (Ws): in speed in km/h (range: 6 - 29)
* Rain amount (Rain): rain amount in a day, in milimeters (mm) (range: 0 - 16.8)
* Fine Fuel Moisture Code (FFMC) index from the FWI system (range: 28.6 - 92.5)
* Duff Moisture Code (DMC) index from the FWI system (range: 1.1 - 65.9)
* Drought Code (DC) index from the FWI system (range: 7 - 220.4)
* Initial Spread Index (ISI) index from the FWI system (range: 0 - 18.5)
* Buildup Index (BUI) index from the FWI system (range: 1.1 - 68)
* Fire Weather Index (FWI) Index (range: 0 - 31.1)
* Classes: fire/not fire

In [30]:
library(tidyverse)
library(repr)
library(readxl)
library(RColorBrewer)
library(forcats)
library(tidymodels)
options(repr.matrix.max.rows = 6) #limits output of dataframes to 6 rows

In [31]:
url <- "https://archive.ics.uci.edu/ml/machine-learning-databases/00547/Algerian_forest_fires_dataset_UPDATE.csv"
Bej_data <- read_csv(url, skip = 1, n_max = 122)
SB_data <- read_csv(url, skip = 126) %>% 
    na.omit() %>% 
    mutate(DC = as.numeric(DC),
           FWI = as.numeric(FWI))

forest_fires <- rbind(Bej_data, SB_data) %>% 
    mutate(day = as.numeric(day),
           month = as.numeric(month),
           Classes = as_factor(Classes))
forest_fires = cbind(region = (c(rep("Bejaia", 122), rep("Sidi-Bel Abbes", 121))), forest_fires)

forest_fires

Parsed with column specification:
cols(
  day = [31mcol_character()[39m,
  month = [31mcol_character()[39m,
  year = [32mcol_double()[39m,
  Temperature = [32mcol_double()[39m,
  RH = [32mcol_double()[39m,
  Ws = [32mcol_double()[39m,
  Rain = [32mcol_double()[39m,
  FFMC = [32mcol_double()[39m,
  DMC = [32mcol_double()[39m,
  DC = [32mcol_double()[39m,
  ISI = [32mcol_double()[39m,
  BUI = [32mcol_double()[39m,
  FWI = [32mcol_double()[39m,
  Classes = [31mcol_character()[39m
)

Parsed with column specification:
cols(
  day = [31mcol_character()[39m,
  month = [31mcol_character()[39m,
  year = [32mcol_double()[39m,
  Temperature = [32mcol_double()[39m,
  RH = [32mcol_double()[39m,
  Ws = [32mcol_double()[39m,
  Rain = [32mcol_double()[39m,
  FFMC = [32mcol_double()[39m,
  DMC = [32mcol_double()[39m,
  DC = [31mcol_character()[39m,
  ISI = [32mcol_double()[39m,
  BUI = [32mcol_double()[39m,
  FWI = [31mcol_character()[39m,
  Classes

region,day,month,year,Temperature,RH,Ws,Rain,FFMC,DMC,DC,ISI,BUI,FWI,Classes
<chr>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<fct>
Bejaia,1,6,2012,29,57,18,0.0,65.7,3.4,7.6,1.3,3.4,0.5,not fire
Bejaia,2,6,2012,29,61,13,1.3,64.4,4.1,7.6,1.0,3.9,0.4,not fire
Bejaia,3,6,2012,26,82,22,13.1,47.1,2.5,7.1,0.3,2.7,0.1,not fire
⋮,⋮,⋮,⋮,⋮,⋮,⋮,⋮,⋮,⋮,⋮,⋮,⋮,⋮,⋮
Sidi-Bel Abbes,28,9,2012,27,87,29,0.5,45.9,3.5,7.9,0.4,3.4,0.2,not fire
Sidi-Bel Abbes,29,9,2012,24,54,18,0.1,79.7,4.3,15.2,1.7,5.1,0.7,not fire
Sidi-Bel Abbes,30,9,2012,24,64,15,0.2,67.3,3.8,16.5,1.2,4.8,0.5,not fire
