# Algerian Forest Fires Dataset

### Variable Information
1. Date : (DD/MM/YYYY) Day, month ('june' to 'september'), year (2012)
Weather data observations 
2. Temp : temperature noon (temperature max)  in Celsius degrees: 22 to 42
3. RH : Relative Humidity in %: 21 to 90 
4. Ws :Wind speed in km/h: 6 to 29 
5. Rain: total day in mm: 0 to 16.8
FWI Components  
6. Fine Fuel Moisture Code (FFMC) index from the FWI system: 28.6 to 92.5 
7. Duff Moisture Code (DMC) index from the FWI system: 1.1 to 65.9 
8. Drought Code (DC) index from the FWI system:  7 to 220.4
9. Initial Spread Index (ISI) index from the FWI system: 0 to 18.5 
10. Buildup Index (BUI) index from the FWI system: 1.1 to 68
11. Fire Weather Index (FWI) Index: 0 to 31.1
12. Classes: two classes, namely   â€œFireâ€ and â€œnot Fireâ€

In [None]:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline
import warnings
warnings.filterwarnings('ignore')

In [None]:
dataset=pd.read_csv('Algerian_forest_fires_dataset_UPDATE.csv', header=1)

In [None]:
dataset.head()

In [None]:
dataset.info()

## Data Cleaning

In [None]:
dataset[dataset.isnull().any(axis=1)]

Dataset is converted into two sets based on Regions from 122th index we can make new column bases on the Region 
1. "Bejaia Region dataset"
2. "Sidi-Bel Abbes Region dataset"

Add new columns to region

In [None]:
dataset.loc[:122, "Region"] = 0
dataset.loc[122:, "Region"] = 1
df=dataset

In [None]:
df.head()

In [None]:
df.info()

In [None]:
df[['Region']]=df[['Region']].astype(int)

In [None]:
df.head()

In [None]:
df.tail()

In [None]:
df.isnull().sum()

In [None]:
df = df.dropna().reset_index(drop=True)

In [None]:
df.isnull().sum()

In [None]:
df.iloc[[122]]

In [None]:
df=df.drop(122).reset_index(drop=True)

In [None]:
df.iloc[[122]]

In [None]:
df.columns

In [None]:
# fix spaces in columns names
df.columns = df.columns.str.strip()
df.columns

In [None]:
df.info()

Change the required columns as integer datatype

In [None]:
df[['day', 'month', 'year', 'Temperature', 'RH', 'Ws']] = df[['day', 'month', 'year', 'Temperature', 'RH', 'Ws']].astype(int)

In [None]:
df.info()

In [None]:
df.head()

Changing the other columns to float datatype

In [None]:
objects = [featuers for featuers in df.columns if df[featuers].dtypes == 'O']

In [None]:
objects

In [None]:
for i in objects:
    if i != 'Classes':
        df[i] = df[i].astype(float)

In [None]:
df.info()

In [None]:
df.describe()

In [None]:
df.head()

In [None]:
df.to_csv('Algerian_forest_fires_cleaned_dataset.csv', index=False)

## Exploratory data analysis

In [None]:
df_copy = df.drop(['day', 'month', 'year'], axis=1)

In [None]:
df_copy.head()

In [None]:
df_copy['Classes'].value_counts()

In [None]:
df_copy['Classes']=np.where(df_copy['Classes'].str.contains('not fire'), 0,1)

In [None]:
df_copy['Classes'].value_counts()

In [None]:
percentage = df_copy['Classes'].value_counts(normalize=True)*100
percentage

In [None]:
classlabel = ['fire', 'not fire']
plt.figure(figsize=(12,7))
plt.pie(percentage, labels=classlabel, autopct='%1.1f%%')
plt.title("Pie Chart for classes")
plt.show()

In [None]:
df_copy.corr

In [None]:
sns.heatmap(df_copy.corr(), annot=True)

In [None]:
sns.boxplot(df_copy['FWI'], color='green')

In [None]:
## Monthly Fire Analysis
dftemp=df.loc[df['Region']==1]
plt.subplots(figsize=(13,6))
sns.set_style('whitegrid')
sns.countplot(x='month',hue='Classes', data=df) 
plt.ylabel('Number of Fires',weight='bold')
plt.xlabel('Months',weight='bold')
plt.title("Fire Analysis of Sidi Bel Regions", weight="bold")