# ***California Fire Detection***




---


In [1]:
!pip install plotly
!pip install geopandas==0.3.0
!pip install pyshp==1.2.10
!pip install shapely==1.6.3
!pip install geopandas --upgrade
!pip install plotly-geo

Collecting geopandas==0.3.0
  Downloading geopandas-0.3.0-py2.py3-none-any.whl (888 kB)
[K     |████████████████████████████████| 888 kB 4.9 MB/s 
[?25hCollecting pyproj
  Downloading pyproj-3.2.1-cp37-cp37m-manylinux2010_x86_64.whl (6.3 MB)
[K     |████████████████████████████████| 6.3 MB 31.3 MB/s 
Collecting fiona
  Downloading Fiona-1.8.20-cp37-cp37m-manylinux1_x86_64.whl (15.4 MB)
[K     |████████████████████████████████| 15.4 MB 39 kB/s 
Collecting munch
  Downloading munch-2.5.0-py2.py3-none-any.whl (10 kB)
Collecting cligj>=0.5
  Downloading cligj-0.7.2-py3-none-any.whl (7.1 kB)
Collecting click-plugins>=1.0
  Downloading click_plugins-1.1.1-py2.py3-none-any.whl (7.5 kB)
Installing collected packages: munch, cligj, click-plugins, pyproj, fiona, geopandas
Successfully installed click-plugins-1.1.1 cligj-0.7.2 fiona-1.8.20 geopandas-0.3.0 munch-2.5.0 pyproj-3.2.1
Collecting pyshp==1.2.10
  Downloading pyshp-1.2.10.tar.gz (176 kB)
[K     |████████████████████████████████| 176

In [2]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from sklearn.impute import SimpleImputer

import folium
import math

import plotly.figure_factory as ff
import plotly.express as px

California is one of the places having the most deadliest and destructive wildfire seasons. The dataset contains the list of Wildfires that has occurred in California between 2013 and 2019. The dataset contains the location where wildfires have occurred including the County name, latitude and longitude values and also details on when the wildfire has started.

This data helps to generate insights on what locations in California are under fire threat, what time do Wildfires usually occur and how frequent and devastating they are!!

## **Import Data**

In [3]:
dataset = pd.read_csv('California_Fire_Cleaned.csv')
dataset.head(15)

Unnamed: 0.1,Unnamed: 0,AcresBurned,AdminUnit,ArchiveYear,CalFireIncident,Counties,CountyIds,Extinguished,Fatalities,Featured,Final,Injuries,Latitude,Location,Longitude,MajorIncident,Name,PercentContained,Started,Status,Active Time,FIPS
0,0,257314.0,Stanislaus National Forest/Yosemite National Park,2013,True,Tuolumne,55,2013-09-06 18:30:00,,False,True,,37.857,3 miles east of Groveland along Hwy 120,-120.086,False,Rim Fire,100.0,2013-08-17 15:25:00,Finalized,20 days 03:05:00,6109.0
1,1,30274.0,USFS Angeles National Forest/Los Angeles Count...,2013,True,Los Angeles,19,2013-06-08 18:30:00,,False,True,,34.585595,Angeles National Forest,-118.423176,False,Powerhouse Fire,100.0,2013-05-30 15:28:00,Finalized,9 days 03:02:00,6037.0
2,2,27531.0,CAL FIRE Riverside Unit / San Bernardino Natio...,2013,True,Riverside,33,2013-07-30 18:00:00,,False,True,,33.7095,Hwy 243 & Hwy 74 near Mountain Center,-116.72885,False,Mountain Fire,100.0,2013-07-15 13:43:00,Finalized,15 days 04:17:00,6065.0
3,3,27440.0,Tahoe National Forest,2013,False,Placer,31,2013-08-30 08:00:00,,False,True,,39.12,"Deadwood Ridge, northeast of Foresthill",-120.65,False,American Fire,100.0,2013-08-10 16:30:00,Finalized,19 days 15:30:00,6061.0
4,4,24251.0,Ventura County Fire/CAL FIRE,2013,True,Ventura,56,2013-05-11 06:30:00,,False,True,10.0,0.0,Southbound Highway 101 at Camarillo Springs Ro...,0.0,True,Springs Fire,100.0,2013-05-02 07:01:00,Finalized,8 days 23:29:00,6111.0
5,5,22992.0,Sierra National Forest,2013,False,Fresno,10,2013-09-24 20:15:00,,False,True,,37.279,Seven miles north of Big Creek,-119.318,False,Aspen Fire,100.0,2013-07-22 22:15:00,Finalized,63 days 22:00:00,6019.0
6,6,20292.0,CAL FIRE Riverside Unit / San Bernardino Natio...,2013,True,Riverside,33,2013-08-12 18:00:00,,False,True,26.0,33.86157,"Poppet Flats Rd near Hwy 243, south of Banning",-116.90427,True,Silver Fire,100.0,2013-08-07 14:05:00,Finalized,5 days 03:55:00,6065.0
7,7,14754.0,Klamath National Forest,2013,False,Siskiyou,47,2013-08-31 06:45:00,,False,True,,41.32,"North Fork of the Salmon River, West of Sawyer...",-123.176,False,Salmon River Complex,100.0,2013-07-31 22:00:00,Finalized,30 days 08:45:00,6093.0
8,8,12503.0,Six Rivers National Forest,2013,False,Humboldt,12,2013-08-12 12:00:00,,False,True,,41.035,Tish Tang Ridge east of Hoopa Valley Reservation,-123.488,False,Corral Complex,100.0,2013-08-10 11:40:00,Finalized,2 days 00:20:00,6023.0
9,9,11429.0,CAL FIRE Tehama-Glenn Unit,2013,True,Tehama,52,2013-08-29 16:45:00,,False,True,5.0,40.04263,"Near Deer Creek, 12 miles east of Los Molinos.",-121.85397,True,Deer Fire,100.0,2013-08-23 14:15:00,Finalized,6 days 02:30:00,6103.0


In [4]:
dataset.columns

Index(['Unnamed: 0', 'AcresBurned', 'AdminUnit', 'ArchiveYear',
       'CalFireIncident', 'Counties', 'CountyIds', 'Extinguished',
       'Fatalities', 'Featured', 'Final', 'Injuries', 'Latitude', 'Location',
       'Longitude', 'MajorIncident', 'Name', 'PercentContained', 'Started',
       'Status', 'Active Time', 'FIPS'],
      dtype='object')

In [5]:
dataset.dtypes

Unnamed: 0            int64
AcresBurned         float64
AdminUnit            object
ArchiveYear           int64
CalFireIncident        bool
Counties             object
CountyIds            object
Extinguished         object
Fatalities          float64
Featured               bool
Final                  bool
Injuries            float64
Latitude            float64
Location             object
Longitude           float64
MajorIncident          bool
Name                 object
PercentContained    float64
Started              object
Status               object
Active Time          object
FIPS                float64
dtype: object

# ***Analysis***

In [14]:
map = folium.Map(location=[37.160317,-120.621407], tiles="cartodb positron", zoom_start=7, height = "75%", width = "75%")
for i in range(len(dataset)):
    folium.Circle(
        location=[dataset.loc[i,'Latitude'], dataset.loc[i,'Longitude']],
        radius=math.sqrt(float(dataset.loc[i,'AcresBurned'])*4046.86/math.pi), # Convert acres to meters
        tooltip=str(dataset.loc[i,'Name'])+', '+str(dataset.loc[i,'ArchiveYear']),
        color="#e08a3f",
        fill=True,
        fill_color="#e08a3f",
    ).add_to(map)

title_html = '''
             <h3 align="center" style="font-size:25px"><b>Geospatial distribution of wildfires in California</b></h3>
             '''
map.get_root().html.add_child(folium.Element(title_html))    
    
map

In [19]:
v = dataset[['Counties','FIPS']].value_counts().to_frame()
v = v.reset_index()
v.columns =  ['County', 'FIPS', 'TotalFires']
v

Unnamed: 0,County,FIPS,TotalFires
0,Riverside,6065.0,146
1,San Diego,6073.0,89
2,Butte,6007.0,66
3,Shasta,6089.0,64
4,San Luis Obispo,6079.0,64
5,Kern,6029.0,62
6,Fresno,6019.0,57
7,Siskiyou,6093.0,57
8,San Bernardino,6071.0,53
9,Tehama,6103.0,51


In [20]:
v = v.dropna(how='any',axis=0) 

values = v['TotalFires'].tolist()
fips = v['FIPS'].tolist()

endpts = list(np.mgrid[min(values):max(values):4j])
colorscale = ["#002047","#2b82ed","#0057a3","#6190c9","#c2ddff"]
fig = ff.create_choropleth(
    fips=fips, values=values, scope=['California'], show_state_data=True,
    colorscale=colorscale, binning_endpoints=endpts, round_legend_values=False,
    plot_bgcolor='rgb(229,229,229)',
    paper_bgcolor='rgb(229,229,229)',
    legend_title='Total Fires by County',
    county_outline={'color': 'rgb(255,255,255)', 'width': 0.5},
    exponent_format=True,
)
fig.layout.template = None
fig.show()



1.   *Riverside had the most wild fires* *Imperial County did not have any wild fires*
2.   *Imperial County did not have any wild fires*



*Riverside had the most wild fires*

In [21]:
dataset[dataset["Counties"] == "Imperial"] # Proof that imperial county did not have wild fires

Unnamed: 0.1,Unnamed: 0,AcresBurned,AdminUnit,ArchiveYear,CalFireIncident,Counties,CountyIds,Extinguished,Fatalities,Featured,Final,Injuries,Latitude,Location,Longitude,MajorIncident,Name,PercentContained,Started,Status,Active Time,FIPS


In [23]:
dataset['Month'] = [i.month for i in pd.to_datetime(dataset['Started'])]
new_Data = dataset.groupby(['ArchiveYear', 'Month'])['Counties'].count().reset_index()
new_Data.rename(columns = {"Counties" : "Wild Fire Count"}, inplace = True)

In [24]:
fig = px.line(new_Data, x="Month", y="Wild Fire Count", color = "ArchiveYear", title='Wild Fire Frequency over the month in a given year')
fig.show()

More wild fires seem to be occuring during the summer time. Makes sense as during this time the temperatures are high and it is more dry making wild fires a more likely occurence.

# **Our Model**

## **Checking Model** 

## **Improved Model** 