# Group Project

## CSE 6242, Data and Visual Analytics

### Georgia Tech, Fall 2020

#### Participants: Sean Webber, Edward Gramza, Ramin Melikov

It so happened that our brainstorming session was at the same time the
Western US was engulfed in wildfires. The wildfires then became the center
of our discussions and eventually became the subject of our analysis. We
wanted to see if we could use the techniques we've learned in our class or
other classes of the OMSA program to predict wildfires or identify variables
that contribute to this problem. The next steps for us would be to find data
sources necessary to tackle this problem as well as research that would guide
our project. Upon initial research, we've discovered the `1.88 Million US Wildfires`
dataset, as well as few others. It will be our main dataset for this project.

The `1.88 Million US Wildfires` dataset is a curated dataset provided by
Kaggle and is located at [here](https://www.kaggle.com/rtatman/188-million-us-wildfires).

In [1]:
import sqlite3
import pandas as pd
from pandas_profiling import ProfileReport

We want to get the data and move it to a `pandas` dataframe for our analysis.

In [2]:
con = sqlite3.connect('./data/FPA_FOD_20170508.sqlite')
cur = con.cursor()
df = pd.read_sql_query("select * from fires", con)
con.close()

Let's look at at the metadata.

In [3]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1880465 entries, 0 to 1880464
Data columns (total 39 columns):
 #   Column                      Dtype  
---  ------                      -----  
 0   OBJECTID                    int64  
 1   FOD_ID                      int64  
 2   FPA_ID                      object 
 3   SOURCE_SYSTEM_TYPE          object 
 4   SOURCE_SYSTEM               object 
 5   NWCG_REPORTING_AGENCY       object 
 6   NWCG_REPORTING_UNIT_ID      object 
 7   NWCG_REPORTING_UNIT_NAME    object 
 8   SOURCE_REPORTING_UNIT       object 
 9   SOURCE_REPORTING_UNIT_NAME  object 
 10  LOCAL_FIRE_REPORT_ID        object 
 11  LOCAL_INCIDENT_ID           object 
 12  FIRE_CODE                   object 
 13  FIRE_NAME                   object 
 14  ICS_209_INCIDENT_NUMBER     object 
 15  ICS_209_NAME                object 
 16  MTBS_ID                     object 
 17  MTBS_FIRE_NAME              object 
 18  COMPLEX_NAME                object 
 19  FIRE_YEAR            

We also wanted to take a closer look at the metadata using the
`pandas_profiling` package. The code below produced an `HTML`
file that has all of the metadata about this dataset.

````markdown
profile = ProfileReport(df, title='Pandas Profiling Report', explorative=True)
profile.to_file("data_profile.html")
````
The metadata `HTML` file can be accessed [here](data_profile.html)

---

During our research, we've also discovered that there is a relationship
between drought and wildfires (Addington et al., 2015). Drought is very stronly
related to wildfires. (Furthermore, from the same research paper we've learned
that municipalities that do a good job with respect to prescribed fires experience
less wildfires.)

To that end, the government has a resource, called the [US Drought Monitor](https://www.drought.gov/)
and it shows the current conditions with respect to drought.

![](current_conditions.png)

Upon reviewing the images for different periods, it became obvious that the Western
part of the US is the primary drought region.

The code below calls a Tableau workbook that we've created. It shows a map of the total
number of wildfires per state between 1992 and 2015.

In [1]:
%%html

<div class='tableauPlaceholder' id='viz1605982185029' style='position: relative'>
	<noscript>
		<a href='#'> <img alt=' ' src='https:&#47;&#47;public.tableau.com&#47;static&#47;images&#47;CS&#47;CSE6242GroupProject&#47;NumberofWildfires&#47;1_rss.png' style='border: none' /> </a>
	</noscript>
	<object class='tableauViz' style='display:none;'>
		<param name='host_url' value='https%3A%2F%2Fpublic.tableau.com%2F' />
		<param name='embed_code_version' value='3' />
		<param name='site_root' value='' />
		<param name='name' value='CSE6242GroupProject&#47;NumberofWildfires' />
		<param name='tabs' value='no' />
		<param name='toolbar' value='yes' />
		<param name='static_image' value='https:&#47;&#47;public.tableau.com&#47;static&#47;images&#47;CS&#47;CSE6242GroupProject&#47;NumberofWildfires&#47;1.png' />
		<param name='animate_transition' value='yes' />
		<param name='display_static_image' value='yes' />
		<param name='display_spinner' value='yes' />
		<param name='display_overlay' value='yes' />
		<param name='display_count' value='yes' />
		<param name='language' value='en' />
		<param name='filter' value='publish=yes' /> </object>
</div>
<script type='text/javascript'>
var divElement = document.getElementById('viz1605982185029');
var vizElement = divElement.getElementsByTagName('object')[0];
if(divElement.offsetWidth > 800) {
	vizElement.style.width = '1000px';
	vizElement.style.height = '827px';
} else if(divElement.offsetWidth > 500) {
	vizElement.style.width = '1000px';
	vizElement.style.height = '827px';
} else {
	vizElement.style.width = '100%';
	vizElement.style.height = '727px';
}
var scriptElement = document.createElement('script');
scriptElement.src = 'https://public.tableau.com/javascripts/api/viz_v1.js';
vizElement.parentNode.insertBefore(scriptElement, vizElement);
</script>

Looking at the map, we can see that most of the wildfires have been happening in
the Southern part of the US. And when evaluating the map, we see that the highest
number of wildfires is in the state of California. For our project, we are reducing
the scope of our project to the state of California.

In [9]:
%%html

<div class='tableauPlaceholder' id='viz1606012369370' style='position: relative'>
	<noscript>
		<a href='#'><img alt=' ' src='https:&#47;&#47;public.tableau.com&#47;static&#47;images&#47;Ca&#47;CaliforniaWildfiresperCounty&#47;CaliforniaTotalNumberofFires&#47;1_rss.png' style='border: none' /></a>
	</noscript>
	<object class='tableauViz' style='display:none;'>
		<param name='host_url' value='https%3A%2F%2Fpublic.tableau.com%2F' />
		<param name='embed_code_version' value='3' />
		<param name='site_root' value='' />
		<param name='name' value='CaliforniaWildfiresperCounty&#47;CaliforniaTotalNumberofFires' />
		<param name='tabs' value='no' />
		<param name='toolbar' value='yes' />
		<param name='static_image' value='https:&#47;&#47;public.tableau.com&#47;static&#47;images&#47;Ca&#47;CaliforniaWildfiresperCounty&#47;CaliforniaTotalNumberofFires&#47;1.png' />
		<param name='animate_transition' value='yes' />
		<param name='display_static_image' value='yes' />
		<param name='display_spinner' value='yes' />
		<param name='display_overlay' value='yes' />
		<param name='display_count' value='yes' />
		<param name='language' value='en' />
		<param name='filter' value='publish=yes' />
	</object>
</div>
<script type='text/javascript'>
var divElement = document.getElementById('viz1606012369370');
var vizElement = divElement.getElementsByTagName('object')[0];
if(divElement.offsetWidth > 800) {
	vizElement.style.width = '700px';
	vizElement.style.height = '827px';
} else if(divElement.offsetWidth > 500) {
	vizElement.style.width = '700px';
	vizElement.style.height = '827px';
} else {
	vizElement.style.width = '100%';
	vizElement.style.height = '727px';
}
var scriptElement = document.createElement('script');
scriptElement.src = 'https://public.tableau.com/javascripts/api/viz_v1.js';
vizElement.parentNode.insertBefore(scriptElement, vizElement);
</script>

Looking at the output above, we can see that the bubbles tend to get larger toward
the Southern part of California.

Let's take a look at the causes of wildfires for California.

**Note: We had to use OpenRefine to clean and reduce the data.**

In [7]:
%%html

<div class='tableauPlaceholder' id='viz1606011758444' style='position: relative'>
	<noscript>
		<a href='#'><img alt=' ' src='https:&#47;&#47;public.tableau.com&#47;static&#47;images&#47;Ca&#47;CausesofWildfiresinCalifornia&#47;CausesofWildfiresoverTimeinCalifornia&#47;1_rss.png' style='border: none' /></a>
	</noscript>
	<object class='tableauViz' style='display:none;'>
		<param name='host_url' value='https%3A%2F%2Fpublic.tableau.com%2F' />
		<param name='embed_code_version' value='3' />
		<param name='site_root' value='' />
		<param name='name' value='CausesofWildfiresinCalifornia&#47;CausesofWildfiresoverTimeinCalifornia' />
		<param name='tabs' value='no' />
		<param name='toolbar' value='yes' />
		<param name='static_image' value='https:&#47;&#47;public.tableau.com&#47;static&#47;images&#47;Ca&#47;CausesofWildfiresinCalifornia&#47;CausesofWildfiresoverTimeinCalifornia&#47;1.png' />
		<param name='animate_transition' value='yes' />
		<param name='display_static_image' value='yes' />
		<param name='display_spinner' value='yes' />
		<param name='display_overlay' value='yes' />
		<param name='display_count' value='yes' />
		<param name='language' value='en' />
		<param name='filter' value='publish=yes' />
	</object>
</div>
<script type='text/javascript'>
var divElement = document.getElementById('viz1606011758444');
var vizElement = divElement.getElementsByTagName('object')[0];
if(divElement.offsetWidth > 800) {
	vizElement.style.width = '800px';
	vizElement.style.height = '931px';
} else if(divElement.offsetWidth > 500) {
	vizElement.style.width = '800px';
	vizElement.style.height = '931px';
} else {
	vizElement.style.width = '100%';
	vizElement.style.height = '3527px';
}
var scriptElement = document.createElement('script');
scriptElement.src = 'https://public.tableau.com/javascripts/api/viz_v1.js';
vizElement.parentNode.insertBefore(scriptElement, vizElement);
</script>