# Extreme Whether Event Prediction

## Exploratory Data Analysis on Wildfire Dataset

### Introduction
In this project, we aim to predict extreme weather events with a focus on wildfires. Wildfires are a significant natural disaster that cause extensive damage to the environment, property, and human life. Accurate prediction of wildfires can help in planning and mitigating these risks.

### Dataset Sources
For our analysis, we are using publicly available wildfire datasets. Below are two reliable sources where the same dataset can be accessed:

1. **[National Interagency Fire Occurrence - Sixth Edition (1992-2020) on Data.gov](https://catalog.data.gov/dataset/national-interagency-fire-occurrence-sixth-edition-1992-2020-feature-layer)**
   - This dataset contains wildfire occurrence data from various agencies across the United States, covering the years 1992 to 2020.


2. **[Kaggle - US Wildfire Records (6th Edition)](https://www.kaggle.com/datasets/behroozsohrabi/us-wildfire-records-6th-edition)**
   - The same dataset as above, available for easy access and use on Kaggle.

### Instructions to Save the Dataset
Once you have downloaded the dataset from either of the above sources, save it in your working directory with the file name `data.csv`. This is necessary because the following code reads the dataset from `data.csv`.

In [44]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from pytrends.request import TrendReq

### Data Loading and Overview
- Load the dataset into a Pandas DataFrame.
- Display the first few rows.
- Print a summary of the dataset.

In [45]:
df = pd.read_csv("data.csv")
pd.set_option('display.max_columns', None)

  df = pd.read_csv("data.csv")


In [46]:
df

Unnamed: 0,OBJECTID,Shape,FOD_ID,FPA_ID,SOURCE_SYSTEM_TYPE,SOURCE_SYSTEM,NWCG_REPORTING_AGENCY,NWCG_REPORTING_UNIT_ID,NWCG_REPORTING_UNIT_NAME,SOURCE_REPORTING_UNIT,SOURCE_REPORTING_UNIT_NAME,LOCAL_FIRE_REPORT_ID,LOCAL_INCIDENT_ID,FIRE_CODE,FIRE_NAME,ICS_209_PLUS_INCIDENT_JOIN_ID,ICS_209_PLUS_COMPLEX_JOIN_ID,MTBS_ID,MTBS_FIRE_NAME,COMPLEX_NAME,FIRE_YEAR,DISCOVERY_DATE,DISCOVERY_DOY,DISCOVERY_TIME,NWCG_CAUSE_CLASSIFICATION,NWCG_GENERAL_CAUSE,NWCG_CAUSE_AGE_CATEGORY,CONT_DATE,CONT_DOY,CONT_TIME,FIRE_SIZE,FIRE_SIZE_CLASS,LATITUDE,LONGITUDE,OWNER_DESCR,STATE,COUNTY,FIPS_CODE,FIPS_NAME
0,1,b'\x00\x01\xad\x10\x00\x00\xc8\xce\n[_@^\xc0\x...,1,FS-1418826,FED,FS-FIRESTAT,FS,USCAPNF,Plumas National Forest,511,Plumas National Forest,1,PNF-47,BJ8K,FOUNTAIN,,,,,,2005,2/2/2005,33,1300.0,Human,Power generation/transmission/distribution,,2/2/2005,33.0,1730.0,0.10,A,40.036944,-121.005833,USFS,CA,63.0,6063.0,Plumas County
1,2,b'\x00\x01\xad\x10\x00\x00\xc8\xe594\xe2\x19^\...,2,FS-1418827,FED,FS-FIRESTAT,FS,USCAENF,Eldorado National Forest,503,Eldorado National Forest,13,13,AAC0,PIGEON,,,,,,2004,5/12/2004,133,845.0,Natural,Natural,,5/12/2004,133.0,1530.0,0.25,A,38.933056,-120.404444,USFS,CA,61.0,6061.0,Placer County
2,3,b'\x00\x01\xad\x10\x00\x00x{\xac \x13/^\xc0@\x...,3,FS-1418835,FED,FS-FIRESTAT,FS,USCAENF,Eldorado National Forest,503,Eldorado National Forest,27,021,A32W,SLACK,,,,,,2004,5/31/2004,152,1921.0,Human,Debris and open burning,,5/31/2004,152.0,2024.0,0.10,A,38.984167,-120.735556,STATE OR PRIVATE,CA,17.0,6017.0,El Dorado County
3,4,b'\x00\x01\xad\x10\x00\x00\xc8\x13u\xd7s\xfa]\...,4,FS-1418845,FED,FS-FIRESTAT,FS,USCAENF,Eldorado National Forest,503,Eldorado National Forest,43,6,,DEER,,,,,,2004,6/28/2004,180,1600.0,Natural,Natural,,7/3/2004,185.0,1400.0,0.10,A,38.559167,-119.913333,USFS,CA,3.0,6003.0,Alpine County
4,5,b'\x00\x01\xad\x10\x00\x00\xd0\x11y\xf8\xb6\xf...,5,FS-1418847,FED,FS-FIRESTAT,FS,USCAENF,Eldorado National Forest,503,Eldorado National Forest,44,7,,STEVENOT,,,,,,2004,6/28/2004,180,1600.0,Natural,Natural,,7/3/2004,185.0,1200.0,0.10,A,38.559167,-119.933056,USFS,CA,3.0,6003.0,Alpine County
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2303561,2303562,b'\x00\x01\xad\x10\x00\x00\xcc\x9a\xedDh=[\xc0...,400732978,ICS209_2020_11710294,INTERAGCY,IA-ICS209,BLM,USCOGRD,Grand Junction Field Office,COGRD,Grand Junction Field Office,,105,,JONES,2020_11710294_JONES,,,,,2020,6/5/2020,157,1741.0,Natural,Natural,,,,,1.00,B,39.037890,-108.959500,MISSING/NOT SPECIFIED,CO,,,
2303562,2303563,b'\x00\x01\xad\x10\x00\x00\xe8\x11\xda\xda1\xe...,400732979,ICS209_2020_11781527,INTERAGCY,IA-ICS209,ST/C&L,USCAMMU,Merced-Mariposa Unit,CAMMU,Merced-Mariposa Unit,,14707,,POWER,2020_11781527_POWER,,,,,2020,7/11/2020,193,1958.0,Missing data/not specified/undetermined,Missing data/not specified/undetermined,,,,,100.00,D,37.148611,-119.503056,Private,CA,Madera,6039.0,Madera County
2303563,2303564,b'\x00\x01\xad\x10\x00\x00P\xf6\xa7\x9eV\x9c\\...,400732980,ICS209_2020_11815219,INTERAGCY,IA-ICS209,FS,USMTBRF,Bitterroot National Forest,MTBRF,Bitterroot National Forest,,20179,,12 MILE,2020_11815219_12 MILE,,,,,2020,8/27/2020,240,1911.0,Natural,Natural,,,,,50.00,C,46.151370,-114.442800,MISSING/NOT SPECIFIED,MT,,,
2303564,2303565,b'\x00\x01\xad\x10\x00\x00\\\x87\xc8\xbbS\x07^...,400732982,ICS209_2020_11831809,INTERAGCY,IA-ICS209,FWS,USWAMCR,Mid Columbia National Wildlife Refuge Complex,WAMCR,Mid Columbia National Wildlife Refuge Complex,,508,,TAYLOR POND,2020_11831809_TAYLOR POND,,WA4667012011520200817,TAYLOR POND,,2020,8/17/2020,230,755.0,Natural,Natural,,8/20/2020,233.0,1900.0,24892.00,G,46.670340,-120.114500,UNDEFINED FEDERAL,WA,Yakima,53077.0,Yakima County


In [47]:
df.columns

Index(['OBJECTID', 'Shape', 'FOD_ID', 'FPA_ID', 'SOURCE_SYSTEM_TYPE',
       'SOURCE_SYSTEM', 'NWCG_REPORTING_AGENCY', 'NWCG_REPORTING_UNIT_ID',
       'NWCG_REPORTING_UNIT_NAME', 'SOURCE_REPORTING_UNIT',
       'SOURCE_REPORTING_UNIT_NAME', 'LOCAL_FIRE_REPORT_ID',
       'LOCAL_INCIDENT_ID', 'FIRE_CODE', 'FIRE_NAME',
       'ICS_209_PLUS_INCIDENT_JOIN_ID', 'ICS_209_PLUS_COMPLEX_JOIN_ID',
       'MTBS_ID', 'MTBS_FIRE_NAME', 'COMPLEX_NAME', 'FIRE_YEAR',
       'DISCOVERY_DATE', 'DISCOVERY_DOY', 'DISCOVERY_TIME',
       'NWCG_CAUSE_CLASSIFICATION', 'NWCG_GENERAL_CAUSE',
       'NWCG_CAUSE_AGE_CATEGORY', 'CONT_DATE', 'CONT_DOY', 'CONT_TIME',
       'FIRE_SIZE', 'FIRE_SIZE_CLASS', 'LATITUDE', 'LONGITUDE', 'OWNER_DESCR',
       'STATE', 'COUNTY', 'FIPS_CODE', 'FIPS_NAME'],
      dtype='object')

In [48]:
df.head()

Unnamed: 0,OBJECTID,Shape,FOD_ID,FPA_ID,SOURCE_SYSTEM_TYPE,SOURCE_SYSTEM,NWCG_REPORTING_AGENCY,NWCG_REPORTING_UNIT_ID,NWCG_REPORTING_UNIT_NAME,SOURCE_REPORTING_UNIT,SOURCE_REPORTING_UNIT_NAME,LOCAL_FIRE_REPORT_ID,LOCAL_INCIDENT_ID,FIRE_CODE,FIRE_NAME,ICS_209_PLUS_INCIDENT_JOIN_ID,ICS_209_PLUS_COMPLEX_JOIN_ID,MTBS_ID,MTBS_FIRE_NAME,COMPLEX_NAME,FIRE_YEAR,DISCOVERY_DATE,DISCOVERY_DOY,DISCOVERY_TIME,NWCG_CAUSE_CLASSIFICATION,NWCG_GENERAL_CAUSE,NWCG_CAUSE_AGE_CATEGORY,CONT_DATE,CONT_DOY,CONT_TIME,FIRE_SIZE,FIRE_SIZE_CLASS,LATITUDE,LONGITUDE,OWNER_DESCR,STATE,COUNTY,FIPS_CODE,FIPS_NAME
0,1,b'\x00\x01\xad\x10\x00\x00\xc8\xce\n[_@^\xc0\x...,1,FS-1418826,FED,FS-FIRESTAT,FS,USCAPNF,Plumas National Forest,511,Plumas National Forest,1,PNF-47,BJ8K,FOUNTAIN,,,,,,2005,2/2/2005,33,1300.0,Human,Power generation/transmission/distribution,,2/2/2005,33.0,1730.0,0.1,A,40.036944,-121.005833,USFS,CA,63.0,6063.0,Plumas County
1,2,b'\x00\x01\xad\x10\x00\x00\xc8\xe594\xe2\x19^\...,2,FS-1418827,FED,FS-FIRESTAT,FS,USCAENF,Eldorado National Forest,503,Eldorado National Forest,13,13,AAC0,PIGEON,,,,,,2004,5/12/2004,133,845.0,Natural,Natural,,5/12/2004,133.0,1530.0,0.25,A,38.933056,-120.404444,USFS,CA,61.0,6061.0,Placer County
2,3,b'\x00\x01\xad\x10\x00\x00x{\xac \x13/^\xc0@\x...,3,FS-1418835,FED,FS-FIRESTAT,FS,USCAENF,Eldorado National Forest,503,Eldorado National Forest,27,021,A32W,SLACK,,,,,,2004,5/31/2004,152,1921.0,Human,Debris and open burning,,5/31/2004,152.0,2024.0,0.1,A,38.984167,-120.735556,STATE OR PRIVATE,CA,17.0,6017.0,El Dorado County
3,4,b'\x00\x01\xad\x10\x00\x00\xc8\x13u\xd7s\xfa]\...,4,FS-1418845,FED,FS-FIRESTAT,FS,USCAENF,Eldorado National Forest,503,Eldorado National Forest,43,6,,DEER,,,,,,2004,6/28/2004,180,1600.0,Natural,Natural,,7/3/2004,185.0,1400.0,0.1,A,38.559167,-119.913333,USFS,CA,3.0,6003.0,Alpine County
4,5,b'\x00\x01\xad\x10\x00\x00\xd0\x11y\xf8\xb6\xf...,5,FS-1418847,FED,FS-FIRESTAT,FS,USCAENF,Eldorado National Forest,503,Eldorado National Forest,44,7,,STEVENOT,,,,,,2004,6/28/2004,180,1600.0,Natural,Natural,,7/3/2004,185.0,1200.0,0.1,A,38.559167,-119.933056,USFS,CA,3.0,6003.0,Alpine County


In [49]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2303566 entries, 0 to 2303565
Data columns (total 39 columns):
 #   Column                         Dtype  
---  ------                         -----  
 0   OBJECTID                       int64  
 1   Shape                          object 
 2   FOD_ID                         int64  
 3   FPA_ID                         object 
 4   SOURCE_SYSTEM_TYPE             object 
 5   SOURCE_SYSTEM                  object 
 6   NWCG_REPORTING_AGENCY          object 
 7   NWCG_REPORTING_UNIT_ID         object 
 8   NWCG_REPORTING_UNIT_NAME       object 
 9   SOURCE_REPORTING_UNIT          object 
 10  SOURCE_REPORTING_UNIT_NAME     object 
 11  LOCAL_FIRE_REPORT_ID           object 
 12  LOCAL_INCIDENT_ID              object 
 13  FIRE_CODE                      object 
 14  FIRE_NAME                      object 
 15  ICS_209_PLUS_INCIDENT_JOIN_ID  object 
 16  ICS_209_PLUS_COMPLEX_JOIN_ID   object 
 17  MTBS_ID                        object 
 18  MT

### Column Descriptions

- **FOD_ID**: Unique numeric record identifier.
- **FPA_ID**: Unique identifier that contains information necessary to track back to the original record in the source dataset.
- **SOURCE_SYSTEM_TYPE**: Type of source database or system that the record was drawn from (FED = federal, NONFED = nonfederal, or INTERAGCY = interagency).
- **SOURCE_SYSTEM**: Name or other identifier for the source database or system that the record was drawn from.
- **NWCG_REPORTING_AGENCY**: Active National Wildlife Coordinating Group (NWCG) Unit Identifier for the agency preparing the fire report (BIA = Bureau of Indian Affairs, BLM = Bureau of Land Management, BOR = Bureau of Reclamation, DOD = Department of Defense, DOE = Department of Energy, FS = Forest Service, FWS = Fish and Wildlife Service, IA = Interagency Organization, NPS = National Park Service, ST/C&L = State, County, or Local Organization, and TRIBE = Tribal Organization).
- **NWCG_REPORTING_UNIT_ID**: Active NWCG Unit Identifier for the unit preparing the fire report.
- **NWCG_REPORTING_UNIT_NAME**: Active NWCG Unit Name for the unit preparing the fire report.
- **SOURCE_REPORTING_UNIT**: Code for the agency unit preparing the fire report, based on code/name in the source dataset.
- **SOURCE_REPORTING_UNIT_NAME**: Name of the reporting agency unit preparing the fire report, based on code/name in the source dataset.
- **LOCAL_FIRE_REPORT_ID**: Number or code that uniquely identifies an incident report for a particular reporting unit and a particular calendar year.
- **LOCAL_INCIDENT_ID**: Number or code that uniquely identifies an incident for a particular local fire management organization within a particular calendar year.
- **FIRE_CODE**: Code used within the interagency wildland fire community to track and compile cost information for emergency fire suppression (https://www.firecode.gov/).
- **FIRE_NAME**: Name of the incident, from the fire report (primary) or ICS-209 report (secondary).
- **ICS_209_PLUS_INCIDENT_JOIN_ID**: Primary identifier needed to join into operational situation reporting data for the incident in the ICS-209-PLUS dataset.
- **ICS_209_PLUS_COMPLEX_JOIN_ID**: If part of a complex, secondary identifier potentially needed to join to operational situation reporting data for the incident in the ICS-209-PLUS dataset.
- **MTBS_ID**: Incident identifier, from the MTBS perimeter dataset.
- **MTBS_FIRE_NAME**: Name of the incident, from the MTBS perimeter dataset.
- **COMPLEX_NAME**: Name of the complex under which the fire was ultimately managed, when discernible.
- **FIRE_YEAR**: Calendar year in which the fire was discovered or confirmed to exist.
- **DISCOVERY_DATE**: Date on which the fire was discovered or confirmed to exist.
- **DISCOVERY_DOY**: Day of year on which the fire was discovered or confirmed to exist.
- **DISCOVERY_TIME**: Time of day that the fire was discovered or confirmed to exist.
- **NWCG_CAUSE_CLASSIFICATION**: Broad classification of the reason the fire occurred (Human, Natural, Missing data/not specified/undetermined).
- **NWCG_GENERAL_CAUSE**: Event or circumstance that started a fire or set the stage for its occurrence (Arson/incendiarism, Debris and open burning, Equipment and vehicle use, Firearms and explosives use, Fireworks, Misuse of fire by a minor, Natural, Power generation/transmission/distribution, Railroad operations and maintenance, Recreation and ceremony, Smoking, Other causes, Missing data/not specified/undetermined).
- **NWCG_CAUSE_AGE_CATEGORY**: If cause attributed to children (ages 0-12) or adolescents (13-17), the value for this data element is set to Minor; otherwise null.
- **CONT_DATE**: Date on which the fire was declared contained or otherwise controlled (mm/dd/yyyy where mm=month, dd=day, and yyyy=year).
- **CONT_DOY**: Day of year on which the fire was declared contained or otherwise controlled.
- **CONT_TIME**: Time of day that the fire was declared contained or otherwise controlled (hhmm where hh=hour, mm=minutes).
- **FIRE_SIZE**: The estimate of acres within the final perimeter of the fire.
- **FIRE_SIZE_CLASS**: Code for fire size based on the number of acres within the final fire perimeter (A=greater than 0 but less than or equal to 0.25 acres, B=0.26-9.9 acres, C=10.0-99.9 acres, D=100-299 acres, E=300 to 999 acres, F=1000 to 4999 acres, and G=5000+ acres).
- **LATITUDE**: Latitude (NAD83) for point location of the fire (decimal degrees).
- **LONGITUDE**: Longitude (NAD83) for point location of the fire (decimal degrees).
- **OWNER_DESCR**: Name of primary owner or entity responsible for managing the land at the point of origin of the fire at the time of the incident.
- **STATE**: Two-letter alphabetic code for the state in which the fire burned (or originated), based on the nominal designation in the fire report (not from a spatial overlay).
- **COUNTY**: County, or equivalent, in which the fire burned (or originated), based on nominal designation in the fire report (not from a spatial overlay).
- **FIPS_CODE**: Five-digit code from the Federal Information Process Standards (FIPS) publication 6-4 for representation of counties and equivalent entities, based on the nominal designation in the fire report (not from a spatial overlay).
- **FIPS_NAME**: County name from the FIPS publication 6-4 for representation of counties and equivalent entities, based on the nominal designation in the fire report (not from a spatial overlay).

In [50]:
df.describe()

Unnamed: 0,OBJECTID,FOD_ID,FIRE_YEAR,DISCOVERY_DOY,DISCOVERY_TIME,CONT_DOY,CONT_TIME,FIRE_SIZE,LATITUDE,LONGITUDE,FIPS_CODE
count,2303566.0,2303566.0,2303566.0,2303566.0,1514471.0,1408753.0,1312686.0,2303566.0,2303566.0,2303566.0,1637787.0
mean,1151784.0,118510000.0,2006.167,165.9714,1445.252,170.7579,1523.731,78.16088,36.96623,-96.35792,27413.64
std,664982.4,162156400.0,8.044361,89.75278,425.3662,86.26373,446.0993,2630.832,6.00826,16.6436,16944.84
min,1.0,1.0,1992.0,1.0,0.0,1.0,0.0,1e-05,17.93972,-178.8026,1001.0
25%,575892.2,622549.2,2000.0,91.0,1234.0,99.0,1303.0,0.1,33.0139,-111.0361,12101.0
50%,1151784.0,1403630.0,2006.0,166.0,1455.0,176.0,1554.0,0.8,35.7225,-93.47009,28085.0
75%,1727675.0,300007100.0,2013.0,231.0,1711.0,232.0,1810.0,3.0,40.89029,-82.51,45001.0
max,2303566.0,400733000.0,2020.0,366.0,2359.0,366.0,2359.0,662700.0,70.3306,-65.25694,72147.0


### Data Cleaning and Processing
- Handle missing values.
  - Calculate the percentage of missing values for each column.
  - Drop columns with more than a specified threshold of missing values (50%).
- Remove duplicate rows.
- Ensure data consistency by converting columns to appropriate data types.

In [51]:
# total missing value in each column
total_missing_values = df.isna().sum()
total_missing_values

OBJECTID                               0
Shape                                  0
FOD_ID                                 0
FPA_ID                                 0
SOURCE_SYSTEM_TYPE                     0
SOURCE_SYSTEM                          0
NWCG_REPORTING_AGENCY                  0
NWCG_REPORTING_UNIT_ID                 0
NWCG_REPORTING_UNIT_NAME               0
SOURCE_REPORTING_UNIT                  0
SOURCE_REPORTING_UNIT_NAME             0
LOCAL_FIRE_REPORT_ID             1825891
LOCAL_INCIDENT_ID                 744411
FIRE_CODE                        1906254
FIRE_NAME                         995415
ICS_209_PLUS_INCIDENT_JOIN_ID    2270072
ICS_209_PLUS_COMPLEX_JOIN_ID     2298627
MTBS_ID                          2289696
MTBS_FIRE_NAME                   2289696
COMPLEX_NAME                     2297619
FIRE_YEAR                              0
DISCOVERY_DATE                         0
DISCOVERY_DOY                          0
DISCOVERY_TIME                    789095
NWCG_CAUSE_CLASS

In [52]:
missing_percent = total_missing_values.sort_values(ascending=False) * 100 / len(df)
missing_percent

ICS_209_PLUS_COMPLEX_JOIN_ID     99.785593
COMPLEX_NAME                     99.741835
MTBS_FIRE_NAME                   99.397890
MTBS_ID                          99.397890
ICS_209_PLUS_INCIDENT_JOIN_ID    98.545993
NWCG_CAUSE_AGE_CATEGORY          96.721301
FIRE_CODE                        82.752307
LOCAL_FIRE_REPORT_ID             79.263672
FIRE_NAME                        43.211916
CONT_TIME                        43.015047
CONT_DOY                         38.844687
CONT_DATE                        38.844687
DISCOVERY_TIME                   34.255368
LOCAL_INCIDENT_ID                32.315592
FIPS_NAME                        28.902146
FIPS_CODE                        28.902102
COUNTY                           28.902102
SOURCE_REPORTING_UNIT_NAME        0.000000
SOURCE_REPORTING_UNIT             0.000000
STATE                             0.000000
OWNER_DESCR                       0.000000
LONGITUDE                         0.000000
LATITUDE                          0.000000
FIRE_SIZE_C

In [53]:
# Drop columns with more than 50% missing values
threshold = 50
columns_to_drop = missing_percent[missing_percent > threshold].index
df = df.drop(columns=columns_to_drop, axis=1)

In [54]:
# Remove rows with remaining missing values (if any)
df = df.dropna()

In [55]:
# Check for duplicate rows
df.duplicated().sum()

0

In [56]:
# Remove duplicate rows(if any)
df = df.drop_duplicates()

In [57]:
# Convert DISCOVERY_DATE and CONT_DATE to datetime format
df['DISCOVERY_DATE'] = pd.to_datetime(df['DISCOVERY_DATE'], errors='coerce')
df['CONT_DATE'] = pd.to_datetime(df['CONT_DATE'], errors='coerce')

In [58]:
df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 528166 entries, 0 to 2303565
Data columns (total 31 columns):
 #   Column                      Non-Null Count   Dtype         
---  ------                      --------------   -----         
 0   OBJECTID                    528166 non-null  int64         
 1   Shape                       528166 non-null  object        
 2   FOD_ID                      528166 non-null  int64         
 3   FPA_ID                      528166 non-null  object        
 4   SOURCE_SYSTEM_TYPE          528166 non-null  object        
 5   SOURCE_SYSTEM               528166 non-null  object        
 6   NWCG_REPORTING_AGENCY       528166 non-null  object        
 7   NWCG_REPORTING_UNIT_ID      528166 non-null  object        
 8   NWCG_REPORTING_UNIT_NAME    528166 non-null  object        
 9   SOURCE_REPORTING_UNIT       528166 non-null  object        
 10  SOURCE_REPORTING_UNIT_NAME  528166 non-null  object        
 11  LOCAL_INCIDENT_ID           528166 non

In [59]:
df.head()

Unnamed: 0,OBJECTID,Shape,FOD_ID,FPA_ID,SOURCE_SYSTEM_TYPE,SOURCE_SYSTEM,NWCG_REPORTING_AGENCY,NWCG_REPORTING_UNIT_ID,NWCG_REPORTING_UNIT_NAME,SOURCE_REPORTING_UNIT,SOURCE_REPORTING_UNIT_NAME,LOCAL_INCIDENT_ID,FIRE_NAME,FIRE_YEAR,DISCOVERY_DATE,DISCOVERY_DOY,DISCOVERY_TIME,NWCG_CAUSE_CLASSIFICATION,NWCG_GENERAL_CAUSE,CONT_DATE,CONT_DOY,CONT_TIME,FIRE_SIZE,FIRE_SIZE_CLASS,LATITUDE,LONGITUDE,OWNER_DESCR,STATE,COUNTY,FIPS_CODE,FIPS_NAME
0,1,b'\x00\x01\xad\x10\x00\x00\xc8\xce\n[_@^\xc0\x...,1,FS-1418826,FED,FS-FIRESTAT,FS,USCAPNF,Plumas National Forest,511,Plumas National Forest,PNF-47,FOUNTAIN,2005,2005-02-02,33,1300.0,Human,Power generation/transmission/distribution,2005-02-02,33.0,1730.0,0.1,A,40.036944,-121.005833,USFS,CA,63.0,6063.0,Plumas County
1,2,b'\x00\x01\xad\x10\x00\x00\xc8\xe594\xe2\x19^\...,2,FS-1418827,FED,FS-FIRESTAT,FS,USCAENF,Eldorado National Forest,503,Eldorado National Forest,13,PIGEON,2004,2004-05-12,133,845.0,Natural,Natural,2004-05-12,133.0,1530.0,0.25,A,38.933056,-120.404444,USFS,CA,61.0,6061.0,Placer County
2,3,b'\x00\x01\xad\x10\x00\x00x{\xac \x13/^\xc0@\x...,3,FS-1418835,FED,FS-FIRESTAT,FS,USCAENF,Eldorado National Forest,503,Eldorado National Forest,021,SLACK,2004,2004-05-31,152,1921.0,Human,Debris and open burning,2004-05-31,152.0,2024.0,0.1,A,38.984167,-120.735556,STATE OR PRIVATE,CA,17.0,6017.0,El Dorado County
3,4,b'\x00\x01\xad\x10\x00\x00\xc8\x13u\xd7s\xfa]\...,4,FS-1418845,FED,FS-FIRESTAT,FS,USCAENF,Eldorado National Forest,503,Eldorado National Forest,6,DEER,2004,2004-06-28,180,1600.0,Natural,Natural,2004-07-03,185.0,1400.0,0.1,A,38.559167,-119.913333,USFS,CA,3.0,6003.0,Alpine County
4,5,b'\x00\x01\xad\x10\x00\x00\xd0\x11y\xf8\xb6\xf...,5,FS-1418847,FED,FS-FIRESTAT,FS,USCAENF,Eldorado National Forest,503,Eldorado National Forest,7,STEVENOT,2004,2004-06-28,180,1600.0,Natural,Natural,2004-07-03,185.0,1200.0,0.1,A,38.559167,-119.933056,USFS,CA,3.0,6003.0,Alpine County
