In [None]:
import pandas as pd 
import geopandas as gpd

# NIFC Datasets

## Annual summaries 

INCLUDES ALASKA (and probably Hawaii), not just CONUS

2018: 58,083 wildfires for 8,767,492 acres, of which 1167 were large

2019: 50,477 wildfires for 4,664,364 acres, of which 806 were large

2020: 58,950 fires reported, 10,122,336 acres, 999 large fires 

2021: 58,985 wildfires burned 7,125,643 acres nationally. Of these, 943 were large fires

2022: 68,988 wildfires that burned 7,577,183 acres". 1289 large 

## Historic Perimeters Combined 2000-2018 GeoMAC

Only through 2019 (yes, although it says it ends in 2018). However, Katrina recommends over InterAgencyFirePerimeter. Skipping for now due to date range. 

## WFIGS Interagency Fire Perimeters 

Does not seem to contain more than one perimeter per fire, per visual inspection on NIFC ArcGIS site. 

25,375 records. 

Newer/authoritative source going forward? 

Why such a difference between 2021 NIFC (4743) and 2021 WFIGS (2123) 
Contains 2022-2024:
```
poly_CreateDate
2023.0    5367
2024.0    5322
2022.0    2279
2021.0    2123
2020.0      24
Name: count, dtype: int64
```

In [None]:
wfigs = "~/feds-benchmarking/data/WFIGS_Interagency_Perimeters_1406150765430228857.geojson"

wfigs = gpd.read_file(wfigs)

In [None]:
wfigs.poly_CreateDate = pd.to_datetime(wfigs.poly_CreateDate)
wfigs.poly_CreateDate.dt.year.value_counts()

## InterAgencyFirePerimeterHistory_All_Years_View

There is some overlap, according to the online viewer, but not a ton. 

Missing 2023 (this is probably in WFIGS, but InterAgency has 4743 in 2021 and WFIGS has 2123 so those don't seem comparable). 

```
FIRE_YEAR
2021    4743
2020    4237
2018    2888
2019    2551
Name: count, dtype: int64
```

In [None]:
# note that the downloaded file is def wrong, it is cutoff at 2000 recoreds when there should be 14,419 just for 2018-22
# inter = "~/feds-benchmarking/data/InterAgencyFirePerimeterHistory_All_Years_View.geojson" 

inter = "~/feds-benchmarking/data/InterAgencyFirePerimeterHistory_All_Years_View.geojson"

inter = gpd.read_file(inter, engine='pyogrio')

In [None]:
len(inter)

In [None]:
inter.FIRE_YEAR.value_counts()

# MTBS Dataset

31248 records, 1984-2022. 

Generally seems to have one fire perimeter per fire (final perimeter). Seems to overestimate burned area reported to NIFC burned area. (not saying that NIFC is correct, just that MTBS is higher). 

Average of 922.5 fires/year between 2018 and 2022. 
During this time period, MTBS had on average 1.54% of the total number of reported fires from NIFC and accounted for 90.78% of burned area reported in NIFC (THIS IS WRONG, NIFC contains Alaska while MTBS doesn't. MTBS is more over NIFC than shown.) 

#### 2022
For 2022, for example, sum of burned area in the dataset = 2,293,483 acres. 
[NIFC annual report for 2022](https://www.nifc.gov/sites/default/files/NICC/2-Predictive%20Services/Intelligence/Annual%20Reports/2022/annual_report.2.pdf) says that "In 2022, there were 68,988 wildfires that burned 7,577,183 acres". 1289 large fires in NIFC 

MTBS has 830 large fires (>1000 acres in the West, >500 in the East) that burned 2,293,483 acres. 
That is 1.2% of reported fires in the US and accounts for 30.3% of burned area. 

**AH, you know what it is: Alaska is included in the NIFC summary stats but not the MTBS data.** 3 million acres burned in AK in 2022 

#### 2021
Same analysis for 2021, because I have 2021 FEDS done but not 2022 yet. 

[NIFC 2021 report](https://www.nifc.gov/sites/default/files/NICC/2-Predictive%20Services/Intelligence/Annual%20Reports/2021/annual_report_0.pdf) says that 58,985 wildfires burned 7,125,643 acres nationally. Of these, 943 were large fires 

MTBS has 1022 large fires burning 8,256,844 acres? 1.7% of fires accounting for 115.9% (?) of burned area. 

#### 2020 
NIFC: 58,950 fires reported, 10,122,336 acres burned (this was a very bad year, for context).
MTBS: 815 fires, 10,345,285 acres burned 
1.3% of fires, 97.8% of burned area. 

Note: NIFC defines large fires as "100 acres in timber fuel types, 300 acres in grass and brush fuel types, or are otherwise
managed by a Type 1 or 2 Incident Management Team or NIMO" and recorded 999 in 2020. 

#### 2019 
NIFC: 50,477 wildfires for 4,664,364 acres, of which 806 were large fires
MTBS: 814 large fires for 5,017,810 acres
1.6% of fires, 107.6% of reported burned area

#### 2018 
NIFC: 58,083 wildfires for 8,767,492 acres, of which 1167 were large fires. 
MTBS: 1131 large fires, 8966731 acres
1.9% of large fires, 102.3% of burned area

In [None]:
8966731 / 8767492

## Stats

In [None]:
mtbs = "~/feds-benchmarking/data/mtbs_perimeter_data/mtbs_perims_DD.shp" 

mtbs = gpd.read_file(mtbs)

In [None]:
mtbs.columns

In [None]:
mtbs['Ig_Date']

In [None]:
years = {}
for date in mtbs['Ig_Date']:
    date = pd.to_datetime(date)
    if date.year in years:
        years[date.year] += 1
    else:
        years[date.year] = 1

years

In [None]:
ssum = 0

for year in [2018, 2019, 2020, 2021, 2022]:
    ssum += years[year]

ssum / 5

In [None]:
mtbs['Ig_Date'] = pd.to_datetime(mtbs['Ig_Date'])

In [None]:
mtbs = mtbs[mtbs['Ig_Date'].dt.year >= 2018]

In [None]:
mtbs.columns

In [None]:
area = mtbs[mtbs['Ig_Date'].dt.year == 2018]['BurnBndAc'].sum()
area

In [None]:
mtbs[mtbs['Incid_Name'].str.contains("DEAKLE")] # confirming that BurnBndAc == acres by checking against web viewer