# Smoke Estimate Calculation

This script will process the fire data set generated by the first code (Data collection code) to calculate annual smoke estimates specific to Farmington, New Mexico. Code authored by Adithyaa Vaasen, UW MS DS, as part of the DATA 512 course project.

### Imports

In [1]:
import warnings
warnings.filterwarnings("ignore")
import pandas as pd

### Read the output of Code 1 (Processed Fire data)

In [2]:
df = pd.read_csv('C:/Users/adith/Documents/data-512-common-analysis/intermediate/finalfiresdata.csv')

### Smoke Estimate Calculation

On researching about Fires (https://www.epa.gov/), I figured that there can be many factors that influence fires & smoke production.

Size of the Fire (GISAcres):
- Larger fires tend to produce more smoke.
- You can consider using GISAcres as a proxy for the potential amount of smoke produced.

Type of Vegetation Burned:
- Different types of vegetation produce different amounts of smoke.
- If you have data on the type of vegetation, it could be incorporated into your model; otherwise, you might assume an average.

Fire Intensity:
- Intense fires typically consume more biomass and can generate more smoke.
- This information might not be directly available, but fire intensity might be inferred from GISAcres or from specific fire behavior models.

Combustion Efficiency:
- Smoldering fires tend to produce more smoke than flaming fires.
- This is more difficult to estimate without detailed data, but assumptions can be made based on fire type or anecdotal evidence from fire reports.

Weather Conditions:
- Wind speed and direction can affect the dispersion and dilution of smoke.
- Weather conditions at the time of each fire might be used to adjust the estimated smoke production.

Distance to City (shortest_dist):
- The further away a fire is, the more smoke will disperse and dilate before reaching the city.
- A decay function could be applied to account for smoke dilution over distance.

With the data in hand, I wanted to start with something simple.

- Weighted Smoke Impact Estimation:
    For each fire event, we consider two factors: the size of the fire (measured in GISAcres) and the proximity to Farmington, NM (measured by shortest_dist). The size of the fire is directly proportional to the smoke produced, while the distance from the city will be inversely proportional to the smoke impact experienced by the city.

- Impact Decay with Distance:
    As smoke travels, it disperses and dilutes, so the impact of a fire on air quality should diminish with distance. This can be represented by a decay function. For simplicity, we can use an inverse distance weighting (IDW), which assumes that the impact decreases inversely with distance.

- Annual Aggregation Method:
    Instead of summing the impacts which could overstate the impact for years with many fires, or using the maximum which could understate the cumulative impact of multiple fires, the average smoke impact over the year is chosen. This way, we can avoid the need to determine the exact overlap of smoke in time while still getting a sense of the overall yearly impact.

- Formula:
    We can define the smoke impact estimate for each fire as follows:
    **Smoke Estimate per Fire = GISAcres\shortest_dist^alpha**
     Where α is the decay exponent determining how rapidly the smoke impact falls off with distance. For simplicity, we can start with α=1, which is a simple inverse relationship.

- Annual Smoke Impact:
    The annual smoke impact for Farmington can then be calculated as:
    **Annual Smoke Estimate=Average(Smoke Estimate per Fire)**

For each year, we would compute this value for all fires within the specified distance.

Using the above methodology, we will compile annual smoke estimates. The average is chosen because it provides a sense of the overall exposure without disproportionately weighting either large numbers of smaller fires or a few large fires.


In [3]:
#Replace Nans with 0s
df = df.fillna(0)

# Filter out prescribed fires as they contribute to very less number
df_wildfires = df[df['FireType'] == 'Wildfire']

# Calculate smoke impact per fire using inverse distance weighting
alpha = 1  # This will be experimented with moving forward
df_wildfires['SmokeEstimate'] = df_wildfires['GISAcres'] / df_wildfires['shortest_dist'] ** alpha

# Calculate annual average smoke impact
annual_smoke_impact = df_wildfires.groupby('FireYear')['SmokeEstimate'].mean().reset_index()

print(annual_smoke_impact)

# save this as a .csv file
annual_smoke_impact.to_csv('C:/Users/adith/Documents/data-512-common-analysis/intermediate/annual_smoke_estimate.csv',index = False)

    FireYear  SmokeEstimate
0       1963       1.584395
1       1964       2.890730
2       1965       0.908946
3       1966       3.315327
4       1967       1.757997
5       1968       1.306879
6       1969       1.134026
7       1970       2.955043
8       1971       3.570148
9       1972       1.498691
10      1973       2.388213
11      1974       1.511972
12      1975       1.478003
13      1976       1.810838
14      1977       2.208096
15      1978       0.768359
16      1979       2.167136
17      1980       1.959832
18      1981       3.032177
19      1982       1.645685
20      1983       2.945636
21      1984       2.763742
22      1985       4.434214
23      1986       3.299114
24      1987       2.871356
25      1988       6.275216
26      1989       3.054647
27      1990       2.881087
28      1991       1.802162
29      1992       2.728625
30      1993       4.789386
31      1994       5.083488
32      1995       3.767417
33      1996       7.451795
34      1997       1