## Run the fire expansion and merging algorithm

`Fire_Forward` is responsible for reading in the preprocessed data created in the `Ingest` notebook

In [1]:
import datetime
import pandas as pd
import geopandas as gpd

import FireMain, FireTime, FireObj, FireConsts, FireVector
from utils import timed

region = ["CONUS",]  # note you don't need the shape in here, just the name
tst = [2023, 8, 28, 'AM']
ted = [2023, 9, 6, 'AM']

In [2]:
allfires, allpixels, gdf = FireMain.Fire_Forward(tst=tst, ted=ted, restart=False, region=region)

2024-01-31 09:10:37,155 - FireLog - INFO - func:read_preprocessed took: 5.89 ms
2024-01-31 09:10:37,163 - FireLog - INFO - func:read_preprocessed took: 6.67 ms
2024-01-31 09:10:37,171 - FireLog - INFO - func:read_preprocessed took: 7.75 ms
2024-01-31 09:10:37,177 - FireLog - INFO - func:read_preprocessed took: 5.43 ms
2024-01-31 09:10:37,182 - FireLog - INFO - func:read_preprocessed took: 4.55 ms
2024-01-31 09:10:37,188 - FireLog - INFO - func:read_preprocessed took: 5.35 ms
2024-01-31 09:10:37,194 - FireLog - INFO - func:read_preprocessed took: 5.93 ms
2024-01-31 09:10:37,199 - FireLog - INFO - func:read_preprocessed took: 3.62 ms
2024-01-31 09:10:37,204 - FireLog - INFO - func:read_preprocessed took: 4.31 ms
2024-01-31 09:10:37,208 - FireLog - INFO - func:read_preprocessed took: 3.71 ms
2024-01-31 09:10:37,214 - FireLog - INFO - func:read_preprocessed took: 5.91 ms
2024-01-31 09:10:37,218 - FireLog - INFO - func:read_preprocessed took: 3.54 ms
2024-01-31 09:10:37,224 - FireLog - INFO

## Serialize to disk

- allpixels -> one file for each t (one row for each pixel).
- allfires -> one geoparquet file to hold all information about each fire at each time (one row for each burning fire at each t). This file can be used to rehydrate the allfires object and pick back up running the algorithm, or it can be used to write all the snapshot and largefire timeseries files.

In [3]:
%%time
for t in FireTime.t_generator(tst, ted):
    pixels = allpixels[allpixels["t"] == FireTime.t2dt(t)]
    filepath = f"out/{region[0]}/{t[0]}{t[1]:02}{t[2]:02}_{t[3]}.txt"
    pixels.to_csv(filepath)

CPU times: user 129 ms, sys: 4.02 ms, total: 133 ms
Wall time: 143 ms


In [4]:
%%time
t = ted
gdf.to_parquet(f"out/{region[0]}/allfires_{t[0]}{t[1]:02}{t[2]:02}_{t[3]}.parq")

CPU times: user 113 ms, sys: 20 ms, total: 133 ms
Wall time: 131 ms


## Read from disk

In [5]:
%%time
allpixels = pd.concat([
    pd.read_csv(f"out/{region[0]}/{t[0]}{t[1]:02}{t[2]:02}_{t[3]}.txt", index_col="uuid", parse_dates=["t"])
    for t in FireTime.t_generator(tst, ted)
])

CPU times: user 72.4 ms, sys: 7.92 ms, total: 80.4 ms
Wall time: 78.4 ms


In [6]:
%%time
gdf = gpd.read_parquet(f"out/{region[0]}/allfires_{t[0]}{t[1]:02}{t[2]:02}_{t[3]}.parq")

CPU times: user 102 ms, sys: 43.8 ms, total: 146 ms
Wall time: 140 ms


## Rehydrate the latest allfires

In [7]:
@timed
def rehydrate_allfires(t, allpixels, gdf, notdead=True):
    dt = FireTime.t2dt(t)
    dt_dead = dt - datetime.timedelta(days=FireConsts.limoffdays)
    
    gdf_ = gdf[(gdf.t_st <= dt) & (gdf.t_ed >= dt_dead)]
    a = FireObj.Allfires(t)
    for fid, gdf_fid in gdf_.groupby(level=0):
        f = FireObj.Fire(fid, t, allpixels)
        dt_st = gdf_fid.t_st.min()
        dt_ed = gdf_fid.t_ed.max()
        
        f.t_st = FireTime.dt2t(dt_st)
        f.t_ed = FireTime.dt2t(dt_ed)
        
        gdf_fid_t = gdf_fid.loc[(fid, dt_ed)]
        for k, v in gdf_fid_t.items():
            if k in ["hull", "ftype", "fline"]:
                setattr(f, k, v)
        if f.isignition:
            a.fids_new.append(fid)
        else:
            a.fids_expanded.append(fid)
        a.fires[fid] = f
    return a

For instance let's rehydrate just the last allfires object. 

This should be equivalent to the allfires object that we generated at the top of this notebook.

In [8]:
a = rehydrate_allfires(ted, allpixels, gdf)
a

2024-01-31 09:12:48,604 - FireLog - INFO - func:rehydrate_allfires took: 1.77 sec


<Allfires at t=[2023, 9, 6, 'AM'] with n_fires=4030>

What does it look like to rehydrate the object for all timesteps? 

Note that it takes longer if you have more data. But it should max out at some point as long as `notdead=True`

In [9]:
%%time
for t in FireTime.t_generator(tst, ted):
    a = rehydrate_allfires(t, allpixels, gdf)

2024-01-31 09:12:51,583 - FireLog - INFO - func:rehydrate_allfires took: 184.15 ms
2024-01-31 09:12:51,927 - FireLog - INFO - func:rehydrate_allfires took: 342.98 ms
2024-01-31 09:12:52,421 - FireLog - INFO - func:rehydrate_allfires took: 493.79 ms
2024-01-31 09:12:53,004 - FireLog - INFO - func:rehydrate_allfires took: 582.32 ms
2024-01-31 09:12:53,661 - FireLog - INFO - func:rehydrate_allfires took: 655.69 ms
2024-01-31 09:12:54,438 - FireLog - INFO - func:rehydrate_allfires took: 776.19 ms
2024-01-31 09:12:55,279 - FireLog - INFO - func:rehydrate_allfires took: 840.69 ms
2024-01-31 09:12:56,288 - FireLog - INFO - func:rehydrate_allfires took: 1.01 sec
2024-01-31 09:12:57,338 - FireLog - INFO - func:rehydrate_allfires took: 1.05 sec
2024-01-31 09:12:58,476 - FireLog - INFO - func:rehydrate_allfires took: 1.14 sec
2024-01-31 09:12:59,727 - FireLog - INFO - func:rehydrate_allfires took: 1.25 sec
2024-01-31 09:13:01,028 - FireLog - INFO - func:rehydrate_allfires took: 1.30 sec
2024-01-3

CPU times: user 21 s, sys: 28.3 ms, total: 21 s
Wall time: 21 s


## TO DO:

- [ ] figure out if the merge behavior is correct
- [ ] what to do with invalid fires?
- [ ] do we really need allpixels when rehydrating?
- [ ] should the allfires object hold on to the gdf and update it itself?
- [ ] do the snapshot files have rows merged based on the fid?
- [ ] figure out how to write the desired output files solely from the allfires geodataframe and the allpixels dataframe. I think it's important not to reinstantiate the objects for this.