# Pre-processing: Flood Events

**Tasks**
* Extract all major flood_dates for every country in the provided food crisis dataset.
* Filter food crisis data based on the months before/after flooding events identified.
* Run time: < 1 minute

**Sources**
* Flood Event Dataset: https://global-flood-database.cloudtostreet.ai/

In [1]:
import pandas as pd
import os
import timeit

os.getcwd()
start = timeit.default_timer()


**1. Read Data**
* In CSV format. Shapefiles formats are not recommended due to computational costs.

In [2]:
# Read Food Security data
FC = pd.read_csv('Data/FC_data.csv')


In [3]:
# Read Flood Event data
FE = pd.read_csv('Data/global_flooding_events.csv')


**2. Clean Flooding Event Data**
* Needs to be within time frame that is compatible with both the food crisis data (2010-2019) and SARS data
(2015-Present)
* Data will be cleaned to only include countries from provided food crises dataset (in this case - Burkina Faso,
Mali, Niger)

In [4]:
# CLEAN FE DATA
# Remove FE dates before 2010
time = [2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014]
for i in time:
    FE.drop(FE.index[FE['year'] == i], inplace=True)

# Remove duplicates
FE = FE.drop_duplicates(subset=['country', 'year', 'month'])

**3. Create Timestamps**
* For both food crises data and flooding event data

In [5]:
# Create Timestamps (MONTHLY)
# FC
FC['date'] = FC['year'].astype(str) + '-' + FC['month'].astype(str)
FC['datetime'] = pd.to_datetime(FC['date'])
FC = FC.drop(['date'], axis=1)
#FE
FE['date'] = FE['year'].astype(str) + '-' + FE['month'].astype(str) + '-' +FE['day'].astype(str)
FE['datetime'] = pd.to_datetime(FE['date'])
FE = FE.drop(['date'], axis=1)
FE['datetime_round'] = pd.to_datetime(FE['datetime']).dt.to_period('M').dt.to_timestamp()
FE = FE.reset_index()
FE = FE.drop(['index'], axis=1)


**4. Filter Flooding Event Data**
* To only include flooding events for relevant countries

In [6]:
# Extract Flood events for relevant countries (from FC dataset)
country_list = set(FC['country'].tolist()) # Extract list of FEWS NET countries
FE2 = pd.DataFrame(columns=list(FE.columns))
for i in FE.index:
    for country in country_list:
        if FE['country'][i] == country:
            FE2 = FE2.append(FE.loc[i])
FE2 = FE2.sort_values(by=['country', 'year', 'month', 'day'])
FE2 = FE2.drop(['area', 'exposed'], axis=1)
country_list


{'Burkina Faso', 'Mali', 'Niger'}

**5. Filter Food Crisis Data**
 * To only include data from before/after each identified flooding event

In [7]:
# Filter FC data to only include data before/after each event
FC2 = pd.DataFrame(columns=list(FC.columns))
FC2['state'] = ''
FC2['flood_date'] = ''
for event in FE2.index:
    for i in FC.index:
        if FC['country'][i] == FE2['country'][event]:
            FC_date = FC['datetime'][i]
            FE_date = FE2['datetime_round'][event]
            before = FE_date - pd.DateOffset(months=1)
            after = FE_date + pd.DateOffset(months=1)
            if FC_date == before:
                FC2 = FC2.append(FC.loc[i])
                FC2.loc[i,['state']] = 'Before'
                FC2.loc[i, ['flood_date']] = FE2['datetime'][event]
            elif FC_date == after:
                FC2 = FC2.append(FC.loc[i])
                FC2.loc[i,['state']] = 'After'
                FC2.loc[i, ['flood_date']] = FE2['datetime'][event]
            else:
                continue


**6. Clean Up New Data**


In [8]:
# Clean up Food Crisis Data Before/After
FC2 = FC2.reset_index()
FC2 = FC2.drop(['index'], axis=1)
FC2['datetime'] = FC2['datetime'].astype(str)
FC2['flood_date'] = FC2['flood_date'].astype(str)
FC2 = FC2[['country', 'admin_name', 'centx', 'centy', 'state', 'datetime', 'flood_date', 'fews_ipc','fews_ha','ndvi_mean',
           'rain_mean', 'et_mean', 'acled_count', 'acled_fatalities', 'p_staple_food', 'area', 'cropland_pct', 'pop',
           'ruggedness_mean', 'pasture_pct', 'spacelag', 'timelag1', 'timelag2', 'geometry']]

# Clean up Flood Event Data - For GEE
FE3 = FC2[['country', 'flood_date', 'datetime', 'state']]
FE3 = FE3.drop_duplicates()
FE3 = FE3.reset_index()
FE3 = FE3.drop(['index'], axis=1)

**7. Save Data**

In [9]:
# Save data
FC2.to_csv('Data/FC_before_after_data.csv', index = False)
FE3.to_csv('Data/FE_before_after_dates.csv', index = False)

stop = timeit.default_timer()
print('Running Time: ', stop - start, 'seconds')

Running Time:  33.4431081 seconds
