### Application: Develop a process to compute potential insurance payouts informed by conflict return period thresholds.

#### Contextual information

<span style="color:red">Guidance provided by Håvard</span>

1. Start out with a dataset with rows for each grid cell year 
2. Identify the cells for a country for each year that qualify to each of the thresholds and set the payout rate as a value in a column for each cell year 
3. Multiply the payout with the population in the cell in a new column to get the population-weighted payout 
4. Sum the population-weighted payout for each country year 
5. Divide the summed population-weighted payout for each country year by the country’s total population 

<span style="color:red">Sentence structure example provided by Jerry</span>

To make this a bit more clear, let's assume you have two grids that have 100 percent payout rate. These grids each have 1 percent of the countries population. The national number is = 100 x.01 + 100 x .01 or 2 percent. That 2 percent value represents the payout rate for the nation. The protection level of $1M for the national would pay 2 percent of $1M.  

<span style="color:red">Key difference to explore</span>

what effect does the order of operations have, to adjust calculating proportional population as the third task rather than fifth, on the resulting payout metric?


### Locate necessary files:

The function used to generate the complete payout table requires 5 parameters. To keep this focus constrained to the processes generating the final payout value, preceding steps to compute the necessary tables are ommitted. If you have an interest in exploring this information, the Benz_Graphics branch is updated and contains an .ipynb referenced to generate all tables and infographics. 

base files relevant for this review:
1. /.../VIEWS_FAO_index/notebooks/methods/Proof_For_Summary_Table/Example_dataframe.csv
2. Annual intensity table -- designated in the code as 'y'
3. Original return period calculations -- designated in the code as 'z'
4. Ammended table z with range values and symbology (color) instruction


The naming conventions were not arbitrary. Because each table represents some iteration of an earlier version the naming conventions of more abstract variable names are more effectively distinguish the tables than appendix versions of 'annual_something'.

for instance 'insurance_table' speaks to both z, info_df, and the payout table result. Additionally, 'annual_table' which table y was specifed as in earlier iterations becomes confused with the annual payout table. 

All files and functions relevant to this discussion are contained within the Proof_For_Summary_Table folder

#### Address the requirement for multiple input tables:

- `x` is the main DataFrame. It contains the most granular data, with fields such as [pg_id, year, fatalities_sum, pop_gpw_sum, percapita_100k]. This represents the most disaggregated information.

- `y` contains summary statistics, including fields like 'max' and 'average'. This table was initially requested by Jerry on XX to complement specific analysis needs.

- `z` is the original, unrevised "insurance payout table." It communicates the floor thresholds associated with each return period and the respective payout rates. 

- `info_df` is used primarily for formatting graphics and aids in deriving a range that facilitates feature engineering.

These seemingly arbitrary names are intentionally chosen. Using easily distinguishable variable names facilitates the differentiation between tables that contain closely related information. For instance, the term "annual table" could apply to both **`y`** and the resulting payout table, which may lead to confusion. Similarly, a variable named "insurance" could be misinterpreted as referring to either **`info_df`** or **`z`**. Each table contains unique fields that need to remain distinguishable, yet all are collectively integrated in the final payout table through the `append_return_periods_to_annual_table` function.

Table y offers the least intuitive contribution to the final payout table; We aim to retain the ability to sort the comprehensive table by magnitude or intensity, ensuring flexibility in analysis.

#### Load the csv files

In [4]:
import os
import pandas as pd

#SET PATH TO FILES
#--------------------------------------------------------------------------------------------
main_dir = os.getcwd()
#--------------------------------------------------------------------------------------------
#Load files:
example_dataframe_path = os.path.join(main_dir, 'Example_dataframe.csv')
example_return_period_ranges = os.path.join(main_dir, 'Example_return_period_ranges.csv')
#--------------------------------------------------------------------------------------------
#--------------------------------------------------------------------------------------------
# access the files
x = pd.read_csv(example_dataframe_path, index_col=None)
y = pd.read_csv(example_return_period_ranges, index_col=None)
#--------------------------------------------------------------------------------------------
z =
filtered_info = pd.read_csv(example_return_period_ranges, index_col=None)
#--------------------------------------------------------------------------------------------
#--------------------------------------------------------------------------------------------

display(x.head(5))
display(y)

Unnamed: 0.1,Unnamed: 0,pg_id,year,fatalities_sum,pop_gpw_sum,percapita_100k
0,0,135077,1993,0.0,15539.84082,0.0
1,1,135077,1994,0.0,16002.742188,0.0
2,2,135077,1995,0.0,16465.642578,0.0
3,3,135077,1996,0.0,16904.835938,0.0
4,4,135077,1997,0.0,17344.029297,0.0


Unnamed: 0.1,Unnamed: 0,Return Period,Range,Label
0,0,0,0 - 0.0,Below 1 in 10 year
1,1,10,0.0 - 1.0,1 in 10 year
2,2,20,1.0 - 8.0,1 in 20 year
3,3,50,8.0 - 31.0,1 in 50 year
4,4,100,31.0 - 100000,1 in 100 year


In [5]:
from generate_payout_table import append_return_periods_to_annual_table

In [None]:


table = append_return_periods_to_annual_table(x, y, z, filtered_info, value_field, population_field)

