# OnStove notebook

This is the OnStove notebook. The purpose of the notebook is to give users the ability to run through the analysis with example data and it can therefore act as a complement to the publication and read the docs documentation.

The notebook is divded into 4 major parts: 
* **Data processing** - In this part of the analysis different geospatial datasets are read and processed to be used in the analysis. The datasets from this step are saved on the users computer. For future runs on the same area of interest this step can consquently be skipped unless datasets are switched. 
* **Calibration** - In this part the area of interest is calibrated. Raster cells are classified as either urban or rural, the electrification rate in different cells are determined and the rates of differet cooking fuels across settlements are calibrated. The calibrated data is saved in .pkl-file   
* **Model run** - The net-benefit for different stoves are determined in different parts of the study area. Summaries of the results documenting the benefits and costs of each stove type across the entire study area are produced. The results are saved as .pkl-file.  
* **Visualization** - Visulizing and saving different maps related to the results.

Each part of the notebook is divided into several different cells and each cell is described more in depth.

In [None]:
import os, sys, requests, zipfile
import geopandas as gpd
sys.path.append("..")

In [None]:
%load_ext autoreload

In [None]:
%autoreload 2
from onstove.onstove import OnStove, DataProcessor
from onstove.layer import RasterLayer, VectorLayer
from onstove.raster import interpolate
import time

# Downloading example data for Ghana from the Mendeley database

**Downloading and saving the techno-economic specification file**

In [None]:
tech_specs = r""
response = requests.get(gis_data)
open("tech_specs.csv", "wb").write(response.content)

**Downloading and saving the socio-economic specification file**

In [None]:
soc_specs = r""
response = requests.get(gis_data)
open("soc_specs.csv", "wb").write(response.content)

**Downloading, saving and unzipping the GIS data**

In [None]:
gis_data = r""
response = requests.get(gis_data)
open("gis_data.zip", "wb").write(response.content)

with zipfile.ZipFile("gis_data.zip","r") as zip_ref:
    zip_ref.extractall("gis_data")

# 1. Data processing

## 1.1. Create a data processor

This cell creates your dataprocessor. The dataprocessor is your model and will set the base for your model. OnStove is a raster-based model, hence the resolution becomes important. We specify the resolution when creating the dataprocessor together with the coordinate sytsem. This will ensure that all rasters are resampled to the correct resolution and all datasets (vectors and rasters) are reprojected to the target coordinate system.

In this example we use the pseudo-mercator coordinate system (EPSG:3857) and a spatial resolution of 1 sq. km. 

**Note:** This section of the code the user only needs to run once unless the geospatial datasets do not change in between runs. 

In [None]:
start = time.time()

country = OnStove()
data = DataProcessor(project_crs=3395, cell_size=(1000, 1000))
output_directory = '../example/results'
data.output_directory = output_directory

## 1.2. Create a data processor

In [None]:
path = os.path.join('..', 'example','soc_specs.csv')
country.read_scenario_data(path, delimiter=',')

## 1.3. Add a mask and base 

The mask layer dictates what falls within your area of interest and what is excluded from your analysis. For mask layer we use the administrative boundaries of the country.

For this a function called *add_mask_layer*. *add_mask_layer* takes four inputs:
1. `category`- referencing the cateogry of the layer
2. `name` - referencing the name of the layer
3. `layer_path` - from where to read the data
4. `postgres` - boolean determining indicating whether the data is saved on disc or in a PostgreSQL database. Default is `False`, meaning the dataset is saved on the disc.

In [None]:
adm_path = r"../example/Ghana/Administrative/Country_boundaries/Country_boundaries.geojson"
data.add_mask_layer(category='Administrative', name='Country_boundaries', layer_path=adm_path)

A raster base layer is needed to make every output match its grid and extent. For this, two additional options need to be passed to the `add_layer` method:
* `base_layer`: if `True` the added layer will be considered as the base layer. 
* `resample`: this is the resampling method to be used when resampling this layer to the desired `cell_size` if a `cell_size` is provided.

In [None]:
data.add_layer(category='Base', name='Base', layer_path=r"../example/Ghana/Forest/Forest.tif",
               layer_type='raster', base_layer=True, resample='nearest')

## 1.4. Add GIS layers

Similarly, we can add data layers using the `add_layer` method. A layer `name`, `layer_path` and `postgres` conection also need to be provided (the `postgres` conection defaults to `False`). In addition, the following arguments can be passed:
* `category`: this is used to group all datasets into a category in the final output, e.g. `demand` or `supply`. 
* `layer_type`: this argument is required with two possible options `raster` or `vector`, we should pass either one according to the dataset you are adding. 
* `resample`: this defines what resampling method to use when changing the resolution of the raster. The change of resolution happens when the layer gets aligned with the base layer.

In the cells below the following datasets are read:

* **Population raster** - a raster layer describing the spatial distribution of peoeple acorss the study area
* **Urban-rural split raster** - a raster layer describing which areas that can be considered urban and rural respectively 
* **Forest raster** - a raster layer describing where forest is availble and where it is not. This is used in order to estimate how far people have to travel in order to collect biomass.
* **Friction raster** - a raster layer describing the walking-only friction across the study area (the time it takes to travel across different cells of the study area by foot). This is used in order to determine the collection time of manure (for biogas) and biomass. 
* **Medium voltage line vector file** - a line vector layer showing the availability of medium voltage lines. This is used in order to estimate which settlements are electrified and which ones are not. Users can also use either transformers or high-voltage lines if available. 
* **Nighttime lights raster** - a raster showing the intensity of anthropogenic light sources. This is used as a proxy for determinig who may have electricity and how does not. 
* **Traveltime raster** - a map showing the time it takes to travel to the closest city with motorized transport. This is used in order to estimate transportation cost of LPG. A user can also provide LPG suppliers (as a point layer) and a motorized friction map to determine the travel time instead of a traveltime map.
* **Livestock rasters** - raster layers showing the headcounts of different livestock (buffaloes, cattles, poultry, goats, pigs and sheeps). This is used in order to assess the availability of manure in different cells of the study area. 
* **Temperature raster** - raster layer describing the temperature across the study area. This is used in order to assess the possibility of using biogas. [If the average temperature decreases below 10 degrees celsius the conversion of small scale biogas digesters reduces significantly making them unviable](https://www.sciencedirect.com/science/article/pii/S2213138821003118).

### 1.4.1. Demographics

In [None]:
pop_path = r"../example/Ghana/Population\Population.tif"
data.add_layer(category='Demographics', name='Population', 
               layer_path=pop_path, layer_type='raster', resample='sum')

urban_path = r"../example/Ghana/Urban/Urban.tif"
data.add_layer(category='Demographics', name='Urban_rural_divide', 
               layer_path=urban_path, layer_type='raster', resample='nearest')

### 1.4.2. Biomass

In [None]:
forest_path = r"../example/Ghana/Forest/Forest.tif"
data.add_layer(category='Biomass', name='Forest', 
               layer_path=forest_path, layer_type='raster', resample='average')

friction_path = r"../example/Ghana/Friction/Friction.tif"
data.add_layer(category='Biomass', name='Friction', layer_path=friction_path, 
               layer_type='raster', resample='average')

### 1.4.3. Electricity

In [None]:
mv_path = r"../example/Ghana/MV lines/MV_lines.geojson"
data.add_layer(category='Electricity', name='MV_lines', 
               layer_path=mv_path, layer_type='vector')

ntl_path = r"../example/Ghana/Night time lights\Night_time_lights.tif"
data.add_layer(category='Electricity', name='Night_time_lights', 
               layer_path=ntl_path, layer_type='raster', resample='average')

### 1.4.4. LPG

In [None]:
lpg_path = r"../example/Ghana/Traveltime/Traveltime.tif"
data.add_layer(category='LPG', name='LPG Traveltime', 
               layer_path=lpg_path, layer_type='raster', resample='average')

### 1.4.5. Biogas

In [None]:
buffaloes = r"../example/Ghana/Livestock\buffaloes\buffaloes.tif"
cattles = r"../example/Ghana/Livestock\cattles\cattles.tif"
poultry = r"../example/Ghana/Livestock\poultry\poultry.tif"
goats = r"../example/Ghana/Livestock\goats\goats.tif"
pigs = r"../example/Ghana/Livestock\pigs\pigs.tif"
sheeps = r"../example/Ghana/Livestock\sheeps\sheeps.tif"

for key, path in {'buffaloes': buffaloes,
                  'cattles': cattles,
                  'poultry': poultry,
                  'goats': goats,
                  'pigs': pigs,
                  'sheeps': sheeps}.items():
    data.add_layer(category='Biogas/Livestock', name=key, layer_path=path,
                   layer_type='raster', resample='nearest', rescale=True)

In [None]:
temperature = r"../example/Ghana/Temperature\Temperature.tif"
data.add_layer(category='Biogas', name='Temperature', layer_path=temperature,
               layer_type='raster', resample='average')
data.layers['Biogas']['Temperature'].save(f'{data.output_directory}/Biogas/Temperature')

## 1.5. Mask reproject and align all required layers

The cell below masks all of the read rasters and categories them in different groups. Each dataset that is clipped here is saved in their respective subfolder (e.g. Demographics) under the output_directory specified in [cell 1.1](http://localhost:8888/notebooks/example/OnStove_notebook.ipynb#1.1.-Create-a-data-processor).

In [None]:
data.mask_layers(datasets={'Demographics': ['Population', 'Urban_rural_divide'],
                           'Biomass': ['Forest', 'Friction'],
                           'Electricity': ['Night_time_lights'],
                           'LPG': ['LPG Traveltime'],
                           'Biogas': ['Temperature']})

Next, all raster datasets are aligned with the base layer selected in [cell 1.3](http://localhost:8888/notebooks/example/OnStove_notebook.ipynb#1.3.-Add-a-mask-and-base). This function also ensures that the coordinate sysmtem and resolution of all rasters are the same as the user specifies in [cell 1.1](http://localhost:8888/notebooks/example/OnStove_notebook.ipynb#1.1.-Create-a-data-processor). 

In [None]:
data.align_layers(datasets='all')

Lastly, the vector files (apart from the mask layer) are reprojected and clipped. 

In [None]:
data.reproject_layers(datasets={'Electricity': ['MV_lines']})

In [None]:
end = time.time()

diff = end - start
print('Execution time:', str(str(int(diff//60))) + ' min ' + str(int((diff)%60)) + ' sec')

# 2. Calibration

The calibration step does two things 1) it adds the datasets processed in the previous step to a settlement file that will be used in [step 3](http://localhost:8888/notebooks/example/OnStove_notebook.ipynb#3.-Model-run) for determining the net-benefit in different settlements and 2) calibrating the file with regards to total population, urban-rural split and electrification rate. 

**Note:** Similar to the data processing step, this step is only needed once unless you change anything in the inputs

## 2.1. Read the model data

The cell below reads the socio-economic specification file. This file is needed as it contains the electrificaiton and urban rates and the actual population in the study area (often the GIS datasets of populations have slighlty outdated values of population). 

In [None]:
path = os.path.join('..', 'example', 'soc_specs.csv')
country.read_scenario_data(path)

## 2.2. Add a country mask layer

In [None]:
path = os.path.join(output_directory,'Administrative','Country_boundaries', 'Country_boundaries.geojson')
mask_layer = VectorLayer('admin', 'adm_0', layer_path=path)
country.mask_layer = mask_layer

## 2.3. Add a population base layer

In [None]:
path = os.path.join(output_directory,'Demographics','Population', 'Population.tif')
country.add_layer(category='Demographics', name='Population', layer_path=path, layer_type='raster', base_layer=True)
country.population_to_dataframe()

## 2.4. Calibrate population and urban/rural split

In [None]:
country.calibrate_current_pop()

ghs_path = output_directory + r"\Demographics\Urban_rural_divide\Urban_rural_divide.tif"
country.calibrate_urban_current_and_future_GHS(ghs_path)

## 2.5. Add wealth index GIS data

In [None]:
wealth_index = r"..\example\Ghana\Relative wealth index\GHA_relative_wealth_index.csv"
country.extract_wealth_index(wealth_index, file_type="csv")

## 2.6. Calculate value of time 

In [None]:
country.get_value_of_time()

## 2.7. Read electricity network GIS layers

In [None]:
path = os.path.join(output_directory, 'Electricity', 'MV_lines', 'MV_lines.geojson')
mv_lines = VectorLayer('Electricity', 'MV_lines', layer_path=path)

## 2.8. Calculate distance to electricity infrastructure 

In [None]:
country.distance_to_electricity(mv_lines=mv_lines)

## 2.9. Add nighttime lights data

In [None]:
path = os.path.join(output_directory, 'Electricity', 'Night_time_lights', 'Night_time_lights.tif')
ntl = RasterLayer('Electricity', 'Night_time_lights', layer_path=path)

country.raster_to_dataframe(ntl.layer, name='Night_lights', method='read')

## 2.10 Calibrate current electrified population

In [None]:
country.current_elec()
country.final_elec()

print('Calibrated grid electrified population fraction:', country.gdf['Elec_pop_calib'].sum() / country.gdf['Calibrated_pop'].sum())

## 2.11. Read the cooking technologies data

In [None]:
path = os.path.join('..', 'example', 'tech_specs.csv')
country.read_tech_data(path, delimiter=',')

## 2.12. Calculating grid added capacity cost

In [None]:
country.techs['Electricity'].get_capacity_cost(country)

## 2.13. Reading GIS data for LPG supply

In [None]:
lpg = RasterLayer('LPG', 'LPG Traveltime', 
                  os.path.join(output_directory, 'LPG', 'LPG Traveltime', 'LPG Traveltime.tif'))

country.techs['LPG'].travel_time = country.raster_to_dataframe(lpg.layer,
                                                           nodata=lpg.meta['nodata'],
                                                           fill_nodata='interpolate', method='read') * 2 / 60

## 2.14. Adding GIS data for Biogas

In [None]:
amin = gpd.read_file(os.path.join(output_directory, 'Administrative', 'Country_boundaries', 'Country_boundaries.geojson'))
buffaloes = os.path.join(output_directory, 'Biogas', 'Livestock', 'buffaloes', 'buffaloes.tif')
cattles = os.path.join(output_directory, 'Biogas', 'Livestock', 'cattles', 'cattles.tif')
poultry =os.path.join(output_directory, 'Biogas', 'Livestock', 'poultry', 'poultry.tif')
goats = os.path.join(output_directory, 'Biogas', 'Livestock', 'goats', 'goats.tif')
pigs = os.path.join(output_directory, 'Biogas', 'Livestock', 'pigs', 'pigs.tif')
sheeps = os.path.join(output_directory, 'Biogas', 'Livestock', 'sheeps', 'sheeps.tif')

country.techs['Biogas'].temperature = os.path.join(output_directory, 'Biogas', 'Temperature', 'Temperature.tif')
country.techs['Biogas'].recalibrate_livestock(country, buffaloes, cattles, poultry, goats, pigs, sheeps)
country.techs['Biogas'].friction_path = os.path.join(output_directory, 'Biomass', 'Friction', 'Friction.tif')

## 2.15. Adding GIS data for Biomass

In [None]:
#country.techs['Biomass Forced Draft'].friction_path = os.path.join(output_directory, 'Biomass', 'Friction', 'Friction.tif')
#country.techs['Biomass Forced Draft'].forest_path = os.path.join(output_directory, 'Biomass', 'Forest', 'Forest.tif')
#country.techs['Biomass Forced Draft'].forest_condition = lambda x: x > 30

country.techs['Collected_Improved_Biomass'].friction_path = os.path.join(output_directory, 'Biomass', 'Friction', 'Friction.tif')
country.techs['Collected_Improved_Biomass'].forest_path = os.path.join(output_directory, 'Biomass', 'Forest', 'Forest.tif')
country.techs['Collected_Improved_Biomass'].forest_condition = lambda x: x > 30

country.techs['Collected_Traditional_Biomass'].friction_path = os.path.join(output_directory, 'Biomass', 'Friction', 'Friction.tif')
country.techs['Collected_Traditional_Biomass'].forest_path = os.path.join(output_directory, 'Biomass', 'Forest', 'Forest.tif')
country.techs['Collected_Traditional_Biomass'].forest_condition = lambda x: x > 30

## 2.16. Saving the prepared model inputs

In [None]:
country.output_directory = r""

country.to_pickle("model_inputs.pkl")

end = time.time()

diff = end - start
print('Execution time:', str(str(int(diff//60))) + ' min ' + str(int((diff)%60)) + ' sec')

# 3. Model run

In [None]:
start = time.time()
country = OnStove.read_model("model_inputs.pkl")

In [None]:
path = os.path.join('..', 'example', 'soc_specs.csv')
country.read_scenario_data(path)

## 3.1 Calculating benefits and costs of each technology and getting the max benefit technology for each cell

In [None]:
names = ['Electricity','Traditional_Charcoal' ,'Charcoal ICS' ,'LPG', 'Biogas', 'Collected_Traditional_Biomass', 'Collected_Improved_Biomass']
#names = ['Electricity','Traditional_Charcoal' ,'Charcoal ICS' ,'LPG', 'Biogas', 'Collected_Traditional_Biomass', 'Collected_Improved_Biomass', 'Biomass Forced Draft', 'Pellets Forced Draft']
country.run(technologies=names) 

## 3.2. Printing the results

In [None]:
country.summary()

## 3.3. Saving the results

In [None]:
country.to_pickle("results.pkl")

# 4. Visualization

## 4.1. Reading the results

In [None]:
results = OnStove.read_model("results.pkl")

## 4.2. Setting the color palette and label names

In [None]:
cmap = {"ICS": '#57365A', "LPG": '#6987B7', "Traditional biomass": '#673139', "Charcoal": '#B6195E',
        "Biogas": '#3BE2C5', "Biogas and ICS": "#F6029E",
        "Biogas and LPG": "#C021C0", "Biogas and Traditional biomass": "#266AA6",
        "Biogas and Charcoal": "#3B05DF", "Biogas and Electricity": "#484673",
        "Electricity": '#D0DF53', "Electricity and ICS": "#4D7126",
        "Electricity and LPG": "#004D40", "Electricity and Traditional biomass": "#FFC107",
        "Electricity and Charcoal": "#1E88E5", "Electricity and Biogas": "#484673"}

labels = {"Biogas and Electricity": "Electricity and Biogas",
          'Collected Traditional Biomass': 'Traditional biomass',
          'Collected Improved Biomass': 'ICS',
          'Biogas and Collected Improved Biomass': 'Biogas and ICS'}

## 4.3. Printing map with stoves with highest net-benefit in each settlement

In [None]:
country.plot('max_benefit_tech', cmap=cmap, legend_position=(0.9, 0.7),
           title=f'Maximum net-benefit cooking technology', stats_fontsize=9, 
           labels=labels, legend=True, legend_title='Maximum benefit\ncooking technology',dpi=300, 
           rasterized=True, stats=True, stats_position=(0.85, 0.76), save_style=False)

## 4.4. Printing population split in the study area

In [None]:
country.plot_split(cmap=cmap, labels=labels, save=False)

## 4.5. Printing the costs and benefits for each stove category selected

In [None]:
country.plot_costs_benefits(labels=labels, save=False, height=1.5, width=2)

## 4.6. Printing the range of net-benefit found for each stove per household for each stove selected

In [None]:
country.plot_benefit_distribution(type='box', groupby='None', cmap=cmap, labels=labels, save=False, height=1.5, width=3.5)

## 4.7. Printing the maximum net-benefit across the study area

In [None]:
country.plot('maximum_net_benefit', cmap='magma', metric='per_household')

## 4.8. Printing the total costs across the study area

In [None]:
country.gdf['total_costs'] = country.gdf['investment_costs'] - country.gdf['salvage_value'] + country.gdf['fuel_costs'] + country.gdf['om_costs']
country.plot('total_costs', cmap='magma', metric='per_household')

## 4.9. Saving selected results

In [None]:
country.to_raster('max_benefit_tech', cmap=cmap, labels=labels)
country.to_raster('maximum_net_benefit', metric='per_household')
country.to_raster('maximum_net_benefit', metric='total')
country.to_raster('net_benefit_LPG', metric='per_household')
country.to_raster('net_benefit_Biogas', metric='per_household')
country.to_raster('net_benefit_Collected_Improved_Biomass', metric='per_household')
country.to_raster('total_costs', metric='per_household')
country.to_raster('investment_costs', metric='total')
country.to_raster('deaths_avoided', metric='per_100k')
country.to_raster('time_saved', metric='per_household')
country.to_raster('reduced_emissions', metric='total')
country.to_raster('health_costs_avoided', metric='total')
country.to_raster('Households', metric='sum')