In [None]:
%load_ext autoreload
%autoreload 2

In [None]:
# import sys
# sys.path.insert(0, "..")
from gpbp.layers import AdmArea
from gpbp import visualisation

from optimization import jg_opt
from functools import partial

import warnings
warnings.filterwarnings(action='ignore')

## Defining the Administrative Area

In [None]:
adm_area = AdmArea(country="Timor-Leste", level=1)

In [None]:
adm_area.country

In [None]:
adm_area.country_gdf

In [None]:
adm_area.adm_name

- `AdmArea` stands for Administrative Area
- `init`:
  - `country` gets the ISO codes of the country using `pycountry`. assignment tries to find a similar country if the country name is not immediately found
  - `level` should probably be explained in the package docs
  - Gets country geometry (as well as of all admin areas down to the specified `level`) from the GADM dataset using `gadm`
- [backend] if `level == 0`, `self.geometry` and `self.adm_name` are populated, otherwise they are not. Doesn't look like a good practice. I guess it's to make an exception for country-level analyses.
- [backend] Code has some print statements but they are not being shown. Convert to logging?

In [None]:
adm_area.get_adm_area("Baucau")

- `self.get_adm_area` simply extracts the geometry of a specific admin area
- However it has a special treatment of `level == 0` as mentioned before. Not sure if necessary to do this.

- [frontend] We could add some quick plotting tools for the user to check their work as they develop. E.g. plot the loaded maps etc.

## Retrieving Facility and Population data

In [None]:
adm_area.get_facilities(method="osm", tags={"building":"hospital"})
visualisation.plot_facilities(adm_area.fac_gdf)

- `self.get_facilities` uses `osmnx` to get the requested facilities. It's an API to query OSM but it can also do some network analyses.
  - It's built to work with more sources, but only `osmnx` is supported so far
  - I think [this example notebook from osmnx](https://github.com/gboeing/osmnx-examples/blob/main/notebooks/16-download-osm-geospatial-features.ipynb) explains what we use the package for
  - The query to `osmnx` returns a complex GeoPandas df. `osm_facilities` parses the data into the OSM IDs of the facilities and the coordinates of the facilities' centroids.
- The example query doesn't seem to be complete. I can find other healthcare facilities in e.g. Baucau which were not returned by the query

In [None]:
adm_area.get_population(method="world_pop")
visualisation.plot_population_heatmap(adm_area.pop_df)

- `self.get_population` is very similar in design to `self.get_facilities`
  - It supports two sources: worldpop.org and facebook
  - For worldpop.org it's a simple GET and parsing the resulting JSON
  - [backend] During the data parsing there are some functions that might be better suited as methods of `AdmArea` (`get_admarea_mask`)
  - For facebook it's a file download. If the file already exists, the download is skipped. The data parsing is also different.

## Computing potential locations for facilities

In [None]:
adm_area.compute_potential_fac(spacing=0.05)
visualisation.plot_facilities(adm_area.pot_fac_gdf)

- `compute_potential_fac` simply defines a grid over the polygon of the chosen administrative area. However I can imagine they might want to have different strategies to define potential facility locations. 
- some names of class attributes are a bit poor. E.g. `pot_fac_gdf` stands for "potential facilities gdf".

## Retrieving the road network

In [None]:
adm_area.get_road_network("driving")

- `get_road_network` takes the `network_type` as an argument, but each network type has an extra parameter hardcoded in it: `default_speed`. we could make this one an argument as well
  - Actually, looking at it with more detail, this `default_speed` is just an imputation value for when `osmnx` cannot find the average speed of a certain edge in the network. So perhaps it makes sense not to let the user tune this.
- this method uses `osmnx` to get a graph of a given network from OpenStreetMaps
- the `osmnx` functions expect a specific projection, but it seems they do the conversion to the right projection, so we don't have to worry about checking the projections, it seems.
- after getting the network of roads, we get the average travel speed in each edge (imputing it with a hardcoded value when missing).
- finally given the edge speeds, `osmnx` computes the travel time in each edge

## Prepare optimization data

In [None]:
MAPBOX_API_TOKEN = None # fill out with your own access token for mapbox strategy
if MAPBOX_API_TOKEN is None:
    raise ValueError("Please fill out the MAPBOX_API_TOKEN with your own access token.")
DISTANCE_TYPE = "length"
pop_count, current, potential = adm_area.prepare_optimization_data(
    DISTANCE_TYPE, [2000, 5000, 10000], "driving", "mapbox", population_resolution=3, mapbox_access_token=MAPBOX_API_TOKEN)

In [None]:
pop_count.shape

In [None]:
current['length']

- This method is the longest.
- We have to pass the mode of transport, even though it was already defined in `get_road_network`.
- the population is projected to a lower resolution grid. the user defines the number of digits that lat and long should be projected to.
- the existing facilities and potential locations for new facilities are merged into the same gdf, I'm not sure why. Then this gdf is passed into two calls of the same function `population_served` and they are once again split into existing and potential facilities. The splitting is done by index, which is not very elegant. We could replace this with a new column that labels the type of facility (potential or existing).
- `population_served` is poorly documented and it's also quite long
  - in a first step, isopolygons are computed depending on the chosen strategy.
  - OSM is done locally with networkx and osmnx
  - at a second stage, the population and isopolygons are joined but I still didn't fully get what happens to them
  - this happens for every distance passed to `prepare_optimization_data`
- the outputs are:
  -  the population per point of the low resolution grid
  -  current and potential TODO didn't get what exactly this is

## Optimize

In [None]:
CBC_SOLVER_PATH = None # fill out the solver path where the cbc executable 
BUDGET = [5, 20, 50] # budget for the optimization in terms of how many locations can be built
cbc_optimize = partial(
                    jg_opt.OpenOptimize, solver_path=CBC_SOLVER_PATH
                )
jg_opt.Solve(pop_count, current, potential, DISTANCE_TYPE, BUDGET, optimize=cbc_optimize, type='ID')