# GEP-OnSSET GIS-Extraction Notebook for GEP-OnSSET

This is the GEP-OnSSET GIS extraction notebook that runs in bulk. 

There are two options for how to choose the layers: 
 * Choose Option A to browse and select the layer each time.
 * Choose Option B to enter the paths to each layer, which could go faster if you are trying to run the tool multiple times.

For Option B, please provide your layers in the form of **"../path/filename.ext"** and run all cell at once. Please make sure the layers are in the proper format and contain the write attributes when needed. 

### Useful hints and common error messages
* Make sure that all input layers are using EPSG:4326 as the coordinate system
* Make sure that the target "crs" is in a coordinate system using meters as the unit
* It is often useful to clip all the input layers to the country boundaries in order to reduce processing times
* Make sure that each dataset actually has some data within the country boundaries
* Some of the datasets require the user to choose values from a dropdown list below
* For hydro points and mini-grids, the vector layers need some specific column names to work
* In case a dataset still does not work, try opening it in QGIS and run the *Fix geometries* tool and save the new layer.
* If things do not work, it may be useful to go to the very top of this Jupyter Notebook and start again from cell 1

## Importing necessary packages (Mandatory)

Packages to be used are imported from the funcs.ipynb.

In [None]:
%run funcs.ipynb
import traceback
import time
#import warnings

#warnings.filterwarnings("ignore")

# Step 1: Indicate the layers and parameters to be used

## Run either Option A to select all the layers from the File Explorer, or Option B to write all the links instead

## Option A

First, define the coordinate system (crs). 
Then select the correct layer each time. If there is a layer you do not have or wish to use, press **Cancel**

In [None]:
crs = 'EPSG:3395'

In [None]:
# Make sure this matches the population column in the clusters file
x = 'Population'

# If you use the hydro layer, make sure the power column and unit match the below
hydro_power_column = "PowerMW"
hydro_power_unit = 'MW'

# Select the Admin 1 column if you are using Admin 1 boundaries
admin_1_name = "NAME_1" #'NAME_1'

In [None]:
messagebox.showinfo('OnSSET extraction', 'Output folder')
workspace = filedialog.askdirectory()

messagebox.showinfo('OnSSET', 'Select the admin boundaries')
admin = filedialog.askopenfilename(filetypes = (("vector",["*.shp", "*.gpkg", "*.geojson"]),("all files","*.*")))

messagebox.showinfo('OnSSET', 'Select the clusters')
clusters = filedialog.askopenfilename(filetypes = (("vector",["*.shp", "*.gpkg", "*.geojson"]),("all files","*.*")))


## Raster layers
messagebox.showinfo('OnSSET', 'Select the Solar GHI layer')
ghi_layer = filedialog.askopenfilename(filetypes = (("rasters","*.tif"),("all files","*.*")))

messagebox.showinfo('OnSSET', 'Select the Travel Time layer')
travel_layer = filedialog.askopenfilename(filetypes = (("rasters","*.tif"),("all files","*.*")))

messagebox.showinfo('OnSSET', 'Select the Wind layer')
wind_layer = filedialog.askopenfilename(filetypes = (("rasters","*.tif"),("all files","*.*")))

messagebox.showinfo('OnSSET', 'Select the Night Lights layer')
ntl_layer = filedialog.askopenfilename(filetypes = (("rasters","*.tif"),("all files","*.*")))

messagebox.showinfo('OnSSET', 'Select the Custom Demand layer')
custDem_layer = filedialog.askopenfilename(filetypes = (("rasters","*.tif"),("all files","*.*")))


## Vector Layers

messagebox.showinfo('OnSSET', 'Select the Substations layer')
sub_layer = filedialog.askopenfilename(filetypes = (("vector",["*.shp", "*.gpkg", "*.geojson"]),("all files","*.*")))

messagebox.showinfo('OnSSET', 'Select the Existing HV layer')
HVexist_layer = filedialog.askopenfilename(filetypes = (("vector",["*.shp", "*.gpkg", "*.geojson"]),("all files","*.*")))

messagebox.showinfo('OnSSET', 'Select the Planned HV layer')
HVplan_layer = filedialog.askopenfilename(filetypes = (("vector",["*.shp", "*.gpkg", "*.geojson"]),("all files","*.*")))

messagebox.showinfo('OnSSET', 'Select the Existing MV layer')
MVexist_layer = filedialog.askopenfilename(filetypes = (("vector",["*.shp", "*.gpkg", "*.geojson"]),("all files","*.*")))

messagebox.showinfo('OnSSET', 'Select the Planned MV layer')
MVplan_layer = filedialog.askopenfilename(filetypes = (("vector",["*.shp", "*.gpkg", "*.geojson"]),("all files","*.*")))

messagebox.showinfo('OnSSET', 'Select the Roads layer')
road_layer = filedialog.askopenfilename(filetypes = (("vector",["*.shp", "*.gpkg", "*.geojson"]),("all files","*.*")))

messagebox.showinfo('OnSSET', 'Select the Distribution Transformer layer')
trx_layer = filedialog.askopenfilename(filetypes = (("vector",["*.shp", "*.gpkg", "*.geojson"]),("all files","*.*")))

## If you use this make sure the hydro layer is a point layer (not a multipoint) and check the column and unit below
messagebox.showinfo('OnSSET', 'Select the Hydro layer')
hydro_layer = filedialog.askopenfilename(filetypes = (("vector",["*.shp", "*.gpkg", "*.geojson"]),("all files","*.*")))

## If you use this make sure your mini grid layer has these columns 'name', "MV_network", "MG_type"
messagebox.showinfo('OnSSET', 'Select the Mini Grid layer')
exist_MG_layer = filedialog.askopenfilename(filetypes = (("vector",["*.shp", "*.gpkg", "*.geojson"]),("all files","*.*")))

messagebox.showinfo('OnSSET', 'Select the Admin 1 layer')
adm1_layer = filedialog.askopenfilename(filetypes = (("vector",["*.shp", "*.gpkg", "*.geojson"]),("all files","*.*")))



## Option B

First, define the coordinate system (crs). 
Then input the path to each layer. If there is a layer you do not have or wish to use, leave it empty (**''**)

In [None]:
#crs = 'EPSG:32737'
#workspace = r'C:\Users\andre\Documents\TrainingMaterial\ExtractionTest\Test2'
#admin = r"C:\Users\andre\Documents\TrainingMaterial\Scripts\Input_file_extraction\Input\administrative\gadm36_SLE_0.geojson"
#clusters = r"C:\Users\andre\Documents\TrainingMaterial\Scripts\Input_file_extraction\Input\clusters\Clusters_SL.gpkg"

# Make sure this matches the population column in the clusters file
#x = "Population"

## Raster layers
#ghi_layer = r"C:\Users\andre\Documents\TrainingMaterial\Scripts\Input_file_extraction\Input\GHI\SL_GHI.tif"
#travel_layer = r"C:\Users\andre\Documents\TrainingMaterial\Scripts\Input_file_extraction\Input\travel_time\2015_accessibility_to_cities_v1.0_SL.tif"
#wind_layer = r"C:\Users\andre\Documents\TrainingMaterial\Scripts\Input_file_extraction\Input\wind\WindSpeed.tif"
#ntl_layer = r"C:\Users\andre\Documents\TrainingMaterial\Scripts\Input_file_extraction\Input\ntl\NightLights.tif"
#custDem_layer = r"C:\Users\andre\Documents\TrainingMaterial\Scripts\Input_file_extraction\Input\customDemand\CREDIT.tif"

## Vector Layers
#sub_layer = r"C:\Users\andre\Documents\TrainingMaterial\Scripts\Input_file_extraction\Input\power_infrastructure\Transformer_SL.gpkg"
#HVexist_layer = r"C:\Users\andre\Documents\TrainingMaterial\Scripts\Input_file_extraction\Input\power_infrastructure\HV_existing_SL.gpkg"
#HVplan_layer = r"C:\Users\andre\Documents\TrainingMaterial\Scripts\Input_file_extraction\Input\power_infrastructure\HV_proposed_SL.gpkg"
#MVexist_layer = r"C:\Users\andre\Documents\TrainingMaterial\Scripts\Input_file_extraction\Input\power_infrastructure\MV_existing_GridFinder.gpkg"
#MVplan_layer = r"C:\Users\andre\Documents\TrainingMaterial\Scripts\Input_file_extraction\Input\power_infrastructure\MV_existing_GridFinder.gpkg"
#road_layer = r"C:\Users\andre\Documents\TrainingMaterial\Scripts\Input_file_extraction\Input\roads\sierra-leone-highway-latest.gpkg"
#trx_layer = r"C:\Users\andre\Documents\TrainingMaterial\Scripts\Input_file_extraction\Input\power_infrastructure\Transformer_SL.gpkg"

## If you use this make sure the hydro layer is a point layer (not a multipoint) and has a column "power" in "Watts"
#hydro_layer = r"C:\Users\andre\Documents\TrainingMaterial\Scripts\Input_file_extraction\Input\hydropower\hydro_potential_SL_points.gpkg"

# If you use the hydro layer, make sure the power column and unit match the below
#hydro_power_column = "power"
#hydro_power_unit = 'W'

## If you use this make sure your mini grid layer has these columns 'name', "MV_network", "MG_type"
#exist_MG_layer = r""

##
#adm1_layer = r"C:\Users\andre\Documents\TrainingMaterial\Scripts\Input_file_extraction\Input\administrative\SL_admin1.gpkg"
#admin_1_name = 'NAME_1'

# Step 2: Process layers

## Import admin

In [None]:
admin = gpd.read_file(admin)

## Import clusters

In [None]:
clusters = gpd.read_file(clusters)
clusters = gpd.clip(clusters, admin)

## Extract Global Horizontal Irradiation (GHI) from Raster layer

In [None]:
try:
    clusters = zonal_stat(ghi_layer, clusters, 'mean', 'GHI')
    print(time.ctime()
except rasterio.RasterioIOError as e:
    print('Could not process solar GHI, or layer was not selected')
    print(e)
except Exception as e:
    traceback.format_exc()
    print(traceback.format_exc())

## Extract Travel Time from Raster layer

In [None]:
#clusters = processing_raster("traveltime","mean",clusters)
try:
    clusters = zonal_stat(travel_layer, clusters, 'mean', 'TravelTime')
    print(time.ctime()
except rasterio.RasterioIOError as e:
    print('Could not process Travel Time, or layer was not selected')
    print(e)
except Exception as e:
    traceback.format_exc()
    print(traceback.format_exc())

## Extract Wind Velocity from Raster layer

In [None]:
#clusters = processing_raster("wind","mean",clusters)
try:
    clusters = zonal_stat(wind_layer, clusters, 'mean', 'WindVel')
    print(time.ctime()
except rasterio.RasterioIOError as e:
    print('Could not process Wind velocity, or layer was not selected')
    print(e)
except Exception as e:
    traceback.format_exc()
    print(traceback.format_exc())

## Extract Night Lights from Raster layer

In [None]:
#clusters = processing_raster("wind","mean",clusters)
try:
    clusters = zonal_stat(ntl_layer, clusters, 'max', 'NightLight')
    print(time.ctime()
except rasterio.RasterioIOError as e:
    print('Could not process Wind velocity, or layer was not selected')
    print(e)
except Exception as e:
    traceback.format_exc()
    print(traceback.format_exc())

## Extract Custom Demand from Raster layer

In [None]:
try:
    clusters = zonal_stat(custDem_layer, clusters, 'mean', 'CustomDemand')
    print(time.ctime()
except rasterio.RasterioIOError as e:
    print('Could not process Custom Demand, or layer was not selected')
    print(e)
except Exception as e:
    traceback.format_exc()
    print(traceback.format_exc())

## Preparing to run the vector data

In [None]:
clusters = preparing_for_vectors(workspace, clusters, crs)

## Extract Distance from Substations (Vector point layer)

In [None]:
try:
    clusters = processing_points_bulk(sub_layer, "Substation", admin, crs, workspace, clusters, mg_filter=False)
except fiona.errors.DriverError as e:
    print('Could not process Substations, or layer was not selected')
except Exception as e:
    traceback.format_exc()
    print(traceback.format_exc())

## Extract Distance from Existing high voltage lines (Vector line layer)

In [None]:
try:
    clusters = processing_lines_bulk(HVexist_layer, "Existing_HV", admin, crs, workspace, clusters)
except fiona.errors.DriverError as e:
    print('Could not process Existing HV, or layer was not selected')
except Exception as e:
    traceback.format_exc()
    print(traceback.format_exc())
    

## Extract Distance from Planned high voltage lines (Vector line layer)

In [None]:
try:
    clusters = processing_lines_bulk(HVplan_layer, "Planned_HV", admin, crs, workspace, clusters)
except fiona.errors.DriverError as e:
    print('Could not process Planned HV, or layer was not selected')
    print(e)
except Exception as e:
    traceback.format_exc()
    print(traceback.format_exc())

## Extract Distance from Existing medium voltage lines (Vector line layer) 

In [None]:
try:
    clusters = processing_lines_bulk(MVexist_layer, "Existing_MV", admin, crs, workspace, clusters)
except fiona.errors.DriverError as e:
    print('Could not process Existing MV, or layer was not selected')
    print(e)
except Exception as e:
    traceback.format_exc()
    print(traceback.format_exc())

## Extract Distance from Planned medium voltage lines (Vector line layer)

In [None]:
try:
    clusters = processing_lines_bulk(MVplan_layer, "Planned_MV", admin, crs, workspace, clusters)
except fiona.errors.DriverError as e:
    print('Could not process Planned MV, or layer was not selected')
    print(e)
except Exception as e:
    traceback.format_exc()
    print(traceback.format_exc())

## Extract Distance from Roads (Vector line layer)

In [None]:
try:
    clusters = processing_lines_bulk(road_layer, "Road", admin, crs, workspace, clusters)
except fiona.errors.DriverError as e:
    print('Could not process Roads, or layer was not selected')
    print(e)
except Exception as e:
    traceback.format_exc()
    print(traceback.format_exc())

## Extract Distance from Transformers (Vector point layer)

In [None]:
try:
    clusters = processing_points_bulk(trx_layer, "Transformer", admin, crs, workspace, clusters, mg_filter=False)
except fiona.errors.DriverError as e:
    print('Could not process Distribution transformers, or layer was not selected')
    print(e)
except Exception as e:
    traceback.format_exc()
    print(traceback.format_exc())

## Extract Distance from hydro points (Vector point layer)

In [None]:
try:
    hydro=gpd.read_file(hydro_layer)
    clusters = processing_hydro(admin, crs, workspace, clusters, hydro, hydro_power_column, hydro_power_unit)
except fiona.errors.DriverError as e:
    print('Could not process Hydro, or layer was not selected')
    print(e)
except Exception as e:
    traceback.format_exc()
    print(traceback.format_exc())

## Extract Distance from Existing ESP (mini-grid) data (Vector point layer)

In [None]:
try:
    clusters = processing_points_bulk(exist_MG_layer, "MG", admin, crs, workspace, clusters, mg_filter=True)
except fiona.errors.DriverError as e:
    print('Could not process Mini-grids, or layer was not selected')
    print(e)
except Exception as e:
    traceback.format_exc()
    print(traceback.format_exc())

## Adding admin 1 name to clusters (Vector polygon layer)


In [None]:
try:
    clusters = get_admin1_name_bulk(clusters, adm1_layer, admin_1_name, crs)
except fiona.errors.DriverError as e:
    print('Could not process Admin_1, or layer was not selected')
    print(e)
except Exception as e:
    traceback.format_exc()
    print(traceback.format_exc())

## Conditioning & Export (Mandatory)

This is the final cell in the extraction. This cell has to be run.

In [None]:
clusters = create_prio_columns(clusters)
clusters = conditioning(clusters, workspace, x)