# Hands-on

In this notebook, we extract the characteristics of a `.geojson` file obtained from DeepPVMapper. This file corresponds to the outputs of the example provided in the hands-on of DeepPVMapper's repository, accessible at this URL: [https://github.com/gabrielkasmi/dsfrance/blob/main/notebooks/hands-on.ipynb](https://github.com/gabrielkasmi/dsfrance/blob/main/notebooks/hands-on.ipynb).

We work with the raw polygon file `arrays_69.geojson`. You can either generate your own file following the instruction on DeepPVMapper's repository, or use the file supplied in the folder `hands-on` on our Zenodo repository. 

## Imputation of the parameters

`PyPVRoof` admits two ways to input the necessary parameters: either with a `config.yml` file or by passing a dictionnary of parameters as input. The parameter names are the same. In this notebook, we will pass a dictionnary with the parameters as input. The parameters can be categorized as follows:

* General parameters:
    * `has-data` (boolean). Specifies whether auxiliary data is accessible
    * `has-dem` (boolean). Specifies whether surface models (as `.geotiff`) are accessible
    * `tilt-method` (str). Specifies the tilt estimtation method. Can be `lut`, `theil-sen` or `constant`.
    * `azimuth-method` (str). Specifies the azimuth estimation method. Can be `bounding-box` or `constant`.
    * `regression-type` (str). Specifies the azimuth estimation method. Can be `bounding-box` or `constant`.
    * `output-name` (str). Specifies the name of the output file. 
    * `data_dir` (str). Specifies the directory of the source data

If you do not wish to specify some parameters, you can pass an empty string `''` for these fields.

* Parameters related to a method:
    *  If `tilt-method == 'constant'`:
        *  `constant-tilt` (int or float). The value taken as a constant tilt.
    * If `tilt-method == "lut"`. In this case, it is required that you have auxiliary data. So `has-data` should be `True`
        * `data-directory` (str). The directory where the auxiliary file is located.
        * `data-name` (str). The name of the auxiliary file (in the auxiliary directory).
        * `latitude_var` (str). The name of the field corresponding to the latitude in the raw auxiliary file.
        * `longitude_var`  (str). The name of the field corresponding to the latitude in the raw auxiliary file.
        * `ic_var`  (str). The name of the field corresponding to the latitude in the raw auxiliary file.
        * `surface_var`  (str). The name of the field corresponding to the latitude in the raw auxiliary file.
        * `tilt_var` (str). The name of the field corresponding to the latitude in the raw auxiliary file.
        * `lut-steps` (int). The number of steps used to generate the LUT. The LUT will be a `np.ndarray` of size `(lut-steps, lut-steps)`.
        * `regression-clusters` (int). The number of clusters for the regression.
    * If `tilt-method == "theil-sen"` or `azimuth-method == "theil-sen"`. In this case, it is required that you have auxiliary data. So `has-dem` should be `True`
        * `conversion` (str or None): a string separated by a comma (,) indicating the coordinates system of the polygons and that of the raster. If both coordinates systems are the same, then leave it as `None`
        * `M` and `N` (int or None). Number of subsamples and max subpopulation parameters for the Theil-Sen estimation. If left as empty, default parameters are used.
        * `offset`: the offset between the polygon edges and the mask edges when computing the mask from the polygons. Also used to extract an array from the DEM raster.
    * If `azimuth-method == "bounding-box"`:
        * No additional parameter required.
    *  If `regression-type == 'constant'`:
        *  `default-coefficient` (int or float). Corresponds to the efficiency of the array, expressed in `kWp/m²`.
    * If `regression-type == 'clustered'`:
        * `regression-clusters`: (int). The number of clusters for the regression.

In [None]:
# Library imports
import geojson
import pandas as pd
import os
import json
import matplotlib.pyplot as plt
from src import main

In [None]:
# Load the file from Zenodo or from your local directory (uncomment what's relevant for you)

# !wget 'https://zenodo.org/record/7586879/files/hands-on.zip?download=1' -O 'hands-on.zip'
# !unzip 'hands-on.zip' 
# !rm 'hands-on.zip' # delete the zip file.


# root directory (uncomment what irrelevant for you)
# source_directory = "path/to/your/data"
source_dir = 'hands-on'

# file names
source_data_name = "bdappv-metadata.csv"
source_input_name = "arrays_69.geojson"
lookup_name = "lookup-table.json"

# load the files
dataset = pd.read_csv(os.path.join(source_dir, source_data_name))
arrays = geojson.load(open(os.path.join(source_dir, source_input_name)))
lookup = json.load(open(os.path.join(source_dir, lookup_name)))


## Set-up

We will extract the characteristics of the `.geojson` file and load them locally in this notebook. We choose the following methods:

* `tilt-method`: look-up table
* `regression-type`: clustered regression
* `azimuth-method`: bounding box

We will use the method `extract_all_characteristics` of the `MetadataExtraction` extraction class. We need the following inputs: an auxiliary table and the file containing the polygons, both loaded in the previous cell.


In [None]:
# define the dictionnary of parameters

params = {
    # general parameters
    "has-data" : True,
    "has-dem"  : False,
    'tilt-method' : "lut",
    "azimuth-method" : 'bounding-box',
    'regression-type' : "linear",
    'output-name' : "arrays_characteristics",
    'data-dir' : source_dir,

    # parameters specific to our methods
    "data-directory" : source_dir,
    'data-name' : source_data_name,
    "latitude_var" : "latitude",
    "longitude_var" : "longitude",
    "ic_var" : "kWp",
    "tilt_var" : 'tilt',
    "surface_var" : "surface",
    "lut-steps" : 50,
    'regression-clusters' : 4,
    
}

# initialize the extractor object
extraction = main.MetadataExtraction(p = params, lut = lookup)

Our `extraction` object is initialized with our desired parameters. Alternatively, you can dump these parameters in a `config.yml` file and initialize as follows:
```python
    extraction = main.MetadataExtraction(cf = cf)
```
where `cf` is your configuration file. Now, we will extract the metadata for all our polygons and store the results in the directory `source_dir`


In [None]:
extraction.extract_all_characteristics(arrays)

In [None]:
# Load the results stored locally
out = pd.read_csv(os.path.join(source_dir, "{}.csv".format(params['output-name'])))
out.head()

And that's it! We can now plot the distribution of azimuths and installed capacities according to our estimation. We can see that most of them are pointing south, which is reassuring. Besides, most installations have a small installed capacity (beyond 100 kWp), as our algorithm mostly targets distributed PV.

In [None]:
fig, ax = plt.subplots(1,3, figsize = (13,4))

ax[0].set_title('Azimuth [°]')
ax[0].hist(out["azimuth"], density = True)

ax[1].set_title('Installed capacity [kWp]')
ax[1].hist(out['installed_capacity'])
ax[1].set_yscale('log')

ax[2].set_title('Tilt [°]')
ax[2].hist(out['tilt'], density = True)

plt.show()

## Comparison of methods

We now compare the tilt and azmimuth estimation using the LUT and the bounding-box method with the estimation using the Theil-Sen method. We do this for one installation as the raster is not available everywhere. 

In [None]:
item = arrays['features'][1356] # specific array for which the DEM is available

params['azimuth-method'] = 'bounding-box'
params["tilt-method"] = "lut"

extraction = main.MetadataExtraction(p = params, lut = lookup)
tilt_1, azim_1 = extraction.compute_tilt(item), extraction.compute_azimuth(item)


params['azimuth-method'] = 'theil-sen'
params["tilt-method"] = "theil-sen"

# Parameters for Theil-sen estimation
params['offset'] = 25
params['raster-folder'] = '../hands-on'
params['conversion'] =  "epsg:4326,epsg:2154"

extraction = main.MetadataExtraction(p = params, lut = lookup)
tilt_2, azim_2 = extraction.compute_tilt(item), extraction.compute_azimuth(item)
 
print(''' Characteristics extraction comparison:
Tilt : {:0.2f} (LUT) / {:0.2f} (Theil-Sen)
Azimuth : {:0.2f} (BB) / {:0.2f} (Theil-Sen)
'''.format(tilt_1, tilt_2, azim_1, azim_2))

Below is an view of the installation (right) and the mask (left). We can see that Theil-Sen improves over the bouding box when the mask's boundaries are not parallel with the installations' edges. Mask and image are not at the same scale.


<table align="center"><tr>
<td> <img src="assets/mask.png" alt="Drawing" style="width: 200px;"/> </td>
<td> <img src="assets/img.png" alt="Drawing" style="width: 200px;"/> </td>
</tr></table>

<i> Image: Google Maps </i>


