In [1]:
%load_ext watermark
import pandas as pd
import numpy as np
from typing import Type, Optional, Callable
from typing import List, Dict, Union, Tuple

from review_methods_tests import collect_vitals, find_missing, find_missing_loc_dates
from review_methods_tests import make_a_summary

import matplotlib.pyplot as plt
import matplotlib as mpl
import matplotlib.colors
from matplotlib.colors import LinearSegmentedColormap, ListedColormap

import setvariables as conf_
import reportclass as r_class

# Report class

The report class is used to generate descriptive statistics and identify objects of interest for a query defined by geographic, adminsitrative and temporal bounds. The reference is the swiss federal report published in 2022.

## Defining and collecting the data

The survey records are stored separately from the environmental and administrative data. Each survey is identified by an id that is a combination of the location and date of the survey `loc_date`. There can be up to 228 codes associated to one `loc_date`, most of them will be zero. The combination of `loc_date` and `code` should result in a unique value.

### basic requirements

__Define the limits of the request:__
   * temporal
   * geographic (includes features and parent boundaries)
   * object types
   * level of aggregation

Define what codes are being used

The default setting is to combine all the fragmented plastics into one group (all sizes) and the same for fragmented expanded polystyrene and plastic bottle tops. This results in three codes that represent objects that are very similar. This topic has been addressed many times. These groups register not-trivial quantities at most surveys. However, the differentiation of these objects into their respective subgroups ie. plastic caps for drinnking v/s plastic caps for household cleaners is not considered a priority by all groups that have collected data in the past.

* Gfrags
* Gfoams
* Gcaps

__Define the reporting language:__

The reporting language can be either French, German or English.

__Note:__ The reporting language is only applied at the moment of display. The column names, feature labels and other underlying identifying criteria for the data remain unchanged. The column name definitions and translations are in the _random variables_ section.

__Summary__
   
From the Annex in `testing_data_models` we identified the column combinantions needed to slice the data depending on the report request. At the same time we identify the operations to be performed and when they are to be performed as different columns are used to group the data. This was summrised as follows:

   * `df (pd.DataFrame)`: The input DataFrame containing data for analysis.
   * `cumulative_columns (List, optional)`: List of columns to be considered for cumulative values.
   * `boundary_labels (List, optional)`: List of labels for boundary summaries.
   * `object_labels (List, optional)`: List of labels for individual objects.
   * `object_columns (List, optional)`: List of columns identifying objects.
   * `unit_agg (dict, optional)`: Aggregation methods for unit summaries.
   * `unit_columns (List, optional)`: List of columns for unit summaries.
   * `agg_groups (dict, optional)`: Aggregation methods for boundary summaries.


### Work data

A report can be defined by providing the temporal and geographic bounds of interest. Below is the current method.

__code sample:__

```python
# starting data, can be MySQL or NoSQL calls
# the three methods accept Callables, as long
# as the out put is pd.DataFrame
c_l = r_class.language_maps()
surveys = r_class.collect_survey_data_for_report()
codes, beaches, land_cover, land_use, streets, river_intersect_lakes = r_class.collect_env_data_for_report()

survey_data = surveys.copy()
survey_data = survey_data.merge(beaches['canton'], left_on='slug', right_index=True, validate='many_to_one')

# temporal and geographic boundaries
# user defined input
boundaries = dict(canton='Valais', language='fr', start_date='2019-01-01', end_date='2022-01-01')

# the level and label of the report
# the language for display
# the data for the report and all other
# from the data range
top_label, language, w_df, w_di = r_class.report_data(boundaries, survey_data.copy(), beaches, codes)

# define the language map
w_df.head().style.set_table_styles(conf_.table_css_styles)
```

Which produces the following untranslated output.

In [2]:
# starting data, can be MySQL or NoSQL calls
# the three methods accept Callables, as long
# as the out put is pd.DataFrame
c_l = r_class.language_maps()
surveys = r_class.collect_survey_data_for_report()
codes, beaches, land_cover, land_use, streets, river_intersect_lakes = r_class.collect_env_data_for_report()

survey_data = surveys.copy()
survey_data = survey_data.merge(beaches['canton'], left_on='slug', right_index=True, validate='many_to_one')

# temporal and geographic boundaries
# user defined input
boundaries = dict(canton='Valais', language='fr', start_date='2019-01-01', end_date='2022-01-01')

# the level and label of the report
# the language for display
# the data for the report and all other
# from the data range
top_label, language, w_df, w_di = r_class.report_data(boundaries, survey_data.copy(), beaches, codes)

# define the language map
w_df.head().style.set_table_styles(conf_.table_css_styles)

Unnamed: 0,city,parent_boundary,loc_date,length,canton,date,feature_type,slug,feature_name,groupname,code,quantity,pcs_m
35698,Lens,les-alpes,"('clean-up-tour-crans-montana', '2021-06-12')",43,Valais,2021-06-12,p,crans-montana,alpes-valaisannes,agriculture,G13,0,0.0
35699,Lens,les-alpes,"('clean-up-tour-crans-montana', '2021-06-12')",43,Valais,2021-06-12,p,crans-montana,alpes-valaisannes,agriculture,G140,0,0.0
35700,Lens,les-alpes,"('clean-up-tour-crans-montana', '2021-06-12')",43,Valais,2021-06-12,p,crans-montana,alpes-valaisannes,agriculture,G161,0,0.0
35701,Lens,les-alpes,"('clean-up-tour-crans-montana', '2021-06-12')",43,Valais,2021-06-12,p,crans-montana,alpes-valaisannes,agriculture,G168,0,0.0
35702,Lens,les-alpes,"('clean-up-tour-crans-montana', '2021-06-12')",43,Valais,2021-06-12,p,crans-montana,alpes-valaisannes,agriculture,G170,0,0.0


## Reporting categories

The first variable of the input is used to define the hierarchy of the report. For administrative purposes a vertical approach that reflects areas of responsibility is important. For estimating values the geographic/topographic attributes are more important.

The survey data is labeled for these purposes. The columns `parent_boundary`, `feature_type` and `feature_name` are the topographic features. 

1. `parent_boundary`: the name of the: river basin, catchment area, park, name of geograhphic region or other zone defined by swiss geo admin.
2. `feature_type`: lake, river or park
3. `feature_name`: the name of the lake, river or park

The `geo_h` array sets the order for reporting. Reports for cantons can contain subreports for all the values in the array, by default the cantonal results will reference the IQAASL report for threshold or prior results. Reports for cities will contain only geographic categories with reference to cantonal results.

__code sample:__


```python


geo_h = ['parent_boundary', 'feature_type',  'feature_name','canton', 'city']


def categorize_work_data(df, labels, columns_of_interest: List[str] = geo_h, sample_id: str = 'loc_date'):
       
    data = df[df[labels[0]] == labels[1]].copy()
    
    summaries = columns_of_interest
    print(summaries)
    
    # if city is selected the available boundaries
    # are geographic. A city is in only one canton
    # if canton is selected then city becomes a category
    # for which a report can be produced    
    if labels[0] == columns_of_interest[-1]:
        summaries = columns_of_interest[:-2]
    if labels[0] == columns_of_interest[-2]:
        summaries = [*columns_of_interest[:-2], columns_of_interest[-1]]
    
    new_columns = list(set([sample_id, *summaries]))
    d = data[new_columns].copy()
    res = {}
    for an_attribute in new_columns:
        datt = d[an_attribute].unique()
        res.update({an_attribute: datt})
    
    res['samples'] = res.pop('loc_date')
    
    return {labels[1]:res}

# this categorizes the survey data into search terms
# the available data or reporting categories are retrieved
# by getting the length of the array for each category
# if the category is not present then the data is not available
parent_categories = categorize_work_data(w_df, top_label)
p_vals = parent_categories[boundaries[top_label[0]]]

# the type and number of reports available
reporting_categories = {k:len(v) for k, v in p_vals.items()}
reporting_categories
```

Which gives the following result:

In [3]:
# this categorizes the survey data into search terms
# the available data or reporting categories are retrieved
# by getting the length of the array for each category
# if the category is not present then the data is not available
parent_categories = r_class.categorize_work_data(w_df, top_label)
p_vals = parent_categories[boundaries[top_label[0]]]

# the type and number of reports available
reporting_categories = {k:len(v) for k, v in p_vals.items()}
reporting_categories

{'city': 10,
 'parent_boundary': 2,
 'feature_type': 3,
 'feature_name': 3,
 'samples': 22}

The same operation can be performed at each level. The first call to `categorize_work_data` gives the structure of the report. For each key value of the reporting categories there wil be a set of descriptive statistics.

For example, a detailed report on all feature types within the canton includes the following summary data for each feature type.

__code sample:__

```python
# identify and count the results from parcs
parc_features = categorize_work_data(w_df[w_df.feature_type == p_vals['feature_type'][0]], top_label)

# count the contents in each attribute
{k:len(v) for k, v in parc_features[top_label[1]].items()}

# out =>

{'city': 6,
 'feature_type': 1,
 'parent_boundary': 1,
 'feature_name': 1,
 'samples': 7}

```

In this example there are 7 samples from 6 cities in the parcs feature_type. 

The summary of each label for each feature in the current data set can be obtained by providing the feature of interest to the groupby columns. By default the sample id: `loc_date` and the location name `slug` are required.

__Essential:__ The boundaries variable defines the top level data structure, for example: `boundaries={'canton': 'Valais', 'language': 'fr', 'start_date': '2019-01-01', 'end_date': '2022-01-01'}` will produce a dataframe with records only from the canton of Valais within the dates defined. Tables and charts will be translated to french.

The report class uses the resulting data structure to define reports for the different features within the dataframe.

## The report class

The report class provides the set of arguments that define the structure of the report based on the user input. Those arguments are a property of the class `ReportClass.feautures`. The `ReportClass` also identifies objects that meet the criteria defined by `mc_criteria_one` and `mc_criteria_two`.

To start a `ReportClass` call it with the dataframes of interest, the boundaries, the top level report, the language and the language map.

```python
a_report = r_class.ReportClass(w_df, w_di, boundaries, top_label, 'fr', c_l)
a_report.the_number_of_attributes_in_a_feature('feature_type')
```

### The number and types of features in a report

Once a `ReportClass` is initiated a summary of the attributes can be obtained:

In [24]:
top_label, language, w_df, w_di = r_class.report_data(boundaries, survey_data.copy(), beaches, codes)          
a_report = r_class.ReportClass(w_df, w_di, boundaries, top_label, 'fr', c_l)
r_class.translated_and_style_for_display(a_report.the_number_of_attributes_in_a_feature('feature_type'), a_report.lang_maps[a_report.language], a_report.language, gradient=False)

Unnamed: 0,Ville,Région,Zone,Feature_Name,Échantillons
Parc,6,1,1,1,7
Rivière,4,1,1,1,4
Lac,1,1,1,1,11


#### A top level description

The first out put says there are three feature types (lakes, rivers and parks) in the data. There is one lake that was sampled 11 times, a river was sampled 4 times and the parks were sampled 7 times. In total there were 11 cities, 4 on the river, 6 in the parcs feature and 1 on the lake.

Recall that the geographic column names are: `['feature_type', 'feature_name', 'parent_name']`.  The suvey results from each sector can be compared by selecting the column name of interest. Depending on the value of `boundaries` all the column names may not be available. There is a method to identify exactly what features are available. Note in the example below, canton is not an option. This is because the boundaries were set for a canton.

```python
my_feautures = my_report_class.available_features()

print(my_features)

=> ['parent_boundary', 'feature_type', 'feature_name', 'city']
```

The `summarize_feature_labels` method in the `ReportClass` creates a summary of the sample totals for each label of the selected feature. Calling the `translated_and_style_for_display` method puts the table to html and applies language specific formatting using the `dataframe.style` method. The index and column names are translated using the language maps.

```python
feature_type_summary =  my_report_class.summarize_feature_labels('feature_type')

translated_and_style_for_display(feature_type_summary, my_report.language_maps, my_report.language, gradient=False)
```

Combined with the output from above a desctription of the data and how it was collected can be constructed, the higlighted text can be called from the active variables.
> There were `13'782` objects identified in the period between `2019-01-01` and `2021-12-31` in the `canton` of `Valais`. In total, `22` samples were recorded, `11` on the `lake-shore`, `7` in `ski-areas` and `4` on `riverbanks`.  The lake samples were recorded from `one` `city` on the other hand the alpes and rivers were taken from `10` `different cities`. The `median` sample total of _pieces of trash per meter_ `pcs/m` is highest at the `lakeside`, followed by the `parcs` and `rivers`.

```{admonition} Ships search terms
The `ReportClass.features` is a dictionary or .JSON file that contains the common labled geographic name of the region in question. Structuring these into search terms is a way to integrate a LLM into the analysis.
```

In [25]:
r_class.translated_and_style_for_display(a_report.summarize_feature_labels('feature_type'), a_report.lang_maps[a_report.language], a_report.language, gradient=False)

Unnamed: 0_level_0,Pcs_M,Pcs_M,Pcs_M
label,L,P,R
25%,1475,136,16
50%,1771,287,24
75%,3267,479,48
Échantillons,11,7,4
Max,5273,41505,102
Moyenne,2364,6161,40
Min,258,106,9
Écart-Type,1691,15586,42
Total,7'560,6'144,78


#### By criteria

Objects can be selected by criteria. The default criteria requires that the quantity be in the top ten or the fail rate >= .5. This can be changed at any time using the keywords when the class is called or setting the class variables in the form `my_report_class.criteria_one = anewvalue`.

```python
objects_selected_by_criteria = my_report_class.most_common
translated_and_style_for_display(a_report.most_common, a_report.lang_maps[a_report.language], a_report.language, gradient=False)
``` 

Calling `my_report_class.most_common` will return a dataframe that has the test statistic and description of all objects that meet the criteria.

In [26]:
r_class.translated_and_style_for_display(a_report.most_common, a_report.lang_maps[a_report.language], a_report.language, gradient=False)

Unnamed: 0,Quantité,% Du Total,Pcs/M,Taux D'Échec
Brosse De Télésiège,5'181,38,0,9
"Fragments De Polystyrène Expansé: G76, G81, G82, G83",1'476,11,2,59
"Fragments De Plastique: G80, G79, G78, G75",1'299,9,23,91
"Couvercles En Plastique Bouteille: G21, G22, G23, G24",589,4,6,64
"Emballages De Bonbons, De Snacks",564,4,35,82
"Bâche, Feuille Plastique Industrielle",516,4,15,59
Mousse De Plastique Pour L'Isolation Thermique,453,3,3,55
Coton-Tige,453,3,7,50
Déchets De Construction En Plastique,307,2,15,59
Mégots Et Filtres À Cigarettes,221,2,9,73


#### Results by criteria and feature type

Once the the objects of intereste are identified (criteria) they can be compared accross the diferent feature_types and labels.

```python
t = a_cumulative_report(w_df[w_df.code.isin(a_report.most_common.index)], feature_name='feature_type', object_column='code')
translated_and_style_for_display(t, a_report.lang_maps[a_report.language], a_report.language, gradient=True)
``` 
For example the most common objects are found at different densitiies depending on the feature type.

In [27]:
t = r_class.a_cumulative_report(w_df[w_df.code.isin(a_report.most_common.index)], feature_name='feature_type', object_column='code')
r_class.translated_and_style_for_display(t, a_report.lang_maps[a_report.language], a_report.language, gradient=True)

Unnamed: 0,Parc,Rivière,Lac,Cumulé
Emballage Fast Food,0,0,27,3
Médical Conteneurs/Tubes/ Emballages,0,0,15,2
"Bouchons De Bouteilles En Métal, Couvercles Et Tirettes",2,0,3,2
Tabac Emballages En Plastique,0,0,16,1
Mégots Et Filtres À Cigarettes,16,0,13,9
"Emballages De Bonbons, De Snacks",17,0,77,35
Bâtonnets De Sucette,0,0,28,1
Jouets Et Faveurs De Fête,0,0,14,3
"Gobelets, Couvercles, Mousse À Usage Unique Et Plastique Dur",5,0,21,7
Pailles Et Agitateurs,0,0,16,3


#### Alternate object groups

If the column has other labeled values for object identification it can be used to aggregate results for each sample id. Here we consider `groupname, there is more than one object in a group. They represent use cases.

```python
t = a_cumulative_report(w_df[w_df.code.isin(a_report.most_common.index)], feature_name='feature_type', object_column='groupname')
translated_and_style_for_display(t, a_report.lang_maps[a_report.language], a_report.language, gradient=True)
``` 
For example the different use cases are found at different densitiies depending on the feature type.

In [16]:
t = r_class.a_cumulative_report(w_df, feature_name='feature_type', object_column='groupname')
r_class.translated_and_style_for_display(t, a_report.lang_maps[a_report.language], a_report.language, gradient=True)

Unnamed: 0,Parc,Rivière,Lac,Cumulé
Agriculture,2,3,96,23
Nourriture Et Boissons,26,1,314,102
Infrastructures,58,6,572,216
Micro-Plastiques (< 5Mm),3,0,54,7
Emballage Non Alimentaire,12,1,61,21
Articles Personnels,7,1,34,14
Morceaux De Plastique,23,1,256,68
Loisirs,11,3,99,38
Tabac,18,0,28,20
Non Classé,2,0,6,2


#### By Survey area or `parent_boundary`

There are two parent boundaries in the Valais, the _Alpes and Jura_ and the _Rhône_ river basin.

In [17]:
t = r_class.a_cumulative_report(w_df[w_df.code.isin(a_report.most_common.index)], feature_name='parent_boundary', object_column='code')
r_class.translated_and_style_for_display(t,a_report.lang_maps[a_report.language], a_report.language, gradient=True)

Unnamed: 0,Alpes Et Jura,Rhône,Cumulé
Emballage Fast Food,0,18,3
Médical Conteneurs/Tubes/ Emballages,0,11,2
"Bouchons De Bouteilles En Métal, Couvercles Et Tirettes",2,2,2
Tabac Emballages En Plastique,0,13,1
Mégots Et Filtres À Cigarettes,16,6,9
"Emballages De Bonbons, De Snacks",17,50,35
Bâtonnets De Sucette,0,11,1
Jouets Et Faveurs De Fête,0,9,3
"Gobelets, Couvercles, Mousse À Usage Unique Et Plastique Dur",5,11,7
Pailles Et Agitateurs,0,11,3


#### By feature name:

There are three different features with samples: Tha Alpes, the Rhône river and Lake Geneva.

In [18]:
t = r_class.a_cumulative_report(w_df[w_df.code.isin(a_report.most_common.index)], feature_name='feature_name', object_column='code')
r_class.translated_and_style_for_display(t, a_report.lang_maps[a_report.language], a_report.language, gradient=True)

Unnamed: 0,Alpes-Valaisannes,Rhône,Lac-Leman,Cumulé
Emballage Fast Food,0,0,27,3
Médical Conteneurs/Tubes/ Emballages,0,0,15,2
"Bouchons De Bouteilles En Métal, Couvercles Et Tirettes",2,0,3,2
Tabac Emballages En Plastique,0,0,16,1
Mégots Et Filtres À Cigarettes,16,0,13,9
"Emballages De Bonbons, De Snacks",17,0,77,35
Bâtonnets De Sucette,0,0,28,1
Jouets Et Faveurs De Fête,0,0,14,3
"Gobelets, Couvercles, Mousse À Usage Unique Et Plastique Dur",5,0,21,7
Pailles Et Agitateurs,0,0,16,3


#### By city:

In [19]:
t = r_class.a_cumulative_report(w_df[w_df.code.isin(a_report.most_common.index)], feature_name='city', object_column='code')
r_class.translated_and_style_for_display(t, a_report.lang_maps[a_report.language], a_report.language, gradient=True)

Unnamed: 0,Lens,Leuk,Nendaz,Riddes,Saint-Gingolph,Salgesch,Sion,Troistorrents,Val De Bagnes,Val-D'Illiez,Cumulé
Emballage Fast Food,0,0,0,0,27,0,0,0,0,0,3
Médical Conteneurs/Tubes/ Emballages,0,0,4,1,15,0,0,3,0,0,2
"Bouchons De Bouteilles En Métal, Couvercles Et Tirettes",2,0,5,1,3,0,0,4,14,0,2
Tabac Emballages En Plastique,0,0,18,1,16,0,0,0,0,0,1
Mégots Et Filtres À Cigarettes,16,0,155,4,13,0,4,24,152,0,9
"Emballages De Bonbons, De Snacks",9,0,118,6,77,0,2,17,24,6,35
Bâtonnets De Sucette,0,0,0,0,28,0,0,0,0,0,1
Jouets Et Faveurs De Fête,0,2,0,0,14,0,0,0,0,0,3
"Gobelets, Couvercles, Mousse À Usage Unique Et Plastique Dur",5,0,7,0,21,0,0,3,0,5,7
Pailles Et Agitateurs,2,0,0,0,16,0,0,1,0,0,3


## Testing

There are 318'478 rows in the survey data. We can test the sorting and grouping functions by running a report class on all possible combinations of the features of interest. The test should produce the set of arguments that define the survey locations and surveys that define the boundaries of a report.

```python
some_features = ['feature_type', 'parent_boundary', 'feature_name', 'canton', 'city']

def produce_reports_for_testing(survey_data, some_features):
    reports = {}
    for a_feature in some_features:
        labels = survey_data[a_feature].unique()
        label_reports = {}
        for label in labels:
            start_date = survey_data[survey_data[a_feature] == label]['date'].min()
            end_date = survey_data[survey_data[a_feature] == label]['date'].max()
            
            boundaries = {a_feature:label, 'language':'fr', 'start_date':start_date, 'end_date':end_date}
            top_label, language, w_df, w_di = report_data(boundaries, survey_data.copy())
            a_report = ReportClass(w_df, w_di, boundaries, top_label, 'fr', c_l)
            label_reports.update({label:a_report.features})
        reports.update({a_feature:label_reports})
    return reports
   
t = produce_reports_for_testing(survey_data, some_features)

t['canton']['Valais']
```

In [20]:
some_features = ['feature_type', 'parent_boundary', 'feature_name', 'canton', 'city']

def produce_reports_for_testing(survey_data, some_features):
    reports = {}
    for a_feature in some_features:
        labels = survey_data[a_feature].unique()
        label_reports = {}
        for label in labels:
            start_date = survey_data[survey_data[a_feature] == label]['date'].min()
            end_date = survey_data[survey_data[a_feature] == label]['date'].max()
            
            boundaries = {a_feature:label, 'language':'fr', 'start_date':start_date, 'end_date':end_date}
            top_label, language, w_df, w_di = r_class.report_data(boundaries, survey_data.copy(), beaches, codes)
            a_report = r_class.ReportClass(w_df, w_di, boundaries, top_label, 'fr', c_l)
            label_reports.update({label:a_report.features})
        reports.update({a_feature:label_reports})
    return reports
   
t = produce_reports_for_testing(survey_data, some_features)

### Retrieveing properties from test

This should match `a_report.features` from the example:

In [21]:
t['canton']['Valais']

{'city': array(['Lens', 'Leuk', 'Nendaz', 'Riddes', 'Saint-Gingolph', 'Salgesch',
        'Sion', 'Troistorrents', 'Val de Bagnes', "Val-d'Illiez"],
       dtype=object),
 'parent_boundary': array(['les-alpes', 'rhone'], dtype=object),
 'feature_type': array(['p', 'r', 'l'], dtype=object),
 'feature_name': array(['alpes-valaisannes', 'rhone', 'lac-leman'], dtype=object),
 'samples': array(["('clean-up-tour-crans-montana', '2021-06-12')",
        "('leuk-mattenstrasse', '2021-02-14')",
        "('clean-up-tour-nendaz', '2021-07-04')",
        "('clean-up-tour-veysonnaz', '2021-07-03')",
        "('clean-up-tour-la-tzoumaz', '2021-05-22')",
        "('les-glariers', '2020-12-01')", "('grand-clos', '2016-09-21')",
        "('grand-clos', '2017-10-22')", "('grand-clos', '2020-05-07')",
        "('grand-clos', '2020-06-09')", "('grand-clos', '2020-07-07')",
        "('grand-clos', '2020-08-06')", "('grand-clos', '2020-09-08')",
        "('grand-clos', '2020-10-06')", "('grand-clos', '2020-1

The properties should contain the arguments for cities in the example report

In [22]:
t['city']['Saint-Gingolph']

{'feature_name': array(['lac-leman'], dtype=object),
 'feature_type': array(['l'], dtype=object),
 'parent_boundary': array(['rhone'], dtype=object),
 'samples': array(["('grand-clos', '2016-09-21')", "('grand-clos', '2017-10-22')",
        "('grand-clos', '2020-05-07')", "('grand-clos', '2020-06-09')",
        "('grand-clos', '2020-07-07')", "('grand-clos', '2020-08-06')",
        "('grand-clos', '2020-09-08')", "('grand-clos', '2020-10-06')",
        "('grand-clos', '2020-12-08')", "('grand-clos', '2021-01-07')",
        "('grand-clos', '2021-02-09')", "('grand-clos', '2021-03-09')",
        "('grand-clos', '2021-04-10')"], dtype=object)}

In [23]:
%watermark -a hammerdirt-analyst -co --iversions

Author: hammerdirt-analyst

conda environment: cantonal_report

numpy     : 1.25.2
matplotlib: 3.7.1
pandas    : 2.0.3

