# Build a table: region code -> region polygon

In this section we will build a static reference of region to be used to plot a Colorpleth of France.
Region path will be indexed by region code, the code is retrieved from the INSEE.
We will use these files:
* region2019-csv: columns **Code région** and **Nom en clair (majuscules)**
* regions-version-simplifiee.geojson: The files provided by Mohamed containing regions paths

In [13]:
import json
import pandas as pd

In [10]:
# Download INSEE 2019 region data
!wget https://www.insee.fr/fr/statistiques/fichier/3720946/region2019-csv.zip
!unzip region2019-csv.zip
!ls

--2019-06-12 19:36:25--  https://www.insee.fr/fr/statistiques/fichier/3720946/region2019-csv.zip
Resolving www.insee.fr (www.insee.fr)... 194.254.37.163, 143.196.255.163
Connecting to www.insee.fr (www.insee.fr)|194.254.37.163|:443... connected.
HTTP request sent, awaiting response... 200 
Length: unspecified [application/zip]
Saving to: ‘region2019-csv.zip’

region2019-csv.zip      [ <=>                ]     735  --.-KB/s    in 0s      

2019-06-12 19:36:26 (18.0 MB/s) - ‘region2019-csv.zip’ saved [735]

Archive:  region2019-csv.zip
  inflating: region2019.csv          
Untitled.ipynb                     region2019.csv
region2019-csv.zip                 regions-version-simplifiee.geojson


In [25]:
# Load new region codes
region2019_df = pd.read_csv("./region2019.csv")
region2019_df[region2019_df["reg"] ==  27]

Unnamed: 0,reg,cheflieu,tncc,ncc,nccenr,libelle
7,27,21231,0,BOURGOGNE FRANCHE COMTE,Bourgogne-Franche-Comté,Bourgogne-Franche-Comté


In [57]:
v = region2019_df[region2019_df["reg"] ==  27]["ncc"]

'7    BOURGOGNE FRANCHE COMTE\nName: ncc, dtype: object'

In [123]:
with open("regions-version-simplifiee.geojson") as f:
    geodata = json.load(f)

In [60]:
regions_paths = {}
for f in geodata["features"]:
    code = f["properties"]["code"]
    old_name = f["properties"]["nom"]
    name = region2019_df[region2019_df["reg"] ==  int(code)].iloc[0]["libelle"]
    regions_paths[code] = {
        "name":name,
        "old_name": old_name,
        "feature": f
    }

In [None]:
with open("regions.json", "w") as f:
    json.dump(regions_paths, f)

In [71]:
with open("regions_codes.txt", "w") as f:
    f.write("code\n")
    for code in regions_paths.keys():
        f.write("%s\n" % code)

# Build a table region -> total consumption

We want to have a consumption table. The table would be ideally indexed by INSEE region code so for each region we can retrieve directly its consumption.

In [75]:
regions_codes = pd.read_csv("regions_codes.txt")

In [78]:
df = pd.read_csv("consommation-electrique-par-secteur-dactivite-region.csv", delimiter=";")
df.columns

Index(['Année', 'Nom région', 'Code région', 'Nb sites Résidentiel',
       'Conso totale Résidentiel (MWh)', 'Conso moyenne Résidentiel (MWh)',
       'Nb sites Professionnel', 'Conso totale Professionnel (MWh)',
       'Conso moyenne Professionnel (MWh)', 'Nb sites Agriculture',
       'Conso totale Agriculture (MWh)', 'Nb sites Industrie',
       'Conso totale Industrie (MWh)', 'Nb sites Tertiaire',
       'Conso totale Tertiaire (MWh)', 'Nb sites Secteur non affecté',
       'Conso totale Secteur non affecté (MWh)', 'Nombre d'habitants',
       'Taux de logements collectifs', 'Taux de résidences principales',
       'Superficie des logements < 30 m2',
       'Superficie des logements 30 à 40 m2',
       'Superficie des logements 40 à 60 m2',
       'Superficie des logements 60 à 80 m2',
       'Superficie des logements 80 à 100 m2',
       'Superficie des logements > 100 m2',
       'Résidences principales avant 1919',
       'Résidences principales de 1919 à 1945',
       'Résiden

In [102]:
filtered_df = regions_codes.merge(df, right_on="Code région", left_on="code", how="inner")
filtered_df.shape, df.shape, regions_codes.shape

((84, 37), (84, 36), (13, 1))

In [117]:
availible_years = filtered_df["Année"].unique()
regions_consumption = {}

for code, region_groupdf in filtered_df.groupby("code"):
    regions_consumption[code] = {}
    for year, year_groupdf in  region_groupdf.groupby("Année"):
        regions_consumption[code][year] = year_groupdf.iloc[0]["Conso moyenne Résidentiel (MWh)"]

In [119]:
with open("regions_consumption.json", "w") as f:
    json.dump(regions_consumption, f)