This notebook extracts and formats 4 predictor variables (number of working inhabitants, number of college graduates, number of youth (18-24) and number of immigrants) by Parisian district over the period 2006-2016 (last available year). These data come from Insee's IRIS database, which collects several hundreds of variables at the sub-city level. 

We selected four variables that we believe have a strong influence (potentially causal) on the outcome of elections in each district of Paris. Our assumption may be wrong, but it will be easy to see that once we put the data into the model -- it won't run or will tell us that these variables are not correlated with the outcome. The model will use these predictors to try and predict election results in each district, but we'll do that in another notebook. 

Let's start with some import statements and a handy function to extract predictors:

In [1]:
%load_ext lab_black
%load_ext watermark

import logging
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns

from fbprophet import Prophet
from pathlib import Path
from typing import List

logging.getLogger().setLevel(logging.CRITICAL)

variables = {
    "activite_residents": {
        "C_ACTOCC1564": "actifs_occupes",
        "P_CHOM1564": "chomeurs",
        "P_RETR1564": "retired",
        "C_ACTOCC1564_CS3": "csp_plus",
        "C_ACTOCC1564_CS5": "csp_employed",
        "P_SAL15P_CDD": "cdd",
        "P_SAL15P_INTERIM": "interim",
        "P_SAL15P_EMPAID": "empaid",
    },
    "couples_familles_menages": {
        "P_POP2554": "pop2554",
        "P_POP5579": "pop5579",
        "C_FAMMONO": "fam_mono",
    },
    "diplomes_formation": {
        "P_NSCOL15P_SUP": "college_grad",
        "P_NSCOL15P_BAC": "non_college",
        "P_NSCOL15P_DIPLMIN": "min_diploma",
    },
    "logement": {"P_RP_100M2P": "big_house"},
    "population": {
        "P_POP1824": "youth",
        "P_POP_IMM": "immigration",
        "P_POP3044": "pop3044",
        "P_POP4559": "pop4559",
        "P_POP6074": "pop6074",
    },
}

In [2]:
basepath = Path("../../../Downloads/db_iris_all/logement/")
files_in_path = basepath.glob("*.xls")
preds = pd.DataFrame()
for file in files_in_path:
    year = file.stem[-2:]
    var_cols = [
        f"{var_code[:1]}{year}{var_code[1:]}"
        for var_code in variables["logement"].keys()
    ]
    if year > "12":
        var_cols = [f"P{year}_RP_{v}" for v in ["100120M2", "120M2P"]]

    df = pd.read_excel(
        file,
        header=5,
        sheet_name="IRIS",
        usecols=["DEP", "LIBCOM"] + var_cols,
        dtype={"DEP": "category", "LIBCOM": "category"},
        nrows=40_500,
    )
    df = df[df.DEP == "75"].reset_index(drop=True).drop("DEP", axis=1)
    if year > "12":
        df[f"P{year}_RP_100M2P"] = df[var_cols].sum(1)
        df = df.drop(var_cols, axis=1)
    preds = pd.concat([preds, df], axis=1)
preds

Unnamed: 0,LIBCOM,P10_RP_100M2P,LIBCOM.1,P11_RP_100M2P,LIBCOM.2,P07_RP_100M2P,LIBCOM.3,P13_RP_100M2P,LIBCOM.4,P12_RP_100M2P,...,LIBCOM.5,P16_RP_100M2P,LIBCOM.6,P15_RP_100M2P,LIBCOM.7,P14_RP_100M2P,LIBCOM.8,P08_RP_100M2P,LIBCOM.9,P09_RP_100M2P
0,Paris 1er Arrondissement,78.253972,Paris 1er Arrondissement,69.380930,Paris 1er Arrondissement,57.160146,Paris 1er Arrondissement,76.350351,Paris 1er Arrondissement,76.584549,...,Paris 1er Arrondissement,55.936939,Paris 1er Arrondissement,66.812550,Paris 1er Arrondissement,72.611186,Paris 1er Arrondissement,62.451607,Paris 1er Arrondissement,75.925458
1,Paris 1er Arrondissement,23.583259,Paris 1er Arrondissement,16.312026,Paris 1er Arrondissement,26.156991,Paris 1er Arrondissement,22.119253,Paris 1er Arrondissement,15.641223,...,Paris 1er Arrondissement,12.274221,Paris 1er Arrondissement,11.701616,Paris 1er Arrondissement,14.069804,Paris 1er Arrondissement,23.160299,Paris 1er Arrondissement,26.279107
2,Paris 1er Arrondissement,86.463210,Paris 1er Arrondissement,75.739708,Paris 1er Arrondissement,66.777546,Paris 1er Arrondissement,66.312667,Paris 1er Arrondissement,67.156291,...,Paris 1er Arrondissement,47.913030,Paris 1er Arrondissement,32.115575,Paris 1er Arrondissement,39.057519,Paris 1er Arrondissement,53.863441,Paris 1er Arrondissement,81.877231
3,Paris 1er Arrondissement,2.817932,Paris 1er Arrondissement,2.776186,Paris 1er Arrondissement,0.000000,Paris 1er Arrondissement,2.876313,Paris 1er Arrondissement,2.662020,...,Paris 1er Arrondissement,0.000000,Paris 1er Arrondissement,0.000000,Paris 1er Arrondissement,0.000000,Paris 1er Arrondissement,0.000000,Paris 1er Arrondissement,2.855777
4,Paris 1er Arrondissement,0.000000,Paris 1er Arrondissement,0.000000,Paris 1er Arrondissement,0.000000,Paris 1er Arrondissement,0.000000,Paris 1er Arrondissement,,...,Paris 1er Arrondissement,0.000000,Paris 1er Arrondissement,0.000000,Paris 1er Arrondissement,0.000000,Paris 1er Arrondissement,0.000000,Paris 1er Arrondissement,0.000000
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
987,Paris 20e Arrondissement,28.915448,Paris 20e Arrondissement,24.894402,Paris 20e Arrondissement,32.040505,Paris 20e Arrondissement,13.637888,Paris 20e Arrondissement,15.392645,...,Paris 20e Arrondissement,27.943755,Paris 20e Arrondissement,5.649548,Paris 20e Arrondissement,5.906094,Paris 20e Arrondissement,32.061406,Paris 20e Arrondissement,32.655114
988,Paris 20e Arrondissement,34.424891,Paris 20e Arrondissement,47.987801,Paris 20e Arrondissement,55.460034,Paris 20e Arrondissement,40.088309,Paris 20e Arrondissement,33.709357,...,Paris 20e Arrondissement,20.228391,Paris 20e Arrondissement,35.709361,Paris 20e Arrondissement,36.800613,Paris 20e Arrondissement,33.535707,Paris 20e Arrondissement,41.146490
989,Paris 20e Arrondissement,64.648922,Paris 20e Arrondissement,69.742467,Paris 20e Arrondissement,98.249795,Paris 20e Arrondissement,73.930025,Paris 20e Arrondissement,63.964689,...,Paris 20e Arrondissement,132.006923,Paris 20e Arrondissement,77.561534,Paris 20e Arrondissement,67.679495,Paris 20e Arrondissement,56.801907,Paris 20e Arrondissement,67.655650
990,Paris 20e Arrondissement,30.690409,Paris 20e Arrondissement,39.680238,Paris 20e Arrondissement,12.690371,Paris 20e Arrondissement,63.046776,Paris 20e Arrondissement,62.119843,...,Paris 20e Arrondissement,60.138407,Paris 20e Arrondissement,55.529384,Paris 20e Arrondissement,53.165509,Paris 20e Arrondissement,3.468633,Paris 20e Arrondissement,16.579049


In [3]:
# drop duplicated column values:
preds = preds.T.drop_duplicates().T
# drop duplicated column names:
preds = preds.loc[:, ~preds.columns.duplicated()]
preds

Unnamed: 0,LIBCOM,P10_RP_100M2P,P11_RP_100M2P,P07_RP_100M2P,P13_RP_100M2P,P12_RP_100M2P,P06_RP_100M2P,P16_RP_100M2P,P15_RP_100M2P,P14_RP_100M2P,P08_RP_100M2P,P09_RP_100M2P
0,Paris 1er Arrondissement,78.254,69.3809,57.1601,76.3504,76.5845,63.1362,55.9369,66.8126,72.6112,62.4516,75.9255
1,Paris 1er Arrondissement,23.5833,16.312,26.157,22.1193,15.6412,28.4831,12.2742,11.7016,14.0698,23.1603,26.2791
2,Paris 1er Arrondissement,86.4632,75.7397,66.7775,66.3127,67.1563,46.814,47.913,32.1156,39.0575,53.8634,81.8772
3,Paris 1er Arrondissement,2.81793,2.77619,0,2.87631,2.66202,0,0,0,0,0,2.85578
4,Paris 1er Arrondissement,0,0,0,0,,0,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...
987,Paris 20e Arrondissement,28.9154,24.8944,32.0405,13.6379,15.3926,24.9946,27.9438,5.64955,5.90609,32.0614,32.6551
988,Paris 20e Arrondissement,34.4249,47.9878,55.46,40.0883,33.7094,46.125,20.2284,35.7094,36.8006,33.5357,41.1465
989,Paris 20e Arrondissement,64.6489,69.7425,98.2498,73.93,63.9647,87.3385,132.007,77.5615,67.6795,56.8019,67.6556
990,Paris 20e Arrondissement,30.6904,39.6802,12.6904,63.0468,62.1198,26.0653,60.1384,55.5294,53.1655,3.46863,16.579


In [4]:
# extract district number:
preds["LIBCOM"] = preds.LIBCOM.str.extract("(\d+)").astype(int)
preds = preds.rename(columns={"LIBCOM": "arrondissement"})
preds

Unnamed: 0,arrondissement,P10_RP_100M2P,P11_RP_100M2P,P07_RP_100M2P,P13_RP_100M2P,P12_RP_100M2P,P06_RP_100M2P,P16_RP_100M2P,P15_RP_100M2P,P14_RP_100M2P,P08_RP_100M2P,P09_RP_100M2P
0,1,78.254,69.3809,57.1601,76.3504,76.5845,63.1362,55.9369,66.8126,72.6112,62.4516,75.9255
1,1,23.5833,16.312,26.157,22.1193,15.6412,28.4831,12.2742,11.7016,14.0698,23.1603,26.2791
2,1,86.4632,75.7397,66.7775,66.3127,67.1563,46.814,47.913,32.1156,39.0575,53.8634,81.8772
3,1,2.81793,2.77619,0,2.87631,2.66202,0,0,0,0,0,2.85578
4,1,0,0,0,0,,0,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...
987,20,28.9154,24.8944,32.0405,13.6379,15.3926,24.9946,27.9438,5.64955,5.90609,32.0614,32.6551
988,20,34.4249,47.9878,55.46,40.0883,33.7094,46.125,20.2284,35.7094,36.8006,33.5357,41.1465
989,20,64.6489,69.7425,98.2498,73.93,63.9647,87.3385,132.007,77.5615,67.6795,56.8019,67.6556
990,20,30.6904,39.6802,12.6904,63.0468,62.1198,26.0653,60.1384,55.5294,53.1655,3.46863,16.579


In [5]:
# aggregate by district and prettify columns:
preds = preds.groupby("arrondissement").sum()

temp = []
for var_code in variables["logement"].keys():
    dat = preds.filter(like=var_code[2:])
    dat.columns = dat.columns.str[1:3].astype(int) + 2000
    dat.columns.name = "year"
    dat = dat.sort_index(axis=1)
    dat = dat.stack()
    dat.name = var_code[2:]
    temp.append(dat)

In [6]:
pd.concat(temp, axis=1)

Unnamed: 0_level_0,Unnamed: 1_level_0,RP_100M2P
arrondissement,year,Unnamed: 2_level_1
1,2006,1252.732849
1,2007,1286.164264
1,2008,1216.472899
1,2009,1302.325425
1,2010,1397.237101
...,...,...
20,2012,3789.992165
20,2013,3792.380980
20,2014,3850.537722
20,2015,3843.909770


In [38]:
def extract_predictors(repo: str) -> pd.DataFrame:
    """
    Gets all files in the given repo, selects wanted predictor variables (and takes care
    of change of perimeter in 2013 for some of them), restricts to Paris, aggregates predictors
    by district, and then returns formatted time series.
    """
    basepath = Path(f"../../../Downloads/db_iris_all/{repo}/")
    files_in_path = basepath.glob("*.xls")
    print(f"Began extracting predictors from {repo} repo...")

    # load and concat files (heavy):
    preds = pd.DataFrame()
    for file in files_in_path:
        year = file.stem[-2:]
        var_cols = extract_vars(repo, year)
        df = pd.read_excel(
            file,
            header=5,
            sheet_name="IRIS",
            usecols=["DEP", "LIBCOM"] + var_cols,
            dtype={"DEP": "category", "LIBCOM": "category"},
            nrows=40_500,
        )
        df = df[df.DEP == "75"].reset_index(drop=True).drop("DEP", axis=1)
        # handle change of perimeter in data:
        if ((repo == "diplomes_formation") and (year <= "12")) or ((repo == "logement") and (year > "12")):
            df = reconcile_perimeter(repo, year, var_cols, df)
        preds = pd.concat([preds, df], axis=1)
        
    return agg_and_format(repo, preds)


def extract_vars(repo: str, year: str) -> List:
    """
    From the repo and year, make a list of appropriate variables to extract from the file.
    The perimeter of sampled data changed in 2013 for two categories of variables we're interested in
    (diplomes_formation and logement) -- the function handles that by selecting the right columns for 
    each year.
    """
    var_map = variables[repo]
    var_cols = [
        f"{var_code[:1]}{year}{var_code[1:]}" for var_code in var_map.keys()
    ]
    # handle change of perimeter in data:
    if (repo == "diplomes_formation") and (year <= "12"):
        var_cols = [
            f"P{year}_NSCOL15P_{v}"
            for v in ["DIPL0", "CEP", "BEPC", "BAC", "BACP2", "SUP"]
        ]
    if (repo == "logement") and (year > "12"):
        var_cols = [f"P{year}_RP_{v}" for v in ["100120M2", "120M2P"]]
    
    return var_cols


def reconcile_perimeter(repo: str, year: str, var_cols: List, df: pd.DataFrame) -> pd.DataFrame:
    """
    This function reconciles the change of perimeter that occured in 2013 for variables in 
    diplomes_formation and logement. It just sums the appropriate columns to get the same 
    perimeter for all years, and then drops the useless columns from the dataframe.
    """
    if (repo == "diplomes_formation") and (year <= "12"):
        df[f"P{year}_NSCOL15P_DIPLMIN"] = df[
            [
                f"P{year}_NSCOL15P_DIPL0",
                f"P{year}_NSCOL15P_CEP",
                f"P{year}_NSCOL15P_BEPC",
            ]
        ].sum(1)
        df[f"P{year}_NSCOL15P_SUP"] = df[
            [f"P{year}_NSCOL15P_BACP2", f"P{year}_NSCOL15P_SUP"]
        ].sum(1)
        df = df.drop(
            [
                f"P{year}_NSCOL15P_DIPL0",
                f"P{year}_NSCOL15P_CEP",
                f"P{year}_NSCOL15P_BEPC",
                f"P{year}_NSCOL15P_BACP2",
            ],
            axis=1,
        )
    if (repo == "logement") and (year > "12"):
        df[f"P{year}_RP_100M2P"] = df[var_cols].sum(1)
        df = df.drop(var_cols, axis=1)
    
    return df


def agg_and_format(repo: str, df: pd.DataFrame) -> pd.DataFrame:
    """
    Takes the raw timeseries of predictors, aggregates them by district,
    prettifies columns and returns timeseries of all predictors in
    appropriate format.
    """
    # drop duplicated column values:
    df = df.T.drop_duplicates().T
    # drop duplicated column names:
    df = df.loc[:, ~df.columns.duplicated()]

    # extract district number:
    df["LIBCOM"] = df.LIBCOM.str.extract("(\d+)").astype(int)
    df = df.rename(columns={"LIBCOM": "arrondissement"})

    # aggregate by district and prettify columns:
    df = df.groupby("arrondissement").sum()
    temp = []
    for var_code, var_name in variables[repo].items():
        dat = df.filter(like=var_code[2:])
        dat.columns = dat.columns.str[1:3].astype(int) + 2000
        dat.columns.name = "year"
        dat = dat.sort_index(axis=1).stack()
        dat.name = var_name
        temp.append(dat)
    
    print(f"Finished extracting and aggregating predictors.\n")
    return pd.concat(temp, axis=1)

The raw excel files where the data live are very heavy, so this function will take some time to run -- but it will be worth it. Indeed, it will go and load the files where each predictor is, for  each year on record, do some formatting and restricting and then return a dataframe with the proper time series. Let's run it and go get a cup of coffee:

In [5]:
%%time
predictors = []
for repo in variables.keys():
    predictors.append(extract_predictors(repo))

Began extracting actifs_occupes predictor from activite_residents repo...

Finished extracting and aggregating actifs_occupes predictor.

Began extracting chomeurs predictor from activite_residents repo...

Finished extracting and aggregating chomeurs predictor.

Began extracting retired predictor from activite_residents repo...

Finished extracting and aggregating retired predictor.

Began extracting csp_plus predictor from activite_residents repo...

Finished extracting and aggregating csp_plus predictor.

Began extracting csp_employed predictor from activite_residents repo...

Finished extracting and aggregating csp_employed predictor.

Began extracting cdd predictor from activite_residents repo...

Finished extracting and aggregating cdd predictor.

Began extracting interim predictor from activite_residents repo...

Finished extracting and aggregating interim predictor.

Began extracting empaid predictor from activite_residents repo...

Finished extracting and aggregating empaid pr

ValueError: Usecols do not match columns, columns expected but not found: ['P11_NSCOL15P_DIPLMIN']

In [4]:
predictors = pd.concat(predictors, axis=1)
predictors

Unnamed: 0_level_0,Unnamed: 1_level_0,actifs_occupes,college_grad,youth,immigration
arrondissement,year,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
1,2006,9485.059228,6453.630949,1874.672223,3148.134542
1,2007,9546.148694,6731.105539,1866.646378,3227.921219
1,2008,9469.633224,6770.997547,1816.180756,3121.358408
1,2009,9665.691628,6994.804860,1842.097989,3121.406343
1,2010,9558.180760,7009.239728,1779.033595,3021.113668
...,...,...,...,...,...
20,2012,91753.677270,44244.946169,18234.641207,43045.904247
20,2013,90488.610079,63058.398038,18156.671990,42888.160363
20,2014,90469.181326,66282.764695,18133.119561,42123.803538
20,2015,90370.240523,68786.240273,17977.773858,41633.325845


Had a nice coffee? As you can see, we now have the predictors ready to match with past election results, and then to give to the model! Ready? Well, not completely... The data stop in 2016 but we will train our model on elections as recent as 2017, and we'll test it on 2019 European elections, so we need data for the period 2017-2019.

Unfortunately, these type of data generally take two years to produce. This means 2019 data should be available around 2021 -- we can't wait that long! Facebook's Prophet library comes very handy here and will allow us to make some reasonable extrapolations of the predictors' values. Ideally, we should think hard about Prophet's default settings and if they are adapted to our use case -- we could even see if our predictors could be predicted by other, available data.

Here however, I'll do a quick and dirty extrapolation, sticking to Prophet's default. We'll see how the model handles that and we always do better afterwards if needed. Actually, I think it could be even more helpful to incorporate measurement error on predictors *into* the model, so that the Bayesian machinery takes it into account -- so let's not spend too much time here, at least for our first iteration.

Let's turn our `year` variable into a real datetime (new year's eve) and write our interpolation function:

In [3]:
predictors = predictors.reset_index().set_index("arrondissement")
predictors["year"] = pd.to_datetime(predictors.year, format="%Y") + pd.DateOffset(
    months=11, days=30
)
predictors

Unnamed: 0_level_0,year,actifs_occupes,college_grad,youth,immigration
arrondissement,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
1,2006-12-31,9485.059228,6453.630949,1874.672223,3148.134542
1,2007-12-31,9546.148694,6731.105539,1866.646378,3227.921219
1,2008-12-31,9469.633224,6770.997547,1816.180756,3121.358408
1,2009-12-31,9665.691628,6994.804860,1842.097989,3121.406343
1,2010-12-31,9558.180760,7009.239728,1779.033595,3021.113668
...,...,...,...,...,...
20,2012-12-31,91753.677270,44244.946169,18234.641207,43045.904247
20,2013-12-31,90488.610079,63058.398038,18156.671990,42888.160363
20,2014-12-31,90469.181326,66282.764695,18133.119561,42123.803538
20,2015-12-31,90370.240523,68786.240273,17977.773858,41633.325845


In [86]:
def extrapol_pred(
    district: int, predictor: str, pred_df: pd.DataFrame, timeframe: int
) -> pd.DataFrame:
    """
    Quick and dirty extrapolation of predictor in the district, 
    for the number of years specified in timeframe variable.
    The function uses Facebook's Prophet default settings -- hence 'quick and dirty'.
    """
    df = pred_df.loc[district, ["year", predictor]].reset_index(drop=True)
    df.columns = ["ds", "y"]  # Prophet needs this names

    m = Prophet()
    m.fit(df)
    future = m.make_future_dataframe(periods=timeframe, freq="Y")
    forecast = m.predict(future)

    forecast = forecast.iloc[-timeframe:][["ds", "yhat"]]
    forecast.columns = ["year", predictor]

    forecast.index = [district] * len(forecast)
    forecast.index.name = "arrondissement"
    forecast = forecast.reset_index().set_index(["arrondissement", "year"])
    return forecast

Each pair (district, predictor) represents a time series that we extrapolate for the next three years (2017-2019). Then, we combine all that in a dataframe:

In [90]:
districts_dfs = []

for district in predictors.index.unique():
    extrapol = []
    for predictor in predictors.columns.difference(["year"]):
        extrapol.append(extrapol_pred(district, predictor, predictors, timeframe=3))
    
    print(f"Finished extrapolating all 4 predictors for district {district}\n")
    districts_dfs.append(pd.concat(extrapol, axis=1))

districts_dfs = pd.concat(districts_dfs)

Finished extrapolating all 4 predictors for district 1

Finished extrapolating all 4 predictors for district 2

Finished extrapolating all 4 predictors for district 3

Finished extrapolating all 4 predictors for district 4

Finished extrapolating all 4 predictors for district 5

Finished extrapolating all 4 predictors for district 6

Finished extrapolating all 4 predictors for district 7

Finished extrapolating all 4 predictors for district 8

Finished extrapolating all 4 predictors for district 9

Finished extrapolating all 4 predictors for district 10

Finished extrapolating all 4 predictors for district 11

Finished extrapolating all 4 predictors for district 12

Finished extrapolating all 4 predictors for district 13

Finished extrapolating all 4 predictors for district 14

Finished extrapolating all 4 predictors for district 15

Finished extrapolating all 4 predictors for district 16

Finished extrapolating all 4 predictors for district 17

Finished extrapolating all 4 predictors 

The only thing left to do is concatenating the extrapolations and the observed data:

In [114]:
predictors = pd.concat(
    [predictors.reset_index().set_index(["arrondissement", "year"]), districts_dfs],
    sort=True,
).sort_index()
predictors.to_csv("data/predictors_by_district.csv")
predictors

Unnamed: 0_level_0,Unnamed: 1_level_0,actifs_occupes,college_grad,immigration,youth
arrondissement,year,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
1,2006-12-31,9485.059228,6453.630949,3148.134542,1874.672223
1,2007-12-31,9546.148694,6731.105539,3227.921219,1866.646378
1,2008-12-31,9469.633224,6770.997547,3121.358408,1816.180756
1,2009-12-31,9665.691628,6994.804860,3121.406343,1842.097989
1,2010-12-31,9558.180760,7009.239728,3021.113668,1779.033595
...,...,...,...,...,...
20,2015-12-31,90370.240523,68786.240273,41633.325845,17977.773858
20,2016-12-31,90874.227479,71851.173023,41180.244725,17537.759874
20,2017-12-31,90955.383757,73470.335532,40830.947447,17485.646561
20,2018-12-31,90935.955004,77881.217226,40333.333708,17380.539014


And now we're ready to match predictors against past election results, and to give data to the model! Let's do that in another notebook.

In [115]:
%watermark -a AlexAndorra -n -u -v -iv

logging 0.5.1.2
seaborn 0.9.0
numpy   1.17.3
pandas  0.25.3
AlexAndorra 
last updated: Fri Nov 22 2019 

CPython 3.7.5
IPython 7.9.0
