# Modelación del territorio utilizando PCA

En este cuaderno llevamos a cabo una modelización de un territorio, en este caso toda España, mediante PCA en las elecciones de noviembre de 2019. 

En el cuaderno anterior nos basamos en las secciones elegidas de la provincia de Zaragoza para modelizar esta provincia. Ahora seleccionaremos las secciones a partir de una corta serie de municipios; es decir, estaremos siendo bastante restrictivos desde el principio. Algunos municipios estarán situados en CCAA con partidos regionalistas o nacionalistas, para intentar captar las particuparidades del voto en estos territorios. 

El método hasta aplicar la PCA es exactamente el mismo que el que utilizamos en el cuaderno de regresión lineal, por lo que no nos detendremos en nuestros comentarios tan detalladamente como en aquel. También aplicaremos el modelo obtenido para ver lo bien que es capaz de predecir, con las secciones equivalentes en junio de 2016, los resultados de esa elección.

Comenzamos cargando las librerías y el dataset de noviembre de 2019.

In [146]:
import pandas as pd
import numpy as np
import random

In [147]:
strings = {'Sección' : 'str', 'cod_ccaa' : 'str', 'cod_prov' : 'str', 'cod_mun' : 'str', 'cod_sec' : 'str'}

In [148]:
df_eleccion_comp = pd.read_csv('/content/drive/MyDrive/Proyecto_KeepCoding - Propio/Data/Gen-19-Nov/gen_N19_unif_cols_prov.txt', dtype = strings)

In [149]:
df_eleccion_comp

Unnamed: 0,Sección,cod_ccaa,cod_prov,cod_mun,cod_sec,CCAA,Provincia,Municipio,Censo_Esc,Votos_Total,Participación,Nulos,Votos_Válidos,Blanco,V_Cand,PP,PSOE,Cs,UP,IU,VOX,UPyD,MP,CiU,ERC,JxC,CUP,DiL,PNV,Bildu,Amaiur,CC,FA,TE,BNG,PRC,GBai,Compromis,PACMA,Otros,...,30-34,35-39,40-44,45-49,50-54,55-59,60-64,65-69,70-74,75-79,80-84,85-89,90-94,95-99,100 y más,Población Total,Hombres,Mujeres,% mayores 65 años,% 20-64 años,% menores 19 años,Afiliados SS Minicipio,% Afiliados SS autónomos,% Afiliados SS / Población,Paro Registrado Municipio,% Paro Hombres,% Paro mayores 45,% Paro s/ Afiliados SS Municipio,Renta persona 2017,Renta persona 2015,Renta hogar 2017,Renta hogar 2015,Renta Salarios 2018,Renta Salarios 2015,Renta Pensiones 2018,Renta Pensiones 2015,Renta Desempleo 2018,Renta Desempleo 2015,dict_res,dict_res_ord
0,022019111010400101001,01,04,04001,0400101001,Andalucía,Almería,Abla,1002,717,0.715569,7,710,3,707,193,310,47,30,0,122,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,2,...,73.0,80.0,89.0,81.0,94.0,87.0,91.0,77.0,72.0,42.0,67.0,56.0,19.0,4.0,0.0,1249.0,635.0,614.0,0.269816,0.590072,0.140112,304.0,0.223684,0.243395,140.0,0.421429,0.550000,0.315315,9159.0,8788.0,20172.0,19546.0,5574.0,4833.0,3286.0,3082.0,403.0,471.0,"{'PP': 193, 'PSOE': 310, 'Cs': 47, 'UP': 30, '...","[('PSOE', 310), ('PP', 193), ('VOX', 122), ('C..."
1,022019111010400201001,01,04,04002,0400201001,Andalucía,Almería,Abrucena,1013,711,0.701876,12,699,1,698,111,349,45,42,0,147,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,2,...,60.0,75.0,70.0,70.0,108.0,101.0,99.0,86.0,61.0,64.0,61.0,46.0,14.0,2.0,1.0,1202.0,637.0,565.0,0.278702,0.609817,0.111481,298.0,0.251678,0.247920,179.0,0.379888,0.625698,0.375262,8827.0,8107.0,17841.0,17115.0,4640.0,4048.0,3418.0,2770.0,568.0,620.0,"{'PP': 111, 'PSOE': 349, 'Cs': 45, 'UP': 42, '...","[('PSOE', 349), ('VOX', 147), ('PP', 111), ('C..."
2,022019111010400301001,01,04,04003,0400301001,Andalucía,Almería,Adra,667,484,0.725637,7,477,5,472,176,128,15,34,0,116,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,0,...,54.0,54.0,54.0,61.0,82.0,75.0,67.0,48.0,37.0,40.0,26.0,15.0,3.0,1.0,0.0,892.0,435.0,457.0,0.190583,0.643498,0.165919,7968.0,0.382530,8.932735,2525.0,0.432871,0.473663,0.240637,8965.0,8267.0,26498.0,24688.0,5121.0,4795.0,2499.0,2301.0,337.0,333.0,"{'PP': 176, 'PSOE': 128, 'Cs': 15, 'UP': 34, '...","[('PP', 176), ('PSOE', 128), ('VOX', 116), ('U..."
3,022019111010400301002,01,04,04003,0400301002,Andalucía,Almería,Adra,1306,909,0.696018,3,906,5,901,251,220,51,58,0,312,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,6,3,...,108.0,158.0,162.0,150.0,140.0,119.0,103.0,67.0,49.0,37.0,30.0,14.0,7.0,1.0,0.0,1752.0,865.0,887.0,0.117009,0.647260,0.235731,7968.0,0.382530,4.547945,2525.0,0.432871,0.473663,0.240637,8599.0,7941.0,25677.0,23400.0,5381.0,4837.0,1815.0,1724.0,343.0,464.0,"{'PP': 251, 'PSOE': 220, 'Cs': 51, 'UP': 58, '...","[('VOX', 312), ('PP', 251), ('PSOE', 220), ('U..."
4,022019111010400301003,01,04,04003,0400301003,Andalucía,Almería,Adra,1551,975,0.628627,12,963,9,954,292,202,73,52,0,327,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,5,3,...,189.0,178.0,215.0,227.0,164.0,110.0,96.0,61.0,58.0,41.0,40.0,27.0,4.0,4.0,0.0,2240.0,1094.0,1146.0,0.104911,0.647768,0.247321,7968.0,0.382530,3.557143,2525.0,0.432871,0.473663,0.240637,8076.0,7150.0,22051.0,19687.0,5224.0,4044.0,1170.0,1198.0,416.0,476.0,"{'PP': 292, 'PSOE': 202, 'Cs': 73, 'UP': 52, '...","[('VOX', 327), ('PP', 292), ('PSOE', 202), ('C..."
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
36297,022019111195200108011,19,52,52001,5200108011,Melilla,Melilla,Melilla,1638,1021,0.623321,3,1018,11,1007,303,140,30,28,0,158,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,348,...,181.0,185.0,171.0,164.0,165.0,180.0,155.0,97.0,38.0,34.0,19.0,16.0,4.0,3.0,0.0,2480.0,1244.0,1236.0,0.085081,0.623387,0.291532,23931.0,0.190548,9.649597,12737.0,0.366177,0.403627,0.347360,16433.0,15847.0,66352.0,62632.0,11378.0,11119.0,1508.0,1274.0,167.0,166.0,"{'PP': 303, 'PSOE': 140, 'Cs': 30, 'UP': 28, '...","[('Otros', 348), ('PP', 303), ('VOX', 158), ('..."
36298,022019111195200108012,19,52,52001,5200108012,Melilla,Melilla,Melilla,1676,1057,0.630668,9,1048,2,1046,463,205,36,35,0,210,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,97,...,160.0,175.0,184.0,162.0,188.0,162.0,147.0,106.0,99.0,67.0,49.0,38.0,8.0,2.0,0.0,2334.0,1173.0,1161.0,0.158098,0.612682,0.229220,23931.0,0.190548,10.253213,12737.0,0.366177,0.403627,0.347360,17350.0,16792.0,50730.0,50839.0,13272.0,13038.0,2763.0,2445.0,169.0,177.0,"{'PP': 463, 'PSOE': 205, 'Cs': 36, 'UP': 35, '...","[('PP', 463), ('VOX', 210), ('PSOE', 205), ('O..."
36299,022019111195200108013,19,52,52001,5200108013,Melilla,Melilla,Melilla,1132,638,0.563604,5,633,4,629,208,113,31,25,0,144,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,108,...,179.0,172.0,123.0,117.0,123.0,127.0,113.0,68.0,44.0,24.0,23.0,17.0,2.0,1.0,0.0,1828.0,976.0,852.0,0.097921,0.663567,0.238512,23931.0,0.190548,13.091357,12737.0,0.366177,0.403627,0.347360,12553.0,11823.0,37816.0,36729.0,10102.0,9640.0,1807.0,1615.0,234.0,252.0,"{'PP': 208, 'PSOE': 113, 'Cs': 31, 'UP': 25, '...","[('PP', 208), ('VOX', 144), ('PSOE', 113), ('O..."
36300,022019111195200108014,19,52,52001,5200108014,Melilla,Melilla,Melilla,899,527,0.586207,4,523,0,523,200,87,13,12,0,126,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,85,...,115.0,124.0,81.0,59.0,69.0,70.0,90.0,71.0,59.0,42.0,25.0,8.0,1.0,0.0,0.0,1298.0,634.0,664.0,0.158706,0.577042,0.264253,23931.0,0.190548,18.436826,12737.0,0.366177,0.403627,0.347360,8906.0,8937.0,29898.0,31384.0,5923.0,6061.0,2463.0,2136.0,244.0,284.0,"{'PP': 200, 'PSOE': 87, 'Cs': 13, 'UP': 12, 'I...","[('PP', 200), ('VOX', 126), ('PSOE', 87), ('Ot..."


Seleccionamos las secciones que vamos a modelizar, que son las de toda España.

In [150]:
ccaa_mod = []

provincia_mod = []

municipio_mod = []

secciones_mod = df_eleccion_comp

In [151]:
if len(ccaa_mod) > 0:

  secciones_mod = secciones_mod.loc[secciones_mod['CCAA'].isin(ccaa_mod)]

if len(provincia_mod) > 0:

  secciones_mod = secciones_mod.loc[secciones_mod['Provincia'].isin(provincia_mod)]

if len(municipio_mod) > 0:

  secciones_mod = secciones_mod.loc[secciones_mod['Municipio'].isin(municipio_mod)]



In [152]:
secciones_mod

Unnamed: 0,Sección,cod_ccaa,cod_prov,cod_mun,cod_sec,CCAA,Provincia,Municipio,Censo_Esc,Votos_Total,Participación,Nulos,Votos_Válidos,Blanco,V_Cand,PP,PSOE,Cs,UP,IU,VOX,UPyD,MP,CiU,ERC,JxC,CUP,DiL,PNV,Bildu,Amaiur,CC,FA,TE,BNG,PRC,GBai,Compromis,PACMA,Otros,...,30-34,35-39,40-44,45-49,50-54,55-59,60-64,65-69,70-74,75-79,80-84,85-89,90-94,95-99,100 y más,Población Total,Hombres,Mujeres,% mayores 65 años,% 20-64 años,% menores 19 años,Afiliados SS Minicipio,% Afiliados SS autónomos,% Afiliados SS / Población,Paro Registrado Municipio,% Paro Hombres,% Paro mayores 45,% Paro s/ Afiliados SS Municipio,Renta persona 2017,Renta persona 2015,Renta hogar 2017,Renta hogar 2015,Renta Salarios 2018,Renta Salarios 2015,Renta Pensiones 2018,Renta Pensiones 2015,Renta Desempleo 2018,Renta Desempleo 2015,dict_res,dict_res_ord
0,022019111010400101001,01,04,04001,0400101001,Andalucía,Almería,Abla,1002,717,0.715569,7,710,3,707,193,310,47,30,0,122,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,2,...,73.0,80.0,89.0,81.0,94.0,87.0,91.0,77.0,72.0,42.0,67.0,56.0,19.0,4.0,0.0,1249.0,635.0,614.0,0.269816,0.590072,0.140112,304.0,0.223684,0.243395,140.0,0.421429,0.550000,0.315315,9159.0,8788.0,20172.0,19546.0,5574.0,4833.0,3286.0,3082.0,403.0,471.0,"{'PP': 193, 'PSOE': 310, 'Cs': 47, 'UP': 30, '...","[('PSOE', 310), ('PP', 193), ('VOX', 122), ('C..."
1,022019111010400201001,01,04,04002,0400201001,Andalucía,Almería,Abrucena,1013,711,0.701876,12,699,1,698,111,349,45,42,0,147,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,2,...,60.0,75.0,70.0,70.0,108.0,101.0,99.0,86.0,61.0,64.0,61.0,46.0,14.0,2.0,1.0,1202.0,637.0,565.0,0.278702,0.609817,0.111481,298.0,0.251678,0.247920,179.0,0.379888,0.625698,0.375262,8827.0,8107.0,17841.0,17115.0,4640.0,4048.0,3418.0,2770.0,568.0,620.0,"{'PP': 111, 'PSOE': 349, 'Cs': 45, 'UP': 42, '...","[('PSOE', 349), ('VOX', 147), ('PP', 111), ('C..."
2,022019111010400301001,01,04,04003,0400301001,Andalucía,Almería,Adra,667,484,0.725637,7,477,5,472,176,128,15,34,0,116,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,0,...,54.0,54.0,54.0,61.0,82.0,75.0,67.0,48.0,37.0,40.0,26.0,15.0,3.0,1.0,0.0,892.0,435.0,457.0,0.190583,0.643498,0.165919,7968.0,0.382530,8.932735,2525.0,0.432871,0.473663,0.240637,8965.0,8267.0,26498.0,24688.0,5121.0,4795.0,2499.0,2301.0,337.0,333.0,"{'PP': 176, 'PSOE': 128, 'Cs': 15, 'UP': 34, '...","[('PP', 176), ('PSOE', 128), ('VOX', 116), ('U..."
3,022019111010400301002,01,04,04003,0400301002,Andalucía,Almería,Adra,1306,909,0.696018,3,906,5,901,251,220,51,58,0,312,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,6,3,...,108.0,158.0,162.0,150.0,140.0,119.0,103.0,67.0,49.0,37.0,30.0,14.0,7.0,1.0,0.0,1752.0,865.0,887.0,0.117009,0.647260,0.235731,7968.0,0.382530,4.547945,2525.0,0.432871,0.473663,0.240637,8599.0,7941.0,25677.0,23400.0,5381.0,4837.0,1815.0,1724.0,343.0,464.0,"{'PP': 251, 'PSOE': 220, 'Cs': 51, 'UP': 58, '...","[('VOX', 312), ('PP', 251), ('PSOE', 220), ('U..."
4,022019111010400301003,01,04,04003,0400301003,Andalucía,Almería,Adra,1551,975,0.628627,12,963,9,954,292,202,73,52,0,327,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,5,3,...,189.0,178.0,215.0,227.0,164.0,110.0,96.0,61.0,58.0,41.0,40.0,27.0,4.0,4.0,0.0,2240.0,1094.0,1146.0,0.104911,0.647768,0.247321,7968.0,0.382530,3.557143,2525.0,0.432871,0.473663,0.240637,8076.0,7150.0,22051.0,19687.0,5224.0,4044.0,1170.0,1198.0,416.0,476.0,"{'PP': 292, 'PSOE': 202, 'Cs': 73, 'UP': 52, '...","[('VOX', 327), ('PP', 292), ('PSOE', 202), ('C..."
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
36297,022019111195200108011,19,52,52001,5200108011,Melilla,Melilla,Melilla,1638,1021,0.623321,3,1018,11,1007,303,140,30,28,0,158,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,348,...,181.0,185.0,171.0,164.0,165.0,180.0,155.0,97.0,38.0,34.0,19.0,16.0,4.0,3.0,0.0,2480.0,1244.0,1236.0,0.085081,0.623387,0.291532,23931.0,0.190548,9.649597,12737.0,0.366177,0.403627,0.347360,16433.0,15847.0,66352.0,62632.0,11378.0,11119.0,1508.0,1274.0,167.0,166.0,"{'PP': 303, 'PSOE': 140, 'Cs': 30, 'UP': 28, '...","[('Otros', 348), ('PP', 303), ('VOX', 158), ('..."
36298,022019111195200108012,19,52,52001,5200108012,Melilla,Melilla,Melilla,1676,1057,0.630668,9,1048,2,1046,463,205,36,35,0,210,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,97,...,160.0,175.0,184.0,162.0,188.0,162.0,147.0,106.0,99.0,67.0,49.0,38.0,8.0,2.0,0.0,2334.0,1173.0,1161.0,0.158098,0.612682,0.229220,23931.0,0.190548,10.253213,12737.0,0.366177,0.403627,0.347360,17350.0,16792.0,50730.0,50839.0,13272.0,13038.0,2763.0,2445.0,169.0,177.0,"{'PP': 463, 'PSOE': 205, 'Cs': 36, 'UP': 35, '...","[('PP', 463), ('VOX', 210), ('PSOE', 205), ('O..."
36299,022019111195200108013,19,52,52001,5200108013,Melilla,Melilla,Melilla,1132,638,0.563604,5,633,4,629,208,113,31,25,0,144,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,108,...,179.0,172.0,123.0,117.0,123.0,127.0,113.0,68.0,44.0,24.0,23.0,17.0,2.0,1.0,0.0,1828.0,976.0,852.0,0.097921,0.663567,0.238512,23931.0,0.190548,13.091357,12737.0,0.366177,0.403627,0.347360,12553.0,11823.0,37816.0,36729.0,10102.0,9640.0,1807.0,1615.0,234.0,252.0,"{'PP': 208, 'PSOE': 113, 'Cs': 31, 'UP': 25, '...","[('PP', 208), ('VOX', 144), ('PSOE', 113), ('O..."
36300,022019111195200108014,19,52,52001,5200108014,Melilla,Melilla,Melilla,899,527,0.586207,4,523,0,523,200,87,13,12,0,126,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,85,...,115.0,124.0,81.0,59.0,69.0,70.0,90.0,71.0,59.0,42.0,25.0,8.0,1.0,0.0,0.0,1298.0,634.0,664.0,0.158706,0.577042,0.264253,23931.0,0.190548,18.436826,12737.0,0.366177,0.403627,0.347360,8906.0,8937.0,29898.0,31384.0,5923.0,6061.0,2463.0,2136.0,244.0,284.0,"{'PP': 200, 'PSOE': 87, 'Cs': 13, 'UP': 12, 'I...","[('PP', 200), ('VOX', 126), ('PSOE', 87), ('Ot..."


A continuación sumamos los resultados de las secciones de España, normalizamos, y creamos la columna que será el vector 'y' en el modelo.

In [153]:
secciones_mod_lista = list(secciones_mod['Sección']) 

In [10]:
cols_validas_mod = ['Censo_Esc', 'Votos_Total', 'Nulos', 'Votos_Válidos', 'Blanco', 'V_Cand', 'PP', 'PSOE', 'Cs', 'UP',
       'IU', 'VOX', 'UPyD', 'MP', 'CiU', 'ERC', 'JxC', 'CUP', 'DiL', 'PNV',
       'Bildu', 'Amaiur', 'CC', 'FA', 'TE', 'BNG', 'PRC', 'GBai', 'Compromis',
       'PACMA', 'Otros']

In [154]:
secciones_mod = secciones_mod[cols_validas_mod]

In [155]:
secciones_mod

Unnamed: 0,Censo_Esc,Votos_Total,Nulos,Votos_Válidos,Blanco,V_Cand,PP,PSOE,Cs,UP,IU,VOX,UPyD,MP,CiU,ERC,JxC,CUP,DiL,PNV,Bildu,Amaiur,CC,FA,TE,BNG,PRC,GBai,Compromis,PACMA,Otros
0,1002,717,7,710,3,707,193,310,47,30,0,122,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,2
1,1013,711,12,699,1,698,111,349,45,42,0,147,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,2
2,667,484,7,477,5,472,176,128,15,34,0,116,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,0
3,1306,909,3,906,5,901,251,220,51,58,0,312,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,6,3
4,1551,975,12,963,9,954,292,202,73,52,0,327,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,5,3
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
36297,1638,1021,3,1018,11,1007,303,140,30,28,0,158,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,348
36298,1676,1057,9,1048,2,1046,463,205,36,35,0,210,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,97
36299,1132,638,5,633,4,629,208,113,31,25,0,144,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,108
36300,899,527,4,523,0,523,200,87,13,12,0,126,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,85


In [156]:
censo_mod = secciones_mod['Censo_Esc'].sum()

In [157]:
censo_mod

34871714

In [158]:
modelizacion = pd.DataFrame(secciones_mod.sum(), columns = ['Modelización'])

In [159]:
modelizacion['Modelización'] = modelizacion['Modelización'] / modelizacion['Modelización']['Censo_Esc']

Este será la columna y del modelo, una vez que quitemos la fila del censo, que siempre es 1 por construcción:

In [160]:
modelizacion

Unnamed: 0,Modelización
Censo_Esc,1.0
Votos_Total,0.698612
Nulos,0.007127
Votos_Válidos,0.691485
Blanco,0.006201
V_Cand,0.685284
PP,0.146826
PSOE,0.193633
Cs,0.046953
UP,0.088816


In [161]:
modelizacion = modelizacion.drop(['Censo_Esc']) 

In [162]:
modelizacion

Unnamed: 0,Modelización
Votos_Total,0.698612
Nulos,0.007127
Votos_Válidos,0.691485
Blanco,0.006201
V_Cand,0.685284
PP,0.146826
PSOE,0.193633
Cs,0.046953
UP,0.088816
IU,0.0


In [20]:
modelizacion.shape

(30, 1)

Ahora definimos las provinvias de donde seleccionaremos las secciones. Escogemos algunos muncipios pequeños de provincias o CCAA donde se presentaron partidos regionalistas o nacionalistas.

In [163]:
ccaa_select = []

provincia_select = []

municipio_select = ['Reus', 'Eibar', 'Laredo', 'Teruel', 'Paterna', 'Telde', 'Calatayud', 'Lucena', 'Verín']

secciones_select = df_eleccion_comp

In [164]:
if len(ccaa_select) > 0:

  secciones_select = secciones_select.loc[secciones_select['CCAA'].isin(ccaa_select)]

if len(provincia_select) > 0:

  secciones_select = secciones_select.loc[secciones_select['Provincia'].isin(provincia_select)]

if len(municipio_select) > 0:

  secciones_select = secciones_select.loc[secciones_select['Municipio'].isin(municipio_select)]



In [165]:
secciones_select

Unnamed: 0,Sección,cod_ccaa,cod_prov,cod_mun,cod_sec,CCAA,Provincia,Municipio,Censo_Esc,Votos_Total,Participación,Nulos,Votos_Válidos,Blanco,V_Cand,PP,PSOE,Cs,UP,IU,VOX,UPyD,MP,CiU,ERC,JxC,CUP,DiL,PNV,Bildu,Amaiur,CC,FA,TE,BNG,PRC,GBai,Compromis,PACMA,Otros,...,30-34,35-39,40-44,45-49,50-54,55-59,60-64,65-69,70-74,75-79,80-84,85-89,90-94,95-99,100 y más,Población Total,Hombres,Mujeres,% mayores 65 años,% 20-64 años,% menores 19 años,Afiliados SS Minicipio,% Afiliados SS autónomos,% Afiliados SS / Población,Paro Registrado Municipio,% Paro Hombres,% Paro mayores 45,% Paro s/ Afiliados SS Municipio,Renta persona 2017,Renta persona 2015,Renta hogar 2017,Renta hogar 2015,Renta Salarios 2018,Renta Salarios 2015,Renta Pensiones 2018,Renta Pensiones 2015,Renta Desempleo 2018,Renta Desempleo 2015,dict_res,dict_res_ord
1791,022019111011403801001,01,14,14038,1403801001,Andalucía,Córdoba,Lucena,913,699,0.765608,9,690,3,687,263,97,65,53,0,205,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,1,...,52.0,63.0,77.0,78.0,90.0,75.0,65.0,67.0,68.0,56.0,39.0,32.0,18.0,2.0,0.0,1132.0,552.0,580.0,0.249117,0.559187,0.191696,17548.0,0.177114,15.501767,6685.0,0.334480,0.511294,0.275863,11025.0,10319.0,29436.0,28262.0,6487.0,5455.0,3588.0,3238.0,225.0,238.0,"{'PP': 263, 'PSOE': 97, 'Cs': 65, 'UP': 53, 'I...","[('PP', 263), ('VOX', 205), ('PSOE', 97), ('Cs..."
1792,022019111011403801002,01,14,14038,1403801002,Andalucía,Córdoba,Lucena,1023,725,0.708700,15,710,13,697,157,171,79,46,0,237,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,4,...,94.0,100.0,84.0,88.0,100.0,97.0,59.0,32.0,55.0,50.0,38.0,29.0,11.0,0.0,1.0,1291.0,643.0,648.0,0.167312,0.619675,0.213013,17548.0,0.177114,13.592564,6685.0,0.334480,0.511294,0.275863,8575.0,7770.0,22534.0,20780.0,5473.0,4532.0,2351.0,2333.0,353.0,482.0,"{'PP': 157, 'PSOE': 171, 'Cs': 79, 'UP': 46, '...","[('VOX', 237), ('PSOE', 171), ('PP', 157), ('C..."
1793,022019111011403801003,01,14,14038,1403801003,Andalucía,Córdoba,Lucena,1697,1218,0.717737,19,1199,14,1185,259,357,124,126,0,299,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,10,10,...,118.0,168.0,202.0,167.0,163.0,133.0,111.0,83.0,92.0,77.0,75.0,50.0,26.0,3.0,0.0,2213.0,1062.0,1151.0,0.183461,0.600994,0.215545,17548.0,0.177114,7.929507,6685.0,0.334480,0.511294,0.275863,7565.0,7095.0,19944.0,18935.0,5166.0,3985.0,2004.0,1940.0,378.0,483.0,"{'PP': 259, 'PSOE': 357, 'Cs': 124, 'UP': 126,...","[('PSOE', 357), ('VOX', 299), ('PP', 259), ('U..."
1794,022019111011403801004,01,14,14038,1403801004,Andalucía,Córdoba,Lucena,1872,1366,0.729701,18,1348,17,1331,233,339,158,124,0,456,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,8,13,...,212.0,245.0,243.0,196.0,188.0,140.0,123.0,68.0,54.0,54.0,37.0,15.0,8.0,0.0,0.0,2539.0,1268.0,1271.0,0.092950,0.657345,0.249705,17548.0,0.177114,6.911382,6685.0,0.334480,0.511294,0.275863,7386.0,6836.0,20709.0,19069.0,6228.0,4956.0,1207.0,1194.0,434.0,483.0,"{'PP': 233, 'PSOE': 339, 'Cs': 158, 'UP': 124,...","[('VOX', 456), ('PSOE', 339), ('PP', 233), ('C..."
1795,022019111011403801005,01,14,14038,1403801005,Andalucía,Córdoba,Lucena,844,647,0.766588,13,634,4,630,217,102,54,67,0,179,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,9,2,...,68.0,81.0,57.0,65.0,72.0,87.0,65.0,74.0,63.0,53.0,35.0,27.0,8.0,1.0,0.0,1081.0,537.0,544.0,0.241443,0.598520,0.160037,17548.0,0.177114,16.233117,6685.0,0.334480,0.511294,0.275863,10891.0,7902.0,28191.0,20660.0,5770.0,4814.0,3693.0,3470.0,258.0,295.0,"{'PP': 217, 'PSOE': 102, 'Cs': 54, 'UP': 67, '...","[('PP', 217), ('VOX', 179), ('PSOE', 102), ('U..."
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
35281,022019111174619001044,17,46,46190,4619001044,La Rioja,Valencia,Paterna,1178,930,0.789474,6,924,12,912,172,202,140,101,0,219,0,64,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,12,2,...,117.0,292.0,368.0,202.0,97.0,58.0,34.0,23.0,25.0,12.0,3.0,1.0,2.0,0.0,0.0,1993.0,1009.0,984.0,0.033116,0.635725,0.331159,52241.0,0.091978,26.212243,6314.0,0.392936,0.511245,0.107830,12210.0,12644.0,33791.0,33686.0,12562.0,12689.0,555.0,610.0,230.0,356.0,"{'PP': 172, 'PSOE': 202, 'Cs': 140, 'UP': 101,...","[('VOX', 219), ('PSOE', 202), ('PP', 172), ('C..."
35282,022019111174619001045,17,46,46190,4619001045,La Rioja,Valencia,Paterna,691,517,0.748191,1,516,6,510,101,101,71,78,0,79,0,65,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,4,11,...,110.0,222.0,229.0,100.0,51.0,20.0,14.0,7.0,4.0,2.0,1.0,1.0,0.0,0.0,0.0,1183.0,594.0,589.0,0.012680,0.683009,0.304311,52241.0,0.091978,44.159763,6314.0,0.392936,0.511245,0.107830,13371.0,11663.0,32873.0,27525.0,14561.0,11718.0,345.0,628.0,237.0,357.0,"{'PP': 101, 'PSOE': 101, 'Cs': 71, 'UP': 78, '...","[('PP', 101), ('PSOE', 101), ('VOX', 79), ('UP..."
35283,022019111174619001046,17,46,46190,4619001046,La Rioja,Valencia,Paterna,1194,789,0.660804,4,785,7,778,170,158,83,130,0,166,0,51,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,13,7,...,129.0,206.0,282.0,171.0,117.0,57.0,60.0,41.0,29.0,16.0,13.0,16.0,4.0,2.0,0.0,1914.0,982.0,932.0,0.063218,0.624347,0.312435,52241.0,0.091978,27.294148,6314.0,0.392936,0.511245,0.107830,10692.0,10129.0,29076.0,26104.0,10740.0,9220.0,718.0,983.0,247.0,424.0,"{'PP': 170, 'PSOE': 158, 'Cs': 83, 'UP': 130, ...","[('PP', 170), ('VOX', 166), ('PSOE', 158), ('U..."
35284,022019111174619001047,17,46,46190,4619001047,La Rioja,Valencia,Paterna,1260,965,0.765873,1,964,10,954,199,200,132,105,0,230,0,78,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,6,4,...,98.0,214.0,380.0,248.0,119.0,70.0,43.0,36.0,27.0,14.0,18.0,17.0,9.0,4.0,0.0,2090.0,1051.0,1039.0,0.059809,0.611005,0.329187,52241.0,0.091978,24.995694,6314.0,0.392936,0.511245,0.107830,13357.0,12549.0,36736.0,33819.0,13832.0,12580.0,580.0,798.0,274.0,263.0,"{'PP': 199, 'PSOE': 200, 'Cs': 132, 'UP': 105,...","[('VOX', 230), ('PSOE', 200), ('PP', 199), ('C..."


Comprobamos que tenemos las secciones de los nueve municipios.

In [166]:
secciones_select['Municipio'].unique()

array(['Lucena', 'Teruel', 'Calatayud', 'Telde', 'Laredo', 'Reus',
       'Verín', 'Eibar', 'Paterna'], dtype=object)

Nos quedamos con las secciones de más de 500 censados;  en este caso casi todas superan este límite.

In [167]:
secciones_select = secciones_select.loc[secciones_select['Censo_Esc'] > 500]

In [168]:
secciones_select

Unnamed: 0,Sección,cod_ccaa,cod_prov,cod_mun,cod_sec,CCAA,Provincia,Municipio,Censo_Esc,Votos_Total,Participación,Nulos,Votos_Válidos,Blanco,V_Cand,PP,PSOE,Cs,UP,IU,VOX,UPyD,MP,CiU,ERC,JxC,CUP,DiL,PNV,Bildu,Amaiur,CC,FA,TE,BNG,PRC,GBai,Compromis,PACMA,Otros,...,30-34,35-39,40-44,45-49,50-54,55-59,60-64,65-69,70-74,75-79,80-84,85-89,90-94,95-99,100 y más,Población Total,Hombres,Mujeres,% mayores 65 años,% 20-64 años,% menores 19 años,Afiliados SS Minicipio,% Afiliados SS autónomos,% Afiliados SS / Población,Paro Registrado Municipio,% Paro Hombres,% Paro mayores 45,% Paro s/ Afiliados SS Municipio,Renta persona 2017,Renta persona 2015,Renta hogar 2017,Renta hogar 2015,Renta Salarios 2018,Renta Salarios 2015,Renta Pensiones 2018,Renta Pensiones 2015,Renta Desempleo 2018,Renta Desempleo 2015,dict_res,dict_res_ord
1791,022019111011403801001,01,14,14038,1403801001,Andalucía,Córdoba,Lucena,913,699,0.765608,9,690,3,687,263,97,65,53,0,205,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,1,...,52.0,63.0,77.0,78.0,90.0,75.0,65.0,67.0,68.0,56.0,39.0,32.0,18.0,2.0,0.0,1132.0,552.0,580.0,0.249117,0.559187,0.191696,17548.0,0.177114,15.501767,6685.0,0.334480,0.511294,0.275863,11025.0,10319.0,29436.0,28262.0,6487.0,5455.0,3588.0,3238.0,225.0,238.0,"{'PP': 263, 'PSOE': 97, 'Cs': 65, 'UP': 53, 'I...","[('PP', 263), ('VOX', 205), ('PSOE', 97), ('Cs..."
1792,022019111011403801002,01,14,14038,1403801002,Andalucía,Córdoba,Lucena,1023,725,0.708700,15,710,13,697,157,171,79,46,0,237,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,4,...,94.0,100.0,84.0,88.0,100.0,97.0,59.0,32.0,55.0,50.0,38.0,29.0,11.0,0.0,1.0,1291.0,643.0,648.0,0.167312,0.619675,0.213013,17548.0,0.177114,13.592564,6685.0,0.334480,0.511294,0.275863,8575.0,7770.0,22534.0,20780.0,5473.0,4532.0,2351.0,2333.0,353.0,482.0,"{'PP': 157, 'PSOE': 171, 'Cs': 79, 'UP': 46, '...","[('VOX', 237), ('PSOE', 171), ('PP', 157), ('C..."
1793,022019111011403801003,01,14,14038,1403801003,Andalucía,Córdoba,Lucena,1697,1218,0.717737,19,1199,14,1185,259,357,124,126,0,299,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,10,10,...,118.0,168.0,202.0,167.0,163.0,133.0,111.0,83.0,92.0,77.0,75.0,50.0,26.0,3.0,0.0,2213.0,1062.0,1151.0,0.183461,0.600994,0.215545,17548.0,0.177114,7.929507,6685.0,0.334480,0.511294,0.275863,7565.0,7095.0,19944.0,18935.0,5166.0,3985.0,2004.0,1940.0,378.0,483.0,"{'PP': 259, 'PSOE': 357, 'Cs': 124, 'UP': 126,...","[('PSOE', 357), ('VOX', 299), ('PP', 259), ('U..."
1794,022019111011403801004,01,14,14038,1403801004,Andalucía,Córdoba,Lucena,1872,1366,0.729701,18,1348,17,1331,233,339,158,124,0,456,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,8,13,...,212.0,245.0,243.0,196.0,188.0,140.0,123.0,68.0,54.0,54.0,37.0,15.0,8.0,0.0,0.0,2539.0,1268.0,1271.0,0.092950,0.657345,0.249705,17548.0,0.177114,6.911382,6685.0,0.334480,0.511294,0.275863,7386.0,6836.0,20709.0,19069.0,6228.0,4956.0,1207.0,1194.0,434.0,483.0,"{'PP': 233, 'PSOE': 339, 'Cs': 158, 'UP': 124,...","[('VOX', 456), ('PSOE', 339), ('PP', 233), ('C..."
1795,022019111011403801005,01,14,14038,1403801005,Andalucía,Córdoba,Lucena,844,647,0.766588,13,634,4,630,217,102,54,67,0,179,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,9,2,...,68.0,81.0,57.0,65.0,72.0,87.0,65.0,74.0,63.0,53.0,35.0,27.0,8.0,1.0,0.0,1081.0,537.0,544.0,0.241443,0.598520,0.160037,17548.0,0.177114,16.233117,6685.0,0.334480,0.511294,0.275863,10891.0,7902.0,28191.0,20660.0,5770.0,4814.0,3693.0,3470.0,258.0,295.0,"{'PP': 217, 'PSOE': 102, 'Cs': 54, 'UP': 67, '...","[('PP', 217), ('VOX', 179), ('PSOE', 102), ('U..."
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
35281,022019111174619001044,17,46,46190,4619001044,La Rioja,Valencia,Paterna,1178,930,0.789474,6,924,12,912,172,202,140,101,0,219,0,64,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,12,2,...,117.0,292.0,368.0,202.0,97.0,58.0,34.0,23.0,25.0,12.0,3.0,1.0,2.0,0.0,0.0,1993.0,1009.0,984.0,0.033116,0.635725,0.331159,52241.0,0.091978,26.212243,6314.0,0.392936,0.511245,0.107830,12210.0,12644.0,33791.0,33686.0,12562.0,12689.0,555.0,610.0,230.0,356.0,"{'PP': 172, 'PSOE': 202, 'Cs': 140, 'UP': 101,...","[('VOX', 219), ('PSOE', 202), ('PP', 172), ('C..."
35282,022019111174619001045,17,46,46190,4619001045,La Rioja,Valencia,Paterna,691,517,0.748191,1,516,6,510,101,101,71,78,0,79,0,65,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,4,11,...,110.0,222.0,229.0,100.0,51.0,20.0,14.0,7.0,4.0,2.0,1.0,1.0,0.0,0.0,0.0,1183.0,594.0,589.0,0.012680,0.683009,0.304311,52241.0,0.091978,44.159763,6314.0,0.392936,0.511245,0.107830,13371.0,11663.0,32873.0,27525.0,14561.0,11718.0,345.0,628.0,237.0,357.0,"{'PP': 101, 'PSOE': 101, 'Cs': 71, 'UP': 78, '...","[('PP', 101), ('PSOE', 101), ('VOX', 79), ('UP..."
35283,022019111174619001046,17,46,46190,4619001046,La Rioja,Valencia,Paterna,1194,789,0.660804,4,785,7,778,170,158,83,130,0,166,0,51,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,13,7,...,129.0,206.0,282.0,171.0,117.0,57.0,60.0,41.0,29.0,16.0,13.0,16.0,4.0,2.0,0.0,1914.0,982.0,932.0,0.063218,0.624347,0.312435,52241.0,0.091978,27.294148,6314.0,0.392936,0.511245,0.107830,10692.0,10129.0,29076.0,26104.0,10740.0,9220.0,718.0,983.0,247.0,424.0,"{'PP': 170, 'PSOE': 158, 'Cs': 83, 'UP': 130, ...","[('PP', 170), ('VOX', 166), ('PSOE', 158), ('U..."
35284,022019111174619001047,17,46,46190,4619001047,La Rioja,Valencia,Paterna,1260,965,0.765873,1,964,10,954,199,200,132,105,0,230,0,78,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,6,4,...,98.0,214.0,380.0,248.0,119.0,70.0,43.0,36.0,27.0,14.0,18.0,17.0,9.0,4.0,0.0,2090.0,1051.0,1039.0,0.059809,0.611005,0.329187,52241.0,0.091978,24.995694,6314.0,0.392936,0.511245,0.107830,13357.0,12549.0,36736.0,33819.0,13832.0,12580.0,580.0,798.0,274.0,263.0,"{'PP': 199, 'PSOE': 200, 'Cs': 132, 'UP': 105,...","[('VOX', 230), ('PSOE', 200), ('PP', 199), ('C..."


Llevamos a cabo el mismo proceso que en el cuaderno de la regresión lineal. Normalizamos inicialmente todas las secciones a la espera de encontrar las que serán válidas.

In [169]:
col_validas_select = ['Sección', 'Censo_Esc', 'Votos_Total', 'Nulos', 'Votos_Válidos', 'Blanco', 'V_Cand', 'PP', 'PSOE', 'Cs', 'UP',
       'IU', 'VOX', 'UPyD', 'MP', 'CiU', 'ERC', 'JxC', 'CUP', 'DiL', 'PNV',
       'Bildu', 'Amaiur', 'CC', 'FA', 'TE', 'BNG', 'PRC', 'GBai', 'Compromis',
       'PACMA', 'Otros']

In [170]:
secciones_select = secciones_select[col_validas_select]

In [171]:
secciones_select

Unnamed: 0,Sección,Censo_Esc,Votos_Total,Nulos,Votos_Válidos,Blanco,V_Cand,PP,PSOE,Cs,UP,IU,VOX,UPyD,MP,CiU,ERC,JxC,CUP,DiL,PNV,Bildu,Amaiur,CC,FA,TE,BNG,PRC,GBai,Compromis,PACMA,Otros
1791,022019111011403801001,913,699,9,690,3,687,263,97,65,53,0,205,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,1
1792,022019111011403801002,1023,725,15,710,13,697,157,171,79,46,0,237,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,4
1793,022019111011403801003,1697,1218,19,1199,14,1185,259,357,124,126,0,299,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,10,10
1794,022019111011403801004,1872,1366,18,1348,17,1331,233,339,158,124,0,456,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,8,13
1795,022019111011403801005,844,647,13,634,4,630,217,102,54,67,0,179,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,9,2
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
35281,022019111174619001044,1178,930,6,924,12,912,172,202,140,101,0,219,0,64,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,12,2
35282,022019111174619001045,691,517,1,516,6,510,101,101,71,78,0,79,0,65,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,4,11
35283,022019111174619001046,1194,789,4,785,7,778,170,158,83,130,0,166,0,51,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,13,7
35284,022019111174619001047,1260,965,1,964,10,954,199,200,132,105,0,230,0,78,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,6,4


In [172]:
secciones_select_norm = secciones_select.copy()

In [173]:
secciones_select_norm

Unnamed: 0,Sección,Censo_Esc,Votos_Total,Nulos,Votos_Válidos,Blanco,V_Cand,PP,PSOE,Cs,UP,IU,VOX,UPyD,MP,CiU,ERC,JxC,CUP,DiL,PNV,Bildu,Amaiur,CC,FA,TE,BNG,PRC,GBai,Compromis,PACMA,Otros
1791,022019111011403801001,913,699,9,690,3,687,263,97,65,53,0,205,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,1
1792,022019111011403801002,1023,725,15,710,13,697,157,171,79,46,0,237,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,4
1793,022019111011403801003,1697,1218,19,1199,14,1185,259,357,124,126,0,299,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,10,10
1794,022019111011403801004,1872,1366,18,1348,17,1331,233,339,158,124,0,456,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,8,13
1795,022019111011403801005,844,647,13,634,4,630,217,102,54,67,0,179,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,9,2
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
35281,022019111174619001044,1178,930,6,924,12,912,172,202,140,101,0,219,0,64,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,12,2
35282,022019111174619001045,691,517,1,516,6,510,101,101,71,78,0,79,0,65,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,4,11
35283,022019111174619001046,1194,789,4,785,7,778,170,158,83,130,0,166,0,51,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,13,7
35284,022019111174619001047,1260,965,1,964,10,954,199,200,132,105,0,230,0,78,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,6,4


In [174]:
  set_cols = ['Sección', 'Censo_Esc']

In [175]:
for col in secciones_select_norm.columns:

  if col not in set_cols:
    
    secciones_select_norm[col] = secciones_select_norm[col] / secciones_select_norm['Censo_Esc']

secciones_select_norm = secciones_select_norm.set_index('Sección')
secciones_select_norm = secciones_select_norm.drop('Censo_Esc', axis = 1)

secciones_select_norm = secciones_select_norm.T

In [176]:
secciones_select_norm

Sección,022019111011403801001,022019111011403801002,022019111011403801003,022019111011403801004,022019111011403801005,022019111011403801006,022019111011403802001,022019111011403802002,022019111011403802003,022019111011403802004,022019111011403802005,022019111011403802006,022019111011403802007,022019111011403802008,022019111011403802009,022019111011403802010,022019111011403803001,022019111011403803002,022019111011403803003,022019111011403803004,022019111011403803005,022019111011403803006,022019111011403803007,022019111011403803008,022019111011403804001,022019111011403805001,022019111011403805002,022019111024421601001,022019111024421601002,022019111024421602001,022019111024421602002,022019111024421602003,022019111024421602004,022019111024421602005,022019111024421602006,022019111024421602007,022019111024421603001,022019111024421603002,022019111024421603003,022019111024421603004,...,022019111174619001008,022019111174619001009,022019111174619001010,022019111174619001011,022019111174619001012,022019111174619001013,022019111174619001014,022019111174619001015,022019111174619001016,022019111174619001017,022019111174619001018,022019111174619001019,022019111174619001020,022019111174619001021,022019111174619001022,022019111174619001023,022019111174619001024,022019111174619001025,022019111174619001026,022019111174619001027,022019111174619001028,022019111174619001030,022019111174619001031,022019111174619001032,022019111174619001033,022019111174619001034,022019111174619001035,022019111174619001036,022019111174619001037,022019111174619001038,022019111174619001039,022019111174619001040,022019111174619001041,022019111174619001042,022019111174619001043,022019111174619001044,022019111174619001045,022019111174619001046,022019111174619001047,022019111174619001048
Votos_Total,0.765608,0.7087,0.717737,0.729701,0.766588,0.713405,0.697704,0.67842,0.70041,0.665417,0.746269,0.711618,0.659433,0.726496,0.721848,0.721251,0.714136,0.739054,0.725291,0.764157,0.711735,0.726077,0.753255,0.671593,0.630907,0.740924,0.690541,0.685422,0.700833,0.686798,0.765306,0.698686,0.740102,0.594123,0.755639,0.748603,0.73716,0.730196,0.760166,0.762027,...,0.711128,0.706651,0.733721,0.65375,0.725742,0.656692,0.63548,0.687005,0.640807,0.674275,0.630065,0.682953,0.660453,0.689715,0.679262,0.607973,0.682612,0.709276,0.549296,0.736752,0.784553,0.730212,0.803468,0.802704,0.338078,0.724638,0.785555,0.662346,0.805575,0.802002,0.304175,0.791696,0.77459,0.73357,0.784703,0.789474,0.748191,0.660804,0.765873,0.749107
Nulos,0.009858,0.014663,0.011196,0.009615,0.015403,0.01849,0.006378,0.012694,0.009362,0.02343,0.004975,0.011411,0.01085,0.013889,0.004812,0.017479,0.009459,0.016929,0.010174,0.003331,0.01148,0.010766,0.011719,0.00821,0.007024,0.024752,0.012162,0.004263,0.0075,0.007725,0.002268,0.003284,0.00203,0.000918,0.000627,0.002235,0.003021,0.001569,0.00332,0.002577,...,0.004573,0.007126,0.002326,0.00125,0.004812,0.00665,0.001215,0.003962,0.003879,0.009254,0.001307,0.004343,0.006658,0.00605,0.00703,0.001661,0.003037,0.002262,0.001408,0.001709,0.002033,0.003902,0.004817,0.006146,0.003559,0.004026,0.006701,0.005185,0.006272,0.007786,0.002982,0.002111,0.004098,0.006217,0.008499,0.005093,0.001447,0.00335,0.000794,0.001786
Votos_Válidos,0.75575,0.694037,0.706541,0.720085,0.751185,0.694915,0.691327,0.665726,0.691047,0.641987,0.741294,0.700207,0.648583,0.712607,0.717036,0.703772,0.704677,0.722125,0.715116,0.760826,0.700255,0.715311,0.741536,0.663383,0.623883,0.716172,0.678378,0.681159,0.693333,0.679073,0.763039,0.695402,0.738071,0.593205,0.755013,0.746369,0.734139,0.728627,0.756846,0.75945,...,0.706555,0.699525,0.731395,0.6525,0.72093,0.650042,0.634265,0.683043,0.636928,0.665022,0.628758,0.67861,0.653795,0.683665,0.672232,0.606312,0.679575,0.707014,0.547887,0.735043,0.78252,0.72631,0.798651,0.796558,0.33452,0.720612,0.778853,0.657161,0.799303,0.794216,0.301193,0.789585,0.770492,0.727353,0.776204,0.78438,0.746744,0.657454,0.765079,0.747321
Blanco,0.003286,0.012708,0.00825,0.009081,0.004739,0.004622,0.005102,0.004231,0.007607,0.003749,0.007463,0.011411,0.007233,0.0,0.021174,0.014719,0.007882,0.006421,0.001453,0.005996,0.01148,0.004785,0.00651,0.004926,0.005747,0.00495,0.006757,0.002558,0.001667,0.000702,0.001134,0.004105,0.004061,0.003673,0.002506,0.005587,0.003021,0.003137,0.004149,0.000859,...,0.004573,0.003563,0.005814,0.005,0.00401,0.007481,0.00243,0.009509,0.002327,0.006786,0.003922,0.004343,0.001332,0.00605,0.002636,0.006645,0.002278,0.00905,0.007042,0.020513,0.00813,0.009476,0.00578,0.004302,0.003559,0.002415,0.002234,0.004537,0.004878,0.003337,0.003976,0.008445,0.004918,0.002664,0.011331,0.010187,0.008683,0.005863,0.007937,0.008036
V_Cand,0.752464,0.681329,0.698291,0.711004,0.746445,0.690293,0.686224,0.661495,0.683441,0.638238,0.733831,0.688797,0.64135,0.712607,0.695861,0.689052,0.696795,0.715703,0.713663,0.75483,0.688776,0.710526,0.735026,0.658456,0.618135,0.711221,0.671622,0.678602,0.691667,0.678371,0.761905,0.691297,0.73401,0.589532,0.752506,0.740782,0.731118,0.72549,0.752697,0.758591,...,0.701982,0.695962,0.725581,0.6475,0.716921,0.64256,0.631835,0.673534,0.6346,0.658236,0.624837,0.674267,0.652463,0.677615,0.669596,0.599668,0.677297,0.697964,0.540845,0.71453,0.77439,0.716834,0.792871,0.792256,0.330961,0.718196,0.77662,0.652625,0.794425,0.790879,0.297217,0.78114,0.765574,0.724689,0.764873,0.774194,0.738061,0.651591,0.757143,0.739286
PP,0.288061,0.15347,0.152622,0.124466,0.257109,0.177196,0.193878,0.138223,0.147455,0.093721,0.197347,0.164938,0.08258,0.154915,0.154957,0.105796,0.116658,0.129013,0.133721,0.263158,0.128827,0.175837,0.175781,0.142857,0.095147,0.133663,0.085135,0.194373,0.161667,0.13132,0.184807,0.15353,0.116751,0.096419,0.114662,0.132961,0.098691,0.159216,0.242324,0.231959,...,0.133384,0.153207,0.109302,0.12875,0.121893,0.113051,0.064399,0.083201,0.058185,0.072178,0.120261,0.098806,0.101198,0.093345,0.102812,0.068106,0.125285,0.18552,0.098592,0.179487,0.193767,0.114827,0.175337,0.153042,0.030842,0.152174,0.218913,0.093325,0.242509,0.240267,0.030815,0.166784,0.17459,0.095915,0.19169,0.14601,0.146165,0.142379,0.157937,0.176786
PSOE,0.106243,0.167155,0.210371,0.18109,0.120853,0.181818,0.167092,0.165021,0.190755,0.221181,0.143449,0.158714,0.16094,0.163462,0.161694,0.183993,0.208092,0.214828,0.25436,0.131246,0.161352,0.159091,0.159505,0.27422,0.113027,0.313531,0.347297,0.103154,0.119167,0.089185,0.111111,0.133826,0.156345,0.135904,0.149749,0.132961,0.152064,0.131765,0.112863,0.09622,...,0.23247,0.211401,0.244186,0.2175,0.198877,0.210308,0.238153,0.213946,0.240497,0.241826,0.223529,0.281216,0.234354,0.25497,0.22935,0.262458,0.21716,0.196833,0.225352,0.198291,0.168022,0.177815,0.186898,0.17394,0.179122,0.171498,0.14073,0.165911,0.15331,0.171301,0.143141,0.147783,0.14918,0.219361,0.152975,0.171477,0.146165,0.132328,0.15873,0.1375
Cs,0.071194,0.077224,0.07307,0.084402,0.063981,0.046225,0.05102,0.060649,0.061439,0.051546,0.053897,0.076763,0.052441,0.064103,0.063523,0.064397,0.074619,0.064215,0.063953,0.067288,0.070791,0.063397,0.083333,0.046798,0.05364,0.052805,0.054054,0.024723,0.018333,0.025281,0.017007,0.030378,0.029442,0.019284,0.028195,0.042458,0.02719,0.025098,0.016598,0.033505,...,0.050305,0.053444,0.067442,0.0575,0.076183,0.049044,0.044957,0.075277,0.032583,0.05182,0.043137,0.046688,0.046605,0.05013,0.053603,0.021595,0.063022,0.057692,0.026761,0.071795,0.087398,0.090301,0.089595,0.073141,0.005931,0.068438,0.087863,0.061568,0.083624,0.095662,0.00497,0.076003,0.117213,0.076377,0.114259,0.118846,0.10275,0.069514,0.104762,0.098214
UP,0.05805,0.044966,0.074249,0.066239,0.079384,0.069337,0.053571,0.054302,0.063195,0.066542,0.072139,0.051867,0.072936,0.066239,0.060635,0.071757,0.069364,0.05721,0.043605,0.0493,0.079719,0.059809,0.069661,0.042693,0.066411,0.051155,0.108108,0.02728,0.0425,0.025281,0.027211,0.022989,0.036548,0.02663,0.036967,0.01676,0.02719,0.018824,0.014108,0.027491,...,0.096799,0.097387,0.112791,0.09125,0.106656,0.117207,0.120292,0.156101,0.099302,0.112893,0.095425,0.095548,0.106525,0.115817,0.102812,0.114618,0.105543,0.091629,0.077465,0.066667,0.116531,0.101449,0.084778,0.152428,0.035587,0.105475,0.064036,0.105638,0.088502,0.086763,0.042744,0.132301,0.081967,0.118117,0.067044,0.085739,0.11288,0.108878,0.083333,0.085714
IU,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


Ahora definimos las dos funciones. La primera elimina los registros de los partidos que no se presentaron en esas provincias, y hace la trasposición y la normalización, y la segunda, que en base a la matriz de correlación va eliminando las que son demasiado similares entre sí.

In [177]:
def preparacion_sec(eleccion):

  set_cols = ['Sección', 'Censo_Esc']
  
  for col in eleccion.columns:

    if eleccion[col].sum() == 0:

      eleccion = eleccion.drop([col], axis = 1)

    elif col not in set_cols:

      eleccion[col] = eleccion[col] / eleccion['Censo_Esc']

  eleccion = eleccion.set_index('Sección')
  eleccion = eleccion.drop('Censo_Esc', axis = 1)

  df_elec_transpose = eleccion.T

  lista_sec = list(df_elec_transpose.columns)
  random.shuffle(lista_sec)

  df_elec_transpose = df_elec_transpose[lista_sec]

  return df_elec_transpose


In [178]:
def secciones_corr(dummy, threshold = 0.995):

  for ind in range(2, m.shape[0]):
    s = m.iloc[0:ind, 0:ind]

    if max(s.iloc[ind-1, 0:ind-1] > threshold):
    # print(m.columns[ind-1])
      dummy = dummy.drop(m.columns[ind-1], axis = 0)
      dummy = dummy.drop(m.columns[ind-1], axis = 1)

  return dummy.columns


In [179]:
secc = preparacion_sec(secciones_select)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  del sys.path[0]


In [180]:
secc

Sección,022019111053502603009,022019111174619001001,022019111053502605003,022019111094312304005,022019111142003003001,022019111053502604002,022019111142003003004,022019111094312309002,022019111174619001044,022019111053502602003,022019111011403801003,022019111053502603008,022019111094312306008,022019111053502603011,022019111094312305007,022019111142003005004,022019111024421603006,022019111063903502001,022019111053502603010,022019111174619001012,022019111142003004002,022019111063903501002,022019111142003005003,022019111113208501001,022019111142003003003,022019111174619001007,022019111053502606015,022019111011403803002,022019111142003004006,022019111024421602005,022019111174619001025,022019111142003001005,022019111011403802006,022019111094312301002,022019111094312305003,022019111053502605007,022019111094312302009,022019111024421604001,022019111094312307005,022019111025006702005,...,022019111025006704001,022019111174619001034,022019111011403802004,022019111094312302010,022019111094312306013,022019111024421603010,022019111053502606013,022019111053502604003,022019111053502602008,022019111174619001045,022019111063903502002,022019111113208501010,022019111011403802009,022019111053502604001,022019111063903502004,022019111174619001028,022019111094312303011,022019111174619001031,022019111025006702001,022019111053502602005,022019111094312302013,022019111174619001021,022019111053502603016,022019111063903501001,022019111053502602004,022019111024421602007,022019111053502602010,022019111053502602007,022019111024421603011,022019111142003001004,022019111053502606008,022019111094312302006,022019111063903502003,022019111011403803003,022019111094312305005,022019111024421603009,022019111025006701005,022019111094312302007,022019111024421603005,022019111011403805001
Votos_Total,0.471947,0.713278,0.618966,0.72623,0.619231,0.592068,0.709779,0.51932,0.789474,0.598485,0.717737,0.583207,0.673404,0.494413,0.728893,0.677201,0.745137,0.74433,0.584983,0.725742,0.73022,0.693899,0.671902,0.609836,0.716198,0.761194,0.59306,0.739054,0.689295,0.594123,0.709276,0.684055,0.711618,0.724868,0.565008,0.670588,0.755862,0.724342,0.70558,0.755891,...,0.630673,0.724638,0.665417,0.718876,0.723982,0.827822,0.695431,0.592952,0.61994,0.748191,0.752395,0.652991,0.721848,0.599762,0.790776,0.784553,0.664612,0.803468,0.733143,0.605108,0.707595,0.689715,0.460425,0.753521,0.60824,0.748603,0.666667,0.560748,0.848875,0.653622,0.65107,0.704487,0.732489,0.725291,0.590504,0.812113,0.743415,0.731051,0.79703,0.740924
Nulos,0.006601,0.00235,0.006897,0.009836,0.003846,0.007082,0.001577,0.003091,0.005093,0.006629,0.011196,0.004539,0.003191,0.005587,0.00469,0.004515,0.006707,0.007216,0.005461,0.004812,0.003891,0.007239,0.00349,0.009836,0.001339,0.010333,0.007886,0.016929,0.001305,0.000918,0.002262,0.005906,0.011411,0.003023,0.0,0.004278,0.008276,0.003947,0.003671,0.01508,...,0.00626,0.004026,0.02343,0.003347,0.003017,0.00114,0.004351,0.014097,0.003748,0.001447,0.005158,0.006838,0.004812,0.008314,0.008999,0.002033,0.003083,0.004817,0.014245,0.005894,0.003797,0.00605,0.003861,0.01006,0.007491,0.002235,0.013035,0.009346,0.001608,0.009159,0.010695,0.005128,0.006751,0.010174,0.003956,0.000688,0.011895,0.00489,0.002829,0.024752
Votos_Válidos,0.465347,0.710928,0.612069,0.716393,0.615385,0.584986,0.708202,0.516229,0.78438,0.591856,0.706541,0.578669,0.670213,0.488827,0.724203,0.672686,0.738431,0.737113,0.579522,0.72093,0.726329,0.68666,0.668412,0.6,0.714859,0.750861,0.585174,0.722125,0.68799,0.593205,0.707014,0.67815,0.700207,0.721844,0.565008,0.66631,0.747586,0.720395,0.701909,0.740811,...,0.624413,0.720612,0.641987,0.715529,0.720965,0.826682,0.69108,0.578855,0.616192,0.746744,0.747237,0.646154,0.717036,0.591449,0.781777,0.78252,0.661529,0.798651,0.718898,0.599214,0.703797,0.683665,0.456564,0.743461,0.600749,0.746369,0.653631,0.551402,0.847267,0.644463,0.640374,0.699359,0.725738,0.715116,0.586548,0.811425,0.731521,0.726161,0.794201,0.716172
Blanco,0.005776,0.0,0.014368,0.003279,0.003077,0.003541,0.003155,0.007728,0.010187,0.002841,0.00825,0.003782,0.003191,0.00419,0.002814,0.004515,0.001341,0.0,0.008191,0.00401,0.003891,0.001034,0.002618,0.002459,0.008032,0.005741,0.006835,0.006421,0.002611,0.003673,0.00905,0.002953,0.011411,0.000756,0.006421,0.003209,0.001379,0.003947,0.005874,0.004713,...,0.004695,0.002415,0.003749,0.002677,0.004525,0.004561,0.007977,0.004405,0.004498,0.008683,0.000737,0.006838,0.021174,0.007126,0.004499,0.00813,0.003699,0.00578,0.008547,0.007859,0.002532,0.00605,0.007722,0.003018,0.003745,0.005587,0.005587,0.00647,0.003215,0.002498,0.008021,0.012179,0.006751,0.001453,0.0,0.002753,0.005098,0.005705,0.002829,0.00495
V_Cand,0.459571,0.710928,0.597701,0.713115,0.612308,0.581445,0.705047,0.508501,0.774194,0.589015,0.698291,0.574887,0.667021,0.484637,0.721388,0.668172,0.737089,0.737113,0.571331,0.716921,0.722438,0.685626,0.665794,0.597541,0.706827,0.745121,0.578339,0.715703,0.685379,0.589532,0.697964,0.675197,0.688797,0.721088,0.558587,0.663102,0.746207,0.716447,0.696035,0.736098,...,0.619718,0.718196,0.638238,0.712851,0.71644,0.822121,0.683104,0.574449,0.611694,0.738061,0.7465,0.639316,0.695861,0.584323,0.777278,0.77439,0.65783,0.792871,0.710351,0.591356,0.701266,0.677615,0.448842,0.740443,0.597004,0.740782,0.648045,0.544932,0.844051,0.641965,0.632353,0.687179,0.718987,0.713663,0.586548,0.808672,0.726423,0.720456,0.791372,0.711221
PP,0.082508,0.137485,0.108046,0.078689,0.036154,0.193343,0.045741,0.035549,0.14601,0.107955,0.152622,0.066566,0.051064,0.063547,0.052533,0.045147,0.176392,0.208247,0.112628,0.121893,0.040208,0.124095,0.045375,0.235246,0.021419,0.203215,0.095163,0.129013,0.032637,0.096419,0.18552,0.042323,0.164938,0.071807,0.05939,0.165775,0.074483,0.173026,0.048458,0.224317,...,0.208138,0.152174,0.093721,0.044846,0.078431,0.119156,0.171864,0.177093,0.151424,0.146165,0.249079,0.226496,0.154957,0.166865,0.24297,0.193767,0.064118,0.175337,0.209877,0.139489,0.068354,0.093345,0.044402,0.193159,0.119101,0.132961,0.173184,0.108555,0.136656,0.043297,0.127005,0.052564,0.142616,0.133721,0.053412,0.22574,0.218352,0.05542,0.201556,0.133663
PSOE,0.15429,0.206816,0.154598,0.167213,0.186923,0.106232,0.173502,0.111283,0.171477,0.163826,0.210371,0.189864,0.093617,0.166899,0.161351,0.1614,0.140845,0.147423,0.133106,0.198877,0.143969,0.170631,0.227749,0.178689,0.172691,0.205511,0.153523,0.214828,0.164491,0.135904,0.196833,0.267717,0.158714,0.103553,0.17817,0.145455,0.091034,0.124342,0.168135,0.193214,...,0.156495,0.171498,0.221181,0.157965,0.102564,0.118586,0.158811,0.147137,0.158921,0.146165,0.150332,0.196581,0.161694,0.107482,0.148481,0.168022,0.209001,0.186898,0.226971,0.154551,0.201266,0.25497,0.224903,0.176056,0.158052,0.132961,0.162011,0.161754,0.086817,0.193172,0.181818,0.152564,0.207595,0.25436,0.167161,0.099794,0.220051,0.138549,0.145686,0.313531
Cs,0.033828,0.068155,0.027011,0.045902,0.003077,0.033286,0.003155,0.047913,0.118846,0.034091,0.07307,0.041604,0.02766,0.025838,0.053471,0.005643,0.02951,0.035052,0.033447,0.076183,0.007782,0.025853,0.005236,0.013934,0.002677,0.059701,0.040484,0.064215,0.001305,0.019284,0.057692,0.004921,0.076763,0.031746,0.051364,0.053476,0.027586,0.031579,0.064611,0.065975,...,0.043818,0.068438,0.051546,0.046185,0.025641,0.027936,0.055838,0.029956,0.043478,0.10275,0.022845,0.035043,0.063523,0.024941,0.021372,0.087398,0.064735,0.089595,0.05793,0.031434,0.048101,0.05013,0.014479,0.019115,0.043446,0.042458,0.039106,0.030194,0.032154,0.003331,0.045455,0.050641,0.04135,0.063953,0.03363,0.035788,0.059473,0.04401,0.026167,0.052805
UP,0.065182,0.095182,0.11954,0.098361,0.117692,0.056657,0.108833,0.064915,0.085739,0.07197,0.074249,0.111195,0.062766,0.09148,0.115385,0.109481,0.023474,0.047423,0.099659,0.106656,0.111543,0.069286,0.096859,0.053279,0.111111,0.096441,0.10673,0.05721,0.0953,0.02663,0.091629,0.125,0.051867,0.07483,0.05939,0.104813,0.06069,0.028289,0.095448,0.06032,...,0.045383,0.105475,0.066542,0.077644,0.066365,0.027366,0.1124,0.070485,0.092204,0.11288,0.050847,0.047863,0.060635,0.086698,0.065242,0.116531,0.07275,0.084778,0.050332,0.091028,0.096203,0.115817,0.068533,0.054326,0.101873,0.01676,0.094972,0.108555,0.017685,0.109908,0.076203,0.084615,0.04557,0.043605,0.070227,0.017894,0.04503,0.092095,0.016973,0.051155
VOX,0.049505,0.144536,0.087356,0.057377,0.006154,0.094193,0.003155,0.044822,0.185908,0.100379,0.176193,0.083964,0.029787,0.073324,0.06848,0.009029,0.074447,0.072165,0.081911,0.137931,0.014267,0.068252,0.006981,0.052459,0.004016,0.115959,0.081493,0.235844,0.009138,0.078972,0.10181,0.005906,0.228216,0.036281,0.043339,0.088235,0.017931,0.132237,0.0837,0.151744,...,0.150235,0.128019,0.196813,0.056225,0.024133,0.072976,0.081943,0.068722,0.090705,0.114327,0.082535,0.046154,0.249278,0.096793,0.085489,0.142276,0.094328,0.181118,0.128205,0.094957,0.082278,0.119274,0.05888,0.064386,0.08015,0.073743,0.096834,0.070453,0.130225,0.014988,0.078877,0.060256,0.081013,0.197674,0.04451,0.086029,0.146134,0.048085,0.08133,0.150165


Para ver cuán efectivo es el PCA hemos elegido un umbral poco exigente, por lo que obtenemos una lista de secciones distintas entre sí bastante amplio, unas 66.

In [182]:
m = secc.corr()
lista_sec = secciones_corr(m, 0.999)

In [183]:
lista_sec

Index(['022019111053502603009', '022019111174619001001',
       '022019111053502605003', '022019111094312304005',
       '022019111142003003001', '022019111053502604002',
       '022019111142003003004', '022019111174619001044',
       '022019111053502602003', '022019111011403801003',
       '022019111053502603008', '022019111094312306008',
       '022019111024421603006', '022019111063903502001',
       '022019111063903501002', '022019111142003005003',
       '022019111113208501001', '022019111142003003003',
       '022019111174619001007', '022019111011403803002',
       '022019111024421602005', '022019111142003001005',
       '022019111011403802006', '022019111094312301002',
       '022019111094312305003', '022019111094312302009',
       '022019111024421604001', '022019111025006702005',
       '022019111142003005002', '022019111174619001046',
       '022019111094312307001', '022019111011403801001',
       '022019111142003004004', '022019111053502606010',
       '022019111174619001019',

In [184]:
lista_sec.shape

(66,)

In [185]:
lista_sec = np.sort(lista_sec)

Ahora nos quedamos con las secciones antes mencionadas.

In [186]:
secciones_select_norm = secciones_select_norm[lista_sec]

In [187]:
secciones_select_norm

Sección,022019111011403801001,022019111011403801003,022019111011403802006,022019111011403803002,022019111011403803003,022019111011403803008,022019111011403804001,022019111011403805001,022019111011403805002,022019111024421602005,022019111024421603001,022019111024421603004,022019111024421603006,022019111024421603007,022019111024421603008,022019111024421603010,022019111024421604001,022019111025006702005,022019111053502602003,022019111053502603005,022019111053502603006,022019111053502603008,022019111053502603009,022019111053502604002,022019111053502605003,022019111053502605004,022019111053502606010,022019111063903501002,022019111063903502001,022019111063903502002,022019111063903502003,022019111063903503002,022019111094312301002,022019111094312302001,022019111094312302004,022019111094312302009,022019111094312302015,022019111094312303010,022019111094312304005,022019111094312305003,022019111094312306008,022019111094312307001,022019111094312308001,022019111094312308007,022019111113208501001,022019111142003001005,022019111142003002001,022019111142003002003,022019111142003003001,022019111142003003003,022019111142003003004,022019111142003004004,022019111142003005002,022019111142003005003,022019111174619001001,022019111174619001007,022019111174619001014,022019111174619001015,022019111174619001019,022019111174619001023,022019111174619001033,022019111174619001038,022019111174619001039,022019111174619001043,022019111174619001044,022019111174619001046
Votos_Total,0.765608,0.717737,0.711618,0.739054,0.725291,0.671593,0.630907,0.740924,0.690541,0.594123,0.73716,0.762027,0.745137,0.714032,0.711167,0.827822,0.724342,0.755891,0.598485,0.63388,0.563659,0.583207,0.471947,0.592068,0.618966,0.480534,0.595944,0.693899,0.74433,0.752395,0.732489,0.685652,0.724868,0.656514,0.626415,0.755862,0.817204,0.64813,0.72623,0.565008,0.673404,0.409565,0.62,0.701431,0.609836,0.684055,0.659905,0.695983,0.619231,0.716198,0.709779,0.801541,0.71553,0.671902,0.713278,0.761194,0.63548,0.687005,0.682953,0.607973,0.338078,0.802002,0.304175,0.784703,0.789474,0.660804
Nulos,0.009858,0.011196,0.011411,0.016929,0.010174,0.00821,0.007024,0.024752,0.012162,0.000918,0.003021,0.002577,0.006707,0.003552,0.003119,0.00114,0.003947,0.01508,0.006629,0.009563,0.009889,0.004539,0.006601,0.007082,0.006897,0.011123,0.014561,0.007239,0.007216,0.005158,0.006751,0.003286,0.003023,0.001692,0.001887,0.008276,0.002688,0.005753,0.009836,0.0,0.003191,0.002609,0.005333,0.005538,0.009836,0.005906,0.0,0.004343,0.003846,0.001339,0.001577,0.003854,0.0,0.00349,0.00235,0.010333,0.001215,0.003962,0.004343,0.001661,0.003559,0.007786,0.002982,0.008499,0.005093,0.00335
Votos_Válidos,0.75575,0.706541,0.700207,0.722125,0.715116,0.663383,0.623883,0.716172,0.678378,0.593205,0.734139,0.75945,0.738431,0.71048,0.708047,0.826682,0.720395,0.740811,0.591856,0.624317,0.55377,0.578669,0.465347,0.584986,0.612069,0.46941,0.581383,0.68666,0.737113,0.747237,0.725738,0.682366,0.721844,0.654822,0.624528,0.747586,0.814516,0.642378,0.716393,0.565008,0.670213,0.406957,0.614667,0.695893,0.6,0.67815,0.659905,0.69164,0.615385,0.714859,0.708202,0.797688,0.71553,0.668412,0.710928,0.750861,0.634265,0.683043,0.67861,0.606312,0.33452,0.794216,0.301193,0.776204,0.78438,0.657454
Blanco,0.003286,0.00825,0.011411,0.006421,0.001453,0.004926,0.005747,0.00495,0.006757,0.003673,0.003021,0.000859,0.001341,0.0,0.003119,0.004561,0.003947,0.004713,0.002841,0.005464,0.008653,0.003782,0.005776,0.003541,0.014368,0.0,0.0052,0.001034,0.0,0.000737,0.006751,0.003286,0.000756,0.006768,0.004717,0.001379,0.004032,0.002876,0.003279,0.006421,0.003191,0.00087,0.004,0.005076,0.002459,0.002953,0.002387,0.0076,0.003077,0.008032,0.003155,0.007707,0.0,0.002618,0.0,0.005741,0.00243,0.009509,0.004343,0.006645,0.003559,0.003337,0.003976,0.011331,0.010187,0.005863
V_Cand,0.752464,0.698291,0.688797,0.715703,0.713663,0.658456,0.618135,0.711221,0.671622,0.589532,0.731118,0.758591,0.737089,0.71048,0.704928,0.822121,0.716447,0.736098,0.589015,0.618852,0.545117,0.574887,0.459571,0.581445,0.597701,0.46941,0.576183,0.685626,0.737113,0.7465,0.718987,0.67908,0.721088,0.648054,0.619811,0.746207,0.810484,0.639501,0.713115,0.558587,0.667021,0.406087,0.610667,0.690817,0.597541,0.675197,0.657518,0.684039,0.612308,0.706827,0.705047,0.789981,0.71553,0.665794,0.710928,0.745121,0.631835,0.673534,0.674267,0.599668,0.330961,0.790879,0.297217,0.764873,0.774194,0.651591
PP,0.288061,0.152622,0.164938,0.129013,0.133721,0.142857,0.095147,0.133663,0.085135,0.096419,0.098691,0.231959,0.176392,0.108348,0.158453,0.119156,0.173026,0.224317,0.107955,0.178962,0.128554,0.066566,0.082508,0.193343,0.108046,0.054505,0.161726,0.124095,0.208247,0.249079,0.142616,0.154436,0.071807,0.028765,0.041509,0.074483,0.072581,0.078619,0.078689,0.05939,0.051064,0.046957,0.066667,0.091371,0.235246,0.042323,0.015513,0.085776,0.036154,0.021419,0.045741,0.065511,0.058055,0.045375,0.137485,0.203215,0.064399,0.083201,0.098806,0.068106,0.030842,0.240267,0.030815,0.19169,0.14601,0.142379
PSOE,0.106243,0.210371,0.158714,0.214828,0.25436,0.27422,0.113027,0.313531,0.347297,0.135904,0.152064,0.09622,0.140845,0.085258,0.101061,0.118586,0.124342,0.193214,0.163826,0.124317,0.15204,0.189864,0.15429,0.106232,0.154598,0.215795,0.164327,0.170631,0.147423,0.150332,0.207595,0.144578,0.103553,0.108291,0.161321,0.091034,0.120968,0.194631,0.167213,0.17817,0.093617,0.117391,0.202667,0.117213,0.178689,0.267717,0.126492,0.34962,0.186923,0.172691,0.173502,0.152216,0.133527,0.227749,0.206816,0.205511,0.238153,0.213946,0.281216,0.262458,0.179122,0.171301,0.143141,0.152975,0.171477,0.132328
Cs,0.071194,0.07307,0.076763,0.064215,0.063953,0.046798,0.05364,0.052805,0.054054,0.019284,0.02719,0.033505,0.02951,0.021314,0.025577,0.027936,0.031579,0.065975,0.034091,0.034153,0.035847,0.041604,0.033828,0.033286,0.027011,0.010011,0.026521,0.025853,0.035052,0.022845,0.04135,0.030668,0.031746,0.018613,0.035849,0.027586,0.051075,0.044104,0.045902,0.051364,0.02766,0.031304,0.061333,0.078911,0.013934,0.004921,0.0,0.006515,0.003077,0.002677,0.003155,0.009634,0.004354,0.005236,0.068155,0.059701,0.044957,0.075277,0.046688,0.021595,0.005931,0.095662,0.00497,0.114259,0.118846,0.069514
UP,0.05805,0.074249,0.051867,0.05721,0.043605,0.042693,0.066411,0.051155,0.108108,0.02663,0.02719,0.027491,0.023474,0.027531,0.019339,0.027366,0.028289,0.06032,0.07197,0.064208,0.071693,0.111195,0.065182,0.056657,0.11954,0.084538,0.074883,0.069286,0.047423,0.050847,0.04557,0.089814,0.07483,0.120135,0.073585,0.06069,0.100806,0.074784,0.098361,0.05939,0.062766,0.034783,0.053333,0.092755,0.053279,0.125,0.127685,0.07709,0.117692,0.111111,0.108833,0.127168,0.063861,0.096859,0.095182,0.096441,0.120292,0.156101,0.095548,0.114618,0.035587,0.086763,0.042744,0.067044,0.085739,0.108878
IU,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [188]:
secciones_select_norm.shape

(30, 66)

Ahora importamos la librería de PCA.

In [189]:
from sklearn.decomposition import PCA

Definimos la matriz X, y la transformamos en array numpy, como de costumbre.

In [190]:
X = secciones_select_norm.values

In [191]:
X

array([[0.76560789, 0.71773718, 0.71161826, ..., 0.78470255, 0.78947368,
        0.66080402],
       [0.00985761, 0.01119623, 0.01141079, ..., 0.00849858, 0.00509338,
        0.00335008],
       [0.75575027, 0.70654095, 0.70020747, ..., 0.77620397, 0.78438031,
        0.65745394],
       ...,
       [0.        , 0.        , 0.        , ..., 0.        , 0.        ,
        0.        ],
       [0.00328587, 0.00589275, 0.00414938, ..., 0.01038716, 0.01018676,
        0.01088777],
       [0.00109529, 0.00589275, 0.00414938, ..., 0.00566572, 0.00169779,
        0.00586265]])

In [192]:
X.shape

(30, 66)

Instanciamos el modelo PCA con 4 componentes.

In [193]:
pca = PCA(n_components = 4)

Llevamos a cabo el fit y el transform.

In [194]:
pca.fit(X)

PCA(copy=True, iterated_power='auto', n_components=4, random_state=None,
    svd_solver='auto', tol=0.0, whiten=False)

In [51]:
X_pca = pca.transform(X)

Como es de esperar, los primeros elementos del PCA tienen una proporción de la varianza mucho mayor que el resto.

In [195]:
print(pca.explained_variance_ratio_)

[0.96720314 0.01054905 0.00808517 0.00563125]


In [196]:
print(pca.singular_values_)

[8.820591   0.92118234 0.80646124 0.67304045]


Definimos el vector 'y'

In [197]:
y = modelizacion['Modelización'].values

In [198]:
y

array([6.98612262e-01, 7.12735256e-03, 6.91484910e-01, 6.20127247e-03,
       6.85283637e-01, 1.46825935e-01, 1.93632983e-01, 4.69532699e-02,
       8.88163685e-02, 0.00000000e+00, 1.04393406e-01, 0.00000000e+00,
       1.70579513e-02, 0.00000000e+00, 2.49424505e-02, 1.51066850e-02,
       7.02380158e-03, 0.00000000e+00, 1.08254501e-02, 7.93006618e-03,
       0.00000000e+00, 3.55525972e-03, 0.00000000e+00, 5.64956457e-04,
       3.42962781e-03, 1.96663691e-03, 3.61955251e-04, 0.00000000e+00,
       6.48608784e-03, 5.41074637e-03])

Ahora cargamos la librería de linear regression de Sklearn...

In [199]:
from sklearn.linear_model import LinearRegression

...y hacemos el fit, obteniendo un fit del 99,9% tomando únicamente los 4 primeros componentes.

In [200]:
reg = LinearRegression(fit_intercept = True).fit(X_pca, y)

In [201]:
reg.score(X_pca, y)

0.9995173580246297

Ahora podemos hacerlo un poco más sofisticado, utilizando un pipeline que incluya un scaler, un PCA con 10 componentes, y una regresión lineal. Además podemos ver el coeficiente R2.

In [202]:
from sklearn.decomposition import PCA
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import StandardScaler

In [203]:
from sklearn.metrics import mean_squared_error

In [204]:
from sklearn.metrics import r2_score

Definimos el pipeline

In [205]:
pipe_modelado = make_pipeline(StandardScaler(), PCA(n_components = 10), LinearRegression(fit_intercept=True))


Hacemos el fit

In [206]:
pipe_modelado.fit(X=X, y=y)

Pipeline(memory=None,
         steps=[('standardscaler',
                 StandardScaler(copy=True, with_mean=True, with_std=True)),
                ('pca',
                 PCA(copy=True, iterated_power='auto', n_components=10,
                     random_state=None, svd_solver='auto', tol=0.0,
                     whiten=False)),
                ('linearregression',
                 LinearRegression(copy_X=True, fit_intercept=True, n_jobs=None,
                                  normalize=False))],
         verbose=False)

Llevamos a cabo las predicciones con el metodo predict()

In [207]:
predicciones = pipe_modelado.predict(X=X)
predicciones = predicciones.flatten()

In [208]:
r2 = r2_score(
            y_true  = y,
            y_pred  = predicciones
           )

El coeficiente R2 resulta ser altísimo, del 99.996%.

In [209]:
r2

0.9999623518769771

Ahora podemos comparar las predicciones con los datos reales. Primero deshacemos la normalización multiplicando por el censo de España.

In [210]:
est = predicciones * censo_mod

In [211]:
df = pd.DataFrame(est, index = secciones_select_norm.index, columns = ['Estimación']).astype('int32')

In [212]:
df

Unnamed: 0,Estimación
Votos_Total,24359895
Nulos,253469
Votos_Válidos,24105725
Blanco,178452
V_Cand,23926573
PP,5072187
PSOE,6749963
Cs,1796615
UP,2986509
IU,-699


In [213]:
df1 = pd.DataFrame(secciones_mod.sum(), columns = ['Real']).drop('Censo_Esc')

La comparación con los datos reales es bastante satisfactoria, pues las diferencias son muy pequeñas respecto a los datos reales. Hay que tener en cuenta que hemos seleccionado 66 secciones, tomadas a su vez de municipios escogidos arbitrariamente. 

In [214]:
df['Real'] = df1['Real']

In [216]:
df['pc Estimación'] = df['Estimación'] / df['Estimación'][2] * 100

In [217]:
df['pc Real'] = df['Real'] / df['Real'][2] * 100

In [218]:
df['dif. Real-Est.'] = df['pc Real'] - df['pc Estimación']

In [219]:
df

Unnamed: 0,Estimación,Real,pc Estimación,pc Real,dif. Real-Est.
Votos_Total,24359895,24361807,101.054397,101.030731,-0.023665
Nulos,253469,248543,1.051489,1.030731,-0.020757
Votos_Válidos,24105725,24113264,100.0,100.0,0.0
Blanco,178452,216249,0.740289,0.896805,0.156516
V_Cand,23926573,23897015,99.256807,99.103195,-0.153612
PP,5072187,5120072,21.041421,21.233426,0.192005
PSOE,6749963,6752314,28.001493,28.002489,0.000995
Cs,1796615,1637341,7.453064,6.790209,-0.662855
UP,2986509,3097179,12.38921,12.844296,0.455086
IU,-699,0,-0.0029,0.0,0.0029


## Modelización en las elecciones de 2016

Ahora vamos a ver lo bien o mal que se ajusta el modelo del pipeline si lo aplicamos a las elecciones de junio de 2016, considerando las secciones equivalente a las 66 utilizadas para estimar el resultado de noviembre de 2019. 

Comenzamos cargando el dataset de equivalencia de las secciones.

In [220]:
sim_secciones = pd.read_csv('/content/drive/MyDrive/Proyecto_KeepCoding - Propio/Data/similitud_secciones_def_REF.csv', dtype = 'str')

Ahora seleccinamos las similares a las 66 secciones que encontramos en el capítulo anterior...

In [221]:
sec_select_J16 = sim_secciones.loc[sim_secciones['cod_sec_ref'].isin(lista_sec)]

In [222]:
sec_select_J16.shape

(66, 12)

... y escogemos sus equivalentes en las elecciones de 2016, que son estas 66:

In [223]:
list_sec_J16 = list(sec_select_J16['cercana J16_ref'])

In [224]:
list_sec_J16 = np.sort(list_sec_J16)

In [225]:
list_sec_J16.shape

(66,)

In [226]:
secciones_select_norm = secciones_select_norm[lista_sec]

In [227]:
secciones_select_norm

Sección,022019111011403801001,022019111011403801003,022019111011403802006,022019111011403803002,022019111011403803003,022019111011403803008,022019111011403804001,022019111011403805001,022019111011403805002,022019111024421602005,022019111024421603001,022019111024421603004,022019111024421603006,022019111024421603007,022019111024421603008,022019111024421603010,022019111024421604001,022019111025006702005,022019111053502602003,022019111053502603005,022019111053502603006,022019111053502603008,022019111053502603009,022019111053502604002,022019111053502605003,022019111053502605004,022019111053502606010,022019111063903501002,022019111063903502001,022019111063903502002,022019111063903502003,022019111063903503002,022019111094312301002,022019111094312302001,022019111094312302004,022019111094312302009,022019111094312302015,022019111094312303010,022019111094312304005,022019111094312305003,022019111094312306008,022019111094312307001,022019111094312308001,022019111094312308007,022019111113208501001,022019111142003001005,022019111142003002001,022019111142003002003,022019111142003003001,022019111142003003003,022019111142003003004,022019111142003004004,022019111142003005002,022019111142003005003,022019111174619001001,022019111174619001007,022019111174619001014,022019111174619001015,022019111174619001019,022019111174619001023,022019111174619001033,022019111174619001038,022019111174619001039,022019111174619001043,022019111174619001044,022019111174619001046
Votos_Total,0.765608,0.717737,0.711618,0.739054,0.725291,0.671593,0.630907,0.740924,0.690541,0.594123,0.73716,0.762027,0.745137,0.714032,0.711167,0.827822,0.724342,0.755891,0.598485,0.63388,0.563659,0.583207,0.471947,0.592068,0.618966,0.480534,0.595944,0.693899,0.74433,0.752395,0.732489,0.685652,0.724868,0.656514,0.626415,0.755862,0.817204,0.64813,0.72623,0.565008,0.673404,0.409565,0.62,0.701431,0.609836,0.684055,0.659905,0.695983,0.619231,0.716198,0.709779,0.801541,0.71553,0.671902,0.713278,0.761194,0.63548,0.687005,0.682953,0.607973,0.338078,0.802002,0.304175,0.784703,0.789474,0.660804
Nulos,0.009858,0.011196,0.011411,0.016929,0.010174,0.00821,0.007024,0.024752,0.012162,0.000918,0.003021,0.002577,0.006707,0.003552,0.003119,0.00114,0.003947,0.01508,0.006629,0.009563,0.009889,0.004539,0.006601,0.007082,0.006897,0.011123,0.014561,0.007239,0.007216,0.005158,0.006751,0.003286,0.003023,0.001692,0.001887,0.008276,0.002688,0.005753,0.009836,0.0,0.003191,0.002609,0.005333,0.005538,0.009836,0.005906,0.0,0.004343,0.003846,0.001339,0.001577,0.003854,0.0,0.00349,0.00235,0.010333,0.001215,0.003962,0.004343,0.001661,0.003559,0.007786,0.002982,0.008499,0.005093,0.00335
Votos_Válidos,0.75575,0.706541,0.700207,0.722125,0.715116,0.663383,0.623883,0.716172,0.678378,0.593205,0.734139,0.75945,0.738431,0.71048,0.708047,0.826682,0.720395,0.740811,0.591856,0.624317,0.55377,0.578669,0.465347,0.584986,0.612069,0.46941,0.581383,0.68666,0.737113,0.747237,0.725738,0.682366,0.721844,0.654822,0.624528,0.747586,0.814516,0.642378,0.716393,0.565008,0.670213,0.406957,0.614667,0.695893,0.6,0.67815,0.659905,0.69164,0.615385,0.714859,0.708202,0.797688,0.71553,0.668412,0.710928,0.750861,0.634265,0.683043,0.67861,0.606312,0.33452,0.794216,0.301193,0.776204,0.78438,0.657454
Blanco,0.003286,0.00825,0.011411,0.006421,0.001453,0.004926,0.005747,0.00495,0.006757,0.003673,0.003021,0.000859,0.001341,0.0,0.003119,0.004561,0.003947,0.004713,0.002841,0.005464,0.008653,0.003782,0.005776,0.003541,0.014368,0.0,0.0052,0.001034,0.0,0.000737,0.006751,0.003286,0.000756,0.006768,0.004717,0.001379,0.004032,0.002876,0.003279,0.006421,0.003191,0.00087,0.004,0.005076,0.002459,0.002953,0.002387,0.0076,0.003077,0.008032,0.003155,0.007707,0.0,0.002618,0.0,0.005741,0.00243,0.009509,0.004343,0.006645,0.003559,0.003337,0.003976,0.011331,0.010187,0.005863
V_Cand,0.752464,0.698291,0.688797,0.715703,0.713663,0.658456,0.618135,0.711221,0.671622,0.589532,0.731118,0.758591,0.737089,0.71048,0.704928,0.822121,0.716447,0.736098,0.589015,0.618852,0.545117,0.574887,0.459571,0.581445,0.597701,0.46941,0.576183,0.685626,0.737113,0.7465,0.718987,0.67908,0.721088,0.648054,0.619811,0.746207,0.810484,0.639501,0.713115,0.558587,0.667021,0.406087,0.610667,0.690817,0.597541,0.675197,0.657518,0.684039,0.612308,0.706827,0.705047,0.789981,0.71553,0.665794,0.710928,0.745121,0.631835,0.673534,0.674267,0.599668,0.330961,0.790879,0.297217,0.764873,0.774194,0.651591
PP,0.288061,0.152622,0.164938,0.129013,0.133721,0.142857,0.095147,0.133663,0.085135,0.096419,0.098691,0.231959,0.176392,0.108348,0.158453,0.119156,0.173026,0.224317,0.107955,0.178962,0.128554,0.066566,0.082508,0.193343,0.108046,0.054505,0.161726,0.124095,0.208247,0.249079,0.142616,0.154436,0.071807,0.028765,0.041509,0.074483,0.072581,0.078619,0.078689,0.05939,0.051064,0.046957,0.066667,0.091371,0.235246,0.042323,0.015513,0.085776,0.036154,0.021419,0.045741,0.065511,0.058055,0.045375,0.137485,0.203215,0.064399,0.083201,0.098806,0.068106,0.030842,0.240267,0.030815,0.19169,0.14601,0.142379
PSOE,0.106243,0.210371,0.158714,0.214828,0.25436,0.27422,0.113027,0.313531,0.347297,0.135904,0.152064,0.09622,0.140845,0.085258,0.101061,0.118586,0.124342,0.193214,0.163826,0.124317,0.15204,0.189864,0.15429,0.106232,0.154598,0.215795,0.164327,0.170631,0.147423,0.150332,0.207595,0.144578,0.103553,0.108291,0.161321,0.091034,0.120968,0.194631,0.167213,0.17817,0.093617,0.117391,0.202667,0.117213,0.178689,0.267717,0.126492,0.34962,0.186923,0.172691,0.173502,0.152216,0.133527,0.227749,0.206816,0.205511,0.238153,0.213946,0.281216,0.262458,0.179122,0.171301,0.143141,0.152975,0.171477,0.132328
Cs,0.071194,0.07307,0.076763,0.064215,0.063953,0.046798,0.05364,0.052805,0.054054,0.019284,0.02719,0.033505,0.02951,0.021314,0.025577,0.027936,0.031579,0.065975,0.034091,0.034153,0.035847,0.041604,0.033828,0.033286,0.027011,0.010011,0.026521,0.025853,0.035052,0.022845,0.04135,0.030668,0.031746,0.018613,0.035849,0.027586,0.051075,0.044104,0.045902,0.051364,0.02766,0.031304,0.061333,0.078911,0.013934,0.004921,0.0,0.006515,0.003077,0.002677,0.003155,0.009634,0.004354,0.005236,0.068155,0.059701,0.044957,0.075277,0.046688,0.021595,0.005931,0.095662,0.00497,0.114259,0.118846,0.069514
UP,0.05805,0.074249,0.051867,0.05721,0.043605,0.042693,0.066411,0.051155,0.108108,0.02663,0.02719,0.027491,0.023474,0.027531,0.019339,0.027366,0.028289,0.06032,0.07197,0.064208,0.071693,0.111195,0.065182,0.056657,0.11954,0.084538,0.074883,0.069286,0.047423,0.050847,0.04557,0.089814,0.07483,0.120135,0.073585,0.06069,0.100806,0.074784,0.098361,0.05939,0.062766,0.034783,0.053333,0.092755,0.053279,0.125,0.127685,0.07709,0.117692,0.111111,0.108833,0.127168,0.063861,0.096859,0.095182,0.096441,0.120292,0.156101,0.095548,0.114618,0.035587,0.086763,0.042744,0.067044,0.085739,0.108878
IU,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


Cargamos ahora los resultados de las elecciones de junio de 2016

In [228]:
df_eleccion_comp_J16 = pd.read_csv('/content/drive/MyDrive/Proyecto_KeepCoding - Propio/Data/Gen-16-Jun/gen_J16_unif_cols_prov.txt', dtype = strings)

Seleccionamos las secciones a modelizar, que son naturalmente las de toda España.

In [229]:
secciones_mod = df_eleccion_comp_J16

if len(ccaa_mod) > 0:

  secciones_mod = secciones_mod.loc[secciones_mod['CCAA'].isin(ccaa_mod)]

if len(provincia_mod) > 0:

  secciones_mod = secciones_mod.loc[secciones_mod['Provincia'].isin(provincia_mod)]

if len(municipio_mod) > 0:

  secciones_mod = secciones_mod.loc[secciones_mod['Municipio'].isin(municipio_mod)]

In [230]:
secciones_mod

Unnamed: 0,Sección,cod_ccaa,cod_prov,cod_mun,cod_sec,CCAA,Provincia,Municipio,Censo_Esc,Votos_Total,Participación,Nulos,Votos_Válidos,Blanco,V_Cand,PP,PSOE,Cs,UP,IU,VOX,UPyD,MP,CiU,ERC,JxC,CUP,DiL,PNV,Bildu,Amaiur,CC,FA,TE,BNG,PRC,GBai,Compromis,PACMA,Otros,...,30-34,35-39,40-44,45-49,50-54,55-59,60-64,65-69,70-74,75-79,80-84,85-89,90-94,95-99,100 y más,Población Total,Hombres,Mujeres,% mayores 65 años,% 20-64 años,% menores 19 años,Afiliados SS Minicipio,% Afiliados SS autónomos,% Afiliados SS / Población,Paro Registrado Municipio,% Paro Hombres,% Paro mayores 45,% Paro s/ Afiliados SS Municipio,Renta persona 2017,Renta persona 2015,Renta hogar 2017,Renta hogar 2015,Renta Salarios 2018,Renta Salarios 2015,Renta Pensiones 2018,Renta Pensiones 2015,Renta Desempleo 2018,Renta Desempleo 2015,dict_res,dict_res_ord
0,022016061010400101001,01,04,04001,0400101001,Andalucía,Almería,Abla,1062,823,0.774953,9,814,5,809,267,356,110,65,0,0,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,4,3,...,66.0,92.0,87.0,83.0,104.0,88.0,91.0,64.0,61.0,72.0,89.0,42.0,13.0,0.0,1.0,1294.0,664.0,630.0,0.264297,0.594281,0.141422,267.0,0.292135,0.206337,184.0,0.461957,0.548913,0.407982,9159.0,8788.0,20172.0,19546.0,5574.0,4833.0,3286.0,3082.0,403.0,471.0,"{'PP': 267, 'PSOE': 356, 'Cs': 110, 'UP': 65, ...","[('PSOE', 356), ('PP', 267), ('Cs', 110), ('UP..."
1,022016061010400201001,01,04,04002,0400201001,Andalucía,Almería,Abrucena,1040,748,0.719231,8,740,2,738,212,342,93,79,0,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,6,...,63.0,73.0,65.0,91.0,97.0,95.0,106.0,61.0,57.0,74.0,67.0,42.0,8.0,2.0,0.0,1208.0,615.0,593.0,0.257450,0.612583,0.129967,331.0,0.238671,0.274007,183.0,0.437158,0.573770,0.356031,8827.0,8107.0,17841.0,17115.0,4640.0,4048.0,3418.0,2770.0,568.0,620.0,"{'PP': 212, 'PSOE': 342, 'Cs': 93, 'UP': 79, '...","[('PSOE', 342), ('PP', 212), ('Cs', 93), ('UP'..."
2,022016061010400301001,01,04,04003,0400301001,Andalucía,Almería,Adra,666,487,0.731231,7,480,2,478,266,112,48,46,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,5,1,...,54.0,56.0,53.0,76.0,82.0,72.0,52.0,42.0,36.0,35.0,28.0,4.0,3.0,0.0,1.0,878.0,421.0,457.0,0.169704,0.636674,0.193622,7032.0,0.404863,8.009112,3663.0,0.395304,0.428610,0.342496,8965.0,8267.0,26498.0,24688.0,5121.0,4795.0,2499.0,2301.0,337.0,333.0,"{'PP': 266, 'PSOE': 112, 'Cs': 48, 'UP': 46, '...","[('PP', 266), ('PSOE', 112), ('Cs', 48), ('UP'..."
3,022016061010400301002,01,04,04003,0400301002,Andalucía,Almería,Adra,1264,867,0.685918,3,864,4,860,436,211,102,101,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,10,0,...,146.0,179.0,139.0,130.0,133.0,122.0,84.0,44.0,44.0,39.0,18.0,15.0,3.0,0.0,0.0,1693.0,839.0,854.0,0.096279,0.669226,0.234495,7032.0,0.404863,4.153574,3663.0,0.395304,0.428610,0.342496,8599.0,7941.0,25677.0,23400.0,5381.0,4837.0,1815.0,1724.0,343.0,464.0,"{'PP': 436, 'PSOE': 211, 'Cs': 102, 'UP': 101,...","[('PP', 436), ('PSOE', 211), ('Cs', 102), ('UP..."
4,022016061010400301003,01,04,04003,0400301003,Andalucía,Almería,Adra,1439,952,0.661571,9,943,10,933,512,214,111,85,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,7,3,...,173.0,209.0,234.0,166.0,142.0,91.0,74.0,60.0,49.0,46.0,48.0,13.0,8.0,1.0,0.0,2149.0,1037.0,1112.0,0.104700,0.645882,0.249418,7032.0,0.404863,3.272220,3663.0,0.395304,0.428610,0.342496,8076.0,7150.0,22051.0,19687.0,5224.0,4044.0,1170.0,1198.0,416.0,476.0,"{'PP': 512, 'PSOE': 214, 'Cs': 111, 'UP': 85, ...","[('PP', 512), ('PSOE', 214), ('Cs', 111), ('UP..."
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
36188,022016061195200108011,19,52,52001,5200108011,Melilla,Melilla,Melilla,1510,860,0.569536,12,848,6,842,401,172,158,98,0,0,5,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,8,0,...,173.0,165.0,158.0,162.0,150.0,181.0,120.0,68.0,40.0,35.0,21.0,15.0,6.0,1.0,0.0,2296.0,1141.0,1155.0,0.081010,0.633275,0.285714,20976.0,0.198322,9.135889,14748.0,0.409547,0.361676,0.412832,16433.0,15847.0,66352.0,62632.0,11378.0,11119.0,1508.0,1274.0,167.0,166.0,"{'PP': 401, 'PSOE': 172, 'Cs': 158, 'UP': 98, ...","[('PP', 401), ('PSOE', 172), ('Cs', 158), ('UP..."
36189,022016061195200108012,19,52,52001,5200108012,Melilla,Melilla,Melilla,1692,1109,0.655437,12,1097,14,1083,646,175,155,81,0,0,9,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,17,0,...,193.0,174.0,177.0,172.0,186.0,164.0,122.0,100.0,95.0,59.0,55.0,22.0,5.0,1.0,0.0,2334.0,1169.0,1165.0,0.144387,0.634533,0.221080,20976.0,0.198322,8.987147,14748.0,0.409547,0.361676,0.412832,17350.0,16792.0,50730.0,50839.0,13272.0,13038.0,2763.0,2445.0,169.0,177.0,"{'PP': 646, 'PSOE': 175, 'Cs': 155, 'UP': 81, ...","[('PP', 646), ('PSOE', 175), ('Cs', 155), ('UP..."
36190,022016061195200108013,19,52,52001,5200108013,Melilla,Melilla,Melilla,1167,627,0.537275,5,622,9,613,317,133,93,58,0,0,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,7,1,...,179.0,158.0,131.0,113.0,123.0,149.0,98.0,55.0,44.0,23.0,31.0,9.0,2.0,0.0,0.0,1810.0,984.0,826.0,0.090608,0.692265,0.217127,20976.0,0.198322,11.588950,14748.0,0.409547,0.361676,0.412832,12553.0,11823.0,37816.0,36729.0,10102.0,9640.0,1807.0,1615.0,234.0,252.0,"{'PP': 317, 'PSOE': 133, 'Cs': 93, 'UP': 58, '...","[('PP', 317), ('PSOE', 133), ('Cs', 93), ('UP'..."
36191,022016061195200108014,19,52,52001,5200108014,Melilla,Melilla,Melilla,947,486,0.513200,6,480,6,474,279,100,43,39,0,0,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,9,0,...,135.0,126.0,71.0,64.0,63.0,92.0,92.0,72.0,57.0,35.0,14.0,4.0,2.0,0.0,0.0,1339.0,640.0,699.0,0.137416,0.604929,0.257655,20976.0,0.198322,15.665422,14748.0,0.409547,0.361676,0.412832,8906.0,8937.0,29898.0,31384.0,5923.0,6061.0,2463.0,2136.0,244.0,284.0,"{'PP': 279, 'PSOE': 100, 'Cs': 43, 'UP': 39, '...","[('PP', 279), ('PSOE', 100), ('Cs', 43), ('UP'..."


In [231]:
censo_mod = secciones_mod['Censo_Esc'].sum()

In [232]:
censo_mod

34595051

Procedemos de igual manera, sumamos los resultados, normalizamos y los almacenamos en un df.

In [233]:
secciones_mod = secciones_mod[cols_validas_mod]

In [234]:
modelizacion = pd.DataFrame(secciones_mod.sum(), columns = ['Modelización'])
modelizacion['Modelización'] = modelizacion['Modelización'] / modelizacion['Modelización']['Censo_Esc']
modelizacion = modelizacion.drop(['Censo_Esc']) 

In [235]:
modelizacion

Unnamed: 0,Modelización
Votos_Total,0.698307
Nulos,0.006491
Votos_Válidos,0.691816
Blanco,0.005161
V_Cand,0.686655
PP,0.228552
PSOE,0.156789
Cs,0.090286
UP,0.146014
IU,0.0


In [236]:
modelizacion.shape

(30, 1)

Comprobamos que el número de secciones equivalentes es el mismo, 66, que está esperando el modelo. Esto es importante porque en uno de las pruebas no fue así. La razón fue que dos de las secciones de 2019 tenían como equivalente en 2016 la misma sección. En ese caso no pudimos ejecutar el método .isin(), sino que echamos mano de un for loop.

In [237]:
np.unique(list_sec_J16).shape

(66,)

In [238]:
secciones_select = df_eleccion_comp_J16.loc[df_eleccion_comp_J16['Sección'].isin(list_sec_J16)]

In [239]:
secciones_select = secciones_select[col_validas_select]

In [240]:
secciones_select

Unnamed: 0,Sección,Censo_Esc,Votos_Total,Nulos,Votos_Válidos,Blanco,V_Cand,PP,PSOE,Cs,UP,IU,VOX,UPyD,MP,CiU,ERC,JxC,CUP,DiL,PNV,Bildu,Amaiur,CC,FA,TE,BNG,PRC,GBai,Compromis,PACMA,Otros
1765,022016061011403801001,884,689,7,682,2,680,426,79,104,67,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1
1767,022016061011403801003,1743,1180,9,1171,15,1156,450,377,170,144,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,8,5
1776,022016061011403802006,945,637,9,628,8,620,273,181,102,52,0,1,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,6,3
1782,022016061011403803002,1682,1177,15,1162,9,1153,410,410,215,95,0,2,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,13,6
1783,022016061011403803003,710,472,4,468,2,466,179,194,54,33,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,3
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
35164,022016061174619001038,856,682,3,679,3,676,254,105,139,152,0,3,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,12,9
35165,022016061174619001039,983,365,5,360,3,357,63,115,26,139,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,8,5
35169,022016061174619001043,997,786,3,783,3,780,286,105,226,141,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,8,13
35170,022016061174619001044,1099,883,4,879,3,876,279,122,226,217,0,1,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,17,10


In [241]:
secciones_select.shape

(66, 32)

Hacemos un pequeño inciso para explicar que hicimos en el caso de no poder utilizar .isin().

In [110]:
secciones_select.iloc[0, ]

Sección          022016061011403801001
Censo_Esc                          884
Votos_Total                        689
Nulos                                7
Votos_Válidos                      682
Blanco                               2
V_Cand                             680
PP                                 426
PSOE                                79
Cs                                 104
UP                                  67
IU                                   0
VOX                                  1
UPyD                                 1
MP                                   0
CiU                                  0
ERC                                  0
JxC                                  0
CUP                                  0
DiL                                  0
PNV                                  0
Bildu                                0
Amaiur                               0
CC                                   0
FA                                   0
TE                       

En ese caso creamos un dataframe dummy con un elemento inicial que defina así las columnas, dff.



In [118]:
dff = pd.DataFrame(secciones_select.iloc[0, ]).T

In [119]:
dff

Unnamed: 0,Sección,Censo_Esc,Votos_Total,Nulos,Votos_Válidos,Blanco,V_Cand,PP,PSOE,Cs,UP,IU,VOX,UPyD,MP,CiU,ERC,JxC,CUP,DiL,PNV,Bildu,Amaiur,CC,FA,TE,BNG,PRC,GBai,Compromis,PACMA,Otros
1765,022016061011403801001,884,689,7,682,2,680,426,79,104,67,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1


... y aplicamos un for loop para seleccionar las filas de las secciones equivalentes

In [120]:
for sec in list_sec_J16:

  row = df_eleccion_comp_J16.loc[df_eleccion_comp_J16['Sección'] == sec]

  dff = dff.append(row)



In [121]:
dff

Unnamed: 0,Sección,Censo_Esc,Votos_Total,Nulos,Votos_Válidos,Blanco,V_Cand,PP,PSOE,Cs,UP,IU,VOX,UPyD,MP,CiU,ERC,JxC,CUP,DiL,PNV,Bildu,Amaiur,CC,FA,TE,BNG,PRC,GBai,Compromis,PACMA,Otros,cod_ccaa,cod_prov,cod_mun,cod_sec,CCAA,Provincia,Municipio,Participación,...,30-34,35-39,40-44,45-49,50-54,55-59,60-64,65-69,70-74,75-79,80-84,85-89,90-94,95-99,100 y más,Población Total,Hombres,Mujeres,% mayores 65 años,% 20-64 años,% menores 19 años,Afiliados SS Minicipio,% Afiliados SS autónomos,% Afiliados SS / Población,Paro Registrado Municipio,% Paro Hombres,% Paro mayores 45,% Paro s/ Afiliados SS Municipio,Renta persona 2017,Renta persona 2015,Renta hogar 2017,Renta hogar 2015,Renta Salarios 2018,Renta Salarios 2015,Renta Pensiones 2018,Renta Pensiones 2015,Renta Desempleo 2018,Renta Desempleo 2015,dict_res,dict_res_ord
1765,022016061011403801001,884,689,7,682,2,680,426,79,104,67,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,,,,,,,,,...,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
1765,022016061011403801001,884,689,7,682,2,680,426,79,104,67,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,01,14,14038,1403801001,Andalucía,Córdoba,Lucena,0.779412,...,49.0,72.0,81.0,78.0,86.0,70.0,63.0,67.0,57.0,42.0,45.0,35.0,14.0,5.0,0.0,1124.0,541.0,583.0,0.235765,0.577402,0.186833,14787.0,0.204031,13.155694,7858.0,0.377704,0.461441,0.347008,11025.0,10319.0,29436.0,28262.0,6487.0,5455.0,3588.0,3238.0,225.0,238.0,"{'PP': 426, 'PSOE': 79, 'Cs': 104, 'UP': 67, '...","[('PP', 426), ('Cs', 104), ('PSOE', 79), ('UP'..."
1767,022016061011403801003,1743,1180,9,1171,15,1156,450,377,170,144,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,8,5,01,14,14038,1403801003,Andalucía,Córdoba,Lucena,0.676994,...,144.0,206.0,174.0,172.0,152.0,123.0,93.0,89.0,79.0,84.0,80.0,46.0,28.0,1.0,0.0,2270.0,1093.0,1177.0,0.179295,0.596035,0.224670,14787.0,0.204031,6.514097,7858.0,0.377704,0.461441,0.347008,7565.0,7095.0,19944.0,18935.0,5166.0,3985.0,2004.0,1940.0,378.0,483.0,"{'PP': 450, 'PSOE': 377, 'Cs': 170, 'UP': 144,...","[('PP', 450), ('PSOE', 377), ('Cs', 170), ('UP..."
1770,022016061011403801006,673,448,0,448,1,447,190,121,77,51,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,4,4,01,14,14038,1403801006,Andalucía,Córdoba,Lucena,0.665676,...,59.0,79.0,72.0,58.0,58.0,39.0,38.0,33.0,42.0,38.0,33.0,21.0,10.0,4.0,0.0,875.0,432.0,443.0,0.206857,0.581714,0.211429,14787.0,0.204031,16.899429,7858.0,0.377704,0.461441,0.347008,7645.0,7696.0,19281.0,19901.0,5199.0,4306.0,2269.0,2247.0,371.0,392.0,"{'PP': 190, 'PSOE': 121, 'Cs': 77, 'UP': 51, '...","[('PP', 190), ('PSOE', 121), ('Cs', 77), ('UP'..."
1774,022016061011403802004,1050,678,12,666,5,661,200,252,103,96,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,7,2,01,14,14038,1403802004,Andalucía,Córdoba,Lucena,0.645714,...,71.0,129.0,164.0,165.0,115.0,78.0,41.0,36.0,48.0,30.0,23.0,14.0,2.0,1.0,0.0,1488.0,758.0,730.0,0.103495,0.620968,0.275538,14787.0,0.204031,9.937500,7858.0,0.377704,0.461441,0.347008,6266.0,5786.0,17705.0,16493.0,4644.0,3522.0,1420.0,1343.0,494.0,595.0,"{'PP': 200, 'PSOE': 252, 'Cs': 103, 'UP': 96, ...","[('PSOE', 252), ('PP', 200), ('Cs', 103), ('UP..."
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
35161,022016061174619001035,1347,1101,0,1101,6,1095,501,133,237,196,0,3,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,13,11,17,46,46190,4619001035,La Rioja,Valencia,Paterna,0.817372,...,84.0,139.0,131.0,164.0,155.0,161.0,135.0,87.0,53.0,28.0,15.0,18.0,7.0,4.0,1.0,1824.0,945.0,879.0,0.116776,0.632675,0.250548,39890.0,0.111782,21.869518,8268.0,0.428882,0.450895,0.171685,18111.0,17313.0,54880.0,53765.0,15839.0,14414.0,3167.0,2473.0,198.0,256.0,"{'PP': 501, 'PSOE': 133, 'Cs': 237, 'UP': 196,...","[('PP', 501), ('Cs', 237), ('UP', 196), ('PSOE..."
35163,022016061174619001037,1458,1190,4,1186,8,1178,488,138,243,283,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,20,4,17,46,46190,4619001037,La Rioja,Valencia,Paterna,0.816187,...,96.0,131.0,143.0,109.0,141.0,196.0,162.0,125.0,78.0,28.0,25.0,17.0,10.0,2.0,0.0,1885.0,933.0,952.0,0.151194,0.647215,0.201592,39890.0,0.111782,21.161804,8268.0,0.428882,0.450895,0.171685,17278.0,17731.0,50974.0,53220.0,13897.0,13206.0,4196.0,3544.0,170.0,264.0,"{'PP': 488, 'PSOE': 138, 'Cs': 243, 'UP': 283,...","[('PP', 488), ('UP', 283), ('Cs', 243), ('PSOE..."
35165,022016061174619001039,983,365,5,360,3,357,63,115,26,139,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,8,5,17,46,46190,4619001039,La Rioja,Valencia,Paterna,0.371312,...,117.0,131.0,116.0,103.0,92.0,61.0,56.0,43.0,20.0,11.0,8.0,4.0,0.0,0.0,1.0,1630.0,784.0,846.0,0.053374,0.560123,0.386503,39890.0,0.111782,24.472393,8268.0,0.428882,0.450895,0.171685,3431.0,3281.0,12087.0,11388.0,2253.0,1349.0,809.0,750.0,385.0,399.0,"{'PP': 63, 'PSOE': 115, 'Cs': 26, 'UP': 139, '...","[('UP', 139), ('PSOE', 115), ('PP', 63), ('Cs'..."
35166,022016061174619001040,1030,796,3,793,6,787,268,117,146,229,0,1,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,13,9,17,46,46190,4619001040,La Rioja,Valencia,Paterna,0.772816,...,74.0,90.0,120.0,125.0,123.0,113.0,101.0,68.0,43.0,34.0,31.0,19.0,4.0,5.0,0.0,1353.0,688.0,665.0,0.150776,0.633407,0.215817,39890.0,0.111782,29.482631,8268.0,0.428882,0.450895,0.171685,14789.0,13849.0,39446.0,36522.0,12727.0,11327.0,2937.0,2601.0,187.0,246.0,"{'PP': 268, 'PSOE': 117, 'Cs': 146, 'UP': 229,...","[('PP', 268), ('UP', 229), ('Cs', 146), ('PSOE..."


... Y nos quedamos con todas excepto la primera que usamos para crear el dummy.

In [122]:
secciones_select = dff.iloc[1:,]

Pero en este caso no es necesario.

In [242]:
secciones_select

Unnamed: 0,Sección,Censo_Esc,Votos_Total,Nulos,Votos_Válidos,Blanco,V_Cand,PP,PSOE,Cs,UP,IU,VOX,UPyD,MP,CiU,ERC,JxC,CUP,DiL,PNV,Bildu,Amaiur,CC,FA,TE,BNG,PRC,GBai,Compromis,PACMA,Otros
1765,022016061011403801001,884,689,7,682,2,680,426,79,104,67,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1
1767,022016061011403801003,1743,1180,9,1171,15,1156,450,377,170,144,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,8,5
1776,022016061011403802006,945,637,9,628,8,620,273,181,102,52,0,1,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,6,3
1782,022016061011403803002,1682,1177,15,1162,9,1153,410,410,215,95,0,2,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,13,6
1783,022016061011403803003,710,472,4,468,2,466,179,194,54,33,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,3
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
35164,022016061174619001038,856,682,3,679,3,676,254,105,139,152,0,3,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,12,9
35165,022016061174619001039,983,365,5,360,3,357,63,115,26,139,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,8,5
35169,022016061174619001043,997,786,3,783,3,780,286,105,226,141,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,8,13
35170,022016061174619001044,1099,883,4,879,3,876,279,122,226,217,0,1,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,17,10


In [243]:
secciones_select_norm = secciones_select.copy()

Y ahora simplemente normalizamos y trasponemos.

In [244]:
for col in secciones_select_norm.columns:

  if col not in set_cols:
    
    secciones_select_norm[col] = secciones_select_norm[col] / secciones_select_norm['Censo_Esc']

secciones_select_norm = secciones_select_norm.set_index('Sección')
secciones_select_norm = secciones_select_norm.drop('Censo_Esc', axis = 1)

secciones_select_norm = secciones_select_norm.T

In [245]:
secciones_select_norm

Sección,022016061011403801001,022016061011403801003,022016061011403802006,022016061011403803002,022016061011403803003,022016061011403803008,022016061011403804001,022016061011403805001,022016061011403805002,022016061024421602005,022016061024421603001,022016061024421603004,022016061024421603006,022016061024421603007,022016061024421603008,022016061024421603010,022016061024421604001,022016061025006702005,022016061053502602003,022016061053502603005,022016061053502603006,022016061053502603008,022016061053502603009,022016061053502604002,022016061053502605003,022016061053502605004,022016061053502606010,022016061063903501002,022016061063903502001,022016061063903502003,022016061063903502005,022016061063903503002,022016061094312301002,022016061094312302001,022016061094312302004,022016061094312302009,022016061094312302015,022016061094312303010,022016061094312304005,022016061094312305003,022016061094312306008,022016061094312307001,022016061094312308001,022016061094312308007,022016061113208501001,022016061142003001005,022016061142003002001,022016061142003002003,022016061142003003001,022016061142003003003,022016061142003003004,022016061142003004004,022016061142003005002,022016061142003005003,022016061174619001001,022016061174619001007,022016061174619001014,022016061174619001015,022016061174619001019,022016061174619001023,022016061174619001033,022016061174619001038,022016061174619001039,022016061174619001043,022016061174619001044,022016061174619001046
Votos_Total,0.779412,0.676994,0.674074,0.699762,0.664789,0.651399,0.629556,0.746622,0.717105,0.613321,0.679719,0.762215,0.739779,0.661352,0.696521,0.78359,0.695652,0.752363,0.640118,0.694051,0.635152,0.630137,0.554688,0.638589,0.672225,0.539754,0.633117,0.689349,0.780039,0.759434,0.62565,0.706422,0.680219,0.590494,0.553571,0.70442,0.726902,0.573709,0.675941,0.559801,0.584416,0.401802,0.580282,0.637044,0.686579,0.651558,0.636577,0.688525,0.581624,0.678981,0.658647,0.742593,0.679767,0.673276,0.683041,0.780488,0.625882,0.724026,0.70282,0.635762,0.391254,0.796729,0.371312,0.788365,0.803458,0.689805
Nulos,0.007919,0.005164,0.009524,0.008918,0.005634,0.003393,0.004639,0.006757,0.013158,0.013876,0.007028,0.006515,0.004543,0.013194,0.010363,0.00826,0.012422,0.009452,0.007866,0.012748,0.010909,0.008371,0.007031,0.011519,0.010926,0.005599,0.005952,0.003945,0.017442,0.011792,0.005199,0.008519,0.007819,0.0,0.004699,0.005525,0.001359,0.005634,0.00982,0.008306,0.004329,0.000901,0.0,0.002996,0.015548,0.001889,0.001172,0.003074,0.004102,0.001274,0.004511,0.001852,0.005822,0.003448,0.002339,0.00813,0.007059,0.00487,0.017354,0.003311,0.002301,0.003505,0.005086,0.003009,0.00364,0.002169
Votos_Válidos,0.771493,0.67183,0.66455,0.690844,0.659155,0.648007,0.624917,0.739865,0.703947,0.599445,0.672691,0.7557,0.735237,0.648158,0.686158,0.77533,0.68323,0.742911,0.632252,0.681303,0.624242,0.621766,0.547656,0.62707,0.6613,0.534155,0.627165,0.685404,0.762597,0.747642,0.620451,0.697903,0.6724,0.590494,0.548872,0.698895,0.725543,0.568075,0.666121,0.551495,0.580087,0.400901,0.580282,0.634049,0.671031,0.649669,0.635404,0.685451,0.577523,0.677707,0.654135,0.740741,0.673945,0.669828,0.680702,0.772358,0.618824,0.719156,0.685466,0.63245,0.388953,0.793224,0.366226,0.785356,0.799818,0.687636
Blanco,0.002262,0.008606,0.008466,0.005351,0.002817,0.005089,0.005964,0.010135,0.001316,0.0037,0.006024,0.007329,0.007787,0.010995,0.005922,0.004956,0.005521,0.003781,0.001967,0.001416,0.006061,0.002283,0.003125,0.00216,0.00345,0.008959,0.004329,0.001972,0.010659,0.004717,0.008666,0.011796,0.002346,0.0,0.007519,0.006906,0.004076,0.003756,0.003273,0.001661,0.002165,0.001802,0.0,0.003994,0.009002,0.002833,0.003517,0.01127,0.003281,0.005096,0.001504,0.001852,0.002911,0.006897,0.004678,0.005807,0.002353,0.008117,0.001085,0.003311,0.005754,0.003505,0.003052,0.003009,0.00273,0.002169
V_Cand,0.769231,0.663224,0.656085,0.685493,0.656338,0.642918,0.618953,0.72973,0.702632,0.595745,0.666667,0.748371,0.72745,0.637163,0.680237,0.770374,0.677709,0.73913,0.630285,0.679887,0.618182,0.619482,0.544531,0.62491,0.657849,0.525196,0.622835,0.683432,0.751938,0.742925,0.611785,0.686107,0.670055,0.590494,0.541353,0.691989,0.721467,0.564319,0.662848,0.549834,0.577922,0.399099,0.580282,0.630055,0.662029,0.646837,0.631887,0.67418,0.574241,0.672611,0.652632,0.738889,0.671033,0.662931,0.676023,0.766551,0.616471,0.711039,0.684382,0.629139,0.383199,0.78972,0.363174,0.782347,0.797088,0.685466
PP,0.4819,0.258176,0.288889,0.243757,0.252113,0.227311,0.247184,0.211149,0.143421,0.207216,0.227912,0.43241,0.338741,0.212754,0.30644,0.265969,0.31608,0.294896,0.253687,0.259207,0.236364,0.15449,0.196875,0.327574,0.19724,0.164614,0.267316,0.220907,0.321705,0.238208,0.318891,0.25557,0.11337,0.078611,0.098684,0.116022,0.092391,0.131455,0.124386,0.104651,0.080087,0.110811,0.152113,0.136795,0.305237,0.076487,0.028136,0.094262,0.041017,0.056051,0.066165,0.092593,0.078603,0.075,0.231579,0.2741,0.131765,0.157468,0.173536,0.107616,0.079402,0.296729,0.06409,0.286861,0.253867,0.203905
PSOE,0.089367,0.216294,0.191534,0.243757,0.273239,0.286684,0.15507,0.381757,0.331579,0.160037,0.206827,0.119707,0.160934,0.140187,0.156181,0.1663,0.144928,0.173913,0.115044,0.134561,0.130909,0.163623,0.124219,0.102232,0.162737,0.152296,0.13961,0.197239,0.221899,0.262972,0.185442,0.153997,0.074277,0.078611,0.119361,0.062155,0.080163,0.133333,0.12766,0.171096,0.047619,0.128829,0.2,0.089865,0.164484,0.190746,0.104338,0.305328,0.140279,0.147771,0.126316,0.114815,0.084425,0.172414,0.132164,0.140534,0.171765,0.144481,0.22885,0.225166,0.120829,0.122664,0.116989,0.105316,0.11101,0.110629
Cs,0.117647,0.097533,0.107937,0.127824,0.076056,0.061917,0.117296,0.069257,0.040789,0.096207,0.098394,0.104235,0.110318,0.126993,0.131754,0.169053,0.100759,0.146503,0.100295,0.103399,0.049697,0.064688,0.050781,0.067675,0.094307,0.040314,0.061688,0.11144,0.094961,0.107311,0.091854,0.092398,0.053167,0.053016,0.033835,0.041436,0.08288,0.088263,0.109656,0.063123,0.047619,0.040541,0.102817,0.138792,0.03928,0.014164,0.010551,0.026639,0.012305,0.016561,0.004511,0.014815,0.016012,0.016379,0.116959,0.125436,0.056471,0.112825,0.069414,0.033113,0.025316,0.162383,0.02645,0.22668,0.205641,0.143167
UP,0.075792,0.082616,0.055026,0.05648,0.046479,0.055131,0.083499,0.050676,0.177632,0.119334,0.119478,0.080619,0.103829,0.144035,0.079941,0.154185,0.101449,0.10586,0.129794,0.144476,0.14303,0.180365,0.107031,0.091433,0.165037,0.119821,0.104437,0.140039,0.09593,0.115566,0.0,0.167104,0.110242,0.133455,0.100564,0.088398,0.142663,0.116432,0.145663,0.091362,0.097403,0.075676,0.069014,0.111333,0.124386,0.233239,0.223916,0.142418,0.209188,0.166879,0.193985,0.183333,0.139738,0.223276,0.167251,0.210221,0.225882,0.260552,0.190889,0.236755,0.132336,0.17757,0.141404,0.141424,0.197452,0.209328
IU,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [246]:
secciones_select_norm.shape

(30, 66)

Ya podemos aplucar el modelo, y para ello definimos la matriz X e y.

In [247]:
secciones_select_norm['Modelización'] = modelizacion['Modelización']

In [248]:
X = secciones_select_norm.drop('Modelización', axis = 1).values
y = secciones_select_norm['Modelización'].values

Aplicamos el modelo del pipeline, por supuesto sin hacer ningún fit.

In [249]:
predicciones = pipe_modelado.predict(X=X)
predicciones = predicciones.flatten()

In [250]:
r2 = r2_score(
            y_true  = y,
            y_pred  = predicciones
           )

El R2 que conseguimos es muy bueno, un 99.97%

In [251]:
r2

0.9997383554666492

Ahora mostramos los resultados, comenzando por deshacer la normalización.

In [252]:
est = predicciones * censo_mod

In [253]:
df = pd.DataFrame(est, index = secciones_select_norm.index, columns = ['Estimación']).astype('int32')

In [254]:
df

Unnamed: 0,Estimación
Votos_Total,24400768
Nulos,238687
Votos_Válidos,24161386
Blanco,162773
V_Cand,23997919
PP,7577739
PSOE,5585833
Cs,3176265
UP,5334831
IU,-694


Añadimos la columna de los datos reales en 2016

In [255]:
df1 = pd.DataFrame(secciones_mod.sum(), columns = ['Real']).drop('Censo_Esc')

In [256]:
df['Real'] = df1['Real']

In [257]:
df

Unnamed: 0,Estimación,Real
Votos_Total,24400768,24157982
Nulos,238687,224564
Votos_Válidos,24161386,23933418
Blanco,162773,178559
V_Cand,23997919,23754859
PP,7577739,7906761
PSOE,5585833,5424130
Cs,3176265,3123436
UP,5334831,5051345
IU,-694,0


Añadimos columnas con los porcentajes de voto para majorar la interpretabilidad. 

In [258]:
df['pc Estimación'] = df['Estimación'] / df['Estimación'][2] * 100

In [259]:
df['pc Real'] = df['Real'] / df['Real'][2] * 100

In [260]:
df['dif. Real-Est.'] = df['pc Real'] - df['pc Estimación']

In [261]:
df

Unnamed: 0,Estimación,Real,pc Estimación,pc Real,dif. Real-Est.
Votos_Total,24400768,24157982,100.990763,100.938286,-0.052476
Nulos,238687,224564,0.987886,0.938286,-0.0496
Votos_Válidos,24161386,23933418,100.0,100.0,0.0
Blanco,162773,178559,0.673691,0.746066,0.072375
V_Cand,23997919,23754859,99.323437,99.253934,-0.069503
PP,7577739,7906761,31.363015,33.036489,1.673474
PSOE,5585833,5424130,23.118843,22.663416,-0.455428
Cs,3176265,3123436,13.146038,13.050522,-0.095516
UP,5334831,5051345,22.079987,21.105824,-0.974164
IU,-694,0,-0.002872,0.0,0.002872


Vemos que el ajuste no es igual de bueno que en noviembre de 2019, como era de esperar, pero sigue siendo bastante satisfactorio en general. La estimación solo difiere de la realidad un poco en el caso del PP.

Pensamos que tiene bastante mérito, teniendo en cuenta la forma casi minimalista en la que hemos elegido las fuentes de las secciones seleccionadas: una serie de municipios elegidos al tuntun en provincias en general con partidos nacionalistas.