

<center><img src="https://github.com/MagallanesTalks/OpenBigData_atPUCP/blob/main/logo.png?raw=true" width="1000"></center>


# **THE 2021 Presidential Elections in Perú**


Results from 2021 presidential election, first round, available at [INFOGOB](https://infogob.jne.gob.pe/BaseDatos). Let's keep the results at the _PROVINCIA_ level.

In [None]:
# !pip install unidecode

In [None]:
import pandas as pd # you may also need openpyxl
from unidecode import unidecode # helps get rid of some troublesome spanish elements

dataLink="https://github.com/MagallanesTalks/OpenBigData_atPUCP/raw/refs/heads/main/data/EG2021_V1.1_Resultados_Presidencial.xlsx"

vuelta1=pd.read_excel(dataLink,sheet_name='Nivel_Provincial')

vuelta1.columns=[unidecode(col) for col in vuelta1.columns.str.replace('\s','',regex=True)]

# checking the way the data was recognised by pandas
vuelta1.info()

In [None]:
# Distribution of missing values
vuelta1.isnull().sum()

In [None]:
## TipoOrganizacionPolitica?
vuelta1[vuelta1.TipoOrganizacionPolitica.isnull()]

In [None]:
## Votos
vuelta1[vuelta1.Votos.isnull()]

In [None]:
# replace by zero.
vuelta1['Votos']=vuelta1.loc[:,'Votos'].fillna(0)

# Information at the Province level

Let's work with the column **OrganizacionPolitica** organized by **Provincia**.

In [None]:
# this is a WIDE shape!
provincias=pd.pivot_table(vuelta1, values="Votos",
                          index=["Region", "Provincia"],
                          columns=["OrganizacionPolitica"])
provincias

Let's compute some indicators from these data:

In [None]:
# who won?
who_won=provincias.iloc[:,:].idxmax(axis=1)
# Where did any if the top-2 win?
oneOf_top2_won=party_won.isin(['PARTIDO POLITICO NACIONAL PERU LIBRE','FUERZA POPULAR'])
# how many validVotes?
votesValid_sum=provincias.iloc[:,:-2].sum(axis=1)
# how many votes?
votesAll_sum=provincias.iloc[:,:].sum(axis=1)
# winner votes?
winner_votes=provincias.iloc[:,:-2].max(axis=1)
winner_majority=(winner_votes/votesValid_sum)>0.5
# where the most competition?
effectiveNum=1/provincias.iloc[:,:-2].div(provincias.iloc[:,:-2].sum(axis=1), axis=0).pow(2).sum(axis=1)

# new vars:
provincias['who_won']=who_won
provincias['oneOf_top2_won']=oneOf_top2_won
provincias['total_validvotes']=votesValid_sum
provincias['total_votes']=votesAll_sum
provincias['winner_votes']=winner_votes
provincias['winner_majority']=winner_majority
provincias['effectiveNum']=effectiveNum.astype(int)
provincias['elected_share']=100*provincias.loc[:,'PARTIDO POLITICO NACIONAL PERU LIBRE']/votesValid_sum
provincias['elected_majority']=provincias['elected_share']>50
provincias['runnerup_share']=100*provincias.loc[:,'FUERZA POPULAR']/votesValid_sum
provincias['runnerup_majority']=provincias['runnerup_share']>50

In [None]:
ProvRegionData=provincias.loc[:,'who_won'::].reset_index()
ProvRegionData

## <div class="alert alert-danger" role="alert">Merging data into map</div>

In [None]:
# shapefile
linkMap="https://github.com/MagallanesTalks/OpenBigData_atPUCP/raw/refs/heads/main/map/PROVINCIAS.shp"

# read the map
import geopandas as gpd
mapaProv=gpd.read_file(linkMap)
mapaProv.head()

This will be our **baseMap**:

In [None]:
baseMap=mapaProv.plot(color='white',edgecolor='grey', linewidth=0.1)
baseMap

In order to merge, verify both data share same values in _PROVINCIA_:

In [None]:
NotInGeoDF=sorted(list(set(mapaProv.PROVINCIA)-set(ProvRegionData.Provincia)))
NotInDF=sorted(list(set(ProvRegionData.Provincia)-set(mapaProv.PROVINCIA)))
changesMap={geo:df for geo,df in zip(NotInGeoDF,NotInDF)}

# CHANGES NEEDED
changesMap

In [None]:
# RECODING
mapaProv.replace({'PROVINCIA':changesMap}, inplace=True)

# MERGING
mapaProvElect=mapaProv.merge(ProvRegionData, left_on='PROVINCIA', right_on='Provincia', how='inner')
mapaProvElect.drop(columns=['Region','Provincia'],inplace=True) # drop duplicate info

# result
mapaProvElect.head()

## Information to visualize (I)


### Where no party won:

In [None]:
mapaProvElect.who_won.value_counts()

In [None]:
baseMap=mapaProv.plot(color='white',edgecolor='grey', linewidth=0.1)
mapaProvElect[mapaProvElect.who_won=='VOTOS EN BLANCO'].plot(color='black',ax=baseMap)

### Where the future president won 50% or more votes

In [None]:
mapaProvElect.elected_majority.value_counts()

In [None]:
baseMap=mapaProv.plot(color='white',edgecolor='grey', linewidth=0.1)
mapaProvElect[mapaProvElect.elected_majority].plot(color='red',ax=baseMap)

### Where was low or high competition

In [None]:
mapaProvElect.effectiveNum.value_counts()

In [None]:
mapaProvElect.plot(column='effectiveNum', legend=True,cmap="Reds")

## Information to visualize (II)

Let me pay attention to **runnerup share**:

In [None]:
mapaProvElect.runnerup_share.describe()

Let's see the behavior of runnerup_share in the neighborhood:



In [None]:
# !pip install pysal

1. Compute the neighborhood

In [None]:
from libpysal.weights import Queen

w_queen = Queen.from_dataframe(mapaProvElect,use_index=False)
w_queen.transform = 'R'

2. Check if the variable **runnerup_share** shows spatial autocorrelation

In [None]:
from esda.moran import Moran

moranRunnerup2021 = Moran(mapaProvElect['runnerup_share'], w_queen)
moranRunnerup2021.I,moranRunnerup2021.p_sim

3. Compute the local spatial autocorrelation:

In [None]:
from esda.moran import Moran_Local
LisaRunnerup = Moran_Local(y=mapaProvElect['runnerup_share'], w=w_queen,seed=1234)

4. Explore results of LISA:

In [None]:
from splot.esda import moran_scatterplot

fig, ax = moran_scatterplot(LisaRunnerup,p=0.05)
ax.set_xlabel('LisaRunnerup_std')
ax.set_ylabel('SpatialLag_LisaRunnerup_std');

5. Get labels for each province, in order to identify **spots & outliers**:

In [None]:
# quadrant: 1 HH,  2 LH,  3 LL,  4 HL
labels = [ '0 no_sig', '1 hotSpot', '2 coldOutlier', '3 coldSpot', '4 hotOutlier']

mapaProvElect['RUNNERUP_quadrant']=[l if p <0.05 else 0 for l,p in zip(LisaRunnerup.q,LisaRunnerup.p_sim)  ]
mapaProvElect['RUNNERUP_quadrant']=[labels[i] for i in mapaProvElect['RUNNERUP_quadrant']]
mapaProvElect.head()

We have what is needed:

In [None]:
mapaProvElect.RUNNERUP_quadrant.value_counts()

Now the map:

In [None]:
domain_ = labels
range_ = ['gainsboro', 'purple', 'lime','orange','orchid']

theMap_LISA=alt.Chart(mapaProvElect)
theMap_LISA_encodings=theMap_LISA.encode(
                                        alt.Color('RUNNERUP_quadrant',
                                                  scale=alt.Scale(domain=domain_,
                                                                  range=range_),
                                                  title = "RUNNERUP_quadrant",
                                                  legend=alt.Legend(orient='none',
                                                                    direction='horizontal',
                                                                    titleAnchor='middle',
                                                                    legendY=-40,
                                                                    legendX=150)),
                                        tooltip=['PROVINCIA']).properties(width=800,
                                                                          height=500)

theMap_LISA_encodings.mark_geoshape()

In [None]:
# You may save
# mapaProvElect.to_file("mapaProvElect_2021.geojson", driver='GeoJSON')