# PAM's Descriptive Analysis

This file contains all the code & analysis developed for the paper titled XXX.

In this section we will analyse the PAM's datasets. 

In [90]:
# Packages
import os 
import io
import requests
import pandas as pd
import numpy as np
from langdetect import detect
import yaml
pd.set_option('display.max_rows', 500)
pd.set_option('display.max_columns', 500)
pd.set_option('display.width', 1000)
pd.set_option('display.max_colwidth', -1) # print long strings

In [4]:
# Read yaml file 
with open('config_file.yaml', 'r') as f:
    config = yaml.load(f)

### 1.1. Proposals Dataset

In [77]:
# Read dataset of proposals
df = pd.read_csv(config['ROOT_PATH'] + '/data/' + 'proposals_all.csv')
len(df)

10860

In [6]:
# Remove first column
df.drop(df.columns[[0]], axis=1, inplace=True)

In [7]:
# Analyse column types
df.dtypes

proposal          int64 
origin            object
scope             object
district          object
category          object
subcategory       object
author            object
author_name       object
created_at        object
votes             int64 
comments          int64 
url               object
status            object
title_es          object
title_ca          object
description_es    object
description_ca    object
dtype: object

In [8]:
# Change some column types
df.astype({'proposal': 'object'}).dtypes

proposal          object
origin            object
scope             object
district          object
category          object
subcategory       object
author            object
author_name       object
created_at        object
votes             int64 
comments          int64 
url               object
status            object
title_es          object
title_ca          object
description_es    object
description_ca    object
dtype: object

### Analysis of votes, comments, districts and categories

In [9]:
# Summary of variables
summary_list = [df.describe()] + \
               [df.groupby([c])[df.columns[0]].count() \
                for c in df.columns if df[c].dtype == 'object']

for i in summary_list:
    print(i)
    print()

           proposal         votes      comments
count  10860.000000  10860.000000  10860.000000
mean   5513.313628   15.201842     1.675046    
std    3139.352666   43.845657     5.708035    
min    3.000000      0.000000      0.000000    
25%    2799.750000   1.000000      0.000000    
50%    5515.500000   5.000000      0.000000    
75%    8230.250000   15.000000     1.000000    
max    10946.000000  1720.000000   337.000000  

origin
citizen         662 
citizenship     2783
meeting         4498
official        1300
organization    1617
Name: proposal, dtype: int64

scope
city        4518
district    6342
Name: proposal, dtype: int64

district
Ciutat Vella                971
Eixample                    539
GrÃÂ cia                   142
GrÃ cia                     571
Horta - GuinardÃÂ³         116
Horta - GuinardÃ³           490
Les Corts                   585
None                        824
Nou Barris                  444
Sant Andreu                 516
Sant MartÃÂ­             

Proposals usually obtain a low number of votes (75% of them obtained at most 15 votes), although, the analysis shows that some proposals were very popular (with at most 1,720 votes). The distribution of comments also exposes that most of them (75%) obtained 0 or 1 comments. However, some of them were very commented (max. is 337).

The district with more proposals was Ciutat Vella, with at least 971 proposals. On the other hand, the Sarrià SAnt Gervasi was the least participative. 

The most popular category was Bon viure, the least 'Justicia Global'.

### Analysis of status, authors, and origin

In [10]:
# What does 0 and 1 mean in the status column?
df.loc[df['status'] == '0'] # status == '0' is equal to status == 'rejected'

df['status'] = df['status'].map({'0': 'rejected', '1': 'accepted'}) #clean status

In [16]:
df['status'].value_counts()

accepted    1540
rejected    514 
Name: status, dtype: int64

In [21]:
df['author_name'].value_counts()

Ajuntament de Barcelona                                1305
None                                                   846 
suport decidim.HG i Les Corts                          496 
suport decidim.barcelona                               391 
Suport General - Decidim Barcelona                     306 
Suport Ciutat Vella                                    290 
GR_Raons PÃºbliques                                    254 
suportdecidim.santandreu                               237 
Suport decidim Activitat EconÃ²mica                    223 
suport.decidim1                                        202 
decidim sarriÃ -sant gervasi                           186 
eixampledelibera                                       179 
suport decidim.barcelona2                              163 
Dsoriano                                               126 
suport decidim.actes de ciutat                         112 
suport decidim.salut                                   93  
Solidaritat Catalana per la IndependÃ¨nc

In [50]:
df['origin'].value_counts()

meeting         4498
citizenship     2783
organization    1617
official        1300
citizen         662 
Name: origin, dtype: int64

1,540 proposals were accepted, while 514 were rejected.
The most proposative authors were the city hall (1,300 proposals), civil servants and decidim meetings. However, there exists a citizen who authored 126 proposals.

### Accepted vs. rejected and (author, origin, district, category)

In [45]:
# Authors with most accepted proposals
df[['author_name', 'status']].pivot_table(index='author_name', columns='status', aggfunc=len, fill_value=0).sort_values(by='accepted', ascending = False)



status,accepted,rejected
author_name,Unnamed: 1_level_1,Unnamed: 2_level_1
,653,193
Ajuntament de Barcelona,244,10
AAVV l'ÃÂstia,17,4
procomuns.net,16,4
suport decidim.HG i Les Corts,15,9
PAD MÃÂ²bil - Sant MartÃÂ­,14,2
Associacio de VeÃÂ¯ns i VeÃÂ¯nes Coll-Vallcarca,13,9
Treballadores i treballadors dels Districtes,12,2
AAVV Barceloneta,12,5
Salvador Pastor Blasco,12,12


In [47]:
# Authors with most rejected proposals
df[['author_name', 'status']].pivot_table(index='author_name', columns='status', aggfunc=len, fill_value=0).sort_values(by='rejected', ascending = False)


status,accepted,rejected
author_name,Unnamed: 1_level_1,Unnamed: 2_level_1
,653,193
Salvador Pastor Blasco,12,12
Ajuntament de Barcelona,244,10
suport decidim.HG i Les Corts,15,9
Associacio de VeÃÂ¯ns i VeÃÂ¯nes Coll-Vallcarca,13,9
SarriÃÂ -Sant Gervasi,6,6
Consell Barri Hostafrancs,8,5
AAVV Barceloneta,12,5
Solidaritat Catalana per la IndependÃÂ¨ncia,11,5
consellsdistrictesantmarti,7,5


In [49]:
# Origin with most rejected proposals
df[['origin', 'status']].pivot_table(index='origin', columns='status', aggfunc=len, fill_value=0).sort_values(by='rejected', ascending = False)


status,accepted,rejected
origin,Unnamed: 1_level_1,Unnamed: 2_level_1
citizen,422,240
meeting,652,192
organization,223,72
official,243,10


In [51]:
# District with most rejected proposals
df[['district', 'status']].pivot_table(index='district', columns='status', aggfunc=len, fill_value=0).sort_values(by='rejected', ascending = False)


status,accepted,rejected
district,Unnamed: 1_level_1,Unnamed: 2_level_1
,633,191
Sant MartÃÂ­,123,60
Ciutat Vella,149,51
SarriÃÂ - Sant Gervasi,39,45
Les Corts,66,43
Sants MontjuÃÂ¯c,81,39
GrÃÂ cia,108,34
Sant Andreu,76,31
Nou Barris,73,8
Eixample,82,6


In [52]:
# District with most rejected proposals
df[['category', 'status']].pivot_table(index='category', columns='status', aggfunc=len, fill_value=0).sort_values(by='rejected', ascending = False)


status,accepted,rejected
category,Unnamed: 1_level_1,Unnamed: 2_level_1
Bon viure,699,210
TransiciÃÂ³ ecolÃÂ²gica,500,206
Economia plural,213,51
Bon govern,121,42
JustÃÂ­cia global,7,5


### Bilingual analysis

In [56]:
# Read the original dataset (without translations)
url = 'https://raw.githubusercontent.com/elaragon/metadecidim/master/proposals.tsv'
s = requests.get(url).content
df_origin = pd.read_csv(io.StringIO(s.decode('utf-8')), sep = '\t')

In [99]:
# Create column with language of proposal
def detect_lang(x):
    try:
        return detect(x)
    except:
        return 'null'
    
    
df_origin['language'] = df_origin['summary'].apply(detect_lang)
df_origin['language'].value_counts()

ca      10014
es      713  
pt      41   
it      36   
null    19   
sv      12   
fr      9    
en      4    
ro      3    
de      2    
lv      1    
no      1    
sk      1    
tl      1    
id      1    
nl      1    
fi      1    
Name: language, dtype: int64

In [101]:
# Analyse different languages
df_origin[['summary', 'language']].loc[(df_origin['language'] != 'ca') & (df_origin['language'] != 'es')]
df_origin.loc[((df_origin['language'] != 'ca') & (df_origin['language'] != 'es') & (df_origin['language'] != 'null')), 'language'] = 'ca' # all different languages are indeed catalan

In [102]:
df_origin['language'].value_counts()

ca      10128
es      713  
null    19   
Name: language, dtype: int64

Most proposals are written in Catalan.

In [107]:
# Merge this new column with df
df_origin_aux = df_origin[['id', 'total_positive_comments', 'total_neutral_comments', 'total_negative_comments', 'rejected_message', 'language']]
df = pd.merge(df, df_origin_aux, left_on=  ['proposal'], right_on= ['id'], how = 'left')

In [108]:
df.head()

Unnamed: 0.1,Unnamed: 0,proposal,origin,scope,district,category,subcategory,author,author_name,created_at,votes,comments,url,status,title_es,title_ca,description_es,description_ca,id,total_positive_comments,total_neutral_comments,total_negative_comments,rejected_message,language
0,0,3591,citizenship,district,Nou Barris,Economia plural,Un nou lideratge pÃºblic,5187,VILLARRASA,2/27/2016,16,0,https://decidim.barcelona.cat/proposals/reduccion-del-ibi-de-las-viviendas-de-torre-baro,rejected,ReducciÃ³n del IBI de las viviendas de TORRE BARÃN,ReducciÃ³ de l'IBI dels habitatges de TORRE BARÃ,"Revisar el alto valor que se paga del Impuesto de Bienes Inmuebles en Torre BarÃ³, sobre todo aquellas fincas construidas a partir del aÃ±o 2001.RevisiÃ³n de la categorÃ­a de calles, del valor del suelo y de la construcciÃ³n con el Objeto que el valor catastral se reduzca Hasta el punto que se corresponda con la realidad de las fincas y super entorno.","Revisar l'alt valor que es paga l'Impost de BÃ©ns Immobles a Torre BarÃ³, sobretot aquelles finques construÃ¯des a partir de l'any 2001.RevisiÃ³n de la categoria de carrers, del valor del sÃ²l i de la construcciÃ³ amb l'objecte que el valor cadastral es redueixi fins al punt que es correspongui amb la realitat de les finques i el seu entorn.",3591,0,0,0,"No hi ha competències i l’Ajuntament no disposa de la capacitat d’influir. No obstant això, el govern municipal està treballant en la recerca de solucions alternatives per reduir l'import de l'IBI en barris com Torre Baró, que considerem injust.",es
1,1,2747,organization,district,GrÃ cia,TransiciÃ³ ecolÃ²gica,Medi ambient i espai pÃºblic,6285,,2/17/2016,8,0,https://decidim.barcelona.cat/proposals/park-guell-prevencio-d-incendis-i-polissa-de-responsabilitat-civil-pels-visitant,rejected,"Park GÃ¼ell, prevencion de incendios y pÃ³liza de responsabilidad civil por visitante","Park GÃ¼ell, PrevenciÃ³ d'incendis i polissa de Responsabilitat Civil pÃ¨ls visitant",Como park pÃºblico hay que tener la prevenciÃ³n de atenciÃ³n epsl incendios con bocas de incendio por los bomberos. PÃ³liza por los usuarios de pago y servicio sanitario y de atenciÃ³n mÃ©dica durante todas las visitas durante el aÃ±o,Com a park pÃºblico calÃ§ tenyir la prevenciÃ³ d'atenciÃ³ epsl incendis amb boques d'incendi Ã©s pÃ¨ls bombers. PÃ²lissa pÃ¨ls Visitants de Pagament i Servei sanitari i d'atencions mÃ¨dica durante Totes els visitis durante l'any,2747,0,0,0,"No podem comprometre’ns a desenvolupar aquesta proposta al nivell de concreció que es proposa, però serà tinguda en compte en el moment de la planificació corresponent.",ca
2,2,3158,citizenship,district,Horta - GuinardÃ³,Bon viure,EducaciÃ³ i coneixement,512,Solidaritat Catalana per la IndependÃ¨ncia,2/19/2016,373,17,https://decidim.barcelona.cat/proposals/institut-al-barri-d-horta,accepted,Instituto en el barrio de Horta.,Institut al barri d'Horta.,"Inicio del procedimiento para la construcciÃ³n de un Instituto de secundaria en Horta, tal como reclaman las asociaciones del barrio.","Inici del Procediment per a la construcciÃ³ d'1 Institut de secundÃ ria a Horta, tal com reclamin els Associacions del barri.",3158,10,3,0,,ca
3,3,8968,citizenship,city,,TransiciÃ³ ecolÃ²gica,Mobilitat sostenible,15512,Archie,4/6/2016,1,0,https://decidim.barcelona.cat/proposals/bus-facilitar-entrada-gent-gran,rejected,Bus: facilitar entrada ancianos,Bus: facilitar entrada gent gran,"Para facilitar la entrada el bus a las personas mayores o con dificultad de movilidad, cambiar la forma de validar el viaje. Actualmente las personas mayores cuando validan la tarjeta tienen muchas dificultades y pueden caer, crear una nueva manera como por ejemplo pasar tarjeta ante sensor.","Per facilitar l 'entrada el bus a la gent gran o amb dificultat de movilitat, canviar la forma de validar el viatge. Actualment les persones Grans Quan validin la targeta tinenÃ§a Moltes Dificultats i podin Caure, crear una nova manera com a Per Exemple passar targeta Davant sensor.",8968,0,0,0,"No podem comprometre’ns a desenvolupar aquesta proposta al nivell de concreció que es proposa, però serà tinguda en compte en el moment de la planificació corresponent.",ca
4,4,6774,citizenship,district,GrÃ cia,TransiciÃ³ ecolÃ²gica,Urbanisme per als barris,15153,Xavier Sisternas,3/24/2016,7,0,https://decidim.barcelona.cat/proposals/reformar-el-carrer-verdi-a-l-entorn-del-mercat-de-lesseps,accepted,Reformar la calle Verdi en el entorno del Mercado de Lesseps,Reformar el carrer Verdi a l'entorn del Mercat de Lesseps,"La calle Verdi, del mercado para arriba, actÃºa como ""calle mayor"". Todo el mundo sube o baja por Verdi, para ir de compras, para ir al metro ... Se una calle con mucha pendiente, poco amable. Pero mejorarÃ­a mucho con plataforma Ãºnica o aceras anchas, sin coches aparcados, con las motos en la calzada, con algunos Ã¡rboles y bancos. Y revitalizarÃ­a el comercio ...","El carrer Verdi, del mercat en amunt, actua com a ""carrer major"". Tothom licitaciÃ³ o baixa per Verdi, per anar a comprar, per anar al metro ... Ãs un carrer amb Molta pendent, Poc amable. PerÃ² milloraria Molt amb plataforma Ãºnica o voreres amples, sense cotxes Aparcats, amb els motos a la CalÃ§ada, amb uns quants arbres i Bancs. I revitalitzaria el comerÃ§ ...",6774,0,0,0,,ca


In [None]:
# Count min, max, avg, mean words per title_es, etc.