# Sessões da Câmara dos Deputados 🇧🇷

[Documentação oficial](http://www.camara.leg.br/internet/plenario/result/votacao/Layout_ArquivosTXT_presencas_vota%C3%A7%C3%A3o_exportados.pdf)

**Nomes dos arquivos**

```
HEaabbcddde000000.TXT presenças - cabeçalho
LPaabbcddde000000.TXT presenças - detalhes
HEaabbcdddeffffff.TXT votos - cabeçalho
LVaabbcdddeffffff.TXT votos - detalhes
```

## Cabeçalhos

| Linha presenças | Descrição presenças                              | Linha votação | Descrição votação |
|--------------|-----------------------------------------------|---------------|-------------------|
| **aabbcddde**    | conforme a nomenclatura do cabeçalho          | **aabbcddde**     | conforme nomenclatura do header  |
| **000000**       |                                               | **ffffff**        | seqüencial da votação            |
| **dd/mm/aaaa**   | data de término da sessão                     | **dd/mm/aaaa**    | data final da votação            |
| **hh:mm:ss**     | hora de término da sessão                     | **hh:mm:ss**      | hora final da votação            |
| **xxx**          | nome do 1º presidente da Sessão (40 posições) | **xxx**           | nome do 1º presidente da Votação |
| **000**          |                                               | **zzz**           | total de votos SIM               |
| **000**          |                                               | **vvv**           | total de votos NÃO               |
| **000**          |                                               | **www**           | total de votos ABSTENÇÃO         |
| **000**          |                                               | **ggg**           | total de votos OBSTRUÇÃO         |
| **000**          |                                               | **hhh**           | total de votos BRANCO            |
| **000**          |                                               | **iii**           | total de votos do presidente     |
| **000**          |                                               | **jjj**           | total de votantes                |
| **yyy**          | total de presentes                            | **kkk**           | nome da proposição               |

In [1]:
import glob
import numpy as np
import pandas as pd

paths = glob.glob('../data/sources/sessions/**/HE*')
headers = pd.DataFrame()
dtypes = {
    'header': np.str,
    'vote_number': np.str,
    'ending_date': np.str,
    'ending_time': np.str,
    'first_president': 'category',
    'yes_votes': np.int,
    'no_votes': np.int,
    'abstention_votes': np.int,
    'obstruction_votes': np.int,
    'blank_votes': np.int,
    'president_votes': np.int,
    'votes': np.int,
    'name': np.str,
}
for path in paths:
    with open(path, encoding='iso-8859-1') as file:
        attributes = file.read().split('\n')
        attributes = [attr.strip() for attr in attributes]
        row = pd.Series(dict(zip(dtypes.keys(), attributes[:13])))
        row['term'] = path.split('/')[-2]
        headers = headers.append(row, ignore_index=True)

Definir tipos corretos para cada uma das colunas.

In [2]:
for col, col_type in dtypes.items():
    headers[col] = headers[col].astype(col_type)

Criar uma coluna contendo data e hora do término da sessão (ou da votação, dependendo da linha).

In [3]:
headers['ending_time'] = pd.to_datetime(
    headers['ending_date'] + ' ' + headers['ending_time'], dayfirst=True)
headers.drop('ending_date', axis=1, inplace=True)

## Código do cabeçalho

**aabbcddde**

| Código | Descrição |
|--------|-----------|
| aa     | CD (sessão da Câmara) ou CC (sessão do Congresso – Câmara) ou SF (sessão do Congresso – Senado) |
| bb     | Número da Sessão legislativa (2 posições) |
| c      | O (sessão legislativa ordinária) ou E (sessão legislativa extraordinária) |
| ddd    | Número da Sessão (3 posições) |
| e      | O (sessão ordinária) ou E (sessão extraordinária) |
| ffffff | seqüencial da votação (6 posições) |

In [4]:
headers.head()

Unnamed: 0,abstention_votes,blank_votes,ending_time,first_president,header,name,no_votes,obstruction_votes,president_votes,term,vote_number,votes,yes_votes
0,0,0,2017-09-04 20:27:45,CARLOS MANATO,CD03O242E,345,0,0,0,2015,0,0,0
1,0,0,2015-06-23 20:43:45,DANIEL VILELA,CD01O165E,319,0,0,0,2015,0,0,0
2,3,0,2016-10-05 16:09:09,BETO MANSUR,CD02O241E,PL Nº 4567/2016 - REQUERIMENTO DE ADIAMENTO ...,262,62,1,2015,7188,335,7
3,0,0,2015-10-21 13:58:57,GILBERTO NASCIMENTO,CD01O318E,234,0,0,0,2015,0,0,0
4,4,0,2017-12-06 01:31:27,RODRIGO MAIA,CD03O378E,MPV Nº 795/2017 - DTQ 11 - PV - ART. 7º DO TE...,222,19,1,2015,8100,369,123


In [5]:
def parse_header(header, include_vote_number=False):
    attrs = pd.Series({
        'body': header[:2],
        'legislative_session_number': header[2:4],
        'legislative_schedule': header[4:5],
        'session_number': header[5:8],
        'schedule': header[8:9],
    })
    if include_vote_number:
        attrs['vote_number'] = header[9:]
    return attrs

def rename_header_categorical_variables(dataframe):
    dtypes = {
        'legislative_schedule': 'category',
        'schedule': 'category',
        'body': 'category',
    }
    for col, col_type in dtypes.items():
        dataframe[col] = dataframe[col].astype(col_type)

    categories = {
        'O': 'ordinary_session',
        'E': 'special_session',
    }

    for col in ['schedule', 'legislative_schedule']:
        dataframe[col].cat.rename_categories(categories, inplace=True)

    dataframe['body'].cat.rename_categories({
        'CD': 'chamber_of_deputies',
        'CC': 'national_congress_chamber_of_deputies',
        'SF': 'national_congress_federal_senate',
    }, inplace=True)
    
    return dataframe

def dataframe_with_head_variables(dataframe):
    new_cols = dataframe['header'].apply(parse_header)
    dataframe = pd.concat(
        [dataframe.drop('header', axis=1), new_cols], axis=1)
    dataframe = rename_header_categorical_variables(dataframe)
    
    return dataframe

In [6]:
headers = dataframe_with_head_variables(headers)

Como a variável `headers` contém dois tipos de cabeçalhos distintos - de sessões e de votações - precisamos separar para tratar cada um das informações da forma que merece.

In [7]:
import re

def is_presence_row(row):
    return not not re.match(r'\d+', row['name'])

presence_rows = headers.apply(is_presence_row, axis=1)
presence_headers = headers[presence_rows]
vote_headers = headers[~presence_rows]
del(headers)

In [8]:
vote_headers.shape[0], presence_headers.shape[0]

(3618, 3983)

In [9]:
vote_specific_cols = [
    'abstention_votes',
    'blank_votes',
    'no_votes',
    'obstruction_votes',
    'president_votes',
    'vote_number',
    'votes',
    'yes_votes',
]
presence_headers = presence_headers \
    .drop(vote_specific_cols, axis=1) \
    .rename(columns={'name': 'congresspeople_present'})
presence_headers['congresspeople_present'] = \
    presence_headers['congresspeople_present'].astype(np.int)

In [10]:
vote_headers.head()

Unnamed: 0,abstention_votes,blank_votes,ending_time,first_president,name,no_votes,obstruction_votes,president_votes,term,vote_number,votes,yes_votes,body,legislative_schedule,legislative_session_number,schedule,session_number
2,3,0,2016-10-05 16:09:09,BETO MANSUR,PL Nº 4567/2016 - REQUERIMENTO DE ADIAMENTO ...,262,62,1,2015,7188,335,7,chamber_of_deputies,ordinary_session,2,special_session,241
4,4,0,2017-12-06 01:31:27,RODRIGO MAIA,MPV Nº 795/2017 - DTQ 11 - PV - ART. 7º DO TE...,222,19,1,2015,8100,369,123,chamber_of_deputies,ordinary_session,3,special_session,378
11,2,0,2015-06-10 18:29:51,EDUARDO CUNHA,PEC Nº 182/2007 - DTQ 27 - PMDB - PREFERÊNCIA...,268,0,1,2015,6368,385,114,chamber_of_deputies,ordinary_session,1,special_session,147
12,1,0,2017-05-16 15:25:20,JHC,MPV Nº 756/2016 - REQUERIMENTO DE RETIRADA DE...,255,61,1,2015,7538,336,18,chamber_of_deputies,ordinary_session,3,special_session,118
15,1,0,2016-12-14 22:51:22,BETO MANSUR,MPV Nº 744/2016 - DTQ 1: PT - EMENDA Nº 11,193,0,1,2015,7353,280,85,chamber_of_deputies,ordinary_session,2,special_session,334


In [11]:
presence_headers.head()

Unnamed: 0,ending_time,first_president,congresspeople_present,term,body,legislative_schedule,legislative_session_number,schedule,session_number
0,2017-09-04 20:27:45,CARLOS MANATO,345,2015,chamber_of_deputies,ordinary_session,3,special_session,242
1,2015-06-23 20:43:45,DANIEL VILELA,319,2015,chamber_of_deputies,ordinary_session,1,special_session,165
3,2015-10-21 13:58:57,GILBERTO NASCIMENTO,234,2015,chamber_of_deputies,ordinary_session,1,special_session,318
5,2015-09-22 02:19:44,RENAN CALHEIROS,72,2015,national_congress_federal_senate,ordinary_session,1,special_session,21
6,2017-09-26 20:34:57,DELEGADO EDSON MOREIRA,452,2015,chamber_of_deputies,ordinary_session,3,special_session,275


**Sanity checks**

É esperado que o número de votos de cada votação seja igual ao somatório de todos os tipos de votos possíveis.

In [12]:
vote_cols = [col for col in vote_headers.columns if col[-6:] == '_votes']
(vote_headers['votes'] == vote_headers[vote_cols].sum(axis=1)).value_counts()

True     3606
False      12
dtype: int64

In [13]:
vote_headers.to_csv('../data/chamber_of_deputies_votes.csv', index=False)
presence_headers.to_csv('../data/chamber_of_deputies_presences.csv', index=False)

---

## Detalhes de sessões

**Conteúdo arquivo**

```
aabbcddde 000000 xxxyyy zzzwwwfff ggg
```

| Linha                | Descrição                                    |
|----------------------|----------------------------------------------|
| **aabbcddde 000000** | conforme nomenclatura do cabeçalho           |
| **xxx**              | nome do parlamentar (40 posições)            |
| **yyy**              | Presente ou <------> (ausência) – 8 posições |
| **zzz**              | sigla do partido (10 posições)               |
| **www**              | nome da UF (25 posições)                     |
| **fff**              | código do parlamentar (3 posições)           |

In [14]:
paths = glob.glob('../data/sources/sessions/**/LP*')
presences = pd.DataFrame()
for index, path in enumerate(paths):
    print('Reading {} out of {}'.format(index + 1, len(paths)))
    subset = pd.read_fwf(
        path, widths=[16, 40, 9, 10, 25, 3], header=None, encoding='iso-8859-1')
    subset['term'] = path.split('/')[-2]
    presences = presences.append(subset)

Reading 1 out of 3962
Reading 2 out of 3962
Reading 3 out of 3962
Reading 4 out of 3962
Reading 5 out of 3962
Reading 6 out of 3962
Reading 7 out of 3962
Reading 8 out of 3962
Reading 9 out of 3962
Reading 10 out of 3962
Reading 11 out of 3962
Reading 12 out of 3962
Reading 13 out of 3962
Reading 14 out of 3962
Reading 15 out of 3962
Reading 16 out of 3962
Reading 17 out of 3962
Reading 18 out of 3962
Reading 19 out of 3962
Reading 20 out of 3962
Reading 21 out of 3962
Reading 22 out of 3962
Reading 23 out of 3962
Reading 24 out of 3962
Reading 25 out of 3962
Reading 26 out of 3962
Reading 27 out of 3962
Reading 28 out of 3962
Reading 29 out of 3962
Reading 30 out of 3962
Reading 31 out of 3962
Reading 32 out of 3962
Reading 33 out of 3962
Reading 34 out of 3962
Reading 35 out of 3962
Reading 36 out of 3962
Reading 37 out of 3962
Reading 38 out of 3962
Reading 39 out of 3962
Reading 40 out of 3962
Reading 41 out of 3962
Reading 42 out of 3962
Reading 43 out of 3962
Reading 44 out of 39

Reading 348 out of 3962
Reading 349 out of 3962
Reading 350 out of 3962
Reading 351 out of 3962
Reading 352 out of 3962
Reading 353 out of 3962
Reading 354 out of 3962
Reading 355 out of 3962
Reading 356 out of 3962
Reading 357 out of 3962
Reading 358 out of 3962
Reading 359 out of 3962
Reading 360 out of 3962
Reading 361 out of 3962
Reading 362 out of 3962
Reading 363 out of 3962
Reading 364 out of 3962
Reading 365 out of 3962
Reading 366 out of 3962
Reading 367 out of 3962
Reading 368 out of 3962
Reading 369 out of 3962
Reading 370 out of 3962
Reading 371 out of 3962
Reading 372 out of 3962
Reading 373 out of 3962
Reading 374 out of 3962
Reading 375 out of 3962
Reading 376 out of 3962
Reading 377 out of 3962
Reading 378 out of 3962
Reading 379 out of 3962
Reading 380 out of 3962
Reading 381 out of 3962
Reading 382 out of 3962
Reading 383 out of 3962
Reading 384 out of 3962
Reading 385 out of 3962
Reading 386 out of 3962
Reading 387 out of 3962
Reading 388 out of 3962
Reading 389 out 

Reading 690 out of 3962
Reading 691 out of 3962
Reading 692 out of 3962
Reading 693 out of 3962
Reading 694 out of 3962
Reading 695 out of 3962
Reading 696 out of 3962
Reading 697 out of 3962
Reading 698 out of 3962
Reading 699 out of 3962
Reading 700 out of 3962
Reading 701 out of 3962
Reading 702 out of 3962
Reading 703 out of 3962
Reading 704 out of 3962
Reading 705 out of 3962
Reading 706 out of 3962
Reading 707 out of 3962
Reading 708 out of 3962
Reading 709 out of 3962
Reading 710 out of 3962
Reading 711 out of 3962
Reading 712 out of 3962
Reading 713 out of 3962
Reading 714 out of 3962
Reading 715 out of 3962
Reading 716 out of 3962
Reading 717 out of 3962
Reading 718 out of 3962
Reading 719 out of 3962
Reading 720 out of 3962
Reading 721 out of 3962
Reading 722 out of 3962
Reading 723 out of 3962
Reading 724 out of 3962
Reading 725 out of 3962
Reading 726 out of 3962
Reading 727 out of 3962
Reading 728 out of 3962
Reading 729 out of 3962
Reading 730 out of 3962
Reading 731 out 

Reading 1032 out of 3962
Reading 1033 out of 3962
Reading 1034 out of 3962
Reading 1035 out of 3962
Reading 1036 out of 3962
Reading 1037 out of 3962
Reading 1038 out of 3962
Reading 1039 out of 3962
Reading 1040 out of 3962
Reading 1041 out of 3962
Reading 1042 out of 3962
Reading 1043 out of 3962
Reading 1044 out of 3962
Reading 1045 out of 3962
Reading 1046 out of 3962
Reading 1047 out of 3962
Reading 1048 out of 3962
Reading 1049 out of 3962
Reading 1050 out of 3962
Reading 1051 out of 3962
Reading 1052 out of 3962
Reading 1053 out of 3962
Reading 1054 out of 3962
Reading 1055 out of 3962
Reading 1056 out of 3962
Reading 1057 out of 3962
Reading 1058 out of 3962
Reading 1059 out of 3962
Reading 1060 out of 3962
Reading 1061 out of 3962
Reading 1062 out of 3962
Reading 1063 out of 3962
Reading 1064 out of 3962
Reading 1065 out of 3962
Reading 1066 out of 3962
Reading 1067 out of 3962
Reading 1068 out of 3962
Reading 1069 out of 3962
Reading 1070 out of 3962
Reading 1071 out of 3962


Reading 1360 out of 3962
Reading 1361 out of 3962
Reading 1362 out of 3962
Reading 1363 out of 3962
Reading 1364 out of 3962
Reading 1365 out of 3962
Reading 1366 out of 3962
Reading 1367 out of 3962
Reading 1368 out of 3962
Reading 1369 out of 3962
Reading 1370 out of 3962
Reading 1371 out of 3962
Reading 1372 out of 3962
Reading 1373 out of 3962
Reading 1374 out of 3962
Reading 1375 out of 3962
Reading 1376 out of 3962
Reading 1377 out of 3962
Reading 1378 out of 3962
Reading 1379 out of 3962
Reading 1380 out of 3962
Reading 1381 out of 3962
Reading 1382 out of 3962
Reading 1383 out of 3962
Reading 1384 out of 3962
Reading 1385 out of 3962
Reading 1386 out of 3962
Reading 1387 out of 3962
Reading 1388 out of 3962
Reading 1389 out of 3962
Reading 1390 out of 3962
Reading 1391 out of 3962
Reading 1392 out of 3962
Reading 1393 out of 3962
Reading 1394 out of 3962
Reading 1395 out of 3962
Reading 1396 out of 3962
Reading 1397 out of 3962
Reading 1398 out of 3962
Reading 1399 out of 3962


Reading 1688 out of 3962
Reading 1689 out of 3962
Reading 1690 out of 3962
Reading 1691 out of 3962
Reading 1692 out of 3962
Reading 1693 out of 3962
Reading 1694 out of 3962
Reading 1695 out of 3962
Reading 1696 out of 3962
Reading 1697 out of 3962
Reading 1698 out of 3962
Reading 1699 out of 3962
Reading 1700 out of 3962
Reading 1701 out of 3962
Reading 1702 out of 3962
Reading 1703 out of 3962
Reading 1704 out of 3962
Reading 1705 out of 3962
Reading 1706 out of 3962
Reading 1707 out of 3962
Reading 1708 out of 3962
Reading 1709 out of 3962
Reading 1710 out of 3962
Reading 1711 out of 3962
Reading 1712 out of 3962
Reading 1713 out of 3962
Reading 1714 out of 3962
Reading 1715 out of 3962
Reading 1716 out of 3962
Reading 1717 out of 3962
Reading 1718 out of 3962
Reading 1719 out of 3962
Reading 1720 out of 3962
Reading 1721 out of 3962
Reading 1722 out of 3962
Reading 1723 out of 3962
Reading 1724 out of 3962
Reading 1725 out of 3962
Reading 1726 out of 3962
Reading 1727 out of 3962


Reading 2016 out of 3962
Reading 2017 out of 3962
Reading 2018 out of 3962
Reading 2019 out of 3962
Reading 2020 out of 3962
Reading 2021 out of 3962
Reading 2022 out of 3962
Reading 2023 out of 3962
Reading 2024 out of 3962
Reading 2025 out of 3962
Reading 2026 out of 3962
Reading 2027 out of 3962
Reading 2028 out of 3962
Reading 2029 out of 3962
Reading 2030 out of 3962
Reading 2031 out of 3962
Reading 2032 out of 3962
Reading 2033 out of 3962
Reading 2034 out of 3962
Reading 2035 out of 3962
Reading 2036 out of 3962
Reading 2037 out of 3962
Reading 2038 out of 3962
Reading 2039 out of 3962
Reading 2040 out of 3962
Reading 2041 out of 3962
Reading 2042 out of 3962
Reading 2043 out of 3962
Reading 2044 out of 3962
Reading 2045 out of 3962
Reading 2046 out of 3962
Reading 2047 out of 3962
Reading 2048 out of 3962
Reading 2049 out of 3962
Reading 2050 out of 3962
Reading 2051 out of 3962
Reading 2052 out of 3962
Reading 2053 out of 3962
Reading 2054 out of 3962
Reading 2055 out of 3962


Reading 2344 out of 3962
Reading 2345 out of 3962
Reading 2346 out of 3962
Reading 2347 out of 3962
Reading 2348 out of 3962
Reading 2349 out of 3962
Reading 2350 out of 3962
Reading 2351 out of 3962
Reading 2352 out of 3962
Reading 2353 out of 3962
Reading 2354 out of 3962
Reading 2355 out of 3962
Reading 2356 out of 3962
Reading 2357 out of 3962
Reading 2358 out of 3962
Reading 2359 out of 3962
Reading 2360 out of 3962
Reading 2361 out of 3962
Reading 2362 out of 3962
Reading 2363 out of 3962
Reading 2364 out of 3962
Reading 2365 out of 3962
Reading 2366 out of 3962
Reading 2367 out of 3962
Reading 2368 out of 3962
Reading 2369 out of 3962
Reading 2370 out of 3962
Reading 2371 out of 3962
Reading 2372 out of 3962
Reading 2373 out of 3962
Reading 2374 out of 3962
Reading 2375 out of 3962
Reading 2376 out of 3962
Reading 2377 out of 3962
Reading 2378 out of 3962
Reading 2379 out of 3962
Reading 2380 out of 3962
Reading 2381 out of 3962
Reading 2382 out of 3962
Reading 2383 out of 3962


Reading 2672 out of 3962
Reading 2673 out of 3962
Reading 2674 out of 3962
Reading 2675 out of 3962
Reading 2676 out of 3962
Reading 2677 out of 3962
Reading 2678 out of 3962
Reading 2679 out of 3962
Reading 2680 out of 3962
Reading 2681 out of 3962
Reading 2682 out of 3962
Reading 2683 out of 3962
Reading 2684 out of 3962
Reading 2685 out of 3962
Reading 2686 out of 3962
Reading 2687 out of 3962
Reading 2688 out of 3962
Reading 2689 out of 3962
Reading 2690 out of 3962
Reading 2691 out of 3962
Reading 2692 out of 3962
Reading 2693 out of 3962
Reading 2694 out of 3962
Reading 2695 out of 3962
Reading 2696 out of 3962
Reading 2697 out of 3962
Reading 2698 out of 3962
Reading 2699 out of 3962
Reading 2700 out of 3962
Reading 2701 out of 3962
Reading 2702 out of 3962
Reading 2703 out of 3962
Reading 2704 out of 3962
Reading 2705 out of 3962
Reading 2706 out of 3962
Reading 2707 out of 3962
Reading 2708 out of 3962
Reading 2709 out of 3962
Reading 2710 out of 3962
Reading 2711 out of 3962


Reading 3000 out of 3962
Reading 3001 out of 3962
Reading 3002 out of 3962
Reading 3003 out of 3962
Reading 3004 out of 3962
Reading 3005 out of 3962
Reading 3006 out of 3962
Reading 3007 out of 3962
Reading 3008 out of 3962
Reading 3009 out of 3962
Reading 3010 out of 3962
Reading 3011 out of 3962
Reading 3012 out of 3962
Reading 3013 out of 3962
Reading 3014 out of 3962
Reading 3015 out of 3962
Reading 3016 out of 3962
Reading 3017 out of 3962
Reading 3018 out of 3962
Reading 3019 out of 3962
Reading 3020 out of 3962
Reading 3021 out of 3962
Reading 3022 out of 3962
Reading 3023 out of 3962
Reading 3024 out of 3962
Reading 3025 out of 3962
Reading 3026 out of 3962
Reading 3027 out of 3962
Reading 3028 out of 3962
Reading 3029 out of 3962
Reading 3030 out of 3962
Reading 3031 out of 3962
Reading 3032 out of 3962
Reading 3033 out of 3962
Reading 3034 out of 3962
Reading 3035 out of 3962
Reading 3036 out of 3962
Reading 3037 out of 3962
Reading 3038 out of 3962
Reading 3039 out of 3962


Reading 3328 out of 3962
Reading 3329 out of 3962
Reading 3330 out of 3962
Reading 3331 out of 3962
Reading 3332 out of 3962
Reading 3333 out of 3962
Reading 3334 out of 3962
Reading 3335 out of 3962
Reading 3336 out of 3962
Reading 3337 out of 3962
Reading 3338 out of 3962
Reading 3339 out of 3962
Reading 3340 out of 3962
Reading 3341 out of 3962
Reading 3342 out of 3962
Reading 3343 out of 3962
Reading 3344 out of 3962
Reading 3345 out of 3962
Reading 3346 out of 3962
Reading 3347 out of 3962
Reading 3348 out of 3962
Reading 3349 out of 3962
Reading 3350 out of 3962
Reading 3351 out of 3962
Reading 3352 out of 3962
Reading 3353 out of 3962
Reading 3354 out of 3962
Reading 3355 out of 3962
Reading 3356 out of 3962
Reading 3357 out of 3962
Reading 3358 out of 3962
Reading 3359 out of 3962
Reading 3360 out of 3962
Reading 3361 out of 3962
Reading 3362 out of 3962
Reading 3363 out of 3962
Reading 3364 out of 3962
Reading 3365 out of 3962
Reading 3366 out of 3962
Reading 3367 out of 3962


Reading 3656 out of 3962
Reading 3657 out of 3962
Reading 3658 out of 3962
Reading 3659 out of 3962
Reading 3660 out of 3962
Reading 3661 out of 3962
Reading 3662 out of 3962
Reading 3663 out of 3962
Reading 3664 out of 3962
Reading 3665 out of 3962
Reading 3666 out of 3962
Reading 3667 out of 3962
Reading 3668 out of 3962
Reading 3669 out of 3962
Reading 3670 out of 3962
Reading 3671 out of 3962
Reading 3672 out of 3962
Reading 3673 out of 3962
Reading 3674 out of 3962
Reading 3675 out of 3962
Reading 3676 out of 3962
Reading 3677 out of 3962
Reading 3678 out of 3962
Reading 3679 out of 3962
Reading 3680 out of 3962
Reading 3681 out of 3962
Reading 3682 out of 3962
Reading 3683 out of 3962
Reading 3684 out of 3962
Reading 3685 out of 3962
Reading 3686 out of 3962
Reading 3687 out of 3962
Reading 3688 out of 3962
Reading 3689 out of 3962
Reading 3690 out of 3962
Reading 3691 out of 3962
Reading 3692 out of 3962
Reading 3693 out of 3962
Reading 3694 out of 3962
Reading 3695 out of 3962


Definir os tipos corretos para cada um dos campos.

In [15]:
dtypes = {
    'header': np.str,
    'name': 'category',
    'status': 'category',
    'party': 'category',
    'state': 'category',
    'congressperson_id': 'category',
}
presences.rename(columns={
    0: 'header',
    1: 'name',
    2: 'status',
    3: 'party',
    4: 'state',
    5: 'congressperson_id',
}, inplace=True)
for col, col_type in dtypes.items():
    presences[col] = presences[col].astype(col_type)

In [16]:
presences['congressperson_id'] = \
    presences['congressperson_id'].apply(lambda val: str(int(val)) if val else None)

Renomear variáveis categóricas.

In [17]:
presences['status'].cat.rename_categories({
    'Presente': 'present',
    '<------>': 'absent',
}, inplace=True)

In [18]:
presences = dataframe_with_head_variables(presences)

In [19]:
presences.head()

Unnamed: 0,name,status,party,state,congressperson_id,term,body,legislative_schedule,legislative_session_number,schedule,session_number
0,ABEL MESQUITA JR.,present,PDT,Roraima,1,2015,chamber_of_deputies,ordinary_session,1,special_session,257
1,CARLOS ANDRADE,present,PHS,Roraima,3,2015,chamber_of_deputies,ordinary_session,1,special_session,257
2,EDIO LOPES,present,PMDB,Roraima,2,2015,chamber_of_deputies,ordinary_session,1,special_session,257
3,HIRAN GONÇALVES,present,PMN,Roraima,4,2015,chamber_of_deputies,ordinary_session,1,special_session,257
4,JHONATAN DE JESUS,present,PRB,Roraima,5,2015,chamber_of_deputies,ordinary_session,1,special_session,257


In [20]:
presences.to_csv('../data/chamber_of_deputies_presences_congresspeople.csv', index=False)

---

## Detalhes de votações

**Conteúdo arquivo**

```
aabbcddde ffffff xxx yyy  www ggg hhh iii
```

| Linha votação        | Descrição votação |
|----------------------|-------------------|
| **aabbcddde ffffff** | conforme nomenclatura do header |
| **xxx**              | nome do parlamentar (40 posições) |
| **yyy**              | qualidade de voto do parlamentar (Sim, Não, Abstenção, Obstrução, Branco) – 10 posições<br>Se a votação for secreta = Presente<br>Se não participou da votação = <-------> |
| **www**              | sigla do partido (10 posições) |
| **ggg**              | nome da uf (25 posições) |
| **hhh**              | código do parlamentar (3 posições) |

In [21]:
paths = glob.glob('../data/sources/sessions/**/LV*')
votes = pd.DataFrame()
for index, path in enumerate(paths):
    print('Reading {} out of {}'.format(index + 1, len(paths)))
    subset = pd.read_fwf(
        path, widths=[16, 40, 10, 10, 25, 3], header=None, encoding='iso-8859-1')
    subset['term'] = path.split('/')[-2]
    votes = votes.append(subset)

Reading 1 out of 3616
Reading 2 out of 3616
Reading 3 out of 3616
Reading 4 out of 3616
Reading 5 out of 3616
Reading 6 out of 3616
Reading 7 out of 3616
Reading 8 out of 3616
Reading 9 out of 3616
Reading 10 out of 3616
Reading 11 out of 3616
Reading 12 out of 3616
Reading 13 out of 3616
Reading 14 out of 3616
Reading 15 out of 3616
Reading 16 out of 3616
Reading 17 out of 3616
Reading 18 out of 3616
Reading 19 out of 3616
Reading 20 out of 3616
Reading 21 out of 3616
Reading 22 out of 3616
Reading 23 out of 3616
Reading 24 out of 3616
Reading 25 out of 3616
Reading 26 out of 3616
Reading 27 out of 3616
Reading 28 out of 3616
Reading 29 out of 3616
Reading 30 out of 3616
Reading 31 out of 3616
Reading 32 out of 3616
Reading 33 out of 3616
Reading 34 out of 3616
Reading 35 out of 3616
Reading 36 out of 3616
Reading 37 out of 3616
Reading 38 out of 3616
Reading 39 out of 3616
Reading 40 out of 3616
Reading 41 out of 3616
Reading 42 out of 3616
Reading 43 out of 3616
Reading 44 out of 36

Reading 349 out of 3616
Reading 350 out of 3616
Reading 351 out of 3616
Reading 352 out of 3616
Reading 353 out of 3616
Reading 354 out of 3616
Reading 355 out of 3616
Reading 356 out of 3616
Reading 357 out of 3616
Reading 358 out of 3616
Reading 359 out of 3616
Reading 360 out of 3616
Reading 361 out of 3616
Reading 362 out of 3616
Reading 363 out of 3616
Reading 364 out of 3616
Reading 365 out of 3616
Reading 366 out of 3616
Reading 367 out of 3616
Reading 368 out of 3616
Reading 369 out of 3616
Reading 370 out of 3616
Reading 371 out of 3616
Reading 372 out of 3616
Reading 373 out of 3616
Reading 374 out of 3616
Reading 375 out of 3616
Reading 376 out of 3616
Reading 377 out of 3616
Reading 378 out of 3616
Reading 379 out of 3616
Reading 380 out of 3616
Reading 381 out of 3616
Reading 382 out of 3616
Reading 383 out of 3616
Reading 384 out of 3616
Reading 385 out of 3616
Reading 386 out of 3616
Reading 387 out of 3616
Reading 388 out of 3616
Reading 389 out of 3616
Reading 390 out 

Reading 691 out of 3616
Reading 692 out of 3616
Reading 693 out of 3616
Reading 694 out of 3616
Reading 695 out of 3616
Reading 696 out of 3616
Reading 697 out of 3616
Reading 698 out of 3616
Reading 699 out of 3616
Reading 700 out of 3616
Reading 701 out of 3616
Reading 702 out of 3616
Reading 703 out of 3616
Reading 704 out of 3616
Reading 705 out of 3616
Reading 706 out of 3616
Reading 707 out of 3616
Reading 708 out of 3616
Reading 709 out of 3616
Reading 710 out of 3616
Reading 711 out of 3616
Reading 712 out of 3616
Reading 713 out of 3616
Reading 714 out of 3616
Reading 715 out of 3616
Reading 716 out of 3616
Reading 717 out of 3616
Reading 718 out of 3616
Reading 719 out of 3616
Reading 720 out of 3616
Reading 721 out of 3616
Reading 722 out of 3616
Reading 723 out of 3616
Reading 724 out of 3616
Reading 725 out of 3616
Reading 726 out of 3616
Reading 727 out of 3616
Reading 728 out of 3616
Reading 729 out of 3616
Reading 730 out of 3616
Reading 731 out of 3616
Reading 732 out 

Reading 1033 out of 3616
Reading 1034 out of 3616
Reading 1035 out of 3616
Reading 1036 out of 3616
Reading 1037 out of 3616
Reading 1038 out of 3616
Reading 1039 out of 3616
Reading 1040 out of 3616
Reading 1041 out of 3616
Reading 1042 out of 3616
Reading 1043 out of 3616
Reading 1044 out of 3616
Reading 1045 out of 3616
Reading 1046 out of 3616
Reading 1047 out of 3616
Reading 1048 out of 3616
Reading 1049 out of 3616
Reading 1050 out of 3616
Reading 1051 out of 3616
Reading 1052 out of 3616
Reading 1053 out of 3616
Reading 1054 out of 3616
Reading 1055 out of 3616
Reading 1056 out of 3616
Reading 1057 out of 3616
Reading 1058 out of 3616
Reading 1059 out of 3616
Reading 1060 out of 3616
Reading 1061 out of 3616
Reading 1062 out of 3616
Reading 1063 out of 3616
Reading 1064 out of 3616
Reading 1065 out of 3616
Reading 1066 out of 3616
Reading 1067 out of 3616
Reading 1068 out of 3616
Reading 1069 out of 3616
Reading 1070 out of 3616
Reading 1071 out of 3616
Reading 1072 out of 3616


Reading 1361 out of 3616
Reading 1362 out of 3616
Reading 1363 out of 3616
Reading 1364 out of 3616
Reading 1365 out of 3616
Reading 1366 out of 3616
Reading 1367 out of 3616
Reading 1368 out of 3616
Reading 1369 out of 3616
Reading 1370 out of 3616
Reading 1371 out of 3616
Reading 1372 out of 3616
Reading 1373 out of 3616
Reading 1374 out of 3616
Reading 1375 out of 3616
Reading 1376 out of 3616
Reading 1377 out of 3616
Reading 1378 out of 3616
Reading 1379 out of 3616
Reading 1380 out of 3616
Reading 1381 out of 3616
Reading 1382 out of 3616
Reading 1383 out of 3616
Reading 1384 out of 3616
Reading 1385 out of 3616
Reading 1386 out of 3616
Reading 1387 out of 3616
Reading 1388 out of 3616
Reading 1389 out of 3616
Reading 1390 out of 3616
Reading 1391 out of 3616
Reading 1392 out of 3616
Reading 1393 out of 3616
Reading 1394 out of 3616
Reading 1395 out of 3616
Reading 1396 out of 3616
Reading 1397 out of 3616
Reading 1398 out of 3616
Reading 1399 out of 3616
Reading 1400 out of 3616


Reading 1689 out of 3616
Reading 1690 out of 3616
Reading 1691 out of 3616
Reading 1692 out of 3616
Reading 1693 out of 3616
Reading 1694 out of 3616
Reading 1695 out of 3616
Reading 1696 out of 3616
Reading 1697 out of 3616
Reading 1698 out of 3616
Reading 1699 out of 3616
Reading 1700 out of 3616
Reading 1701 out of 3616
Reading 1702 out of 3616
Reading 1703 out of 3616
Reading 1704 out of 3616
Reading 1705 out of 3616
Reading 1706 out of 3616
Reading 1707 out of 3616
Reading 1708 out of 3616
Reading 1709 out of 3616
Reading 1710 out of 3616
Reading 1711 out of 3616
Reading 1712 out of 3616
Reading 1713 out of 3616
Reading 1714 out of 3616
Reading 1715 out of 3616
Reading 1716 out of 3616
Reading 1717 out of 3616
Reading 1718 out of 3616
Reading 1719 out of 3616
Reading 1720 out of 3616
Reading 1721 out of 3616
Reading 1722 out of 3616
Reading 1723 out of 3616
Reading 1724 out of 3616
Reading 1725 out of 3616
Reading 1726 out of 3616
Reading 1727 out of 3616
Reading 1728 out of 3616


Reading 2017 out of 3616
Reading 2018 out of 3616
Reading 2019 out of 3616
Reading 2020 out of 3616
Reading 2021 out of 3616
Reading 2022 out of 3616
Reading 2023 out of 3616
Reading 2024 out of 3616
Reading 2025 out of 3616
Reading 2026 out of 3616
Reading 2027 out of 3616
Reading 2028 out of 3616
Reading 2029 out of 3616
Reading 2030 out of 3616
Reading 2031 out of 3616
Reading 2032 out of 3616
Reading 2033 out of 3616
Reading 2034 out of 3616
Reading 2035 out of 3616
Reading 2036 out of 3616
Reading 2037 out of 3616
Reading 2038 out of 3616
Reading 2039 out of 3616
Reading 2040 out of 3616
Reading 2041 out of 3616
Reading 2042 out of 3616
Reading 2043 out of 3616
Reading 2044 out of 3616
Reading 2045 out of 3616
Reading 2046 out of 3616
Reading 2047 out of 3616
Reading 2048 out of 3616
Reading 2049 out of 3616
Reading 2050 out of 3616
Reading 2051 out of 3616
Reading 2052 out of 3616
Reading 2053 out of 3616
Reading 2054 out of 3616
Reading 2055 out of 3616
Reading 2056 out of 3616


Reading 2345 out of 3616
Reading 2346 out of 3616
Reading 2347 out of 3616
Reading 2348 out of 3616
Reading 2349 out of 3616
Reading 2350 out of 3616
Reading 2351 out of 3616
Reading 2352 out of 3616
Reading 2353 out of 3616
Reading 2354 out of 3616
Reading 2355 out of 3616
Reading 2356 out of 3616
Reading 2357 out of 3616
Reading 2358 out of 3616
Reading 2359 out of 3616
Reading 2360 out of 3616
Reading 2361 out of 3616
Reading 2362 out of 3616
Reading 2363 out of 3616
Reading 2364 out of 3616
Reading 2365 out of 3616
Reading 2366 out of 3616
Reading 2367 out of 3616
Reading 2368 out of 3616
Reading 2369 out of 3616
Reading 2370 out of 3616
Reading 2371 out of 3616
Reading 2372 out of 3616
Reading 2373 out of 3616
Reading 2374 out of 3616
Reading 2375 out of 3616
Reading 2376 out of 3616
Reading 2377 out of 3616
Reading 2378 out of 3616
Reading 2379 out of 3616
Reading 2380 out of 3616
Reading 2381 out of 3616
Reading 2382 out of 3616
Reading 2383 out of 3616
Reading 2384 out of 3616


Reading 2673 out of 3616
Reading 2674 out of 3616
Reading 2675 out of 3616
Reading 2676 out of 3616
Reading 2677 out of 3616
Reading 2678 out of 3616
Reading 2679 out of 3616
Reading 2680 out of 3616
Reading 2681 out of 3616
Reading 2682 out of 3616
Reading 2683 out of 3616
Reading 2684 out of 3616
Reading 2685 out of 3616
Reading 2686 out of 3616
Reading 2687 out of 3616
Reading 2688 out of 3616
Reading 2689 out of 3616
Reading 2690 out of 3616
Reading 2691 out of 3616
Reading 2692 out of 3616
Reading 2693 out of 3616
Reading 2694 out of 3616
Reading 2695 out of 3616
Reading 2696 out of 3616
Reading 2697 out of 3616
Reading 2698 out of 3616
Reading 2699 out of 3616
Reading 2700 out of 3616
Reading 2701 out of 3616
Reading 2702 out of 3616
Reading 2703 out of 3616
Reading 2704 out of 3616
Reading 2705 out of 3616
Reading 2706 out of 3616
Reading 2707 out of 3616
Reading 2708 out of 3616
Reading 2709 out of 3616
Reading 2710 out of 3616
Reading 2711 out of 3616
Reading 2712 out of 3616


Reading 3001 out of 3616
Reading 3002 out of 3616
Reading 3003 out of 3616
Reading 3004 out of 3616
Reading 3005 out of 3616
Reading 3006 out of 3616
Reading 3007 out of 3616
Reading 3008 out of 3616
Reading 3009 out of 3616
Reading 3010 out of 3616
Reading 3011 out of 3616
Reading 3012 out of 3616
Reading 3013 out of 3616
Reading 3014 out of 3616
Reading 3015 out of 3616
Reading 3016 out of 3616
Reading 3017 out of 3616
Reading 3018 out of 3616
Reading 3019 out of 3616
Reading 3020 out of 3616
Reading 3021 out of 3616
Reading 3022 out of 3616
Reading 3023 out of 3616
Reading 3024 out of 3616
Reading 3025 out of 3616
Reading 3026 out of 3616
Reading 3027 out of 3616
Reading 3028 out of 3616
Reading 3029 out of 3616
Reading 3030 out of 3616
Reading 3031 out of 3616
Reading 3032 out of 3616
Reading 3033 out of 3616
Reading 3034 out of 3616
Reading 3035 out of 3616
Reading 3036 out of 3616
Reading 3037 out of 3616
Reading 3038 out of 3616
Reading 3039 out of 3616
Reading 3040 out of 3616


Reading 3329 out of 3616
Reading 3330 out of 3616
Reading 3331 out of 3616
Reading 3332 out of 3616
Reading 3333 out of 3616
Reading 3334 out of 3616
Reading 3335 out of 3616
Reading 3336 out of 3616
Reading 3337 out of 3616
Reading 3338 out of 3616
Reading 3339 out of 3616
Reading 3340 out of 3616
Reading 3341 out of 3616
Reading 3342 out of 3616
Reading 3343 out of 3616
Reading 3344 out of 3616
Reading 3345 out of 3616
Reading 3346 out of 3616
Reading 3347 out of 3616
Reading 3348 out of 3616
Reading 3349 out of 3616
Reading 3350 out of 3616
Reading 3351 out of 3616
Reading 3352 out of 3616
Reading 3353 out of 3616
Reading 3354 out of 3616
Reading 3355 out of 3616
Reading 3356 out of 3616
Reading 3357 out of 3616
Reading 3358 out of 3616
Reading 3359 out of 3616
Reading 3360 out of 3616
Reading 3361 out of 3616
Reading 3362 out of 3616
Reading 3363 out of 3616
Reading 3364 out of 3616
Reading 3365 out of 3616
Reading 3366 out of 3616
Reading 3367 out of 3616
Reading 3368 out of 3616


In [22]:
dtypes = {
    'header': np.str,
    'name': 'category',
    'vote': 'category',
    'party': 'category',
    'state': 'category',
    'congressperson_id': 'category',
}
votes.rename(columns={
    0: 'header',
    1: 'name',
    2: 'vote',
    3: 'party',
    4: 'state',
    5: 'congressperson_id',
}, inplace=True)
for col, col_type in dtypes.items():
    votes[col] = votes[col].astype(col_type)

votes['congressperson_id'] = \
    votes['congressperson_id'].apply(lambda val: str(int(val)) if val else None)

In [23]:
votes['vote'].cat.rename_categories({
    'Sim': 'yes',
    'Não': 'no',
    'Abstenção': 'abstention',
    'Obstrução': 'obstruction',
    'Branco': 'blank',
}, inplace=True)

In [24]:
votes = dataframe_with_head_variables(votes)

In [25]:
votes.head()

Unnamed: 0,name,vote,party,state,congressperson_id,term,body,legislative_schedule,legislative_session_number,schedule,session_number
0,CARLOS ANDRADE,no,PHS,Roraima,3,2015,chamber_of_deputies,ordinary_session,1,ordinary_session,70
1,EDIO LOPES,no,PMDB,Roraima,2,2015,chamber_of_deputies,ordinary_session,1,ordinary_session,70
2,HIRAN GONÇALVES,no,PMN,Roraima,4,2015,chamber_of_deputies,ordinary_session,1,ordinary_session,70
3,JHONATAN DE JESUS,no,PRB,Roraima,5,2015,chamber_of_deputies,ordinary_session,1,ordinary_session,70
4,SHÉRIDAN,no,PSDB,Roraima,8,2015,chamber_of_deputies,ordinary_session,1,ordinary_session,70


In [26]:
votes.to_csv('../data/chamber_of_deputies_votes_congresspeople.csv', index=False)

---