# Number of shrinks per city 
We extracted data from DATASUS TabNet interface, corresponding to the following parameters:

- CNES - RECURSOS HUMANOS - PROFISSIONAIS - INDIVÍDUOS - SEGUNDO CBO 2002 - BRASIL
- Período: 2007 a 2020

The data can be found at CSV/Tabnet/shrink_07_20.csv

In [1]:
import pandas as pd
root = "../"

Read file with shrink data

In [2]:
shrink_data = pd.read_csv(root + "CSV/TabNet/shrink_07_20.csv", sep=";")
shrink_data.head()

Unnamed: 0,Município,2007/Ago,2007/Set,2007/Out,2007/Nov,2007/Dez,2008/Jan,2008/Fev,2008/Mar,2008/Abr,...,2019/Set,2019/Out,2019/Nov,2019/Dez,2020/Jan,2020/Fev,2020/Mar,2020/Abr,2020/Mai,2020/Jun
0,110001 Alta Floresta D'Oeste,2,2,2,2,2,2,2,2,1,...,1,1,1,1,1,1,1,1,1,1
1,110037 Alto Alegre dos Parecis,-,-,-,-,-,-,-,-,-,...,1,1,1,1,1,1,1,1,1,1
2,110040 Alto Paraíso,1,1,1,1,1,1,1,1,1,...,2,2,2,2,2,2,2,2,1,2
3,110034 Alvorada D'Oeste,-,-,-,-,-,-,-,-,-,...,1,1,1,1,1,1,1,1,1,1
4,110002 Ariquemes,5,6,6,6,6,6,5,5,5,...,24,24,25,25,24,24,24,25,25,25


Replace "-" values with 0, create a column with the city code (`MUNCOD`)

In [3]:
shrink_data['MUNCOD'] = shrink_data['Município'].str.split(' ', expand=True)[0]
shrink_data.drop('Município', axis=1, inplace=True)
shrink_data = shrink_data.replace("-", 0)
shrink_data = shrink_data.astype(int)
shrink_data.head()

Unnamed: 0,2007/Ago,2007/Set,2007/Out,2007/Nov,2007/Dez,2008/Jan,2008/Fev,2008/Mar,2008/Abr,2008/Mai,...,2019/Out,2019/Nov,2019/Dez,2020/Jan,2020/Fev,2020/Mar,2020/Abr,2020/Mai,2020/Jun,MUNCOD
0,2,2,2,2,2,2,2,2,1,1,...,1,1,1,1,1,1,1,1,1,110001
1,0,0,0,0,0,0,0,0,0,0,...,1,1,1,1,1,1,1,1,1,110037
2,1,1,1,1,1,1,1,1,1,1,...,2,2,2,2,2,2,2,1,2,110040
3,0,0,0,0,0,0,0,0,0,0,...,1,1,1,1,1,1,1,1,1,110034
4,5,6,6,6,6,6,5,5,5,5,...,24,25,25,24,24,24,25,25,25,110002


Remove data from 2007 and 2019/2020

In [4]:
shrink_data = shrink_data[shrink_data.columns.drop(list(shrink_data.filter(regex='2020|2019|2007')))]

Only keep January data

In [5]:
shrink_data = shrink_data.filter(regex='Jan|MUNCOD')

Rename columns

In [6]:
shrink_data.columns = ["08", "09", "10", "11", "12", "13", "14", "15", "16", "17", "18", "MUNCOD"]

Save to csv

In [7]:
shrink_data.to_csv(root + 'CSV/Shrink/shrink_count_08_18.csv')

# Rate of shrinks per city

Get population data

In [8]:
population = pd.read_csv(root + 'CSV/Population/population_08_18.csv', index_col=[0])
shrink = pd.read_csv(root + 'CSV/Shrink/shrink_count_08_18.csv', index_col=[0])
shrink.head()

Unnamed: 0,08,09,10,11,12,13,14,15,16,17,18,MUNCOD
0,2,1,1,1,1,1,1,2,2,1,1,110001
1,0,0,1,2,1,2,1,2,2,1,1,110037
2,1,1,1,1,0,0,0,0,0,0,1,110040
3,0,0,0,0,0,1,1,1,1,1,1,110034
4,6,6,7,9,10,8,9,10,9,20,24,110002


Merge with shrink data

In [9]:
df = pd.merge(population, shrink, left_on="MUNCOD", right_on="MUNCOD")
df.head()

Unnamed: 0,POP_18,MUNCOD,POP_17,POP_16,POP_15,POP_14,POP_13,POP_12,POP_11,POP_10,...,09,10,11,12,13,14,15,16,17,18
0,23167,110001,25437,25506,25578,25652,25728,24069,24228,24422,...,1,1,1,1,1,1,2,2,1,1
1,106168,110002,107345,105896,104401,102860,101269,92747,91570,90354,...,6,7,9,10,8,9,10,9,20,24
2,5438,110003,6224,6289,6355,6424,6495,6132,6221,6309,...,0,0,0,0,0,1,1,1,2,2
3,84813,110004,88507,87877,87226,86556,85863,79330,78959,78601,...,5,7,8,6,14,15,16,18,22,25
4,16444,110005,17934,17959,17986,18013,18041,16852,16939,17030,...,2,2,2,2,2,2,2,2,2,2


Calculate rates

In [10]:
years = ["08", "09", "10", "11", "12", "13", "14", "15", "16", "17", "18"]
for year in years:
    df['RATE_' + year] = df[year]/df['POP_' + year] * 100000
    df = df.drop([year, 'POP_' + year], axis=1)
df.head()

Unnamed: 0,MUNCOD,RATE_08,RATE_09,RATE_10,RATE_11,RATE_12,RATE_13,RATE_14,RATE_15,RATE_16,RATE_17,RATE_18
0,110001,8.13769,4.106102,4.094669,4.127456,4.154722,3.886816,3.898332,7.81922,7.841292,3.931281,4.316485
1,110002,7.093792,7.01418,7.747305,9.828546,10.78202,7.899752,8.749757,9.578452,8.498905,18.631515,22.605682
2,110003,0.0,0.0,0.0,0.0,0.0,0.0,15.566625,15.735641,15.900779,32.133676,36.778227
3,110004,6.388715,6.355259,8.905739,10.131841,7.563343,16.305044,17.329821,18.343155,20.483175,24.856791,29.476613
4,110005,11.916111,12.032246,11.743981,11.807072,11.868028,11.08586,11.103092,11.11976,11.136478,11.152002,12.162491


Export to csv

In [11]:
df.to_csv(root + 'CSV/Shrink/shrink_rates_08_18.csv')