# Análise dos Gastos Públicos 


Como base é utilizado os dados do SICONFI (https://siconfi.tesouro.gov.br/siconfi/index.jsf)

Nestá primeira etapa será apenas apresentada a base, e suas particularidades.

Relizando algumas análises estatísticas.


In [38]:
import pandas as pd

df = pd.read_csv('data/finbra.csv', skiprows=4, encoding = 'ISO-8859-1', sep=";", decimal=',')
df.head(-15)

Unnamed: 0,Instituição,Cod.IBGE,UF,População,Coluna,Conta,Valor
0,Prefeitura Municipal de Bonfinópolis - GO,5203559,GO,8876,Despesas Empenhadas,Despesas Exceto Intraorçamentárias,22065362.39
1,Prefeitura Municipal de Bonfinópolis - GO,5203559,GO,8876,Despesas Empenhadas,01 - Legislativa,797417.88
2,Prefeitura Municipal de Bonfinópolis - GO,5203559,GO,8876,Despesas Empenhadas,01.031 - Ação Legislativa,797417.88
3,Prefeitura Municipal de Bonfinópolis - GO,5203559,GO,8876,Despesas Empenhadas,04 - Administração,2282545.92
4,Prefeitura Municipal de Bonfinópolis - GO,5203559,GO,8876,Despesas Empenhadas,04.122 - Administração Geral,1620112.75
...,...,...,...,...,...,...,...
1013686,Prefeitura Municipal de Reserva do Cabaçal - MT,5107156,MT,2638,Inscrição de Restos a Pagar Processados,08.244 - Assistência Comunitária,2829.27
1013687,Prefeitura Municipal de Reserva do Cabaçal - MT,5107156,MT,2638,Inscrição de Restos a Pagar Processados,10 - Saúde,46610.38
1013688,Prefeitura Municipal de Reserva do Cabaçal - MT,5107156,MT,2638,Inscrição de Restos a Pagar Processados,10.301 - Atenção Básica,46600.03
1013689,Prefeitura Municipal de Reserva do Cabaçal - MT,5107156,MT,2638,Inscrição de Restos a Pagar Processados,10.302 - Assistência Hospitalar e Ambulatorial,10.15


# Apenas as despesas empenhadas e principais

Como pode ser observado, as despesas se encontram na coluna Valor e são lançados em Subcontas, tendo a assim uma conta Totalizadoras.

Nesse primeiro momento estamos interessado nas Contas Sintéticas (Totalizadores).

Estamos interessados somente nas Despesas Empenhadas (Coluna)

Sendo assim, vamos criar uma base somente com essas contas, e em seguida alterar de linhas para colunas (pivot).

Também removemos as despesas Intraorçamentárias, pois aparentemente é a conta de todos os gastos acumulados.

In [39]:
df_empenhadas = df.loc[df['Coluna'] == 'Despesas Empenhadas']
df_desp_principais =  df_empenhadas.loc[~df_empenhadas['Conta'].str.contains('\.')]
df_desp_principais = df_desp_principais.loc[~df_desp_principais['Conta'].str.contains('Intraorçamentárias')]
df_desp_principais = df_desp_principais.loc[~df_desp_principais['Conta'].str.contains('FU')]

df_desp_principais.head(-15)

Unnamed: 0,Instituição,Cod.IBGE,UF,População,Coluna,Conta,Valor
1,Prefeitura Municipal de Bonfinópolis - GO,5203559,GO,8876,Despesas Empenhadas,01 - Legislativa,797417.88
3,Prefeitura Municipal de Bonfinópolis - GO,5203559,GO,8876,Despesas Empenhadas,04 - Administração,2282545.92
7,Prefeitura Municipal de Bonfinópolis - GO,5203559,GO,8876,Despesas Empenhadas,06 - Segurança Pública,82750.24
9,Prefeitura Municipal de Bonfinópolis - GO,5203559,GO,8876,Despesas Empenhadas,08 - Assistência Social,1069282.98
13,Prefeitura Municipal de Bonfinópolis - GO,5203559,GO,8876,Despesas Empenhadas,09 - Previdência Social,1911356.20
...,...,...,...,...,...,...,...
1013422,Prefeitura Municipal de Grossos - RN,2404408,RN,10293,Despesas Empenhadas,15 - Urbanismo,3288919.51
1013425,Prefeitura Municipal de Grossos - RN,2404408,RN,10293,Despesas Empenhadas,18 - Gestão Ambiental,2667954.50
1013427,Prefeitura Municipal de Grossos - RN,2404408,RN,10293,Despesas Empenhadas,20 - Agricultura,405134.56
1013429,Prefeitura Municipal de Grossos - RN,2404408,RN,10293,Despesas Empenhadas,24 - Comunicações,162475.56


# Gerando uma tabela unidimensional

Para área de aprendizado de máquina, precisamos de uma tabela unidimensional.

In [40]:
df_pivot = df_desp_principais.pivot_table(index=['Instituição','Cod.IBGE','População'], columns='Conta',
                     values='Valor', aggfunc='first').reset_index()
df_pivot.head(-15)

Conta,Instituição,Cod.IBGE,População,01 - Legislativa,02 - Judiciária,03 - Essencial à Justiça,04 - Administração,05 - Defesa Nacional,06 - Segurança Pública,07 - Relações Exteriores,...,19 - Ciência e Tecnologia,20 - Agricultura,21 - Organização Agrária,22 - Indústria,23 - Comércio e Serviços,24 - Comunicações,25 - Energia,26 - Transporte,27 - Desporto e Lazer,28 - Encargos Especiais
0,Prefeitura Municipal de Abadia de Goiás - GO,5200050,8053,1220441.23,,,4914829.91,,49858.02,,...,,64084.02,,,112178.49,,185771.05,2193049.22,,451824.50
1,Prefeitura Municipal de Abadia dos Dourados - MG,3100104,7059,1060127.70,,,2999053.15,,46889.08,,...,,327671.48,,,,,,1348000.65,375162.76,1100572.20
2,Prefeitura Municipal de Abadiânia - GO,5200100,18427,1938697.16,31.00,,4720798.15,,7048.57,,...,,435308.34,,,,,,713223.32,367990.55,2217212.20
3,Prefeitura Municipal de Abaetetuba - PA,1500107,151934,,,,22693870.56,,,,...,,2167480.56,,,,,494143.10,282200.00,,4993062.83
4,Prefeitura Municipal de Abaeté - MG,3100203,23574,1461383.45,876485.67,,5665483.91,,30937.85,,...,,487438.13,,,9599.92,2028.45,965307.22,1226039.95,80960.35,394912.30
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
5433,Prefeitura Municipal de Águas Belas - PE,2600500,42831,2321312.77,179810.39,,9923908.82,,,,...,,6477.76,,,1100.00,146570.00,815895.39,23300.00,,1210872.16
5434,Prefeitura Municipal de Águas Formosas - MG,3100906,19416,1679369.17,,186567.41,5844847.02,116634.3,131823.01,,...,,605164.44,,,3032.71,24258.13,892045.56,501509.19,120333.44,1899449.63
5435,Prefeitura Municipal de Águas Frias - SC,4200556,2397,632066.67,,,2075181.73,,18854.90,,...,,2377975.51,,118478.89,,,,2588196.28,298424.65,276264.51
5436,Prefeitura Municipal de Águas Lindas de Goiás ...,5200258,191499,7548974.41,,,33818905.01,,355576.64,,...,,96685.14,,605672.17,,,,1165107.92,124417.75,8194820.17


# Analise Colunas

Podemos observar que estamos trabalhando com número muito grandes, então o próximo passo é suavizar e normalizar esses dados.

Isso evita que uma determinada despesa, tenha peso maior do que outra.



In [31]:
df_per_capita = df_pivot
df_per_capita.describe()


Conta,Cod.IBGE,População,01 - Legislativa,02 - Judiciária,03 - Essencial à Justiça,04 - Administração,05 - Defesa Nacional,06 - Segurança Pública,07 - Relações Exteriores,08 - Assistência Social,...,19 - Ciência e Tecnologia,20 - Agricultura,21 - Organização Agrária,22 - Indústria,23 - Comércio e Serviços,24 - Comunicações,25 - Energia,26 - Transporte,27 - Desporto e Lazer,28 - Encargos Especiais
count,5453.0,5453.0,4945.0,1280.0,515.0,5442.0,286.0,2464.0,12.0,5438.0,...,201.0,4981.0,39.0,872.0,2178.0,949.0,1735.0,4184.0,5061.0,4401.0
mean,3256557.0,36882.37,3113544.0,1328878.0,1455067.0,12212800.0,68079.62,2381498.0,164504.5,3261607.0,...,1279614.0,797579.8,997974.9,406550.0,843286.4,549868.7,1037772.0,3327784.0,728066.1,5417169.0
std,976534.7,215626.8,17426850.0,7984227.0,4590163.0,41315170.0,177963.3,17624070.0,340026.5,20715180.0,...,8090019.0,1239128.0,4971009.0,1152813.0,5058774.0,5422439.0,3905598.0,79984030.0,3303484.0,84135310.0
min,1100015.0,815.0,6779.06,31.0,140.0,1.0,15.0,13.8,166.8,8400.0,...,92.0,65.0,376.5,82.94,20.0,20.06,36.5,35.71,80.3,37.06
25%,2513703.0,5507.0,744651.5,88310.93,84332.6,2760989.0,7120.96,24660.56,6482.02,800057.9,...,17385.0,183492.9,19702.62,29561.71,22776.56,7250.0,126491.5,248663.3,78959.62,300755.6
50%,3147600.0,11621.0,1133867.0,195976.8,220231.9,4372693.0,29091.44,82429.73,22107.9,1256973.0,...,93216.0,453078.0,82003.76,115149.3,115792.2,24678.45,284495.7,785383.2,217139.1,699216.0
75%,4119152.0,25232.0,2069232.0,568038.0,864509.6,8415655.0,59537.09,490795.0,109950.8,2334404.0,...,481618.7,990653.1,258752.5,353541.8,478364.2,83902.67,700910.4,1757284.0,529086.5,1732038.0
max,5222302.0,12038180.0,831980600.0,194165100.0,43845670.0,1400976000.0,2131648.0,549963600.0,1169607.0,1255894000.0,...,111430100.0,29934000.0,31199880.0,24153850.0,167578600.0,123794500.0,100009100.0,5128618000.0,176215600.0,5079563000.0


# Suavização - Gastos per capito

Transformando os gastos em per capita.
Podemos observar que muitas colunas (contas), tem poucos lançamentos (count / qtd de linhas)

Exemplo, Defesa Nacional, tem apenas 286 registro de 5438 possíveis.

Relações Exteriores = 12 / 5438

Ciência e Tecnologia = 201 / 5438

Mesmo assim, em um primeiro momento iremos manter todos os dados.


In [32]:
for ind, column in enumerate(df_per_capita.columns):
    if (ind > 2):
        df_per_capita[column] = df_per_capita[column].values / df_per_capita['População'].values
        
df_per_capita.describe()


Conta,Cod.IBGE,População,01 - Legislativa,02 - Judiciária,03 - Essencial à Justiça,04 - Administração,05 - Defesa Nacional,06 - Segurança Pública,07 - Relações Exteriores,08 - Assistência Social,...,19 - Ciência e Tecnologia,20 - Agricultura,21 - Organização Agrária,22 - Indústria,23 - Comércio e Serviços,24 - Comunicações,25 - Energia,26 - Transporte,27 - Desporto e Lazer,28 - Encargos Especiais
count,5453.0,5453.0,4945.0,1280.0,515.0,5442.0,286.0,2464.0,12.0,5438.0,...,201.0,4981.0,39.0,872.0,2178.0,949.0,1735.0,4184.0,5061.0,4401.0
mean,3256557.0,36882.37,125.657316,29.956377,31.258697,509.02256,3.925224,22.136229,27.217578,139.265037,...,8.668469,84.159159,39.567914,23.737193,28.689027,5.995923,39.370585,148.298705,32.389539,83.385071
std,976534.7,215626.8,82.377824,55.913052,78.973693,381.182251,10.084874,43.144585,64.203121,99.700393,...,17.154779,145.287581,112.952071,55.555253,119.116177,9.973279,93.206628,224.813095,47.017351,95.79206
min,1100015.0,815.0,0.229364,0.000725,0.010524,0.000181,0.001406,0.000501,0.005875,0.433631,...,0.002045,0.001449,0.068767,0.001787,0.000521,0.002317,0.000603,0.000885,0.00186,0.002039
25%,2513703.0,5507.0,76.045213,7.010401,6.052184,274.7942,0.339666,2.742864,0.014847,77.078479,...,0.445221,14.340955,1.958267,2.234286,1.758471,0.69558,12.97526,17.691811,7.304285,33.869234
50%,3147600.0,11621.0,100.315682,18.033328,16.113028,401.154898,1.403299,7.496191,0.099404,110.914463,...,2.894453,35.493761,8.262173,8.348095,7.922471,2.617976,31.968505,65.001861,18.988672,60.851837
75%,4119152.0,25232.0,150.063752,34.757053,29.946602,617.772752,4.505223,22.910619,0.571735,166.316113,...,8.264472,88.222568,26.718927,22.131517,26.101545,6.835955,52.327995,183.88349,39.561795,104.66062
max,5222302.0,12038180.0,1328.045223,933.360817,1022.729435,6385.701551,142.242596,917.209223,190.551755,1370.927649,...,107.857544,1730.061327,692.333385,618.420028,4420.551051,92.648824,3516.958022,3353.142675,1304.284753,3147.233157


# Suavização - Log População

Como temos poucas cidades com mais de 2 milhões de habitantes, elas por si só já seriam uma anomalia.
Assim aplicando Log, temos como dimensionar cidades próximas.

In [33]:
import numpy as np

df_per_capita['População'] = np.log(df_per_capita['População'].values)
df_per_capita.head(-15)

Conta,Instituição,Cod.IBGE,População,01 - Legislativa,02 - Judiciária,03 - Essencial à Justiça,04 - Administração,05 - Defesa Nacional,06 - Segurança Pública,07 - Relações Exteriores,...,19 - Ciência e Tecnologia,20 - Agricultura,21 - Organização Agrária,22 - Indústria,23 - Comércio e Serviços,24 - Comunicações,25 - Energia,26 - Transporte,27 - Desporto e Lazer,28 - Encargos Especiais
0,Prefeitura Municipal de Abadia de Goiás - GO,5200050,8.993800,151.551128,,,610.310432,,6.191236,,...,,7.957782,,,13.930025,,23.068552,272.326986,,56.106358
1,Prefeitura Municipal de Abadia dos Dourados - MG,3100104,8.862059,150.181003,,,424.855242,,6.642454,,...,,46.418966,,,,,,190.961985,53.146729,155.910497
2,Prefeitura Municipal de Abadiânia - GO,5200100,9.821572,105.209592,0.001682,,256.189187,,0.382513,,...,,23.623397,,,,,,38.705341,19.970182,120.324101
3,Prefeitura Municipal de Abaetetuba - PA,1500107,11.931201,,,,149.366637,,,,...,,14.265935,,,,,3.252354,1.857385,,32.863367
4,Prefeitura Municipal de Abaeté - MG,3100203,10.067900,61.991323,37.180185,,240.327645,,1.312372,,...,,20.676938,,,0.407225,0.086046,40.947960,52.008142,3.434307,16.752028
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
5433,Prefeitura Municipal de Águas Belas - PE,2600500,10.665017,54.197025,4.198137,,231.699209,,,,...,,0.151240,,,0.025682,3.422054,19.049179,0.543999,,28.270929
5434,Prefeitura Municipal de Águas Formosas - MG,3100906,9.873853,86.494086,,9.608952,301.032500,6.007123,6.789401,,...,,31.168337,,,0.156196,1.249389,45.943838,25.829686,6.197643,97.829091
5435,Prefeitura Municipal de Águas Frias - SC,4200556,7.781973,263.690726,,,865.741231,,7.866041,,...,,992.063208,,49.427989,,,,1079.764823,124.499228,115.254280
5436,Prefeitura Municipal de Águas Lindas de Goiás ...,5200258,12.162638,39.420438,,,176.600948,,1.856807,,...,,0.504886,,3.162795,,,,6.084146,0.649704,42.793018


# Atribuindo valores para despesas não lançadas

Vamos fazer um primeiro experimento, colocar zero para todos, e em seguida fazer a normalização.

Apenas lembrando que atribuir zero não é o correto, é somente um experimento com os dados brutos.


In [34]:
df_per_capita = df_per_capita.replace(np.nan, 0)
df_per_capita.head(-15)

Conta,Instituição,Cod.IBGE,População,01 - Legislativa,02 - Judiciária,03 - Essencial à Justiça,04 - Administração,05 - Defesa Nacional,06 - Segurança Pública,07 - Relações Exteriores,...,19 - Ciência e Tecnologia,20 - Agricultura,21 - Organização Agrária,22 - Indústria,23 - Comércio e Serviços,24 - Comunicações,25 - Energia,26 - Transporte,27 - Desporto e Lazer,28 - Encargos Especiais
0,Prefeitura Municipal de Abadia de Goiás - GO,5200050,8.993800,151.551128,0.000000,0.000000,610.310432,0.000000,6.191236,0.0,...,0.0,7.957782,0.0,0.000000,13.930025,0.000000,23.068552,272.326986,0.000000,56.106358
1,Prefeitura Municipal de Abadia dos Dourados - MG,3100104,8.862059,150.181003,0.000000,0.000000,424.855242,0.000000,6.642454,0.0,...,0.0,46.418966,0.0,0.000000,0.000000,0.000000,0.000000,190.961985,53.146729,155.910497
2,Prefeitura Municipal de Abadiânia - GO,5200100,9.821572,105.209592,0.001682,0.000000,256.189187,0.000000,0.382513,0.0,...,0.0,23.623397,0.0,0.000000,0.000000,0.000000,0.000000,38.705341,19.970182,120.324101
3,Prefeitura Municipal de Abaetetuba - PA,1500107,11.931201,0.000000,0.000000,0.000000,149.366637,0.000000,0.000000,0.0,...,0.0,14.265935,0.0,0.000000,0.000000,0.000000,3.252354,1.857385,0.000000,32.863367
4,Prefeitura Municipal de Abaeté - MG,3100203,10.067900,61.991323,37.180185,0.000000,240.327645,0.000000,1.312372,0.0,...,0.0,20.676938,0.0,0.000000,0.407225,0.086046,40.947960,52.008142,3.434307,16.752028
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
5433,Prefeitura Municipal de Águas Belas - PE,2600500,10.665017,54.197025,4.198137,0.000000,231.699209,0.000000,0.000000,0.0,...,0.0,0.151240,0.0,0.000000,0.025682,3.422054,19.049179,0.543999,0.000000,28.270929
5434,Prefeitura Municipal de Águas Formosas - MG,3100906,9.873853,86.494086,0.000000,9.608952,301.032500,6.007123,6.789401,0.0,...,0.0,31.168337,0.0,0.000000,0.156196,1.249389,45.943838,25.829686,6.197643,97.829091
5435,Prefeitura Municipal de Águas Frias - SC,4200556,7.781973,263.690726,0.000000,0.000000,865.741231,0.000000,7.866041,0.0,...,0.0,992.063208,0.0,49.427989,0.000000,0.000000,0.000000,1079.764823,124.499228,115.254280
5436,Prefeitura Municipal de Águas Lindas de Goiás ...,5200258,12.162638,39.420438,0.000000,0.000000,176.600948,0.000000,1.856807,0.0,...,0.0,0.504886,0.0,3.162795,0.000000,0.000000,0.000000,6.084146,0.649704,42.793018


In [35]:
from sklearn.preprocessing import StandardScaler
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()
scaled_values = scaler.fit_transform(df_per_capita.iloc[:,3:31]) 
df_per_capita.iloc[:,3:31] =  scaled_values

df_per_capita.head(-15)



Conta,Instituição,Cod.IBGE,População,01 - Legislativa,02 - Judiciária,03 - Essencial à Justiça,04 - Administração,05 - Defesa Nacional,06 - Segurança Pública,07 - Relações Exteriores,...,19 - Ciência e Tecnologia,20 - Agricultura,21 - Organização Agrária,22 - Indústria,23 - Comércio e Serviços,24 - Comunicações,25 - Energia,26 - Transporte,27 - Desporto e Lazer,28 - Encargos Especiais
0,Prefeitura Municipal de Abadia de Goiás - GO,5200050,8.993800,0.114116,0.000000,0.000000,0.095575,0.000000,0.006750,0.0,...,0.0,0.004600,0.0,0.000000,0.003151,0.000000,0.006559,0.081215,0.000000,0.017827
1,Prefeitura Municipal de Abadia dos Dourados - MG,3100104,8.862059,0.113084,0.000000,0.000000,0.066532,0.000000,0.007242,0.0,...,0.0,0.026831,0.0,0.000000,0.000000,0.000000,0.000000,0.056950,0.040748,0.049539
2,Prefeitura Municipal de Abadiânia - GO,5200100,9.821572,0.079221,0.000002,0.000000,0.040119,0.000000,0.000417,0.0,...,0.0,0.013655,0.0,0.000000,0.000000,0.000000,0.000000,0.011543,0.015311,0.038232
3,Prefeitura Municipal de Abaetetuba - PA,1500107,11.931201,0.000000,0.000000,0.000000,0.023391,0.000000,0.000000,0.0,...,0.0,0.008246,0.0,0.000000,0.000000,0.000000,0.000925,0.000554,0.000000,0.010442
4,Prefeitura Municipal de Abaeté - MG,3100203,10.067900,0.046679,0.039835,0.000000,0.037635,0.000000,0.001431,0.0,...,0.0,0.011952,0.0,0.000000,0.000092,0.000929,0.011643,0.015510,0.002633,0.005323
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
5433,Prefeitura Municipal de Águas Belas - PE,2600500,10.665017,0.040810,0.004498,0.000000,0.036284,0.000000,0.000000,0.0,...,0.0,0.000087,0.0,0.000000,0.000006,0.036936,0.005416,0.000162,0.000000,0.008983
5434,Prefeitura Municipal de Águas Formosas - MG,3100906,9.873853,0.065129,0.000000,0.009395,0.047142,0.042232,0.007402,0.0,...,0.0,0.018016,0.0,0.000000,0.000035,0.013485,0.013064,0.007703,0.004752,0.031084
5435,Prefeitura Municipal de Águas Frias - SC,4200556,7.781973,0.198556,0.000000,0.000000,0.135575,0.000000,0.008576,0.0,...,0.0,0.573427,0.0,0.079926,0.000000,0.000000,0.000000,0.322016,0.095454,0.036621
5436,Prefeitura Municipal de Águas Lindas de Goiás ...,5200258,12.162638,0.029683,0.000000,0.000000,0.027656,0.000000,0.002024,0.0,...,0.0,0.000292,0.0,0.005114,0.000000,0.000000,0.000000,0.001814,0.000498,0.013597


# Aplicando o algoritmo LOF

Agora com os dados normalizados e suavizados, vamos aplicar o algoritmo LOF.

In [36]:
df_outliers = df_per_capita

from pyod.models.lof import LOF
detector = LOF(n_neighbors = 70, metric='euclidean') #utiliza distancia euclidiana
detector.fit(df_per_capita.iloc[:, 3:31])

previsoes = detector.labels_
scores = detector.decision_scores_

df_scores = pd.DataFrame(scores, columns=['lof'])
df_previsoes = pd.DataFrame(previsoes, columns=['outlier']) 


df_outliers = pd.concat([df_outliers,df_scores], axis=1)
df_outliers = pd.concat([df_outliers,df_previsoes], axis=1)

result = df_outliers.sort_values(by=['outlier','lof'], ascending=False)

result.head(10)

Unnamed: 0,Instituição,Cod.IBGE,População,01 - Legislativa,02 - Judiciária,03 - Essencial à Justiça,04 - Administração,05 - Defesa Nacional,06 - Segurança Pública,07 - Relações Exteriores,...,21 - Organização Agrária,22 - Indústria,23 - Comércio e Serviços,24 - Comunicações,25 - Energia,26 - Transporte,27 - Desporto e Lazer,28 - Encargos Especiais,lof,outlier
4488,Prefeitura Municipal de Senador Elói de Souza ...,2413102,8.722254,0.094439,0.0,0.0,0.079086,0.0,0.0,1.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.005378,0.008925,11.748457,1
2242,Prefeitura Municipal de Itajá - GO,5210802,8.500657,0.0,0.0,0.0,0.072752,0.0,0.000132,0.0,...,0.0,0.0,0.0,0.0,1.0,0.019041,0.007087,0.01156,11.459516,1
4090,Prefeitura Municipal de Rio Novo do Sul - ES,3204401,9.400547,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,7.23558,1
496,Prefeitura Municipal de Barro Preto - BA,2903300,8.758884,0.090362,0.0,0.82995,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02118,7.034837,1
3136,Prefeitura Municipal de Natividade - RJ,3303104,9.614872,0.11928,0.0,0.0,0.112644,1.0,0.0,0.0,...,0.0,0.0,0.004857,0.0,0.0,0.010206,0.002423,0.074236,6.506948,1
2223,Prefeitura Municipal de Itaguaru - GO,5210604,8.607399,0.105039,0.000127,0.786276,0.012837,0.0,0.008125,0.0,...,0.0,0.021775,0.0,0.0,0.0,0.0,0.095031,0.045146,6.372057,1
349,Prefeitura Municipal de Arroio do Sal - RS,4301057,9.096163,0.137149,0.0,0.0,0.076743,0.0,0.0,0.0,...,0.0,0.0,0.006939,0.0,0.0,0.0,0.0,0.0,5.895087,1
37,Prefeitura Municipal de Agrestina - PE,2600302,10.104549,0.059773,0.0,0.0,0.048878,0.0,0.000936,0.0,...,0.0,0.0,0.0,0.0,0.005505,0.001356,0.015792,0.005696,5.559358,1
3184,Prefeitura Municipal de Nova América da Colina...,4116604,8.174421,0.0,0.0,0.0,0.639538,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,5.464418,1
1876,Prefeitura Municipal de Gracho Cardoso - SE,2802601,8.675734,0.099271,0.862003,0.0,0.010123,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.062831,0.008522,5.375132,1


# Conclusão - Primeiro Experimento

Temos os 10 maiores outliers, que no caso já eram esperados.

O primeiro lugar é uma cidade de 6.138 habitantes e que teve um gasto com Relações Exteriores de R$ 1.169.606,67.

Sendo esse o maior valor lançado para essa despesa.

No próximo experimento iremos tratar os valores não informados de outra forma.


In [41]:
df_pivot[df_pivot['Instituição'].str.contains("Senador Elói de Souza")]

Conta,Instituição,Cod.IBGE,População,01 - Legislativa,02 - Judiciária,03 - Essencial à Justiça,04 - Administração,05 - Defesa Nacional,06 - Segurança Pública,07 - Relações Exteriores,...,19 - Ciência e Tecnologia,20 - Agricultura,21 - Organização Agrária,22 - Indústria,23 - Comércio e Serviços,24 - Comunicações,25 - Energia,26 - Transporte,27 - Desporto e Lazer,28 - Encargos Especiais
4488,Prefeitura Municipal de Senador Elói de Souza ...,2413102,6138,769819.83,,,3099817.26,,,1169606.67,...,,92879.0,,,,,,,43055.3,172405.09


In [42]:
df_pivot.describe()

Conta,Cod.IBGE,População,01 - Legislativa,02 - Judiciária,03 - Essencial à Justiça,04 - Administração,05 - Defesa Nacional,06 - Segurança Pública,07 - Relações Exteriores,08 - Assistência Social,...,19 - Ciência e Tecnologia,20 - Agricultura,21 - Organização Agrária,22 - Indústria,23 - Comércio e Serviços,24 - Comunicações,25 - Energia,26 - Transporte,27 - Desporto e Lazer,28 - Encargos Especiais
count,5453.0,5453.0,4945.0,1280.0,515.0,5442.0,286.0,2464.0,12.0,5438.0,...,201.0,4981.0,39.0,872.0,2178.0,949.0,1735.0,4184.0,5061.0,4401.0
mean,3256557.0,36882.37,3113544.0,1328878.0,1455067.0,12212800.0,68079.62,2381498.0,164504.5,3261607.0,...,1279614.0,797579.8,997974.9,406550.0,843286.4,549868.7,1037772.0,3327784.0,728066.1,5417169.0
std,976534.7,215626.8,17426850.0,7984227.0,4590163.0,41315170.0,177963.3,17624070.0,340026.5,20715180.0,...,8090019.0,1239128.0,4971009.0,1152813.0,5058774.0,5422439.0,3905598.0,79984030.0,3303484.0,84135310.0
min,1100015.0,815.0,6779.06,31.0,140.0,1.0,15.0,13.8,166.8,8400.0,...,92.0,65.0,376.5,82.94,20.0,20.06,36.5,35.71,80.3,37.06
25%,2513703.0,5507.0,744651.5,88310.93,84332.6,2760989.0,7120.96,24660.56,6482.02,800057.9,...,17385.0,183492.9,19702.62,29561.71,22776.56,7250.0,126491.5,248663.3,78959.62,300755.6
50%,3147600.0,11621.0,1133867.0,195976.8,220231.9,4372693.0,29091.44,82429.73,22107.9,1256973.0,...,93216.0,453078.0,82003.76,115149.3,115792.2,24678.45,284495.7,785383.2,217139.1,699216.0
75%,4119152.0,25232.0,2069232.0,568038.0,864509.6,8415655.0,59537.09,490795.0,109950.8,2334404.0,...,481618.7,990653.1,258752.5,353541.8,478364.2,83902.67,700910.4,1757284.0,529086.5,1732038.0
max,5222302.0,12038180.0,831980600.0,194165100.0,43845670.0,1400976000.0,2131648.0,549963600.0,1169607.0,1255894000.0,...,111430100.0,29934000.0,31199880.0,24153850.0,167578600.0,123794500.0,100009100.0,5128618000.0,176215600.0,5079563000.0
