**Fontes**

Unemployment, total (% of total labor force): https://data.worldbank.org/indicator/SL.UEM.TOTL.ZS?_gl=1%2A80nihh%2A_gcl_au%2AMTAyMTg2OTQwLjE3MjQwMTYwMDM.&end=2022&locations=AR&start=1960&view=chart


Fixed broadband subscriptions (per 100 people): https://data.worldbank.org/indicator/IT.NET.BBND.P2?_gl=1%2A80nihh%2A_gcl_au%2AMTAyMTg2OTQwLjE3MjQwMTYwMDM.&end=2022&locations=AR&start=1960&view=chart


Fixed telephone subscriptions: https://data.worldbank.org/indicator/IT.MLT.MAIN?_gl=1%2A80nihh%2A_gcl_au%2AMTAyMTg2OTQwLjE3MjQwMTYwMDM.&end=2022&locations=AR&start=1960&view=chart


GDP: https://data.worldbank.org/indicator/NY.GDP.MKTP.CD?_gl=1%2A80nihh%2A_gcl_au%2AMTAyMTg2OTQwLjE3MjQwMTYwMDM.&end=2022&locations=AR&start=1960&view=chart


Individuals using the Internet (% of population): https://data.worldbank.org/indicator/IT.NET.USER.ZS?_gl=1%2A80nihh%2A_gcl_au%2AMTAyMTg2OTQwLjE3MjQwMTYwMDM.&end=2022&locations=AR&start=1960&view=chart

In [59]:
import pandas as pd

In [60]:
def trata_arquivo(nome):
    # Faz a leitura do arquivo
    df = pd.read_excel(f'raw_data/{nome}.xls', sheet_name='Data', skiprows=3)

    # Seleciona apenas as colunas necessárias
    df_filter = df[['Country Name', '2012', '2013', '2014', '2015', '2016', '2017', '2018', '2019', '2020', '2021', '2022']]

    # Usa o método melt para transformar colunas de ano em linhas
    df_melted = df_filter.melt(id_vars=['Country Name'], 
                            var_name='Year', 
                            value_name=f'{nome}')

    return df_melted


In [61]:
# Definindo arquivos que serão tratados
files = ['Fixed broadband subscriptions', 'Fixed telephone subscriptions', 'GDP', 'Individuals using the Internet', 'Unemployment']

In [62]:
# Iterando sobre os arquivos e tratando-os

df_final = pd.DataFrame()

for file in files:
    df = trata_arquivo(file)

    if df_final.empty:
        # Se df_final estiver vazio, inicializa com o primeiro DataFrame processado
        df_final = df
    else:
        # Mesclar o DataFrame atual com o DataFrame final
        df_final = pd.merge(df_final, df, on=['Country Name', 'Year'], how='outer')

In [63]:
# Tratando as colunas com porcentagem
columns = ['Individuals using the Internet', 'Unemployment', 'Fixed broadband subscriptions']

for column in columns:
    print('Tratando a coluna', column)
    df_final[column] = df_final[column] / 100

Tratando a coluna Individuals using the Internet
Tratando a coluna Unemployment
Tratando a coluna Fixed broadband subscriptions


In [64]:
# Grava o novo DataFrame no formato CSV
print('Salvando o arquivo final...')
df_final.to_csv(f'data/mercado_livre.csv', sep=',', index=False)

Salvando o arquivo final...
