# Pandas
Pandas é uma biblioteca Python de código aberto para análise de dados, que oferece alto desempenho, estruturas de dados de fácil utilização e ferramentas de análise de dados. Para usar a biblioteca, basta utilizar o seguinte comando de importação abaixo:

```python
import numpy as np
import pandas as pd
```

## Obtendo os Dados

| Formato | Descrição                        |  Função de Leitura | Função de Escrita |
|:--------|:---------------------------------|:-------------------|:------------------|
| Texto   | CSV                              | read_csv( )        | to_csv( )         |
| Texto   | JSON                             | read_json( )       | to_json( )        |
| Texto   | HTML                             | read_html( )       | to_html( )        | 
| Texto   | Área de Transferência de Memória | read_clipboard( )  | to_clipboard( )   |
| Binário | MS Excel                         | read_excel( )      | to_excel( )       |
| Binário | HDF5 Format                      | read_hdf( )        | to_hdf( )         |
| Binário | Feather Format                   | read_feather( )    | to_feather( )     |
| Binário | Parquet Format                   | read_parquet( )    | to_parquet( )     |
| Binário | Msgpack                          | read_msgpack( )    | to_msgpack( )     |
| Binário | Stata                            | read_stata( )      | to_stata( )       |
| Binário | SAS                              | read_sas( )        | -                 | 
| Binário | Python Pickle Format             | read_pickle( )     | to_pickle( )      |
| SQL     | SQL                              | read_sql( )        | to_sql( )         |
| SQL     | Google Big Query                 | read_gbq( )        | to_gbq( )         |

### Exemplos

```python
# Exemplo 1: Leitura e Escrita para CSV

# Lê 5 linhas de um arquivo CSV sem cabeçalho.
pd.read_csv("arquivo.csv", header=None, nrows=5) 

# Exporta os dados de uma estrutura de dados 
# Data Frame do Pandas para um arquivo csv.
dados_data_frame.to_csv("dados_data_frame.csv")


# Exemplo 2: Leitura e Escrita para EXCEL
pd.read_excel("planilha.xlsx")

dados_data_frame.to_excel("dados_data_frame.xlsx", sheet_name="Planilha 1")

# Carrega múltiplas planilhas do mesmo arquivo
planilha = pd.ExcelFile("arquivo.xls")
dados_data_frame = pd.read_excel(xlsx, "Planilha 1")
```


## Estruturas de Dados

### Series
Vetor unidimensional rotulado capaz de armazenar qualquer tipo de dado

In [6]:
import numpy as np
import pandas as pd

s = pd.Series(np.random.randn(5), index=['a', 'b', 'c', 'd', 'e'])
s

a   -0.462717
b   -1.244359
c   -0.483060
d    0.740545
e   -0.449470
dtype: float64

### Data Frame
Matriz rotulada contendo colunas com diferentes tipos de dados

In [8]:
data = {
    'Country': ['Belgium',  'India',  'Brazil'],
    'Capital': ['Brussels',  'New Delhi',  'Brasilia'],
    'Population': [11190846, 1303171035, 207847528]
}

df = pd.DataFrame(data, columns=['Country',  'Capital',  'Population'])
df

Unnamed: 0,Country,Capital,Population
0,Belgium,Brussels,11190846
1,India,New Delhi,1303171035
2,Brazil,Brasilia,207847528


### Indexação / Seleção

| Operação                                    | Sintaxe            | Resultado |
|:--------------------------------------------|:-------------------|:----------|
| Seleciona Coluna                            | df[col]            | Series    |
| Seleciona Linha por nome do rótulo          | df.loc[rotulo]     | Series    |
| Seleciona linhas por posição (inteiro)      | df.iloc[posicao]   | Series    |
| Divide Linhas                               | df[5:10]           | DataFrame |
| Seleciona linhas por vetor booleano         | df[vetor_booleano] | DataFrame |


In [17]:
import numpy as np
import pandas as pd

data = {
    'Country': ['Belgium',  'India',  'Brazil'],
    'Capital': ['Brussels',  'New Delhi',  'Brasilia'],
    'Population': [11190846, 1303171035, 207847528]
}

df = pd.DataFrame(data, columns=['Country',  'Capital',  'Population'])
print(df)
print("\n")
# Seleção por posição
print(df.iloc[[0],[0]])
print("\n")

# Seleção por rótulo
print(df.loc[[0], ['Country']])

# Rótulo/Posição
print ("\n")
print(df.loc[2])
print ("\n")
print(df.loc[:, 'Capital'])
print ("\n")
print(df.loc[2, 'Capital'])

   Country    Capital  Population
0  Belgium   Brussels    11190846
1    India  New Delhi  1303171035
2   Brazil   Brasilia   207847528


   Country
0  Belgium


   Country
0  Belgium


Country          Brazil
Capital        Brasilia
Population    207847528
Name: 2, dtype: object


0     Brussels
1    New Delhi
2     Brasilia
Name: Capital, dtype: object


Brasilia


### Missing Data (Dados Ausentes)

O pandas utiliza o valor np.nan para representar todos os dados auentes que possam aparecer no conjunto de dados. Para detectar dados ausentes facilmente, o Pandas fornece as funções: isna() e notna(), que são também métodos dos objetos do tipo Series e DataFrame.

In [27]:
import numpy as np
import pandas as pd

df = pd.DataFrame(np.random.randn(5, 3), index=['A', 'C', 'E', 'F', 'H'], 
                  columns=['one', 'two', 'three'])

df['four'] = 'bar' # Cria uma nova coluna (four) e adiciona valores constantes (bar)
df['five'] = df['one'] > 0 # Cria uma nova coluna (five) e adiciona true se a coluna 1 for maior que zero
df

# Reindexa o data frame para simularmos os missing datas
# Ele adicionou as linhas ausentes: B, D e G; e preencheu
# com dados auentes (NaN)
df2 = df.reindex(['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H'])

# Avalia o data frame e coloca True no local dos dados ausentes
df2
pd.isna(df2)
df2.isna()

# É possível fazer o teste especificando colunas
df2['one']
pd.isna(df2['one']) 
df2['four'].notna()

A     True
B    False
C     True
D    False
E     True
F     True
G    False
H     True
Name: four, dtype: bool

### Ordenação e Ranking

| Função                          | Descrição                        | 
|:--------------------------------|:---------------------------------|
| df.sort_index()                 | Ordena por índices               |
| df.sort_values(by='NomeColuna') | Ordena por valores               |
| df.rank()                       | Gera um ranking para as entradas |


### Informações Básicas

| Função         | Descrição                                | 
|:---------------|:-----------------------------------------|
| df.shape()     | Retorna (linhas, colunas)                |
| df.index()     | Descreve os índices                      |
| df.columns()   | Descreve as colunas do Data Frame        |
| df.info()      | Informações do Data Frame                |
| df.count()     | Retorna a contagem dos valores não nulos |


### Sumário Quantitativo

| Função         | Descrição                      | 
|:---------------|:-------------------------------|
| df.sum()       | Soma dos valores               |
| df.cumsum()    | Soma cumulativa dos valores    |
| df.min()       | Obtém o valor mínimo           |
| df.max()       | Obtém o valor máximo           |
| df.idxmin()    | Obtém o valor mínimo do índice |
| df.idxmax()    | Obtém o valor máximo do índice |
| df.describe()  | Gera estatísticas descritivas que resumem a tendência central, a dispersão e a forma da distribuição de um conjunto de dados, excluindo os valores NaN. |
| df.mean()      | Calcula a Média                |
| df.median()    | Calcula a Mediana              |



## Aprendendo Pandas na Prática

[Stack Overflow Annual Developer Survey](https://insights.stackoverflow.com/survey) é o maior e mais abrangente pesquisa sobre desevenvolvedores ao redor do mundo. Considerando o conjunto de dados da pesquisa referente ao ano de 2018, faremos algumas tarefas rotineiras de um cientista de dados utilizando a biblioteca Pandas. É importante lembrar que os resultados da pesqusia foram anonimizados e distribuídos para download publicamente sob a licença [Open Database Licence (ODbL)](https://opendatacommons.org/licenses/odbl/1.0/)

In [2]:
# Dataset: tack Overflow Annual Developer Survey - 2018
import numpy as np
import pandas as pd

# 1. Carregue o dataset para análise, disponível na pasta data, usando as funções de leitura do Pandas.
df = pd.read_csv (r"C:\Users\Bruna\Desktop\Data Science\developer_survey_2018\survey_results_public.csv")
print (df)


       Respondent Hobby OpenSource                       Country  \
0               1   Yes         No                         Kenya   
1               3   Yes        Yes                United Kingdom   
2               4   Yes        Yes                 United States   
3               5    No         No                 United States   
4               7   Yes         No                  South Africa   
5               8   Yes         No                United Kingdom   
6               9   Yes        Yes                 United States   
7              10   Yes        Yes                       Nigeria   
8              11   Yes        Yes                 United States   
9              16    No        Yes                         India   
10             17   Yes         No                         Spain   
11             18   Yes        Yes                         India   
12             19   Yes         No                       Croatia   
13             20    No         No              

In [55]:
# 2. Obtenha informações do Data Frame carregado em memória
print(df.info())

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 98855 entries, 0 to 98854
Columns: 129 entries, Respondent to SurveyEasy
dtypes: float64(41), int64(1), object(87)
memory usage: 97.3+ MB
None


In [64]:
# 3. Selecione todas as linhas com país igual a Brasil
a = df.loc[df['Country'] == 'Brazil']
a

Unnamed: 0,Respondent,Hobby,OpenSource,Country,Student,Employment,FormalEducation,UndergradMajor,CompanySize,DevType,...,Exercise,Gender,SexualOrientation,EducationParents,RaceEthnicity,Age,Dependents,MilitaryUS,SurveyTooLong,SurveyEasy
97,142,Yes,No,Brazil,No,Employed full-time,"Bachelor’s degree (BA, BS, B.Eng., etc.)","Computer science, computer engineering, or sof...",100 to 499 employees,Embedded applications or devices developer,...,I don't typically exercise,Male,Straight or heterosexual,Some college/university study without earning ...,Black or of African descent;Hispanic or Latino...,35 - 44 years old,No,,The survey was too long,Somewhat easy
101,149,Yes,No,Brazil,"Yes, part-time",Employed full-time,"Master’s degree (MA, MS, M.Eng., MBA, etc.)","Another engineering discipline (ex. civil, ele...",500 to 999 employees,,...,1 - 2 times per week,Male,Straight or heterosexual,"Bachelor’s degree (BA, BS, B.Eng., etc.)",Hispanic or Latino/Latina,25 - 34 years old,No,,The survey was an appropriate length,Very easy
177,270,Yes,No,Brazil,"Yes, part-time",Employed full-time,"Secondary school (e.g. American high school, G...",,100 to 499 employees,Back-end developer;Front-end developer;Full-st...,...,I don't typically exercise,Male,Straight or heterosexual,They never completed any formal education,White or of European descent,18 - 24 years old,No,,The survey was too long,Somewhat easy
205,308,Yes,Yes,Brazil,No,Employed full-time,"Bachelor’s degree (BA, BS, B.Eng., etc.)","Computer science, computer engineering, or sof...","1,000 to 4,999 employees",Back-end developer;Desktop or enterprise appli...,...,3 - 4 times per week,,,,,,,,,
243,363,Yes,Yes,Brazil,No,Employed full-time,"Bachelor’s degree (BA, BS, B.Eng., etc.)","Computer science, computer engineering, or sof...",20 to 99 employees,Back-end developer;Front-end developer;Full-st...,...,1 - 2 times per week,Male,Straight or heterosexual,"Bachelor’s degree (BA, BS, B.Eng., etc.)",Hispanic or Latino/Latina;White or of European...,25 - 34 years old,No,,The survey was too long,Somewhat easy
310,448,Yes,Yes,Brazil,"Yes, part-time",Employed full-time,Some college/university study without earning ...,"Computer science, computer engineering, or sof...",10 to 19 employees,Back-end developer;Front-end developer;Full-st...,...,I don't typically exercise,Male,Straight or heterosexual,They never completed any formal education,,18 - 24 years old,No,,The survey was an appropriate length,Somewhat easy
322,466,Yes,Yes,Brazil,No,Employed full-time,"Bachelor’s degree (BA, BS, B.Eng., etc.)","Computer science, computer engineering, or sof...","10,000 or more employees",Back-end developer;Database administrator;Fron...,...,1 - 2 times per week,Male,Straight or heterosexual,"Secondary school (e.g. American high school, G...",White or of European descent,25 - 34 years old,No,,The survey was an appropriate length,Somewhat easy
344,497,Yes,No,Brazil,No,Employed full-time,"Bachelor’s degree (BA, BS, B.Eng., etc.)","Another engineering discipline (ex. civil, ele...",20 to 99 employees,Back-end developer;Database administrator;Desi...,...,,,,,,,,,,
409,586,Yes,No,Brazil,No,Employed full-time,"Bachelor’s degree (BA, BS, B.Eng., etc.)","Computer science, computer engineering, or sof...",100 to 499 employees,DevOps specialist;Embedded applications or dev...,...,3 - 4 times per week,Male,Straight or heterosexual,Some college/university study without earning ...,White or of European descent,35 - 44 years old,No,,The survey was too long,Somewhat easy
432,616,Yes,No,Brazil,No,Employed full-time,"Bachelor’s degree (BA, BS, B.Eng., etc.)","Computer science, computer engineering, or sof...",Fewer than 10 employees,Back-end developer;Educator or academic resear...,...,I don't typically exercise,Male,Straight or heterosexual,"Bachelor’s degree (BA, BS, B.Eng., etc.)",White or of European descent,45 - 54 years old,Yes,,The survey was an appropriate length,Somewhat easy


In [71]:
# 4. Os dados do brasil possui dados ausentes (missing data)? Se sim, selecione todas as linhas
# que contenham algum dado ausente

dados_nulos = a[a.isnull().any(axis=1)]

dados_nulos

Unnamed: 0,Respondent,Hobby,OpenSource,Country,Student,Employment,FormalEducation,UndergradMajor,CompanySize,DevType,...,Exercise,Gender,SexualOrientation,EducationParents,RaceEthnicity,Age,Dependents,MilitaryUS,SurveyTooLong,SurveyEasy
97,142,Yes,No,Brazil,No,Employed full-time,"Bachelor’s degree (BA, BS, B.Eng., etc.)","Computer science, computer engineering, or sof...",100 to 499 employees,Embedded applications or devices developer,...,I don't typically exercise,Male,Straight or heterosexual,Some college/university study without earning ...,Black or of African descent;Hispanic or Latino...,35 - 44 years old,No,,The survey was too long,Somewhat easy
101,149,Yes,No,Brazil,"Yes, part-time",Employed full-time,"Master’s degree (MA, MS, M.Eng., MBA, etc.)","Another engineering discipline (ex. civil, ele...",500 to 999 employees,,...,1 - 2 times per week,Male,Straight or heterosexual,"Bachelor’s degree (BA, BS, B.Eng., etc.)",Hispanic or Latino/Latina,25 - 34 years old,No,,The survey was an appropriate length,Very easy
177,270,Yes,No,Brazil,"Yes, part-time",Employed full-time,"Secondary school (e.g. American high school, G...",,100 to 499 employees,Back-end developer;Front-end developer;Full-st...,...,I don't typically exercise,Male,Straight or heterosexual,They never completed any formal education,White or of European descent,18 - 24 years old,No,,The survey was too long,Somewhat easy
205,308,Yes,Yes,Brazil,No,Employed full-time,"Bachelor’s degree (BA, BS, B.Eng., etc.)","Computer science, computer engineering, or sof...","1,000 to 4,999 employees",Back-end developer;Desktop or enterprise appli...,...,3 - 4 times per week,,,,,,,,,
243,363,Yes,Yes,Brazil,No,Employed full-time,"Bachelor’s degree (BA, BS, B.Eng., etc.)","Computer science, computer engineering, or sof...",20 to 99 employees,Back-end developer;Front-end developer;Full-st...,...,1 - 2 times per week,Male,Straight or heterosexual,"Bachelor’s degree (BA, BS, B.Eng., etc.)",Hispanic or Latino/Latina;White or of European...,25 - 34 years old,No,,The survey was too long,Somewhat easy
310,448,Yes,Yes,Brazil,"Yes, part-time",Employed full-time,Some college/university study without earning ...,"Computer science, computer engineering, or sof...",10 to 19 employees,Back-end developer;Front-end developer;Full-st...,...,I don't typically exercise,Male,Straight or heterosexual,They never completed any formal education,,18 - 24 years old,No,,The survey was an appropriate length,Somewhat easy
322,466,Yes,Yes,Brazil,No,Employed full-time,"Bachelor’s degree (BA, BS, B.Eng., etc.)","Computer science, computer engineering, or sof...","10,000 or more employees",Back-end developer;Database administrator;Fron...,...,1 - 2 times per week,Male,Straight or heterosexual,"Secondary school (e.g. American high school, G...",White or of European descent,25 - 34 years old,No,,The survey was an appropriate length,Somewhat easy
344,497,Yes,No,Brazil,No,Employed full-time,"Bachelor’s degree (BA, BS, B.Eng., etc.)","Another engineering discipline (ex. civil, ele...",20 to 99 employees,Back-end developer;Database administrator;Desi...,...,,,,,,,,,,
409,586,Yes,No,Brazil,No,Employed full-time,"Bachelor’s degree (BA, BS, B.Eng., etc.)","Computer science, computer engineering, or sof...",100 to 499 employees,DevOps specialist;Embedded applications or dev...,...,3 - 4 times per week,Male,Straight or heterosexual,Some college/university study without earning ...,White or of European descent,35 - 44 years old,No,,The survey was too long,Somewhat easy
432,616,Yes,No,Brazil,No,Employed full-time,"Bachelor’s degree (BA, BS, B.Eng., etc.)","Computer science, computer engineering, or sof...",Fewer than 10 employees,Back-end developer;Educator or academic resear...,...,I don't typically exercise,Male,Straight or heterosexual,"Bachelor’s degree (BA, BS, B.Eng., etc.)",White or of European descent,45 - 54 years old,Yes,,The survey was an appropriate length,Somewhat easy


In [72]:
# 5. Qual o total de valores ausentes para cada coluna?
a.isnull().sum()

Respondent                        0
Hobby                             0
OpenSource                        0
Country                           0
Student                          82
Employment                       50
FormalEducation                  78
UndergradMajor                  583
CompanySize                     698
DevType                         171
YearsCoding                     144
YearsCodingProf                 546
JobSatisfaction                 725
CareerSatisfaction              576
HopeFiveYears                   590
JobSearchStatus                 511
LastNewJob                      525
AssessJob1                      794
AssessJob2                      794
AssessJob3                      794
AssessJob4                      794
AssessJob5                      794
AssessJob6                      794
AssessJob7                      794
AssessJob8                      794
AssessJob9                      794
AssessJob10                     794
AssessBenefits1             

In [83]:
# 6. Remova linhas com pelo menos 1 elemento ausente das colunas: 'FormalEducation', 'UndergradMajor', 'Age'.
a.dropna(subset=['FormalEducation','UndergradMajor','Age'])

Unnamed: 0,Respondent,Hobby,OpenSource,Country,Student,Employment,FormalEducation,UndergradMajor,CompanySize,DevType,...,Exercise,Gender,SexualOrientation,EducationParents,RaceEthnicity,Age,Dependents,MilitaryUS,SurveyTooLong,SurveyEasy
97,142,Yes,No,Brazil,No,Employed full-time,"Bachelor’s degree (BA, BS, B.Eng., etc.)","Computer science, computer engineering, or sof...",100 to 499 employees,Embedded applications or devices developer,...,I don't typically exercise,Male,Straight or heterosexual,Some college/university study without earning ...,Black or of African descent;Hispanic or Latino...,35 - 44 years old,No,,The survey was too long,Somewhat easy
101,149,Yes,No,Brazil,"Yes, part-time",Employed full-time,"Master’s degree (MA, MS, M.Eng., MBA, etc.)","Another engineering discipline (ex. civil, ele...",500 to 999 employees,,...,1 - 2 times per week,Male,Straight or heterosexual,"Bachelor’s degree (BA, BS, B.Eng., etc.)",Hispanic or Latino/Latina,25 - 34 years old,No,,The survey was an appropriate length,Very easy
243,363,Yes,Yes,Brazil,No,Employed full-time,"Bachelor’s degree (BA, BS, B.Eng., etc.)","Computer science, computer engineering, or sof...",20 to 99 employees,Back-end developer;Front-end developer;Full-st...,...,1 - 2 times per week,Male,Straight or heterosexual,"Bachelor’s degree (BA, BS, B.Eng., etc.)",Hispanic or Latino/Latina;White or of European...,25 - 34 years old,No,,The survey was too long,Somewhat easy
310,448,Yes,Yes,Brazil,"Yes, part-time",Employed full-time,Some college/university study without earning ...,"Computer science, computer engineering, or sof...",10 to 19 employees,Back-end developer;Front-end developer;Full-st...,...,I don't typically exercise,Male,Straight or heterosexual,They never completed any formal education,,18 - 24 years old,No,,The survey was an appropriate length,Somewhat easy
322,466,Yes,Yes,Brazil,No,Employed full-time,"Bachelor’s degree (BA, BS, B.Eng., etc.)","Computer science, computer engineering, or sof...","10,000 or more employees",Back-end developer;Database administrator;Fron...,...,1 - 2 times per week,Male,Straight or heterosexual,"Secondary school (e.g. American high school, G...",White or of European descent,25 - 34 years old,No,,The survey was an appropriate length,Somewhat easy
409,586,Yes,No,Brazil,No,Employed full-time,"Bachelor’s degree (BA, BS, B.Eng., etc.)","Computer science, computer engineering, or sof...",100 to 499 employees,DevOps specialist;Embedded applications or dev...,...,3 - 4 times per week,Male,Straight or heterosexual,Some college/university study without earning ...,White or of European descent,35 - 44 years old,No,,The survey was too long,Somewhat easy
432,616,Yes,No,Brazil,No,Employed full-time,"Bachelor’s degree (BA, BS, B.Eng., etc.)","Computer science, computer engineering, or sof...",Fewer than 10 employees,Back-end developer;Educator or academic resear...,...,I don't typically exercise,Male,Straight or heterosexual,"Bachelor’s degree (BA, BS, B.Eng., etc.)",White or of European descent,45 - 54 years old,Yes,,The survey was an appropriate length,Somewhat easy
468,666,Yes,No,Brazil,No,Employed full-time,"Bachelor’s degree (BA, BS, B.Eng., etc.)","Information systems, information technology, o...",Fewer than 10 employees,Front-end developer,...,3 - 4 times per week,Male,Straight or heterosexual,Primary/elementary school,White or of European descent,25 - 34 years old,No,,The survey was an appropriate length,Very easy
543,762,Yes,No,Brazil,No,Employed full-time,"Master’s degree (MA, MS, M.Eng., MBA, etc.)","Computer science, computer engineering, or sof...","1,000 to 4,999 employees",Back-end developer;Desktop or enterprise appli...,...,1 - 2 times per week,Male,Straight or heterosexual,"Secondary school (e.g. American high school, G...",Hispanic or Latino/Latina;White or of European...,25 - 34 years old,No,,The survey was an appropriate length,Somewhat easy
641,902,Yes,Yes,Brazil,No,Employed full-time,"Bachelor’s degree (BA, BS, B.Eng., etc.)","Another engineering discipline (ex. civil, ele...",10 to 19 employees,Engineering manager,...,1 - 2 times per week,Male,Straight or heterosexual,"Other doctoral degree (Ph.D, Ed.D., etc.)",Hispanic or Latino/Latina,25 - 34 years old,No,,The survey was too long,Neither easy nor difficult


In [87]:
# 7. Crie um subconjunto de dados com as colunas: 
# Respondent, Country, Student, Employment, FormalEducation, UndergradMajor, Exercise, Age
b = a.filter(['Respondent','Country','Student','Employment', 'FormalEducation', 'UndergradMajor', 'Exercise', 'Age'], axis=1)
b

Unnamed: 0,Respondent,Country,Student,Employment,FormalEducation,UndergradMajor,Exercise,Age
97,142,Brazil,No,Employed full-time,"Bachelor’s degree (BA, BS, B.Eng., etc.)","Computer science, computer engineering, or sof...",I don't typically exercise,35 - 44 years old
101,149,Brazil,"Yes, part-time",Employed full-time,"Master’s degree (MA, MS, M.Eng., MBA, etc.)","Another engineering discipline (ex. civil, ele...",1 - 2 times per week,25 - 34 years old
177,270,Brazil,"Yes, part-time",Employed full-time,"Secondary school (e.g. American high school, G...",,I don't typically exercise,18 - 24 years old
205,308,Brazil,No,Employed full-time,"Bachelor’s degree (BA, BS, B.Eng., etc.)","Computer science, computer engineering, or sof...",3 - 4 times per week,
243,363,Brazil,No,Employed full-time,"Bachelor’s degree (BA, BS, B.Eng., etc.)","Computer science, computer engineering, or sof...",1 - 2 times per week,25 - 34 years old
310,448,Brazil,"Yes, part-time",Employed full-time,Some college/university study without earning ...,"Computer science, computer engineering, or sof...",I don't typically exercise,18 - 24 years old
322,466,Brazil,No,Employed full-time,"Bachelor’s degree (BA, BS, B.Eng., etc.)","Computer science, computer engineering, or sof...",1 - 2 times per week,25 - 34 years old
344,497,Brazil,No,Employed full-time,"Bachelor’s degree (BA, BS, B.Eng., etc.)","Another engineering discipline (ex. civil, ele...",,
409,586,Brazil,No,Employed full-time,"Bachelor’s degree (BA, BS, B.Eng., etc.)","Computer science, computer engineering, or sof...",3 - 4 times per week,35 - 44 years old
432,616,Brazil,No,Employed full-time,"Bachelor’s degree (BA, BS, B.Eng., etc.)","Computer science, computer engineering, or sof...",I don't typically exercise,45 - 54 years old


In [106]:
# 8. Exiba as contagens de frequência dos valores únicos das colunas: Student, 
#Exercise, Employment, FormalEducation, UndergradMajor, Exercise e Age

students_freq = a['FormalEducation'].value_counts();
print(students_freq)
students_freq.describe()

Bachelor’s degree (BA, BS, B.Eng., etc.)                                              1140
Some college/university study without earning a degree                                 444
Secondary school (e.g. American high school, German Realschule or Gymnasium, etc.)     341
Master’s degree (MA, MS, M.Eng., MBA, etc.)                                            257
Associate degree                                                                        97
Professional degree (JD, MD, etc.)                                                      70
Primary/elementary school                                                               39
Other doctoral degree (Ph.D, Ed.D., etc.)                                               26
I never completed any formal education                                                  13
Name: FormalEducation, dtype: int64


count       9.000000
mean      269.666667
std       360.773475
min        13.000000
25%        39.000000
50%        97.000000
75%       341.000000
max      1140.000000
Name: FormalEducation, dtype: float64