# Pandas I

## Sumário
- [Biblioteca](#biblioteca)
- [DataFrame](#dataframe)
 - [Shape](#Shape)
 - [Head](#Head)
 - [Tail](#Tail)
 - [Descrição das colunas](#Descrição-das-colunas)
- [Filtragem](#Filtragem)
 - [Linha](#Linha)
 - [Coluna](#Coluna)
 - [Composta](#Composta)

## Biblioteca

#### Instalação
```
python -m pip install pandas
```

#### Importação

In [1]:
import numpy as np
import pandas as pd

## DataFrame

In [2]:
dados = {
    'nome': [
        'Alany', 'Antonio', 'Danilo', 'Davi', 
        'David', 'Francisca', 'Francisco', 
        'Wesley', 'Janete', 'João', 'Jorge', 
        'José', 'Joseph', 'KaioM', 'KaioV', 
        'Leandro', 'Mariana', 'Mateus', 
        'Renata', 'Vicente', 'Viviane'
    ],
    'n1': [
        7.71, 10.00, 10.00, 0.00, 8.07, 0.00, 
        0.00, 7.50, 9.00, 7.29, 7.71, 10.00, 
        10.00, 7.14, 6.29, 3.86, 7.21, 10.00, 
        3.00, 9.14, 0.00
    ],
    'n2': [
        0.00, 10.00, 9.91, 0.00, 7.67, 0.00, 
        0.00, 8.85, 0.67, 6.82, 7.15, 6.85, 
        9.85, 0.00, 0.00, 0.00, 0.00, 10.00, 
        0.00, 9.94, 0.00
    ]
}

In [3]:
df = pd.DataFrame(data=dados)
df

Unnamed: 0,nome,n1,n2
0,Alany,7.71,0.0
1,Antonio,10.0,10.0
2,Danilo,10.0,9.91
3,Davi,0.0,0.0
4,David,8.07,7.67
5,Francisca,0.0,0.0
6,Francisco,0.0,0.0
7,Wesley,7.5,8.85
8,Janete,9.0,0.67
9,João,7.29,6.82


### Shape
Formato do DataFrame (linhas x colunas)

In [4]:
df.shape

(21, 3)

### Head
Exibe as *n* primeiras linhas do DataFrame _(n: int = 5)_

In [5]:
df.head()

Unnamed: 0,nome,n1,n2
0,Alany,7.71,0.0
1,Antonio,10.0,10.0
2,Danilo,10.0,9.91
3,Davi,0.0,0.0
4,David,8.07,7.67


### Tail
Exibe as *n* últimas linhas do DataFrame _(n: int = 5)_

In [6]:
df.tail()

Unnamed: 0,nome,n1,n2
16,Mariana,7.21,0.0
17,Mateus,10.0,10.0
18,Renata,3.0,0.0
19,Vicente,9.14,9.94
20,Viviane,0.0,0.0


### Descrição das colunas
Apresenta:
- `count`: Contagem de ítens
- `mean`: Média dos valores
- `std`: Desvio padrão
- `min`: Menor valor
- `max`: Maior valor
- `25%`: Primeiro quartil
- `50%`: Segundo quartil ou Mediana
- `75%`: Terceiro quartil

In [7]:
df.describe()

Unnamed: 0,n1,n2
count,21.0,21.0
mean,6.377143,4.176667
std,3.676596,4.5269
min,0.0,0.0
25%,3.86,0.0
50%,7.5,0.67
75%,9.14,8.85
max,10.0,10.0


## Filtragem

### Linha

In [8]:
# Linha de índice 4
df.loc[4]

nome    David
n1       8.07
n2       7.67
Name: 4, dtype: object

In [9]:
# Linhas 1, 3 e 5
df.loc[[1,3,5]]

Unnamed: 0,nome,n1,n2
1,Antonio,10.0,10.0
3,Davi,0.0,0.0
5,Francisca,0.0,0.0


In [10]:
# Slice
df.loc[1:5]

Unnamed: 0,nome,n1,n2
1,Antonio,10.0,10.0
2,Danilo,10.0,9.91
3,Davi,0.0,0.0
4,David,8.07,7.67
5,Francisca,0.0,0.0


In [11]:
# Slice com step
df.loc[0:10:3]

Unnamed: 0,nome,n1,n2
0,Alany,7.71,0.0
3,Davi,0.0,0.0
6,Francisco,0.0,0.0
9,João,7.29,6.82


In [12]:
# Slice com step
df.loc[0::2] # Somente linhas pares

Unnamed: 0,nome,n1,n2
0,Alany,7.71,0.0
2,Danilo,10.0,9.91
4,David,8.07,7.67
6,Francisco,0.0,0.0
8,Janete,9.0,0.67
10,Jorge,7.71,7.15
12,Joseph,10.0,9.85
14,KaioV,6.29,0.0
16,Mariana,7.21,0.0
18,Renata,3.0,0.0


### Colunas

In [13]:
df.n1

0      7.71
1     10.00
2     10.00
3      0.00
4      8.07
5      0.00
6      0.00
7      7.50
8      9.00
9      7.29
10     7.71
11    10.00
12    10.00
13     7.14
14     6.29
15     3.86
16     7.21
17    10.00
18     3.00
19     9.14
20     0.00
Name: n1, dtype: float64

In [14]:
df['n1']

0      7.71
1     10.00
2     10.00
3      0.00
4      8.07
5      0.00
6      0.00
7      7.50
8      9.00
9      7.29
10     7.71
11    10.00
12    10.00
13     7.14
14     6.29
15     3.86
16     7.21
17    10.00
18     3.00
19     9.14
20     0.00
Name: n1, dtype: float64

In [15]:
df[['n1', 'nome']]

Unnamed: 0,n1,nome
0,7.71,Alany
1,10.0,Antonio
2,10.0,Danilo
3,0.0,Davi
4,8.07,David
5,0.0,Francisca
6,0.0,Francisco
7,7.5,Wesley
8,9.0,Janete
9,7.29,João


### Composta

In [16]:
df[['n1', 'nome']].loc[5:10]

Unnamed: 0,n1,nome
5,0.0,Francisca
6,0.0,Francisco
7,7.5,Wesley
8,9.0,Janete
9,7.29,João
10,7.71,Jorge
