<p align="center">
  <img src="https://pandas.pydata.org/static/img/pandas_secondary.svg" width="450">
</p>

Lista de conteúdos
---------------------------

[Visão Geral](#Visão-Geral)

[Vantagens](#Vantagens)

[Tipos de Dados](#Tipos-de-Dados)

[Lendo o dataset](#Lendo-o-dataset)

[Acessando colunas](#Acessando-colunas)

[Filtrando linhas do dataset](#Filtrando-linhas-do-dataset)

[Filtrando linhas e colunas do dataset](#Filtrando-linhas-e-colunas-do-dataset)

[Funções úteis](#Funções-úteis)

[Exercícios](#Exercícios)

[Referências](#Referências)


## Visão Geral

 - Biblioteca para manipulação e análise de dados;
 - Oferece um conjunto de funções para operar dados tabulares (2D) e séries temporais (1D);
 - Usado na área de finanças, estatística, ciências sociais, e muitas áreas da engenharia;
 - Alternativa para a linguagem **R**;
 
## Vantagens
 
 - Facilidade em tratar informações faltantes;
 - Colunas podem ser facilmente excluídas ou adicionadas;
 - Conversão de tipos;
 - Visualização dos dados;
 - Rápido;

 
## Tipos de Dados
### Series
 
 - Lista de valores rotulados e de tipo único;
 - Possuem somente uma dimensão;
 
 <img src="https://pandas.pydata.org/docs/_images/01_table_series.svg">
 <p style="text-align:center;">
    <small>
        Fonte: Documentação do Pandas [1]
    </small>
 </p>

In [9]:
import pandas as pd
frequencia_cardiaca = pd.Series(
    [82, 82, 84, 96, 95, 86, 84, 88, 90, 95, 102],
    ["inst1", "inst2","inst3","inst4","inst5",
    "inst6","inst7","inst8","inst9","inst10",
    "inst11"])

frequencia_cardiaca["inst1"]

82

 ### DataFrame
  - Matrix 2D de valores rotulados;
  - Tipos diversos;
  - Tamanho mutável;
  - Semelhante à uma panilha/excel ou uma tabela SQL;
  - Cada coluna de um `DataFrame` é do tipo `Series`;
  
<img src="https://pandas.pydata.org/docs/_images/01_table_dataframe.svg" />

<p style="text-align:center;">    
    <small>
        <b>Fonte</b>:
        Documentação do Pandas [1]
    </small>
</p>

## Lendo o dataset

![](https://pandas.pydata.org/docs/_images/02_io_readwrite.svg)

<p style="text-align:center;">
    <small>
        <b>Fonte</b>:
        Documentação do Pandas [1]
    </small>
</p>

 #### Lendo dados de um arquivo CSV

In [10]:
import pandas as pd
dataset = pd.read_csv("../datasets/country_vaccinations.csv")
dataset.head(1)

Unnamed: 0,country,iso_code,date,total_vaccinations,people_vaccinated,people_fully_vaccinated,daily_vaccinations_raw,daily_vaccinations,total_vaccinations_per_hundred,people_vaccinated_per_hundred,people_fully_vaccinated_per_hundred,daily_vaccinations_per_million,vaccines,source_name,source_website
0,Albania,ALB,2021-01-10,0.0,0.0,,,,0.0,0.0,,,Pfizer/BioNTech,Ministry of Health,https://shendetesia.gov.al/covid19-ministria-e...


<p style="text-align: center;">
    <small>
        <b>Fonte</b>: Progresso da vacinação mundialmente ─ dataset Kaggle [2]
    </small>
</p>

<h1 style="text-align: center;"> Descrição do dataset </h1>


|                                         |          |
|:----------------------------------------|:---------|
| **contry**                              |   País   |
| **iso_code**                            |   Sigla  |
| **date**                                |   Data em que as informações seguintes se referem   |
| **total_vaccinations**                  |   Valor absoluto de imunizações no país (pode ser maior que o número de pessoas)  |
| **people_vaccinated**                   |   Número de pessoas vacinadas   |
| **people_fully_vaccinated**             |   Número de pessoas que tomaram as 2 doses da vacina   |
| **daily_vaccinations_raw**              |   Número de vacinações diárias de um país   |
| **daily_vaccinations**                  |   Número de vacinações diárias de um país   |
| **total_vaccinations_per_hundred**      |   Percentual de imunizações em relação a população total do país   |
| **people_vaccinated_per_hundred**       |   Percentual de pessoas imunizadas em relação a população total do país   |
| **people_fully_vaccinated_per_hundred** |   Percentual entre a quantidade de pessoas que tomaram a 2ª dose da vacina e a população do país   |
|    **daily_vaccinations_per_million**   |   Razão entre o número de pessoas vacinadas e a população total do país   |
| **vaccines**                            |   Vacinas sendo aplicadas no país  |
| **source_name**                         |   Organização que compilou os dados   |
| **source_website**                      |   Website que apresenta os dados   |

### Mostrando informações do dataset

In [11]:
dataset.shape

(4380, 15)

In [12]:
dataset.columns

Index(['country', 'iso_code', 'date', 'total_vaccinations',
       'people_vaccinated', 'people_fully_vaccinated',
       'daily_vaccinations_raw', 'daily_vaccinations',
       'total_vaccinations_per_hundred', 'people_vaccinated_per_hundred',
       'people_fully_vaccinated_per_hundred', 'daily_vaccinations_per_million',
       'vaccines', 'source_name', 'source_website'],
      dtype='object')

In [13]:
dataset.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4380 entries, 0 to 4379
Data columns (total 15 columns):
 #   Column                               Non-Null Count  Dtype  
---  ------                               --------------  -----  
 0   country                              4380 non-null   object 
 1   iso_code                             4080 non-null   object 
 2   date                                 4380 non-null   object 
 3   total_vaccinations                   2866 non-null   float64
 4   people_vaccinated                    2438 non-null   float64
 5   people_fully_vaccinated              1626 non-null   float64
 6   daily_vaccinations_raw               2421 non-null   float64
 7   daily_vaccinations                   4226 non-null   float64
 8   total_vaccinations_per_hundred       2866 non-null   float64
 9   people_vaccinated_per_hundred        2438 non-null   float64
 10  people_fully_vaccinated_per_hundred  1626 non-null   float64
 11  daily_vaccinations_per_million

In [14]:
dataset.describe()

Unnamed: 0,total_vaccinations,people_vaccinated,people_fully_vaccinated,daily_vaccinations_raw,daily_vaccinations,total_vaccinations_per_hundred,people_vaccinated_per_hundred,people_fully_vaccinated_per_hundred,daily_vaccinations_per_million
count,2866.0,2438.0,1626.0,2421.0,4226.0,2866.0,2438.0,1626.0,4226.0
mean,1671404.0,1449764.0,473486.6,74241.04,54879.89,6.953692,5.667063,2.216052,2393.753195
std,5640491.0,4556002.0,1828830.0,207285.0,173458.1,12.932168,9.320872,5.404122,4380.248455
min,0.0,0.0,1.0,-50012.0,1.0,0.0,0.0,0.0,0.0
25%,30892.0,27732.5,10543.5,2224.0,1121.0,0.6025,0.5825,0.2025,321.0
50%,200937.5,179772.5,49799.5,11773.0,5745.0,2.69,2.52,0.825,1056.5
75%,836103.8,714345.5,249891.8,53089.0,26678.5,6.57,5.14,1.8775,2162.5
max,70454060.0,47184200.0,22613360.0,2242472.0,1916190.0,102.7,64.09,38.61,54264.0



## Acessando colunas

![](https://pandas.pydata.org/docs/_images/03_subset_columns.svg)

<p style="text-align:center;">
    <b>Fonte</b>: Documentação do Pandas [1]
</p>

In [15]:
dataset["country"] # Seleção de uma única coluna 

0        Albania
1        Albania
2        Albania
3        Albania
4        Albania
          ...   
4375    Zimbabwe
4376    Zimbabwe
4377    Zimbabwe
4378    Zimbabwe
4379    Zimbabwe
Name: country, Length: 4380, dtype: object

In [17]:
type(dataset["country"])

pandas.core.series.Series

In [18]:
dataset[["country", "date", "daily_vaccinations_raw"]] # Seleção de várias colunas

Unnamed: 0,country,date,daily_vaccinations_raw
0,Albania,2021-01-10,
1,Albania,2021-01-11,
2,Albania,2021-01-12,
3,Albania,2021-01-13,60.0
4,Albania,2021-01-14,78.0
...,...,...,...
4375,Zimbabwe,2021-02-22,
4376,Zimbabwe,2021-02-23,2727.0
4377,Zimbabwe,2021-02-24,3831.0
4378,Zimbabwe,2021-02-25,3135.0


## Filtrando linhas do dataset

![](https://pandas.pydata.org/docs/_images/03_subset_rows.svg)
 
 <p style="text-align:center;">
    <b>Fonte</b>: Documentação do Pandas [1]
</p>

In [22]:
# Selecionar o andamento da vacinação no Brasil.

imunization_in_brazil = dataset[dataset["country"] == "Brazil"]
#imunization_in_brazil
imunization_in_brazil[["country", "total_vaccinations", "date", "source_name"]]

Unnamed: 0,country,total_vaccinations,date,source_name
532,Brazil,0.0,2021-01-16,Regional governments via Coronavirus Brasil
533,Brazil,112.0,2021-01-17,Regional governments via Coronavirus Brasil
534,Brazil,1109.0,2021-01-18,Regional governments via Coronavirus Brasil
535,Brazil,11470.0,2021-01-19,Regional governments via Coronavirus Brasil
536,Brazil,28543.0,2021-01-20,Regional governments via Coronavirus Brasil
537,Brazil,136519.0,2021-01-21,Regional governments via Coronavirus Brasil
538,Brazil,245877.0,2021-01-22,Regional governments via Coronavirus Brasil
539,Brazil,537774.0,2021-01-23,Regional governments via Coronavirus Brasil
540,Brazil,604722.0,2021-01-24,Regional governments via Coronavirus Brasil
541,Brazil,700608.0,2021-01-25,Regional governments via Coronavirus Brasil


In [25]:
# Selecionar as informações da última quinta e sexta-feira.

last_thursday = "2021-02-25"
last_friday = "2021-02-26"

imunization_in_brazil[
    (imunization_in_brazil["date"] == last_friday) |
    (imunization_in_brazil["date"] == last_thursday)]

Unnamed: 0,country,iso_code,date,total_vaccinations,people_vaccinated,people_fully_vaccinated,daily_vaccinations_raw,daily_vaccinations,total_vaccinations_per_hundred,people_vaccinated_per_hundred,people_fully_vaccinated_per_hundred,daily_vaccinations_per_million,vaccines,source_name,source_website
572,Brazil,BRA,2021-02-25,7799000.0,6202055.0,1596945.0,247324.0,227474.0,3.67,2.92,0.75,1070.0,"Oxford/AstraZeneca, Sinovac",Regional governments via Coronavirus Brasil,https://coronavirusbra1.github.io/
573,Brazil,BRA,2021-02-26,8101787.0,6346769.0,1755018.0,302787.0,223804.0,3.81,2.99,0.83,1053.0,"Oxford/AstraZeneca, Sinovac",Regional governments via Coronavirus Brasil,https://coronavirusbra1.github.io/


## Filtrando linhas e colunas do dataset


 ![](https://pandas.pydata.org/docs/_images/03_subset_columns_rows.svg)
 
 <p style="text-align:center;">
    <b>Fonte</b>: Documentação do Pandas [1]
</p>

### Filtrando através de labels

In [26]:
# Selecionar o total de pessoas vacinadas
# na última sexta-feira no Brazil e na Argentina.

imunization_br_arg = dataset[
    (dataset["country"] == "Brazil") | (dataset["country"] == "Argentina")]
imunization_br_arg

Unnamed: 0,country,iso_code,date,total_vaccinations,people_vaccinated,people_fully_vaccinated,daily_vaccinations_raw,daily_vaccinations,total_vaccinations_per_hundred,people_vaccinated_per_hundred,people_fully_vaccinated_per_hundred,daily_vaccinations_per_million,vaccines,source_name,source_website
111,Argentina,ARG,2020-12-29,700.0,,,,,0.00,,,,Sputnik V,Ministry of Health,http://datos.salud.gob.ar/dataset/vacunas-cont...
112,Argentina,ARG,2020-12-30,,,,,15656.0,,,,346.0,Sputnik V,Ministry of Health,http://datos.salud.gob.ar/dataset/vacunas-cont...
113,Argentina,ARG,2020-12-31,32013.0,,,,15656.0,0.07,,,346.0,Sputnik V,Ministry of Health,http://datos.salud.gob.ar/dataset/vacunas-cont...
114,Argentina,ARG,2021-01-01,,,,,11070.0,,,,245.0,Sputnik V,Ministry of Health,http://datos.salud.gob.ar/dataset/vacunas-cont...
115,Argentina,ARG,2021-01-02,,,,,8776.0,,,,194.0,Sputnik V,Ministry of Health,http://datos.salud.gob.ar/dataset/vacunas-cont...
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
569,Brazil,BRA,2021-02-22,7028356.0,5857080.0,1171276.0,77554.0,247768.0,3.31,2.76,0.55,1166.0,"Oxford/AstraZeneca, Sinovac",Regional governments via Coronavirus Brasil,https://coronavirusbra1.github.io/
570,Brazil,BRA,2021-02-23,7297061.0,6002873.0,1294188.0,268705.0,241018.0,3.43,2.82,0.61,1134.0,"Oxford/AstraZeneca, Sinovac",Regional governments via Coronavirus Brasil,https://coronavirusbra1.github.io/
571,Brazil,BRA,2021-02-24,7551676.0,6116082.0,1435594.0,254615.0,238305.0,3.55,2.88,0.68,1121.0,"Oxford/AstraZeneca, Sinovac",Regional governments via Coronavirus Brasil,https://coronavirusbra1.github.io/
572,Brazil,BRA,2021-02-25,7799000.0,6202055.0,1596945.0,247324.0,227474.0,3.67,2.92,0.75,1070.0,"Oxford/AstraZeneca, Sinovac",Regional governments via Coronavirus Brasil,https://coronavirusbra1.github.io/


In [28]:
# Intervalo de labels

thu_friday_br_arg = imunization_br_arg.loc[
    (imunization_br_arg["date"] == last_friday) |
    (imunization_br_arg["date"] == last_thursday),
    "country":"total_vaccinations"
]
thu_friday_br_arg

Unnamed: 0,country,iso_code,date,total_vaccinations
169,Argentina,ARG,2021-02-25,829832.0
170,Argentina,ARG,2021-02-26,903915.0
572,Brazil,BRA,2021-02-25,7799000.0
573,Brazil,BRA,2021-02-26,8101787.0


In [33]:
# Conjunto de labels

thu_friday_br_arg.loc[:,["date", "total_vaccinations"]]

Unnamed: 0,date,total_vaccinations
169,2021-02-25,829832.0
170,2021-02-26,903915.0
572,2021-02-25,7799000.0
573,2021-02-26,8101787.0


### Filtrando através de índices 

In [34]:
thu_friday_br_arg = imunization_br_arg.loc[
    (imunization_br_arg["date"] == last_friday) |
    (imunization_br_arg["date"] == last_thursday)
]

thu_friday_br_arg

Unnamed: 0,country,iso_code,date,total_vaccinations,people_vaccinated,people_fully_vaccinated,daily_vaccinations_raw,daily_vaccinations,total_vaccinations_per_hundred,people_vaccinated_per_hundred,people_fully_vaccinated_per_hundred,daily_vaccinations_per_million,vaccines,source_name,source_website
169,Argentina,ARG,2021-02-25,829832.0,558831.0,271001.0,49377.0,27750.0,1.84,1.24,0.6,614.0,Sputnik V,Ministry of Health,http://datos.salud.gob.ar/dataset/vacunas-cont...
170,Argentina,ARG,2021-02-26,903915.0,620635.0,283280.0,74083.0,33609.0,2.0,1.37,0.63,744.0,Sputnik V,Ministry of Health,http://datos.salud.gob.ar/dataset/vacunas-cont...
572,Brazil,BRA,2021-02-25,7799000.0,6202055.0,1596945.0,247324.0,227474.0,3.67,2.92,0.75,1070.0,"Oxford/AstraZeneca, Sinovac",Regional governments via Coronavirus Brasil,https://coronavirusbra1.github.io/
573,Brazil,BRA,2021-02-26,8101787.0,6346769.0,1755018.0,302787.0,223804.0,3.81,2.99,0.83,1053.0,"Oxford/AstraZeneca, Sinovac",Regional governments via Coronavirus Brasil,https://coronavirusbra1.github.io/


In [35]:
thu_friday_br_arg.iloc[:1, :]

Unnamed: 0,country,iso_code,date,total_vaccinations,people_vaccinated,people_fully_vaccinated,daily_vaccinations_raw,daily_vaccinations,total_vaccinations_per_hundred,people_vaccinated_per_hundred,people_fully_vaccinated_per_hundred,daily_vaccinations_per_million,vaccines,source_name,source_website
169,Argentina,ARG,2021-02-25,829832.0,558831.0,271001.0,49377.0,27750.0,1.84,1.24,0.6,614.0,Sputnik V,Ministry of Health,http://datos.salud.gob.ar/dataset/vacunas-cont...


In [36]:
thu_friday_br_arg.iloc[:3, 0:5]

Unnamed: 0,country,iso_code,date,total_vaccinations,people_vaccinated
169,Argentina,ARG,2021-02-25,829832.0,558831.0
170,Argentina,ARG,2021-02-26,903915.0,620635.0
572,Brazil,BRA,2021-02-25,7799000.0,6202055.0



## Funções úteis

### isin

Verifica se os valores de uma coluna estão presente da lista passada como parâmetro

In [39]:



im_br_arg_ch = dataset[
    (dataset["country"] == "Brazil") | 
    (dataset["country"] == "Argentina")]

im_br_arg_ch


#im_br_arg_ch.loc[
#    im_br_arg_ch["date"].isin([last_thursday, last_friday]),
#    ["country", "date", "vaccines"]
#]


Unnamed: 0,country,iso_code,date,total_vaccinations,people_vaccinated,people_fully_vaccinated,daily_vaccinations_raw,daily_vaccinations,total_vaccinations_per_hundred,people_vaccinated_per_hundred,people_fully_vaccinated_per_hundred,daily_vaccinations_per_million,vaccines,source_name,source_website
111,Argentina,ARG,2020-12-29,700.0,,,,,0.00,,,,Sputnik V,Ministry of Health,http://datos.salud.gob.ar/dataset/vacunas-cont...
112,Argentina,ARG,2020-12-30,,,,,15656.0,,,,346.0,Sputnik V,Ministry of Health,http://datos.salud.gob.ar/dataset/vacunas-cont...
113,Argentina,ARG,2020-12-31,32013.0,,,,15656.0,0.07,,,346.0,Sputnik V,Ministry of Health,http://datos.salud.gob.ar/dataset/vacunas-cont...
114,Argentina,ARG,2021-01-01,,,,,11070.0,,,,245.0,Sputnik V,Ministry of Health,http://datos.salud.gob.ar/dataset/vacunas-cont...
115,Argentina,ARG,2021-01-02,,,,,8776.0,,,,194.0,Sputnik V,Ministry of Health,http://datos.salud.gob.ar/dataset/vacunas-cont...
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
569,Brazil,BRA,2021-02-22,7028356.0,5857080.0,1171276.0,77554.0,247768.0,3.31,2.76,0.55,1166.0,"Oxford/AstraZeneca, Sinovac",Regional governments via Coronavirus Brasil,https://coronavirusbra1.github.io/
570,Brazil,BRA,2021-02-23,7297061.0,6002873.0,1294188.0,268705.0,241018.0,3.43,2.82,0.61,1134.0,"Oxford/AstraZeneca, Sinovac",Regional governments via Coronavirus Brasil,https://coronavirusbra1.github.io/
571,Brazil,BRA,2021-02-24,7551676.0,6116082.0,1435594.0,254615.0,238305.0,3.55,2.88,0.68,1121.0,"Oxford/AstraZeneca, Sinovac",Regional governments via Coronavirus Brasil,https://coronavirusbra1.github.io/
572,Brazil,BRA,2021-02-25,7799000.0,6202055.0,1596945.0,247324.0,227474.0,3.67,2.92,0.75,1070.0,"Oxford/AstraZeneca, Sinovac",Regional governments via Coronavirus Brasil,https://coronavirusbra1.github.io/


### value_counts

Conta o número de entradas em cada categoria (e.g **"Brasil"**, **"Sinovac"**) de uma variável (e.g **country**, **vaccine**)

In [40]:
amount_of_days = dataset["country"].value_counts()
print("Imunização no Brasil iniciou à", amount_of_days["Brazil"], "dias")

Imunização no Brasil iniciou à 42 dias


### fillna (fill NaN)

Substitui valores nulos com o valor especificado

In [43]:
people_fully_vaccinated_br = dataset.loc[
    dataset["country"] == "Brazil",
    ["country", "date", "people_fully_vaccinated"]]

filled_with_zeros = people_fully_vaccinated_br["people_fully_vaccinated"].fillna(0)

#people_fully_vaccinated_br["people_fully_vaccinated"] = filled_with_zeros

#people_fully_vaccinated_br

filled_with_zeros

532          0.0
533          0.0
534          0.0
535          0.0
536          0.0
537          0.0
538          0.0
539          0.0
540          0.0
541          0.0
542          0.0
543          0.0
544          0.0
545          0.0
546          0.0
547          0.0
548          0.0
549          0.0
550          0.0
551          0.0
552          0.0
553       1962.0
554      19677.0
555      25688.0
556      33616.0
557      50655.0
558      80760.0
559     109866.0
560     178468.0
561     194230.0
562     217869.0
563     285620.0
564     557532.0
565     723755.0
566     894673.0
567    1056807.0
568    1132894.0
569    1171276.0
570    1294188.0
571    1435594.0
572    1596945.0
573    1755018.0
Name: people_fully_vaccinated, dtype: float64

### Nunique (number of unique)

Conta a quantidade de valores distintos que a coluna especificada tem

In [None]:
country_amount = dataset["country"].nunique()
print("Quantidade de países distintos presentes no dataset é", country_amount)

### Sort

Ordena os valores de uma coluna

In [44]:
# Ordenar os países pela taxa de pessoas vacinadas complementamente (%)

imun_last_friday = dataset[dataset["date"] == last_friday]

imun_last_friday = imun_last_friday.loc[
    imun_last_friday["people_fully_vaccinated_per_hundred"].notna(),
    ["country","date","people_fully_vaccinated_per_hundred"]]

imun_last_friday = imun_last_friday.sort_values(by="people_fully_vaccinated_per_hundred", ascending=False)

#mun_lasfriday["rank"] = [i for i in range(1, imun_last_friday.shape[0]+1)]

imun_last_friday

Unnamed: 0,country,date,people_fully_vaccinated_per_hundred
2126,Israel,2021-02-26,38.07
3614,Serbia,2021-02-26,7.45
2056,Isle of Man,2021-02-26,7.39
4289,United States,2021-02-26,6.76
1865,Iceland,2021-02-26,3.68
3374,Romania,2021-02-26,3.2
3192,Poland,2021-02-26,3.09
1701,Greece,2021-02-26,2.89
2476,Lithuania,2021-02-26,2.67
1806,Hungary,2021-02-26,2.53


### Apply

Executa uma função em cada valor de uma coluna.

In [51]:
falkland = dataset[dataset["country"] == "Falkland Islands"]

falkland


Unnamed: 0,country,iso_code,date,total_vaccinations,people_vaccinated,people_fully_vaccinated,daily_vaccinations_raw,daily_vaccinations,total_vaccinations_per_hundred,people_vaccinated_per_hundred,people_fully_vaccinated_per_hundred,daily_vaccinations_per_million,vaccines,source_name,source_website
1404,Falkland Islands,FLK,07/02/2021,0.0,0.0,,,,0.0,0.0,,,Oxford/AstraZeneca,Government of the Falkland Islands,https://www.facebook.com/FalkIandsGov/posts/42...
1405,Falkland Islands,FLK,08/02/2021,,,,,189.0,,,,54264.0,Oxford/AstraZeneca,Government of the Falkland Islands,https://www.facebook.com/FalkIandsGov/posts/42...
1406,Falkland Islands,FLK,09/02/2021,,,,,189.0,,,,54264.0,Oxford/AstraZeneca,Government of the Falkland Islands,https://www.facebook.com/FalkIandsGov/posts/42...
1407,Falkland Islands,FLK,10/02/2021,,,,,189.0,,,,54264.0,Oxford/AstraZeneca,Government of the Falkland Islands,https://www.facebook.com/FalkIandsGov/posts/42...
1408,Falkland Islands,FLK,11/02/2021,,,,,189.0,,,,54264.0,Oxford/AstraZeneca,Government of the Falkland Islands,https://www.facebook.com/FalkIandsGov/posts/42...
1409,Falkland Islands,FLK,12/02/2021,,,,,189.0,,,,54264.0,Oxford/AstraZeneca,Government of the Falkland Islands,https://www.facebook.com/FalkIandsGov/posts/42...
1410,Falkland Islands,FLK,13/02/2021,,,,,189.0,,,,54264.0,Oxford/AstraZeneca,Government of the Falkland Islands,https://www.facebook.com/FalkIandsGov/posts/42...
1411,Falkland Islands,FLK,14/02/2021,,,,,189.0,,,,54264.0,Oxford/AstraZeneca,Government of the Falkland Islands,https://www.facebook.com/FalkIandsGov/posts/42...
1412,Falkland Islands,FLK,15/02/2021,1515.0,1515.0,,,189.0,43.5,43.5,,54264.0,Oxford/AstraZeneca,Government of the Falkland Islands,https://www.facebook.com/FalkIandsGov/posts/42...


### Drop

Remove linhas ou colunas de um dataframe

In [52]:
dataset.head(3)

Unnamed: 0,country,iso_code,date,total_vaccinations,people_vaccinated,people_fully_vaccinated,daily_vaccinations_raw,daily_vaccinations,total_vaccinations_per_hundred,people_vaccinated_per_hundred,people_fully_vaccinated_per_hundred,daily_vaccinations_per_million,vaccines,source_name,source_website
0,Albania,ALB,,0.0,0.0,,,,0.0,0.0,,,Pfizer/BioNTech,Ministry of Health,https://shendetesia.gov.al/covid19-ministria-e...
1,Albania,ALB,,,,,,64.0,,,,22.0,Pfizer/BioNTech,Ministry of Health,https://shendetesia.gov.al/covid19-ministria-e...
2,Albania,ALB,,128.0,128.0,,,64.0,0.0,0.0,,22.0,Pfizer/BioNTech,Ministry of Health,https://shendetesia.gov.al/covid19-ministria-e...


In [55]:
# Remove as colunas do dataframe através de suas labels
dataset = dataset.drop(["iso_code", "source_website"], axis=1)


In [56]:
dataset.head(3)

Unnamed: 0,country,date,total_vaccinations,people_vaccinated,people_fully_vaccinated,daily_vaccinations_raw,daily_vaccinations,total_vaccinations_per_hundred,people_vaccinated_per_hundred,people_fully_vaccinated_per_hundred,daily_vaccinations_per_million,vaccines,source_name
0,Albania,,0.0,0.0,,,,0.0,0.0,,,Pfizer/BioNTech,Ministry of Health
1,Albania,,,,,,64.0,,,,22.0,Pfizer/BioNTech,Ministry of Health
2,Albania,,128.0,128.0,,,64.0,0.0,0.0,,22.0,Pfizer/BioNTech,Ministry of Health


In [57]:
# Remove as linhas do dataframe através de seus índices

dataset = dataset.drop([0, 1])

In [58]:
dataset.head(3)

Unnamed: 0,country,date,total_vaccinations,people_vaccinated,people_fully_vaccinated,daily_vaccinations_raw,daily_vaccinations,total_vaccinations_per_hundred,people_vaccinated_per_hundred,people_fully_vaccinated_per_hundred,daily_vaccinations_per_million,vaccines,source_name
2,Albania,,128.0,128.0,,,64.0,0.0,0.0,,22.0,Pfizer/BioNTech,Ministry of Health
3,Albania,,188.0,188.0,,60.0,63.0,0.01,0.01,,22.0,Pfizer/BioNTech,Ministry of Health
4,Albania,,266.0,266.0,,78.0,66.0,0.01,0.01,,23.0,Pfizer/BioNTech,Ministry of Health


## Exercícios

1. Filtre as informações da primeira e última data de vacinação dos os Estados Unidos, porém sem usar as funções `head` e `tail`.


2. Ordene os países em ordem decrescente com base no total de vacinações (`total_vaccinations`) do dia 
`15\02\2021`.


3. Usando a função `Apply`, atribua zero à coluna `people_fully_vaccinated` nas linhas onde seus valores forem **NaN**.



In [3]:
dataset[dataset["country"] == "Brazil"]

Unnamed: 0,country,iso_code,date,total_vaccinations,people_vaccinated,people_fully_vaccinated,daily_vaccinations_raw,daily_vaccinations,total_vaccinations_per_hundred,people_vaccinated_per_hundred,people_fully_vaccinated_per_hundred,daily_vaccinations_per_million,vaccines,source_name,source_website
532,Brazil,BRA,2021-01-16,0.0,0.0,,,,0.0,0.0,,,"Oxford/AstraZeneca, Sinovac",Regional governments via Coronavirus Brasil,https://coronavirusbra1.github.io/
533,Brazil,BRA,2021-01-17,112.0,112.0,,112.0,112.0,0.0,0.0,,1.0,"Oxford/AstraZeneca, Sinovac",Regional governments via Coronavirus Brasil,https://coronavirusbra1.github.io/
534,Brazil,BRA,2021-01-18,1109.0,1109.0,,997.0,554.0,0.0,0.0,,3.0,"Oxford/AstraZeneca, Sinovac",Regional governments via Coronavirus Brasil,https://coronavirusbra1.github.io/
535,Brazil,BRA,2021-01-19,11470.0,11470.0,,10361.0,3823.0,0.01,0.01,,18.0,"Oxford/AstraZeneca, Sinovac",Regional governments via Coronavirus Brasil,https://coronavirusbra1.github.io/
536,Brazil,BRA,2021-01-20,28543.0,28543.0,,17073.0,7136.0,0.01,0.01,,34.0,"Oxford/AstraZeneca, Sinovac",Regional governments via Coronavirus Brasil,https://coronavirusbra1.github.io/
537,Brazil,BRA,2021-01-21,136519.0,136519.0,,107976.0,27304.0,0.06,0.06,,128.0,"Oxford/AstraZeneca, Sinovac",Regional governments via Coronavirus Brasil,https://coronavirusbra1.github.io/
538,Brazil,BRA,2021-01-22,245877.0,245877.0,,109358.0,40980.0,0.12,0.12,,193.0,"Oxford/AstraZeneca, Sinovac",Regional governments via Coronavirus Brasil,https://coronavirusbra1.github.io/
539,Brazil,BRA,2021-01-23,537774.0,537774.0,,291897.0,76825.0,0.25,0.25,,361.0,"Oxford/AstraZeneca, Sinovac",Regional governments via Coronavirus Brasil,https://coronavirusbra1.github.io/
540,Brazil,BRA,2021-01-24,604722.0,604722.0,,66948.0,86373.0,0.28,0.28,,406.0,"Oxford/AstraZeneca, Sinovac",Regional governments via Coronavirus Brasil,https://coronavirusbra1.github.io/
541,Brazil,BRA,2021-01-25,700608.0,700608.0,,95886.0,99928.0,0.33,0.33,,470.0,"Oxford/AstraZeneca, Sinovac",Regional governments via Coronavirus Brasil,https://coronavirusbra1.github.io/


# Referências

[[1]](https://pandas.pydata.org/docs) Documentação do Pandas

[[2]](https://www.kaggle.com/gpreda/covid-world-vaccination-progress) Dataset do progresso da vacinação mundial contra a Covid-19

[[3]](https://www.w3schools.com/python/python_lists_comprehension.asp) Compreensão de listas em python W3Schools

[[4]](https://pandas.pydata.org/Pandas_Cheat_Sheet.pdf) Pandas Cheat Sheet