# pandas

`pandas` é uma biblioteca escrita para facilitar a manipulação de dados tabulados.

Se todo seu ambiente estiver configurado corretamente a célula abaixo não dará erro.

In [1]:
import pandas

## Importando dados

Para importar os dados, você precisa *ter* os dados.

No nosso caso vamos usar o arquivo sobre séries e filmes do [Netflix](https://www.netflix.com/) que pode ser encontrado em [Kaggle - Datasets (Netflix Shows)](https://www.kaggle.com/datasets/shivamb/netflix-shows).

Uma cópia do arquivo está [aqui](data/netflix_titles.csv).

In [2]:
netflix_shows = pandas.read_csv("data/netflix_titles.csv") # Importa os dados para o formato usado pelo pandas
netflix_shows # Mostra as 5 primeiras e 5 últimas linhas do DataFrame (tipo tabular usado pelo pandas)

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,listed_in,description
0,s1,Movie,Dick Johnson Is Dead,Kirsten Johnson,,United States,"September 25, 2021",2020,PG-13,90 min,Documentaries,"As her father nears the end of his life, filmm..."
1,s2,TV Show,Blood & Water,,"Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban...",South Africa,"September 24, 2021",2021,TV-MA,2 Seasons,"International TV Shows, TV Dramas, TV Mysteries","After crossing paths at a party, a Cape Town t..."
2,s3,TV Show,Ganglands,Julien Leclercq,"Sami Bouajila, Tracy Gotoas, Samuel Jouy, Nabi...",,"September 24, 2021",2021,TV-MA,1 Season,"Crime TV Shows, International TV Shows, TV Act...",To protect his family from a powerful drug lor...
3,s4,TV Show,Jailbirds New Orleans,,,,"September 24, 2021",2021,TV-MA,1 Season,"Docuseries, Reality TV","Feuds, flirtations and toilet talk go down amo..."
4,s5,TV Show,Kota Factory,,"Mayur More, Jitendra Kumar, Ranjan Raj, Alam K...",India,"September 24, 2021",2021,TV-MA,2 Seasons,"International TV Shows, Romantic TV Shows, TV ...",In a city of coaching centers known to train I...
...,...,...,...,...,...,...,...,...,...,...,...,...
8802,s8803,Movie,Zodiac,David Fincher,"Mark Ruffalo, Jake Gyllenhaal, Robert Downey J...",United States,"November 20, 2019",2007,R,158 min,"Cult Movies, Dramas, Thrillers","A political cartoonist, a crime reporter and a..."
8803,s8804,TV Show,Zombie Dumb,,,,"July 1, 2019",2018,TV-Y7,2 Seasons,"Kids' TV, Korean TV Shows, TV Comedies","While living alone in a spooky town, a young g..."
8804,s8805,Movie,Zombieland,Ruben Fleischer,"Jesse Eisenberg, Woody Harrelson, Emma Stone, ...",United States,"November 1, 2019",2009,R,88 min,"Comedies, Horror Movies",Looking to survive in a world taken over by zo...
8805,s8806,Movie,Zoom,Peter Hewitt,"Tim Allen, Courteney Cox, Chevy Chase, Kate Ma...",United States,"January 11, 2020",2006,PG,88 min,"Children & Family Movies, Comedies","Dragged from civilian life, a former superhero..."


## DataFrame

É o tipo de dados utilizado pela biblioteca `pandas` para manipular os dados tabulares.

In [3]:
netflix_shows.shape # Exibe quantas linhas e colunas tem no DataFrame

(8807, 12)

É possível usar apenas uma coluna, que é chamada de `Series` pelo `pandas` e se comporta de forma ligeiramente diferente do `DataFrame`.

In [4]:
tipos = netflix_shows["type"] # Pega somente a coluna type do DataFrame, o tipo gerado é uma Série
tipos.head() # Mostra somente as primeiras linhas dessa Série

0      Movie
1    TV Show
2    TV Show
3    TV Show
4    TV Show
Name: type, dtype: object

É possível usar múltiplas colunas.

In [5]:
diretores = netflix_shows[["type", "director"]] # Pega as duas colunas gerando um novo DataFrame
diretores.head()

Unnamed: 0,type,director
0,Movie,Kirsten Johnson
1,TV Show,
2,TV Show,Julien Leclercq
3,TV Show,
4,TV Show,


É possível utilizar o resultado de uma filtragem por um critério.

In [6]:
lançamentos_recentes = netflix_shows[netflix_shows["release_year"] >= 2021] # Todos os shows de 2021 até hoje.
lançamentos_recentes.head()

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,listed_in,description
1,s2,TV Show,Blood & Water,,"Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban...",South Africa,"September 24, 2021",2021,TV-MA,2 Seasons,"International TV Shows, TV Dramas, TV Mysteries","After crossing paths at a party, a Cape Town t..."
2,s3,TV Show,Ganglands,Julien Leclercq,"Sami Bouajila, Tracy Gotoas, Samuel Jouy, Nabi...",,"September 24, 2021",2021,TV-MA,1 Season,"Crime TV Shows, International TV Shows, TV Act...",To protect his family from a powerful drug lor...
3,s4,TV Show,Jailbirds New Orleans,,,,"September 24, 2021",2021,TV-MA,1 Season,"Docuseries, Reality TV","Feuds, flirtations and toilet talk go down amo..."
4,s5,TV Show,Kota Factory,,"Mayur More, Jitendra Kumar, Ranjan Raj, Alam K...",India,"September 24, 2021",2021,TV-MA,2 Seasons,"International TV Shows, Romantic TV Shows, TV ...",In a city of coaching centers known to train I...
5,s6,TV Show,Midnight Mass,Mike Flanagan,"Kate Siegel, Zach Gilford, Hamish Linklater, H...",,"September 24, 2021",2021,TV-MA,1 Season,"TV Dramas, TV Horror, TV Mysteries",The arrival of a charismatic young priest brin...


É possível utilizar a filtragem por múltiplos critérios simultâneos com o operador lógico _"e"_ (`&`) ou o operador lógico _ou_ (`|`)

In [7]:
filmes_do_Flanagan = netflix_shows[(netflix_shows["director"] == "Mike Flanagan") & (netflix_shows["type"] == "Movie")] # todos os filmes do Flanagan
filmes_do_Flanagan

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,listed_in,description
5091,s5092,Movie,Before I Wake,Mike Flanagan,"Kate Bosworth, Thomas Jane, Jacob Tremblay, An...",United States,"January 5, 2018",2016,PG-13,97 min,"Horror Movies, Thrillers","Still mourning the death of their son, Mark an..."
5252,s5253,Movie,Gerald's Game,Mike Flanagan,"Carla Gugino, Bruce Greenwood, Henry Thomas, C...",United States,"September 29, 2017",2017,TV-MA,103 min,"Horror Movies, Thrillers","When her husband's sex game goes wrong, Jessie..."
5852,s5853,Movie,Hush,Mike Flanagan,"John Gallagher Jr., Kate Siegel, Michael Trucc...",United States,"April 8, 2016",2016,R,82 min,"Horror Movies, Thrillers",A deaf writer who retreated into the woods to ...


O método `notna` de um `DataFrame` retorna uma série de valores booleanos em que todo elemento sem valor definido é `False`.

In [8]:
diretores_com_nome = netflix_shows["director"].notna()
diretores_com_nome

0        True
1       False
2        True
3       False
4       False
        ...  
8802     True
8803    False
8804     True
8805     True
8806     True
Name: director, Length: 8807, dtype: bool

Depois podemos usar essa série para filtrar todos os shows que tem diretores definidos.

In [9]:
filmes_com_diretores = netflix_shows[(netflix_shows["type"] == "Movie") & diretores_com_nome]
filmes_com_diretores

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,listed_in,description
0,s1,Movie,Dick Johnson Is Dead,Kirsten Johnson,,United States,"September 25, 2021",2020,PG-13,90 min,Documentaries,"As her father nears the end of his life, filmm..."
6,s7,Movie,My Little Pony: A New Generation,"Robert Cullen, José Luis Ucha","Vanessa Hudgens, Kimiko Glenn, James Marsden, ...",,"September 24, 2021",2021,PG,91 min,Children & Family Movies,Equestria's divided. But a bright-eyed hero be...
7,s8,Movie,Sankofa,Haile Gerima,"Kofi Ghanaba, Oyafunmike Ogunlano, Alexandra D...","United States, Ghana, Burkina Faso, United Kin...","September 24, 2021",1993,TV-MA,125 min,"Dramas, Independent Movies, International Movies","On a photo shoot in Ghana, an American model s..."
9,s10,Movie,The Starling,Theodore Melfi,"Melissa McCarthy, Chris O'Dowd, Kevin Kline, T...",United States,"September 24, 2021",2021,PG-13,104 min,"Comedies, Dramas",A woman adjusting to life after a loss contend...
12,s13,Movie,Je Suis Karl,Christian Schwochow,"Luna Wedler, Jannis Niewöhner, Milan Peschel, ...","Germany, Czech Republic","September 23, 2021",2021,TV-MA,127 min,"Dramas, International Movies",After most of her family is murdered in a terr...
...,...,...,...,...,...,...,...,...,...,...,...,...
8801,s8802,Movie,Zinzana,Majid Al Ansari,"Ali Suliman, Saleh Bakri, Yasa, Ali Al-Jabri, ...","United Arab Emirates, Jordan","March 9, 2016",2015,TV-MA,96 min,"Dramas, International Movies, Thrillers",Recovering alcoholic Talal wakes up inside a s...
8802,s8803,Movie,Zodiac,David Fincher,"Mark Ruffalo, Jake Gyllenhaal, Robert Downey J...",United States,"November 20, 2019",2007,R,158 min,"Cult Movies, Dramas, Thrillers","A political cartoonist, a crime reporter and a..."
8804,s8805,Movie,Zombieland,Ruben Fleischer,"Jesse Eisenberg, Woody Harrelson, Emma Stone, ...",United States,"November 1, 2019",2009,R,88 min,"Comedies, Horror Movies",Looking to survive in a world taken over by zo...
8805,s8806,Movie,Zoom,Peter Hewitt,"Tim Allen, Courteney Cox, Chevy Chase, Kate Ma...",United States,"January 11, 2020",2006,PG,88 min,"Children & Family Movies, Comedies","Dragged from civilian life, a former superhero..."


A última função dessa introdução é o `describe`

In [10]:
netflix_shows.describe() # Mostra um resumo da análise descritiva dos tipos numéricos do DataFrame

Unnamed: 0,release_year
count,8807.0
mean,2014.180198
std,8.819312
min,1925.0
25%,2013.0
50%,2017.0
75%,2019.0
max,2021.0
