# Dataframes sortieren
Wir können mit der Methode `sort_values` Datframes unkompliziert sortieren.

## Beispiel
Wir nutzen wieder die Übungsdatei `netflix_titles.csv`, die alle Filme und Serien, die auf Netflix verfügbar sind (bzw. 2019 waren) abbildet. Um einen Dataframe zu sortieren, geben wir bei `sort_values` einfach mit `by` den oder die Spaltennamen an, die sortiert werden sollen.


In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

df = pd.read_csv("data/netflix_titles.csv")
df.dropna(inplace=True)

In [2]:
df_sorted_by_country = df.sort_values(by=["country"])
df_sorted_by_country.head(3)

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,listed_in,description
1167,s1168,Movie,Caida del Cielo,Néstor Sánchez Sotelo,"Muriel Santa Ana, Peto Menahem, Héctor Díaz, S...",Argentina,"January 20, 2017",2016,TV-MA,78 min,"Comedies, International Movies, Romantic Movies",When Julia literally falls into Alejandro's ba...
1375,s1376,Movie,Chronicle of an Escape,Israel Adrián Caetano,"Rodrigo de la Serna, Pablo Echarri, Nazareno C...",Argentina,"June 15, 2018",2006,R,104 min,"Dramas, International Movies, Thrillers",Soccer goalie Claudio Tamburrini is kidnapped ...
3250,s3251,Movie,Just Love,"Andy Caballero, Diego Corsini","Franco Masini, Yamila Saud, Victorio D'Alessan...",Argentina,"January 24, 2019",2018,TV-MA,96 min,"Dramas, International Movies, Music & Musicals",Inspired by his love affair with a conservativ...


In [14]:
# Absteigend alle Formate aus Argentinien > 1990, absteigend nach Release Year
argentinian_movies = (
    df
    .loc[(df["country"]=="Argentina") & (df["release_year"] >  1990)]
    .sort_values(["release_year"], ascending=False)
)
argentinian_movies.head(3)

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,listed_in,description
426,s427,TV Show,Almost Happy,Hernán Guerschuny,"Sebastián Wainraich, Natalie Pérez, Santiago K...",Argentina,"May 2, 2020",2020,TV-MA,1 Season,"International TV Shows, Spanish-Language TV Sh...","Sebastián is a radio show host of modest fame,..."
3035,s3036,Movie,Intuition,Alejandro Montiel,"Luisana Lopilato, Joaquín Furriel, Rafael Ferr...",Argentina,"May 28, 2020",2020,TV-MA,116 min,"Dramas, International Movies, Thrillers",Police officer Pipa works on her first big cas...
4515,s4516,Movie,Notes for My Son,Carlos Sorín,"Valeria Bertuccelli, Esteban Lamothe, Julián S...",Argentina,"November 24, 2020",2020,TV-MA,84 min,"Dramas, International Movies","Battling terminal cancer, a woman writes a one..."


### Auf- und absteigend sortieren
mit dem keyword-Argument `ascending` lässt sich bestimmen, ob aufsteigend oder absteigend sortiert werden soll.
Bei der Übergabe von zwei Spaltennamen wird erst nach der ersten Spalte und dann intern nochmal nach der zweiten Spalte sortiert. Das entspricht der `ORDER BY`-Sortierung von SQL.

In [9]:
df_sorted_by_country = df.sort_values(by=["country", "release_year"], ascending=False)
df_sorted_by_country.head(3)

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,listed_in,description
1493,s1494,Movie,Cook Off,Tomas Brickhill,"Tendaiishe Chitima, Tendai Nguni, Jesese Mungo...",Zimbabwe,"June 1, 2020",2017,TV-G,100 min,"Comedies, International Movies, Romantic Movies","Yearning for a better life, a single mother wi..."
5654,s5655,Movie,Sky Tour: The Movie,Nguyen Thanh Tung,Son Tung M-TP,Vietnam,"September 2, 2020",2020,TV-G,93 min,"Documentaries, International Movies, Music & M...","From the preparations to the performances, thi..."
2318,s2319,Movie,Furie,Le Van Kiet,"Ngo Thanh Van, Phan Thanh Nhien, Mai Cat Vi, T...",Vietnam,"September 25, 2019",2019,TV-MA,97 min,"Action & Adventure, Dramas, International Movies",When traffickers kidnap her daughter from thei...


## Indizies
Die Indizies des Dataframes bleiben erhalten. Wenn man für den neuen, sortierten Dataframe neue Indizies haben will, muss man das Argument `ignore_index` auf True setzen. 

In [8]:
df_sorted_by_year = df.sort_values(by=["release_year"], ascending=True, ignore_index=True)
df_sorted_by_year[:3]

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,listed_in,description
0,s6118,Movie,The Battle of Midway,John Ford,"Henry Fonda, Jane Darwell",United States,"March 31, 2017",1942,TV-14,18 min,"Classic Movies, Documentaries",Director John Ford captures combat footage of ...
1,s7269,Movie,Tunisian Victory,"Frank Capra, John Huston, Hugh Stewart, Roy Bo...",Burgess Meredith,"United States, United Kingdom","March 31, 2017",1944,TV-14,76 min,"Classic Movies, Documentaries",British and American troops join forces to lib...
2,s3426,Movie,Know Your Enemy - Japan,"Frank Capra, Joris Ivens","Walter Huston, Dana Andrews",United States,"March 31, 2017",1945,TV-14,63 min,"Classic Movies, Documentaries",Though culturally insensitive by modern standa...


## n-largest und n-smallest

Mit `nlargest` und `nsmallest` stehen noch zwei Methoden zur Verfügung, um komfortabel die n-größten bzw. n-kleinsten Datensätze 

In [19]:
df.nlargest(3, "release_year")

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,listed_in,description
1285,s1286,Movie,Charming,Ross Venokur,"Wilmer Valderrama, Demi Lovato, Sia, Nia Varda...","Canada, United States, Cayman Islands","January 8, 2021",2021,TV-Y7,85 min,"Children & Family Movies, Comedies","On the eve of his 21st birthday, an adored pri..."
6477,s6478,TV Show,The Idhun Chronicles,Maite Ruiz De Austri,"Michelle Jenner, Itzan Escamilla, Sergio Mur, ...",Spain,"January 8, 2021",2021,TV-14,2 Seasons,"Anime Series, International TV Shows, Spanish-...",A boy suddenly orphaned fights his parents' ki...
7551,s7552,Movie,What Happened to Mr. Cha?,Kim Dong-kyu,"Cha In-pyo, Cho Dal-hwan, Song Jae-ryong",South Korea,"January 1, 2021",2021,TV-MA,102 min,"Comedies, International Movies","With the peak of his career long behind him, a..."


In [21]:
df.nsmallest(3, "release_year")

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,listed_in,description
6117,s6118,Movie,The Battle of Midway,John Ford,"Henry Fonda, Jane Darwell",United States,"March 31, 2017",1942,TV-14,18 min,"Classic Movies, Documentaries",Director John Ford captures combat footage of ...
7268,s7269,Movie,Tunisian Victory,"Frank Capra, John Huston, Hugh Stewart, Roy Bo...",Burgess Meredith,"United States, United Kingdom","March 31, 2017",1944,TV-14,76 min,"Classic Movies, Documentaries",British and American troops join forces to lib...
3425,s3426,Movie,Know Your Enemy - Japan,"Frank Capra, Joris Ivens","Walter Huston, Dana Andrews",United States,"March 31, 2017",1945,TV-14,63 min,"Classic Movies, Documentaries",Though culturally insensitive by modern standa...


## Aufgabe

1) Sortiere nach Erscheinungsjahr und zeige Spalten title, release_year und type
2) Die Anzahl der Filme insgesamt.
3) Zeige die Top 3 Filme mit der längsten Dauer in einer sortierten Liste (duration field muss int sein)


In [41]:
# Aufgabe 1)
df_sorted_year = df.sort_values(by=["release_year"])
df_sorted_year.loc[:, ["title", "release_year", "type"]].tail(3)

Unnamed: 0,title,release_year,type
2327,Gabby's Dollhouse,2021,TV Show
4173,Monarca,2021,TV Show
1355,Chris Rock Total Blackout: The Tamborine Exten...,2021,Movie


In [42]:
# Aufgabe 2)
df_movies_total = (df["type"] == "Movie").sum()
print("Anzahl der Filme:", df_movies_total)

# Alternative
df_movies_total = df[df["type"] == "Movie"].shape[0]
print("Anzahl der Filme:", df_movies_total)


Anzahl der Filme: 5377
Anzahl der Filme: 5377


In [44]:
# Aufgabe 3)
df_movies = df[df["type"] == "Movie"].copy()
df_movies["duration"] = df_movies["duration"].str.replace(" min", "").astype(int)
df_movies_length = df_movies.sort_values(by=["duration"], ascending=False)
df_movies_length.head(5)

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,listed_in,description
957,s958,Movie,Black Mirror: Bandersnatch,,"Fionn Whitehead, Will Poulter, Craig Parkinson...",United States,"December 28, 2018",2018,TV-MA,312,"Dramas, International Movies, Sci-Fi & Fantasy","In 1984, a young programmer begins to question..."
6850,s6851,Movie,The School of Mischief,Houssam El-Din Mustafa,"Suhair El-Babili, Adel Emam, Saeed Saleh, Youn...",Egypt,"May 21, 2020",1973,TV-14,253,"Comedies, Dramas, International Movies",A high school teacher volunteers to transform ...
4490,s4491,Movie,No Longer kids,Samir Al Asfory,"Said Saleh, Hassan Moustafa, Ahmed Zaki, Youne...",Egypt,"May 21, 2020",1979,TV-14,237,"Comedies, Dramas, International Movies",Hoping to prevent their father from skipping t...
3694,s3695,Movie,Lock Your Girls In,Fouad El-Mohandes,"Fouad El-Mohandes, Sanaa Younes, Sherihan, Ahm...",,"May 21, 2020",1982,TV-PG,233,"Comedies, International Movies, Romantic Movies",A widower believes he must marry off his three...
5108,s5109,Movie,Raya and Sakina,Hussein Kamal,"Suhair El-Babili, Shadia, Abdel Moneim Madboul...",,"May 21, 2020",1984,TV-14,230,"Comedies, Dramas, International Movies",When robberies and murders targeting women swe...
