<div style="text-align: center;">
  <img src="https://github.com/Hack-io-Data/Imagenes/blob/main/01-LogosHackio/logo_naranja@4x.png?raw=true" alt="esquema" />
</div>

# Laboratorio Limpieza de Datos

En este laboratorio usaremos el DataFrame de Netflix completo creado en los primeros laboratorios de Pandas. 

**Instrucciones:**

1. Lee cuidadosamente el enunciado de cada ejercicio.

2. Implementa la solución en la celda de código proporcionada.

3. Documenta todas las funciones creadas durante el ejercicio. 

4. Debes incluir después de cada gráfica la interpretación de las mismas en una celda de markdown. 

## Parte 1: Limpieza y Preparación de Datos

#### Ejercicio 1: Estandarización y limpieza de columnas

En este ejercicio, debes limpiar y estandarizar algunas columnas clave para hacerlas más manejables y consistentes en tus análisis. Específicamente, trabajarás con las columnas `date_added` y `duration` para convertirlas a un formato uniforme y estructurado.

Instrucciones:

1. **Convertir la columna `date_added`**: La columna `date_added` contiene fechas en formato de texto. Debes convertirla a un formato `datetime` que pandas pueda entender y manejar fácilmente.

2. **Limpiar la columna `duration`**: La columna `duration` tiene valores en diferentes formatos como "1 Season", "2 Seasons", "90 min", etc. Tu tarea es extraer el número (ya sea el número de temporadas o la cantidad de minutos) y crear una nueva columna llamada `duration_cleaned` con esos valores estandarizados.


**Resultado Esperado:**
Deberás obtener algo como esto:

| duration   | duration_cleaned |
|------------|-----------------|
| 1 Season   | 1               |
| 90 min     | 90              |
| 2 Seasons  | 2               |
| 45 min     | 45              |
| 3 Seasons  | 3               |

In [1]:
# importamos las librerías que necesitamos

# Tratamiento de datos
# -----------------------------------------------------------------------
import pandas as pd
import numpy as np

# Para guardar DataFrames en Excel
# -----------------------------------------------------------------------
from pandas import ExcelWriter

# Para generar todas las posibles combinaciones
# -----------------------------------------------------------------------
import itertools

# Configuración
# -----------------------------------------------------------------------
pd.set_option('display.max_columns', None) # para poder visualizar todas las columnas de los DataFrames

In [2]:
# cargamos el dataframe creado en la lección anterior
df = pd.read_csv("datos/Netflix_Merge.csv", index_col = 0)

df.sample(5)

Unnamed: 0_level_0,type,Title,director,cast,country,date_added,release_year,rating,duration,listed_in,description,Genre,Premiere,Runtime,IMDB Score,Language,Originales,Mes,Ano,Dia_Semana
show_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1
s1013,TV Show,Luis Miguel - The Series,,"Diego Boneta, Juan Pablo Zurita, Camila Sodi, ...",Mexico,"April 19, 2021",2021,TV-MA,2 Seasons,"International TV Shows, Spanish-Language TV Sh...",This series dramatizes the life story of Mexic...,,,,0.0,,,,,
s8648,Movie,"Twisted Trunk, Big Fat Body",,"Vijay Maurya, Naman Jain, Usha Nadkarni, Mukes...",India,"January 15, 2017",2015,TV-14,,"Dramas, International Movies",After terrorists place a bomb inside a toy Lor...,,,,0.0,,,,,
s1352,Movie,"Our Lady of San Juan, Four Centuries of Miracles",Noé González,"Alejandro Peña Arenzana, Alejandra Yañez Reyno...",Mexico,"February 2, 2021",2020,TV-PG,123 min,"Dramas, Faith & Spirituality, International Mo...","In this dramatization, the Virgin Mary works a...",,,,0.0,,,,,
s3807,TV Show,Prince of Peoria,,"Gavin Lewis, Theodore Barnes, Shelby Simmons, ...",United States,"May 20, 2019",2019,TV-G,2 Seasons,"Kids' TV, TV Comedies",A prankster prince who wants to experience lif...,,,,0.0,,,,,
s4163,TV Show,A Taiwanese Tale of Two Cities,,"Tammy Chen, James Wen, Peggy Tseng, Denny Huang",Taiwan,"January 27, 2019",2018,TV-MA,,"International TV Shows, TV Dramas",A Taipei doctor and a San Francisco engineer s...,,,,0.0,,,,,


In [3]:
df["duration_String"] = df["duration"].fillna("0 min").astype(str)
df["duration_cleaned"] = [fecha.split(' ')[0]for fecha in df.duration_String]

df

Unnamed: 0_level_0,type,Title,director,cast,country,date_added,release_year,rating,duration,listed_in,description,Genre,Premiere,Runtime,IMDB Score,Language,Originales,Mes,Ano,Dia_Semana,duration_String,duration_cleaned
show_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1
s1,Movie,Dick Johnson Is Dead,Kirsten Johnson,,United States,"September 25, 2021",2020,PG-13,90 min,Documentaries,"As her father nears the end of his life, filmm...",Documentary,2020-10-02,90.0,7.5,English,True,October,2020.0,Friday,90 min,90
s2,TV Show,Blood & Water,,"Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban...",South Africa,"September 24, 2021",2021,TV-MA,2 Seasons,"International TV Shows, TV Dramas, TV Mysteries","After crossing paths at a party, a Cape Town t...",,,,0.0,,,,,,2 Seasons,2
s3,TV Show,Ganglands,Julien Leclercq,"Sami Bouajila, Tracy Gotoas, Samuel Jouy, Nabi...",,"September 24, 2021",2021,TV-MA,,"Crime TV Shows, International TV Shows, TV Act...",To protect his family from a powerful drug lor...,,,,0.0,,,,,,0 min,0
s4,TV Show,Jailbirds New Orleans,,,,"September 24, 2021",2021,TV-MA,,"Docuseries, Reality TV","Feuds, flirtations and toilet talk go down amo...",,,,0.0,,,,,,0 min,0
s5,TV Show,Kota Factory,,"Mayur More, Jitendra Kumar, Ranjan Raj, Alam K...",India,"September 24, 2021",2021,TV-MA,2 Seasons,"International TV Shows, Romantic TV Shows, TV ...",In a city of coaching centers known to train I...,,,,0.0,,,,,,2 Seasons,2
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
s8803,Movie,Zodiac,David Fincher,"Mark Ruffalo, Jake Gyllenhaal, Robert Downey J...",United States,"November 20, 2019",2007,R,158 min,"Cult Movies, Dramas, Thrillers","A political cartoonist, a crime reporter and a...",,,,0.0,,,,,,158 min,158
s8804,TV Show,Zombie Dumb,,,,"July 1, 2019",2018,TV-Y7,2 Seasons,"Kids' TV, Korean TV Shows, TV Comedies","While living alone in a spooky town, a young g...",,,,0.0,,,,,,2 Seasons,2
s8805,Movie,Zombieland,Ruben Fleischer,"Jesse Eisenberg, Woody Harrelson, Emma Stone, ...",United States,"November 1, 2019",2009,R,,"Comedies, Horror Movies",Looking to survive in a world taken over by zo...,,,,0.0,,,,,,0 min,0
s8806,Movie,Zoom,Peter Hewitt,"Tim Allen, Courteney Cox, Chevy Chase, Kate Ma...",United States,"January 11, 2020",2006,PG,,"Children & Family Movies, Comedies","Dragged from civilian life, a former superhero...",,,,0.0,,,,,,0 min,0


In [4]:
df['Fecha_Insert'] = pd.to_datetime(df['date_added'], format='mixed')
df

Unnamed: 0_level_0,type,Title,director,cast,country,date_added,release_year,rating,duration,listed_in,description,Genre,Premiere,Runtime,IMDB Score,Language,Originales,Mes,Ano,Dia_Semana,duration_String,duration_cleaned,Fecha_Insert
show_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1
s1,Movie,Dick Johnson Is Dead,Kirsten Johnson,,United States,"September 25, 2021",2020,PG-13,90 min,Documentaries,"As her father nears the end of his life, filmm...",Documentary,2020-10-02,90.0,7.5,English,True,October,2020.0,Friday,90 min,90,2021-09-25
s2,TV Show,Blood & Water,,"Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban...",South Africa,"September 24, 2021",2021,TV-MA,2 Seasons,"International TV Shows, TV Dramas, TV Mysteries","After crossing paths at a party, a Cape Town t...",,,,0.0,,,,,,2 Seasons,2,2021-09-24
s3,TV Show,Ganglands,Julien Leclercq,"Sami Bouajila, Tracy Gotoas, Samuel Jouy, Nabi...",,"September 24, 2021",2021,TV-MA,,"Crime TV Shows, International TV Shows, TV Act...",To protect his family from a powerful drug lor...,,,,0.0,,,,,,0 min,0,2021-09-24
s4,TV Show,Jailbirds New Orleans,,,,"September 24, 2021",2021,TV-MA,,"Docuseries, Reality TV","Feuds, flirtations and toilet talk go down amo...",,,,0.0,,,,,,0 min,0,2021-09-24
s5,TV Show,Kota Factory,,"Mayur More, Jitendra Kumar, Ranjan Raj, Alam K...",India,"September 24, 2021",2021,TV-MA,2 Seasons,"International TV Shows, Romantic TV Shows, TV ...",In a city of coaching centers known to train I...,,,,0.0,,,,,,2 Seasons,2,2021-09-24
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
s8803,Movie,Zodiac,David Fincher,"Mark Ruffalo, Jake Gyllenhaal, Robert Downey J...",United States,"November 20, 2019",2007,R,158 min,"Cult Movies, Dramas, Thrillers","A political cartoonist, a crime reporter and a...",,,,0.0,,,,,,158 min,158,2019-11-20
s8804,TV Show,Zombie Dumb,,,,"July 1, 2019",2018,TV-Y7,2 Seasons,"Kids' TV, Korean TV Shows, TV Comedies","While living alone in a spooky town, a young g...",,,,0.0,,,,,,2 Seasons,2,2019-07-01
s8805,Movie,Zombieland,Ruben Fleischer,"Jesse Eisenberg, Woody Harrelson, Emma Stone, ...",United States,"November 1, 2019",2009,R,,"Comedies, Horror Movies",Looking to survive in a world taken over by zo...,,,,0.0,,,,,,0 min,0,2019-11-01
s8806,Movie,Zoom,Peter Hewitt,"Tim Allen, Courteney Cox, Chevy Chase, Kate Ma...",United States,"January 11, 2020",2006,PG,,"Children & Family Movies, Comedies","Dragged from civilian life, a former superhero...",,,,0.0,,,,,,0 min,0,2020-01-11


In [5]:
df_fecha_insert = df.groupby("Fecha_Insert")['Title'].count().sort_values(ascending=False).reset_index()
df_fecha_insert

Unnamed: 0,Fecha_Insert,Title
0,2020-01-01,110
1,2019-11-01,91
2,2018-03-01,75
3,2019-12-31,74
4,2018-10-01,71
...,...,...
1709,2019-12-12,1
1710,2013-10-08,1
1711,2019-12-11,1
1712,2013-08-02,1


#### Ejercicio 2: Normalización de la columna `rating`

La columna `rating` tiene diferentes calificaciones como `PG`, `PG-13`, `R`, entre otras. Debes categorizar estas calificaciones en tres grupos:

- **'General Audience'** para calificaciones como `G`, `PG`.

- **'Teens'** para calificaciones como `PG-13`, `TV-14`.

- **'Adults'** para calificaciones como `R`, `TV-MA`.


In [6]:
def asigna_rating(rating):
        if rating == "G" or rating == "PG":
            return "General Audience"
        elif rating == "PG-13" or rating == "TV-14":
            return "Teens"
        elif rating == "R" or rating == "TV-MA":
            return "Adults"
        elif "min" in rating:
            return "Unrated"
        else:
            return "Unrated"


In [7]:
df["rating"].unique()

array(['PG-13', 'TV-MA', 'PG', 'TV-14', 'TV-PG', 'TV-Y', 'TV-Y7', 'R',
       'TV-G', 'G', 'NC-17', '74 min', '84 min', '66 min', 'NR', nan,
       'TV-Y7-FV', 'UR'], dtype=object)

In [8]:
df["rating_String"] = df["rating"].fillna("Unrated").astype(str)

In [9]:
df["Calificaciones"] = df["rating_String"].apply(asigna_rating)

In [10]:
df["rating"].unique()

array(['PG-13', 'TV-MA', 'PG', 'TV-14', 'TV-PG', 'TV-Y', 'TV-Y7', 'R',
       'TV-G', 'G', 'NC-17', '74 min', '84 min', '66 min', 'NR', nan,
       'TV-Y7-FV', 'UR'], dtype=object)

In [11]:
df["Calificaciones"].unique()
df

Unnamed: 0_level_0,type,Title,director,cast,country,date_added,release_year,rating,duration,listed_in,description,Genre,Premiere,Runtime,IMDB Score,Language,Originales,Mes,Ano,Dia_Semana,duration_String,duration_cleaned,Fecha_Insert,rating_String,Calificaciones
show_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1
s1,Movie,Dick Johnson Is Dead,Kirsten Johnson,,United States,"September 25, 2021",2020,PG-13,90 min,Documentaries,"As her father nears the end of his life, filmm...",Documentary,2020-10-02,90.0,7.5,English,True,October,2020.0,Friday,90 min,90,2021-09-25,PG-13,Teens
s2,TV Show,Blood & Water,,"Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban...",South Africa,"September 24, 2021",2021,TV-MA,2 Seasons,"International TV Shows, TV Dramas, TV Mysteries","After crossing paths at a party, a Cape Town t...",,,,0.0,,,,,,2 Seasons,2,2021-09-24,TV-MA,Adults
s3,TV Show,Ganglands,Julien Leclercq,"Sami Bouajila, Tracy Gotoas, Samuel Jouy, Nabi...",,"September 24, 2021",2021,TV-MA,,"Crime TV Shows, International TV Shows, TV Act...",To protect his family from a powerful drug lor...,,,,0.0,,,,,,0 min,0,2021-09-24,TV-MA,Adults
s4,TV Show,Jailbirds New Orleans,,,,"September 24, 2021",2021,TV-MA,,"Docuseries, Reality TV","Feuds, flirtations and toilet talk go down amo...",,,,0.0,,,,,,0 min,0,2021-09-24,TV-MA,Adults
s5,TV Show,Kota Factory,,"Mayur More, Jitendra Kumar, Ranjan Raj, Alam K...",India,"September 24, 2021",2021,TV-MA,2 Seasons,"International TV Shows, Romantic TV Shows, TV ...",In a city of coaching centers known to train I...,,,,0.0,,,,,,2 Seasons,2,2021-09-24,TV-MA,Adults
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
s8803,Movie,Zodiac,David Fincher,"Mark Ruffalo, Jake Gyllenhaal, Robert Downey J...",United States,"November 20, 2019",2007,R,158 min,"Cult Movies, Dramas, Thrillers","A political cartoonist, a crime reporter and a...",,,,0.0,,,,,,158 min,158,2019-11-20,R,Adults
s8804,TV Show,Zombie Dumb,,,,"July 1, 2019",2018,TV-Y7,2 Seasons,"Kids' TV, Korean TV Shows, TV Comedies","While living alone in a spooky town, a young g...",,,,0.0,,,,,,2 Seasons,2,2019-07-01,TV-Y7,Unrated
s8805,Movie,Zombieland,Ruben Fleischer,"Jesse Eisenberg, Woody Harrelson, Emma Stone, ...",United States,"November 1, 2019",2009,R,,"Comedies, Horror Movies",Looking to survive in a world taken over by zo...,,,,0.0,,,,,,0 min,0,2019-11-01,R,Adults
s8806,Movie,Zoom,Peter Hewitt,"Tim Allen, Courteney Cox, Chevy Chase, Kate Ma...",United States,"January 11, 2020",2006,PG,,"Children & Family Movies, Comedies","Dragged from civilian life, a former superhero...",,,,0.0,,,,,,0 min,0,2020-01-11,PG,General Audience


In [12]:
df_calificaciones = df.groupby("Calificaciones")['Title'].count().sort_values(ascending=False).reset_index()
df_calificaciones

Unnamed: 0,Calificaciones,Title
0,Adults,4006
1,Teens,2650
2,Unrated,1823
3,General Audience,328


#### Ejercicio 3: Creación de una columna personalizada basada en el elenco

Vamos a identificar si un actor clave como `Leonardo DiCaprio`, `Tom Hanks`, o `Morgan Freeman` aparece en el elenco.

Usa `apply` y una función lambda para crear una nueva columna llamada `has_famous_actor` que contenga `True` si alguno de estos actores está en la lista de `cast` y `False` en caso contrario.

In [13]:
def has_famous_actor(actor):
        if "Leonardo DiCaprio" in actor or "Tom Hanks" in actor or "Morgan Freeman" in actor:
            return True
        else:
            return False


In [14]:
df["cast"].unique()

array([nan,
       'Ama Qamata, Khosi Ngema, Gail Mabalane, Thabang Molaba, Dillon Windvogel, Natasha Thahane, Arno Greeff, Xolile Tshabalala, Getmore Sithole, Cindy Mahlangu, Ryle De Morny, Greteli Fincham, Sello Maake Ka-Ncube, Odwa Gwanya, Mekaila Mathys, Sandi Schultz, Duane Williams, Shamilla Miller, Patrick Mofokeng',
       'Sami Bouajila, Tracy Gotoas, Samuel Jouy, Nabiha Akkari, Sofia Lesaffre, Salim Kechiouche, Noureddine Farihi, Geert Van Rampelberg, Bakary Diombera',
       ...,
       'Jesse Eisenberg, Woody Harrelson, Emma Stone, Abigail Breslin, Amber Heard, Bill Murray, Derek Graf',
       'Tim Allen, Courteney Cox, Chevy Chase, Kate Mara, Ryan Newman, Michael Cassidy, Spencer Breslin, Rip Torn, Kevin Zegers',
       'Vicky Kaushal, Sarah-Jane Dias, Raaghav Chanana, Manish Chaudhary, Meghna Malik, Malkeet Rauni, Anita Shabdish, Chittaranjan Tripathy'],
      dtype=object)

In [15]:
df["cast"] = df["cast"].fillna("No actors").astype(str)

In [16]:
df["Actores_Famosos"] = df["cast"].apply(lambda actor: (("Leonardo DiCaprio" in actor or "Tom Hanks" in actor or "Morgan Freeman" in actor) == True))

In [17]:
#  Lo había hecho siguiente la dinámica anterior también, sin utilizar lambda
# df["Actores_Famosos"] = df["cast"].apply(has_famous_actor)
# df

In [18]:

df_actores_famosos = df[df["Actores_Famosos"] ==True]
df_actores_famosos

Unnamed: 0_level_0,type,Title,director,cast,country,date_added,release_year,rating,duration,listed_in,description,Genre,Premiere,Runtime,IMDB Score,Language,Originales,Mes,Ano,Dia_Semana,duration_String,duration_cleaned,Fecha_Insert,rating_String,Calificaciones,Actores_Famosos
show_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1
s130,Movie,An Unfinished Life,Lasse Hallström,"Robert Redford, Jennifer Lopez, Morgan Freeman...","Germany, United States","September 1, 2021",2005,PG-13,,Dramas,A grieving widow and her daughter move in with...,,,,0.0,,,,,,0 min,0,2021-09-01,PG-13,Teens,True
s330,Movie,Catch Me If You Can,Steven Spielberg,"Leonardo DiCaprio, Tom Hanks, Christopher Walk...","United States, Canada","August 1, 2021",2002,PG-13,142 min,Dramas,An FBI agent makes it his mission to put cunni...,,,,0.0,,,,,,142 min,142,2021-08-01,PG-13,Teens,True
s341,Movie,Inception,Christopher Nolan,"Leonardo DiCaprio, Joseph Gordon-Levitt, Ellio...","United States, United Kingdom","August 1, 2021",2010,PG-13,148 min,"Action & Adventure, Sci-Fi & Fantasy, Thrillers",A troubled thief who extracts secrets from peo...,,,,0.0,,,,,,148 min,148,2021-08-01,PG-13,Teens,True
s393,Movie,Django Unchained,Quentin Tarantino,"Jamie Foxx, Christoph Waltz, Leonardo DiCaprio...",United States,"July 24, 2021",2012,R,165 min,"Action & Adventure, Dramas","Accompanied by a German bounty hunter, a freed...",,,,0.0,,,,,,165 min,165,2021-07-24,R,Adults,True
s609,Movie,The Sum of All Fears,Phil Alden Robinson,"Ben Affleck, Morgan Freeman, Bridget Moynahan,...","United States, Germany, Canada","July 1, 2021",2002,PG-13,124 min,Action & Adventure,CIA agent Jack Ryan tries to discover why thre...,,,,0.0,,,,,,124 min,124,2021-07-01,PG-13,Teens,True
s800,Movie,Million Dollar Baby,Clint Eastwood,"Clint Eastwood, Hilary Swank, Morgan Freeman, ...",United States,"June 2, 2021",2004,PG-13,133 min,"Dramas, Sports Movies",When a cantankerous trainer mentors a persiste...,,,,0.0,,,,,,133 min,133,2021-06-02,PG-13,Teens,True
s1359,Movie,Shutter Island,Martin Scorsese,"Leonardo DiCaprio, Mark Ruffalo, Ben Kingsley,...",United States,"February 1, 2021",2010,R,139 min,Thrillers,A U.S. marshal's troubling visions compromise ...,,,,0.0,,,,,,139 min,139,2021-02-01,R,Adults,True
s1470,Movie,What's Eating Gilbert Grape,Lasse Hallström,"Johnny Depp, Leonardo DiCaprio, Juliette Lewis...",United States,"January 1, 2021",1993,PG-13,118 min,"Classic Movies, Dramas, Independent Movies","In a backwater Iowa town, young Gilbert is tor...",,,,0.0,,,,,,118 min,118,2021-01-01,PG-13,Teens,True
s1611,Movie,Angels & Demons,Ron Howard,"Tom Hanks, Ewan McGregor, Ayelet Zurer, Stella...","United States, Italy","December 1, 2020",2009,PG-13,139 min,Thrillers,A Harvard symbologist races to uncover clues t...,,,,0.0,,,,,,139 min,139,2020-12-01,PG-13,Teens,True
s1625,Movie,The Da Vinci Code,Ron Howard,"Tom Hanks, Audrey Tautou, Ian McKellen, Jean R...","United States, Malta, France, United Kingdom","December 1, 2020",2006,PG-13,149 min,Thrillers,"When the curator of the Louvre is killed, a Ha...",,,,0.0,,,,,,149 min,149,2020-12-01,PG-13,Teens,True


In [19]:
# Dos formas de contar los valores:
df_total_actores_famosos = df[df["Actores_Famosos"] ==True].shape[0]
df_total_actores_famosos_A = df["Actores_Famosos"].sum()

print(df_total_actores_famosos)
print(df_total_actores_famosos_A)

35
35


In [20]:
df_famosos = df.groupby("Actores_Famosos")['Title'].count().sort_values(ascending=False).reset_index()
df_famosos

Unnamed: 0,Actores_Famosos,Title
0,False,8772
1,True,35


#### Ejercicio 4: Creación de una columna personalizada usando lógica condicional

Vamos a crear una columna llamada `is_recent` que identifique si un título fue lanzado en los últimos 5 años.

Crea una función para marcar con `True` si el título es reciente (lanzado en los últimos 5 años) y `False` si no lo es.

In [21]:
df["Años_desde_lanzamiento"] = (pd.to_datetime("today").year - df["release_year"])
df

Unnamed: 0_level_0,type,Title,director,cast,country,date_added,release_year,rating,duration,listed_in,description,Genre,Premiere,Runtime,IMDB Score,Language,Originales,Mes,Ano,Dia_Semana,duration_String,duration_cleaned,Fecha_Insert,rating_String,Calificaciones,Actores_Famosos,Años_desde_lanzamiento
show_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1
s1,Movie,Dick Johnson Is Dead,Kirsten Johnson,No actors,United States,"September 25, 2021",2020,PG-13,90 min,Documentaries,"As her father nears the end of his life, filmm...",Documentary,2020-10-02,90.0,7.5,English,True,October,2020.0,Friday,90 min,90,2021-09-25,PG-13,Teens,False,4
s2,TV Show,Blood & Water,,"Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban...",South Africa,"September 24, 2021",2021,TV-MA,2 Seasons,"International TV Shows, TV Dramas, TV Mysteries","After crossing paths at a party, a Cape Town t...",,,,0.0,,,,,,2 Seasons,2,2021-09-24,TV-MA,Adults,False,3
s3,TV Show,Ganglands,Julien Leclercq,"Sami Bouajila, Tracy Gotoas, Samuel Jouy, Nabi...",,"September 24, 2021",2021,TV-MA,,"Crime TV Shows, International TV Shows, TV Act...",To protect his family from a powerful drug lor...,,,,0.0,,,,,,0 min,0,2021-09-24,TV-MA,Adults,False,3
s4,TV Show,Jailbirds New Orleans,,No actors,,"September 24, 2021",2021,TV-MA,,"Docuseries, Reality TV","Feuds, flirtations and toilet talk go down amo...",,,,0.0,,,,,,0 min,0,2021-09-24,TV-MA,Adults,False,3
s5,TV Show,Kota Factory,,"Mayur More, Jitendra Kumar, Ranjan Raj, Alam K...",India,"September 24, 2021",2021,TV-MA,2 Seasons,"International TV Shows, Romantic TV Shows, TV ...",In a city of coaching centers known to train I...,,,,0.0,,,,,,2 Seasons,2,2021-09-24,TV-MA,Adults,False,3
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
s8803,Movie,Zodiac,David Fincher,"Mark Ruffalo, Jake Gyllenhaal, Robert Downey J...",United States,"November 20, 2019",2007,R,158 min,"Cult Movies, Dramas, Thrillers","A political cartoonist, a crime reporter and a...",,,,0.0,,,,,,158 min,158,2019-11-20,R,Adults,False,17
s8804,TV Show,Zombie Dumb,,No actors,,"July 1, 2019",2018,TV-Y7,2 Seasons,"Kids' TV, Korean TV Shows, TV Comedies","While living alone in a spooky town, a young g...",,,,0.0,,,,,,2 Seasons,2,2019-07-01,TV-Y7,Unrated,False,6
s8805,Movie,Zombieland,Ruben Fleischer,"Jesse Eisenberg, Woody Harrelson, Emma Stone, ...",United States,"November 1, 2019",2009,R,,"Comedies, Horror Movies",Looking to survive in a world taken over by zo...,,,,0.0,,,,,,0 min,0,2019-11-01,R,Adults,False,15
s8806,Movie,Zoom,Peter Hewitt,"Tim Allen, Courteney Cox, Chevy Chase, Kate Ma...",United States,"January 11, 2020",2006,PG,,"Children & Family Movies, Comedies","Dragged from civilian life, a former superhero...",,,,0.0,,,,,,0 min,0,2020-01-11,PG,General Audience,False,18


In [22]:
def es_reciente(años_desde_lanzamiento):
        if años_desde_lanzamiento < 5:
            return True
        else:
            return False

In [23]:
df["is_recent"] = df["Años_desde_lanzamiento"].apply(es_reciente)
df

Unnamed: 0_level_0,type,Title,director,cast,country,date_added,release_year,rating,duration,listed_in,description,Genre,Premiere,Runtime,IMDB Score,Language,Originales,Mes,Ano,Dia_Semana,duration_String,duration_cleaned,Fecha_Insert,rating_String,Calificaciones,Actores_Famosos,Años_desde_lanzamiento,is_recent
show_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1
s1,Movie,Dick Johnson Is Dead,Kirsten Johnson,No actors,United States,"September 25, 2021",2020,PG-13,90 min,Documentaries,"As her father nears the end of his life, filmm...",Documentary,2020-10-02,90.0,7.5,English,True,October,2020.0,Friday,90 min,90,2021-09-25,PG-13,Teens,False,4,True
s2,TV Show,Blood & Water,,"Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban...",South Africa,"September 24, 2021",2021,TV-MA,2 Seasons,"International TV Shows, TV Dramas, TV Mysteries","After crossing paths at a party, a Cape Town t...",,,,0.0,,,,,,2 Seasons,2,2021-09-24,TV-MA,Adults,False,3,True
s3,TV Show,Ganglands,Julien Leclercq,"Sami Bouajila, Tracy Gotoas, Samuel Jouy, Nabi...",,"September 24, 2021",2021,TV-MA,,"Crime TV Shows, International TV Shows, TV Act...",To protect his family from a powerful drug lor...,,,,0.0,,,,,,0 min,0,2021-09-24,TV-MA,Adults,False,3,True
s4,TV Show,Jailbirds New Orleans,,No actors,,"September 24, 2021",2021,TV-MA,,"Docuseries, Reality TV","Feuds, flirtations and toilet talk go down amo...",,,,0.0,,,,,,0 min,0,2021-09-24,TV-MA,Adults,False,3,True
s5,TV Show,Kota Factory,,"Mayur More, Jitendra Kumar, Ranjan Raj, Alam K...",India,"September 24, 2021",2021,TV-MA,2 Seasons,"International TV Shows, Romantic TV Shows, TV ...",In a city of coaching centers known to train I...,,,,0.0,,,,,,2 Seasons,2,2021-09-24,TV-MA,Adults,False,3,True
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
s8803,Movie,Zodiac,David Fincher,"Mark Ruffalo, Jake Gyllenhaal, Robert Downey J...",United States,"November 20, 2019",2007,R,158 min,"Cult Movies, Dramas, Thrillers","A political cartoonist, a crime reporter and a...",,,,0.0,,,,,,158 min,158,2019-11-20,R,Adults,False,17,False
s8804,TV Show,Zombie Dumb,,No actors,,"July 1, 2019",2018,TV-Y7,2 Seasons,"Kids' TV, Korean TV Shows, TV Comedies","While living alone in a spooky town, a young g...",,,,0.0,,,,,,2 Seasons,2,2019-07-01,TV-Y7,Unrated,False,6,False
s8805,Movie,Zombieland,Ruben Fleischer,"Jesse Eisenberg, Woody Harrelson, Emma Stone, ...",United States,"November 1, 2019",2009,R,,"Comedies, Horror Movies",Looking to survive in a world taken over by zo...,,,,0.0,,,,,,0 min,0,2019-11-01,R,Adults,False,15,False
s8806,Movie,Zoom,Peter Hewitt,"Tim Allen, Courteney Cox, Chevy Chase, Kate Ma...",United States,"January 11, 2020",2006,PG,,"Children & Family Movies, Comedies","Dragged from civilian life, a former superhero...",,,,0.0,,,,,,0 min,0,2020-01-11,PG,General Audience,False,18,False


In [24]:
df_recientes = df.groupby("is_recent")['Title'].count().sort_values(ascending=False).reset_index()
df_recientes

Unnamed: 0,is_recent,Title
0,False,7262
1,True,1545


#### Ejercicio 5: Clasificación de películas por década

En este ejercicio, tu objetivo es categorizar los años de lanzamiento de las películas o series en décadas. La columna `release_year` contiene el año de lanzamiento y debes crear una nueva columna llamada `decade` que indique la década correspondiente, como "1990s", "2000s", etc.


In [25]:
anos_ordenaods = np.sort(df["release_year"].unique())
print(anos_ordenaods)

[1925 1942 1943 1944 1945 1946 1947 1954 1955 1956 1958 1959 1960 1961
 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975
 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989
 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003
 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017
 2018 2019 2020 2021]


In [26]:
def categoriza_decada(año):
        if año >= 1920 and año < 1930:
            return "1920s"
        elif año >= 1930 and año < 1940:
            return "1930s"
        elif año >= 1940 and año < 1950:
            return "1940s"
        elif año >= 1950 and año < 1960:
            return "1950s"
        elif año >= 1960 and año < 1970:
            return "1960s"
        elif año >= 1970 and año < 1980:
            return "1970s"
        elif año >= 1980 and año < 1990:
            return "1980s"
        elif año >= 1990 and año < 2000:
            return "1990s"
        elif año >= 2000 and año < 2010:
            return "2000s"
        elif año >= 2010 and año < 2020:
            return "2010s"
        elif año >= 2020 and año < 2030:
            return "2020s"
        else:
            return "No decade"

In [27]:
df["Decade"] = df["release_year"].apply(categoriza_decada)
df

Unnamed: 0_level_0,type,Title,director,cast,country,date_added,release_year,rating,duration,listed_in,description,Genre,Premiere,Runtime,IMDB Score,Language,Originales,Mes,Ano,Dia_Semana,duration_String,duration_cleaned,Fecha_Insert,rating_String,Calificaciones,Actores_Famosos,Años_desde_lanzamiento,is_recent,Decade
show_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1
s1,Movie,Dick Johnson Is Dead,Kirsten Johnson,No actors,United States,"September 25, 2021",2020,PG-13,90 min,Documentaries,"As her father nears the end of his life, filmm...",Documentary,2020-10-02,90.0,7.5,English,True,October,2020.0,Friday,90 min,90,2021-09-25,PG-13,Teens,False,4,True,2020s
s2,TV Show,Blood & Water,,"Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban...",South Africa,"September 24, 2021",2021,TV-MA,2 Seasons,"International TV Shows, TV Dramas, TV Mysteries","After crossing paths at a party, a Cape Town t...",,,,0.0,,,,,,2 Seasons,2,2021-09-24,TV-MA,Adults,False,3,True,2020s
s3,TV Show,Ganglands,Julien Leclercq,"Sami Bouajila, Tracy Gotoas, Samuel Jouy, Nabi...",,"September 24, 2021",2021,TV-MA,,"Crime TV Shows, International TV Shows, TV Act...",To protect his family from a powerful drug lor...,,,,0.0,,,,,,0 min,0,2021-09-24,TV-MA,Adults,False,3,True,2020s
s4,TV Show,Jailbirds New Orleans,,No actors,,"September 24, 2021",2021,TV-MA,,"Docuseries, Reality TV","Feuds, flirtations and toilet talk go down amo...",,,,0.0,,,,,,0 min,0,2021-09-24,TV-MA,Adults,False,3,True,2020s
s5,TV Show,Kota Factory,,"Mayur More, Jitendra Kumar, Ranjan Raj, Alam K...",India,"September 24, 2021",2021,TV-MA,2 Seasons,"International TV Shows, Romantic TV Shows, TV ...",In a city of coaching centers known to train I...,,,,0.0,,,,,,2 Seasons,2,2021-09-24,TV-MA,Adults,False,3,True,2020s
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
s8803,Movie,Zodiac,David Fincher,"Mark Ruffalo, Jake Gyllenhaal, Robert Downey J...",United States,"November 20, 2019",2007,R,158 min,"Cult Movies, Dramas, Thrillers","A political cartoonist, a crime reporter and a...",,,,0.0,,,,,,158 min,158,2019-11-20,R,Adults,False,17,False,2000s
s8804,TV Show,Zombie Dumb,,No actors,,"July 1, 2019",2018,TV-Y7,2 Seasons,"Kids' TV, Korean TV Shows, TV Comedies","While living alone in a spooky town, a young g...",,,,0.0,,,,,,2 Seasons,2,2019-07-01,TV-Y7,Unrated,False,6,False,2010s
s8805,Movie,Zombieland,Ruben Fleischer,"Jesse Eisenberg, Woody Harrelson, Emma Stone, ...",United States,"November 1, 2019",2009,R,,"Comedies, Horror Movies",Looking to survive in a world taken over by zo...,,,,0.0,,,,,,0 min,0,2019-11-01,R,Adults,False,15,False,2000s
s8806,Movie,Zoom,Peter Hewitt,"Tim Allen, Courteney Cox, Chevy Chase, Kate Ma...",United States,"January 11, 2020",2006,PG,,"Children & Family Movies, Comedies","Dragged from civilian life, a former superhero...",,,,0.0,,,,,,0 min,0,2020-01-11,PG,General Audience,False,18,False,2000s


In [28]:
np.sort(df["Decade"].unique())

array(['1920s', '1940s', '1950s', '1960s', '1970s', '1980s', '1990s',
       '2000s', '2010s', '2020s'], dtype=object)

In [29]:
df_titulos_decadas = df.groupby("Decade")['Title'].count().sort_values(ascending=False).reset_index()
df_titulos_decadas

Unnamed: 0,Decade,Title
0,2010s,5927
1,2020s,1545
2,2000s,810
3,1990s,274
4,1980s,129
5,1970s,70
6,1960s,25
7,1940s,15
8,1950s,11
9,1920s,1


#### Ejercicio 6: Extracción de información

Para practicar la extracción de información:

1. **Extrae el primer actor** de la lista en la columna `cast` y crea una nueva columna llamada `first_actor`.

2. **Extrae el primer nombre del director** y guárdalo en una columna llamada `first_name_director`.


In [30]:
df["cast"].unique()

array(['No actors',
       'Ama Qamata, Khosi Ngema, Gail Mabalane, Thabang Molaba, Dillon Windvogel, Natasha Thahane, Arno Greeff, Xolile Tshabalala, Getmore Sithole, Cindy Mahlangu, Ryle De Morny, Greteli Fincham, Sello Maake Ka-Ncube, Odwa Gwanya, Mekaila Mathys, Sandi Schultz, Duane Williams, Shamilla Miller, Patrick Mofokeng',
       'Sami Bouajila, Tracy Gotoas, Samuel Jouy, Nabiha Akkari, Sofia Lesaffre, Salim Kechiouche, Noureddine Farihi, Geert Van Rampelberg, Bakary Diombera',
       ...,
       'Jesse Eisenberg, Woody Harrelson, Emma Stone, Abigail Breslin, Amber Heard, Bill Murray, Derek Graf',
       'Tim Allen, Courteney Cox, Chevy Chase, Kate Mara, Ryan Newman, Michael Cassidy, Spencer Breslin, Rip Torn, Kevin Zegers',
       'Vicky Kaushal, Sarah-Jane Dias, Raaghav Chanana, Manish Chaudhary, Meghna Malik, Malkeet Rauni, Anita Shabdish, Chittaranjan Tripathy'],
      dtype=object)

In [31]:
def primer_actor(lista):
    if lista == "No actors":
        return "No info"
    else:
        actores = lista.split(",")
        return actores[0]

In [32]:
df["cast"] = df["cast"].fillna("No actors").astype(str)

In [33]:
df["first_actor"] = df["cast"].apply(primer_actor)
df

Unnamed: 0_level_0,type,Title,director,cast,country,date_added,release_year,rating,duration,listed_in,description,Genre,Premiere,Runtime,IMDB Score,Language,Originales,Mes,Ano,Dia_Semana,duration_String,duration_cleaned,Fecha_Insert,rating_String,Calificaciones,Actores_Famosos,Años_desde_lanzamiento,is_recent,Decade,first_actor
show_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1
s1,Movie,Dick Johnson Is Dead,Kirsten Johnson,No actors,United States,"September 25, 2021",2020,PG-13,90 min,Documentaries,"As her father nears the end of his life, filmm...",Documentary,2020-10-02,90.0,7.5,English,True,October,2020.0,Friday,90 min,90,2021-09-25,PG-13,Teens,False,4,True,2020s,No info
s2,TV Show,Blood & Water,,"Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban...",South Africa,"September 24, 2021",2021,TV-MA,2 Seasons,"International TV Shows, TV Dramas, TV Mysteries","After crossing paths at a party, a Cape Town t...",,,,0.0,,,,,,2 Seasons,2,2021-09-24,TV-MA,Adults,False,3,True,2020s,Ama Qamata
s3,TV Show,Ganglands,Julien Leclercq,"Sami Bouajila, Tracy Gotoas, Samuel Jouy, Nabi...",,"September 24, 2021",2021,TV-MA,,"Crime TV Shows, International TV Shows, TV Act...",To protect his family from a powerful drug lor...,,,,0.0,,,,,,0 min,0,2021-09-24,TV-MA,Adults,False,3,True,2020s,Sami Bouajila
s4,TV Show,Jailbirds New Orleans,,No actors,,"September 24, 2021",2021,TV-MA,,"Docuseries, Reality TV","Feuds, flirtations and toilet talk go down amo...",,,,0.0,,,,,,0 min,0,2021-09-24,TV-MA,Adults,False,3,True,2020s,No info
s5,TV Show,Kota Factory,,"Mayur More, Jitendra Kumar, Ranjan Raj, Alam K...",India,"September 24, 2021",2021,TV-MA,2 Seasons,"International TV Shows, Romantic TV Shows, TV ...",In a city of coaching centers known to train I...,,,,0.0,,,,,,2 Seasons,2,2021-09-24,TV-MA,Adults,False,3,True,2020s,Mayur More
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
s8803,Movie,Zodiac,David Fincher,"Mark Ruffalo, Jake Gyllenhaal, Robert Downey J...",United States,"November 20, 2019",2007,R,158 min,"Cult Movies, Dramas, Thrillers","A political cartoonist, a crime reporter and a...",,,,0.0,,,,,,158 min,158,2019-11-20,R,Adults,False,17,False,2000s,Mark Ruffalo
s8804,TV Show,Zombie Dumb,,No actors,,"July 1, 2019",2018,TV-Y7,2 Seasons,"Kids' TV, Korean TV Shows, TV Comedies","While living alone in a spooky town, a young g...",,,,0.0,,,,,,2 Seasons,2,2019-07-01,TV-Y7,Unrated,False,6,False,2010s,No info
s8805,Movie,Zombieland,Ruben Fleischer,"Jesse Eisenberg, Woody Harrelson, Emma Stone, ...",United States,"November 1, 2019",2009,R,,"Comedies, Horror Movies",Looking to survive in a world taken over by zo...,,,,0.0,,,,,,0 min,0,2019-11-01,R,Adults,False,15,False,2000s,Jesse Eisenberg
s8806,Movie,Zoom,Peter Hewitt,"Tim Allen, Courteney Cox, Chevy Chase, Kate Ma...",United States,"January 11, 2020",2006,PG,,"Children & Family Movies, Comedies","Dragged from civilian life, a former superhero...",,,,0.0,,,,,,0 min,0,2020-01-11,PG,General Audience,False,18,False,2000s,Tim Allen


In [34]:
df["director"].dtype

dtype('O')

In [35]:
df["director"] = df["director"].fillna("No_info")

In [36]:
df["first_name_director"] = [nombre.split(' ')[0]for nombre in df.director]
df

Unnamed: 0_level_0,type,Title,director,cast,country,date_added,release_year,rating,duration,listed_in,description,Genre,Premiere,Runtime,IMDB Score,Language,Originales,Mes,Ano,Dia_Semana,duration_String,duration_cleaned,Fecha_Insert,rating_String,Calificaciones,Actores_Famosos,Años_desde_lanzamiento,is_recent,Decade,first_actor,first_name_director
show_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1
s1,Movie,Dick Johnson Is Dead,Kirsten Johnson,No actors,United States,"September 25, 2021",2020,PG-13,90 min,Documentaries,"As her father nears the end of his life, filmm...",Documentary,2020-10-02,90.0,7.5,English,True,October,2020.0,Friday,90 min,90,2021-09-25,PG-13,Teens,False,4,True,2020s,No info,Kirsten
s2,TV Show,Blood & Water,No_info,"Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban...",South Africa,"September 24, 2021",2021,TV-MA,2 Seasons,"International TV Shows, TV Dramas, TV Mysteries","After crossing paths at a party, a Cape Town t...",,,,0.0,,,,,,2 Seasons,2,2021-09-24,TV-MA,Adults,False,3,True,2020s,Ama Qamata,No_info
s3,TV Show,Ganglands,Julien Leclercq,"Sami Bouajila, Tracy Gotoas, Samuel Jouy, Nabi...",,"September 24, 2021",2021,TV-MA,,"Crime TV Shows, International TV Shows, TV Act...",To protect his family from a powerful drug lor...,,,,0.0,,,,,,0 min,0,2021-09-24,TV-MA,Adults,False,3,True,2020s,Sami Bouajila,Julien
s4,TV Show,Jailbirds New Orleans,No_info,No actors,,"September 24, 2021",2021,TV-MA,,"Docuseries, Reality TV","Feuds, flirtations and toilet talk go down amo...",,,,0.0,,,,,,0 min,0,2021-09-24,TV-MA,Adults,False,3,True,2020s,No info,No_info
s5,TV Show,Kota Factory,No_info,"Mayur More, Jitendra Kumar, Ranjan Raj, Alam K...",India,"September 24, 2021",2021,TV-MA,2 Seasons,"International TV Shows, Romantic TV Shows, TV ...",In a city of coaching centers known to train I...,,,,0.0,,,,,,2 Seasons,2,2021-09-24,TV-MA,Adults,False,3,True,2020s,Mayur More,No_info
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
s8803,Movie,Zodiac,David Fincher,"Mark Ruffalo, Jake Gyllenhaal, Robert Downey J...",United States,"November 20, 2019",2007,R,158 min,"Cult Movies, Dramas, Thrillers","A political cartoonist, a crime reporter and a...",,,,0.0,,,,,,158 min,158,2019-11-20,R,Adults,False,17,False,2000s,Mark Ruffalo,David
s8804,TV Show,Zombie Dumb,No_info,No actors,,"July 1, 2019",2018,TV-Y7,2 Seasons,"Kids' TV, Korean TV Shows, TV Comedies","While living alone in a spooky town, a young g...",,,,0.0,,,,,,2 Seasons,2,2019-07-01,TV-Y7,Unrated,False,6,False,2010s,No info,No_info
s8805,Movie,Zombieland,Ruben Fleischer,"Jesse Eisenberg, Woody Harrelson, Emma Stone, ...",United States,"November 1, 2019",2009,R,,"Comedies, Horror Movies",Looking to survive in a world taken over by zo...,,,,0.0,,,,,,0 min,0,2019-11-01,R,Adults,False,15,False,2000s,Jesse Eisenberg,Ruben
s8806,Movie,Zoom,Peter Hewitt,"Tim Allen, Courteney Cox, Chevy Chase, Kate Ma...",United States,"January 11, 2020",2006,PG,,"Children & Family Movies, Comedies","Dragged from civilian life, a former superhero...",,,,0.0,,,,,,0 min,0,2020-01-11,PG,General Audience,False,18,False,2000s,Tim Allen,Peter


#### Ejercicio 7: Limpieza de la columna `cast`

La columna `cast` contiene una lista de actores separados por comas. Tu objetivo es realizar las siguientes tareas:

1. **Reemplaza los valores nulos** en la columna `cast` por "sin información".

2. **Contar el número de actores** en cada entrada y crear una nueva columna llamada `num_cast`.

3. **Normalizar los nombres**: Asegúrate de que los nombres de los actores estén en un formato consistente (por ejemplo, quitar espacios adicionales).


In [37]:
df["cast"] = df["cast"].str.lstrip()
df["cast"] = df["cast"].str.rstrip()

In [38]:
def primer_actor(cast):
    
    if cast == "No actors":
        return 0
    else:
        lista_actores = cast.split(",")
        return len(lista_actores)


In [39]:
actors = "Ama Qamata, Khosi Ngema, Gail Mabalane, Sami Bouajila, Tracy Gotoas, Samuel Jouy, Mayur More, Jitendra Kumar"
actores = actors.split(",")
len(actores)

8

In [40]:
df["num_cast"] = df["cast"].apply(primer_actor)
df


Unnamed: 0_level_0,type,Title,director,cast,country,date_added,release_year,rating,duration,listed_in,description,Genre,Premiere,Runtime,IMDB Score,Language,Originales,Mes,Ano,Dia_Semana,duration_String,duration_cleaned,Fecha_Insert,rating_String,Calificaciones,Actores_Famosos,Años_desde_lanzamiento,is_recent,Decade,first_actor,first_name_director,num_cast
show_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1
s1,Movie,Dick Johnson Is Dead,Kirsten Johnson,No actors,United States,"September 25, 2021",2020,PG-13,90 min,Documentaries,"As her father nears the end of his life, filmm...",Documentary,2020-10-02,90.0,7.5,English,True,October,2020.0,Friday,90 min,90,2021-09-25,PG-13,Teens,False,4,True,2020s,No info,Kirsten,0
s2,TV Show,Blood & Water,No_info,"Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban...",South Africa,"September 24, 2021",2021,TV-MA,2 Seasons,"International TV Shows, TV Dramas, TV Mysteries","After crossing paths at a party, a Cape Town t...",,,,0.0,,,,,,2 Seasons,2,2021-09-24,TV-MA,Adults,False,3,True,2020s,Ama Qamata,No_info,19
s3,TV Show,Ganglands,Julien Leclercq,"Sami Bouajila, Tracy Gotoas, Samuel Jouy, Nabi...",,"September 24, 2021",2021,TV-MA,,"Crime TV Shows, International TV Shows, TV Act...",To protect his family from a powerful drug lor...,,,,0.0,,,,,,0 min,0,2021-09-24,TV-MA,Adults,False,3,True,2020s,Sami Bouajila,Julien,9
s4,TV Show,Jailbirds New Orleans,No_info,No actors,,"September 24, 2021",2021,TV-MA,,"Docuseries, Reality TV","Feuds, flirtations and toilet talk go down amo...",,,,0.0,,,,,,0 min,0,2021-09-24,TV-MA,Adults,False,3,True,2020s,No info,No_info,0
s5,TV Show,Kota Factory,No_info,"Mayur More, Jitendra Kumar, Ranjan Raj, Alam K...",India,"September 24, 2021",2021,TV-MA,2 Seasons,"International TV Shows, Romantic TV Shows, TV ...",In a city of coaching centers known to train I...,,,,0.0,,,,,,2 Seasons,2,2021-09-24,TV-MA,Adults,False,3,True,2020s,Mayur More,No_info,8
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
s8803,Movie,Zodiac,David Fincher,"Mark Ruffalo, Jake Gyllenhaal, Robert Downey J...",United States,"November 20, 2019",2007,R,158 min,"Cult Movies, Dramas, Thrillers","A political cartoonist, a crime reporter and a...",,,,0.0,,,,,,158 min,158,2019-11-20,R,Adults,False,17,False,2000s,Mark Ruffalo,David,10
s8804,TV Show,Zombie Dumb,No_info,No actors,,"July 1, 2019",2018,TV-Y7,2 Seasons,"Kids' TV, Korean TV Shows, TV Comedies","While living alone in a spooky town, a young g...",,,,0.0,,,,,,2 Seasons,2,2019-07-01,TV-Y7,Unrated,False,6,False,2010s,No info,No_info,0
s8805,Movie,Zombieland,Ruben Fleischer,"Jesse Eisenberg, Woody Harrelson, Emma Stone, ...",United States,"November 1, 2019",2009,R,,"Comedies, Horror Movies",Looking to survive in a world taken over by zo...,,,,0.0,,,,,,0 min,0,2019-11-01,R,Adults,False,15,False,2000s,Jesse Eisenberg,Ruben,7
s8806,Movie,Zoom,Peter Hewitt,"Tim Allen, Courteney Cox, Chevy Chase, Kate Ma...",United States,"January 11, 2020",2006,PG,,"Children & Family Movies, Comedies","Dragged from civilian life, a former superhero...",,,,0.0,,,,,,0 min,0,2020-01-11,PG,General Audience,False,18,False,2000s,Tim Allen,Peter,9


In [41]:
# Confirmar que la pelicula zodcas tiene 10 actores:
cast_zodiac = df.loc[df["Title"] == "Zodiac", "cast"].values[0]

print(cast_zodiac)
print(len(cast_zodiac.split(",")))


Mark Ruffalo, Jake Gyllenhaal, Robert Downey Jr., Anthony Edwards, Brian Cox, Elias Koteas, Donal Logue, John Carroll Lynch, Dermot Mulroney, Chloë Sevigny
10



#### Ejercicio 9: Identificación de Directores Recurrentes

En este ejercicio, debes identificar los directores que aparecen más de una vez en el conjunto de datos. Realiza los siguientes pasos:

1. **Reemplaza los valores nulos** en la columna `director` por "sin información".

3. **Cuenta cuántas veces aparece cada director** en la columna creada en el ejercicio 6.

4. **Filtra aquellos directores que aparecen más de una vez** y crea una nueva columna llamada `recurrent_director` donde se indique "Yes" si el director aparece varias veces o "No" en caso contrario.

In [42]:
# Ya había reemplazado los valores, por lo que no hay nulos y no insertará "Sin informacion" 
df["director"] = df["director"].fillna("Sin informacion").astype(str)
df

Unnamed: 0_level_0,type,Title,director,cast,country,date_added,release_year,rating,duration,listed_in,description,Genre,Premiere,Runtime,IMDB Score,Language,Originales,Mes,Ano,Dia_Semana,duration_String,duration_cleaned,Fecha_Insert,rating_String,Calificaciones,Actores_Famosos,Años_desde_lanzamiento,is_recent,Decade,first_actor,first_name_director,num_cast
show_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1
s1,Movie,Dick Johnson Is Dead,Kirsten Johnson,No actors,United States,"September 25, 2021",2020,PG-13,90 min,Documentaries,"As her father nears the end of his life, filmm...",Documentary,2020-10-02,90.0,7.5,English,True,October,2020.0,Friday,90 min,90,2021-09-25,PG-13,Teens,False,4,True,2020s,No info,Kirsten,0
s2,TV Show,Blood & Water,No_info,"Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban...",South Africa,"September 24, 2021",2021,TV-MA,2 Seasons,"International TV Shows, TV Dramas, TV Mysteries","After crossing paths at a party, a Cape Town t...",,,,0.0,,,,,,2 Seasons,2,2021-09-24,TV-MA,Adults,False,3,True,2020s,Ama Qamata,No_info,19
s3,TV Show,Ganglands,Julien Leclercq,"Sami Bouajila, Tracy Gotoas, Samuel Jouy, Nabi...",,"September 24, 2021",2021,TV-MA,,"Crime TV Shows, International TV Shows, TV Act...",To protect his family from a powerful drug lor...,,,,0.0,,,,,,0 min,0,2021-09-24,TV-MA,Adults,False,3,True,2020s,Sami Bouajila,Julien,9
s4,TV Show,Jailbirds New Orleans,No_info,No actors,,"September 24, 2021",2021,TV-MA,,"Docuseries, Reality TV","Feuds, flirtations and toilet talk go down amo...",,,,0.0,,,,,,0 min,0,2021-09-24,TV-MA,Adults,False,3,True,2020s,No info,No_info,0
s5,TV Show,Kota Factory,No_info,"Mayur More, Jitendra Kumar, Ranjan Raj, Alam K...",India,"September 24, 2021",2021,TV-MA,2 Seasons,"International TV Shows, Romantic TV Shows, TV ...",In a city of coaching centers known to train I...,,,,0.0,,,,,,2 Seasons,2,2021-09-24,TV-MA,Adults,False,3,True,2020s,Mayur More,No_info,8
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
s8803,Movie,Zodiac,David Fincher,"Mark Ruffalo, Jake Gyllenhaal, Robert Downey J...",United States,"November 20, 2019",2007,R,158 min,"Cult Movies, Dramas, Thrillers","A political cartoonist, a crime reporter and a...",,,,0.0,,,,,,158 min,158,2019-11-20,R,Adults,False,17,False,2000s,Mark Ruffalo,David,10
s8804,TV Show,Zombie Dumb,No_info,No actors,,"July 1, 2019",2018,TV-Y7,2 Seasons,"Kids' TV, Korean TV Shows, TV Comedies","While living alone in a spooky town, a young g...",,,,0.0,,,,,,2 Seasons,2,2019-07-01,TV-Y7,Unrated,False,6,False,2010s,No info,No_info,0
s8805,Movie,Zombieland,Ruben Fleischer,"Jesse Eisenberg, Woody Harrelson, Emma Stone, ...",United States,"November 1, 2019",2009,R,,"Comedies, Horror Movies",Looking to survive in a world taken over by zo...,,,,0.0,,,,,,0 min,0,2019-11-01,R,Adults,False,15,False,2000s,Jesse Eisenberg,Ruben,7
s8806,Movie,Zoom,Peter Hewitt,"Tim Allen, Courteney Cox, Chevy Chase, Kate Ma...",United States,"January 11, 2020",2006,PG,,"Children & Family Movies, Comedies","Dragged from civilian life, a former superhero...",,,,0.0,,,,,,0 min,0,2020-01-11,PG,General Audience,False,18,False,2000s,Tim Allen,Peter,9


In [43]:
# lambda argumentos: valor_si_verdadero if condicion else valor_si_falso

# Utilizo una expresión lambda 
df["director"] = df["director"].apply(lambda director: "Sin informacion" if director == "No_info" else director)
df

Unnamed: 0_level_0,type,Title,director,cast,country,date_added,release_year,rating,duration,listed_in,description,Genre,Premiere,Runtime,IMDB Score,Language,Originales,Mes,Ano,Dia_Semana,duration_String,duration_cleaned,Fecha_Insert,rating_String,Calificaciones,Actores_Famosos,Años_desde_lanzamiento,is_recent,Decade,first_actor,first_name_director,num_cast
show_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1
s1,Movie,Dick Johnson Is Dead,Kirsten Johnson,No actors,United States,"September 25, 2021",2020,PG-13,90 min,Documentaries,"As her father nears the end of his life, filmm...",Documentary,2020-10-02,90.0,7.5,English,True,October,2020.0,Friday,90 min,90,2021-09-25,PG-13,Teens,False,4,True,2020s,No info,Kirsten,0
s2,TV Show,Blood & Water,Sin informacion,"Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban...",South Africa,"September 24, 2021",2021,TV-MA,2 Seasons,"International TV Shows, TV Dramas, TV Mysteries","After crossing paths at a party, a Cape Town t...",,,,0.0,,,,,,2 Seasons,2,2021-09-24,TV-MA,Adults,False,3,True,2020s,Ama Qamata,No_info,19
s3,TV Show,Ganglands,Julien Leclercq,"Sami Bouajila, Tracy Gotoas, Samuel Jouy, Nabi...",,"September 24, 2021",2021,TV-MA,,"Crime TV Shows, International TV Shows, TV Act...",To protect his family from a powerful drug lor...,,,,0.0,,,,,,0 min,0,2021-09-24,TV-MA,Adults,False,3,True,2020s,Sami Bouajila,Julien,9
s4,TV Show,Jailbirds New Orleans,Sin informacion,No actors,,"September 24, 2021",2021,TV-MA,,"Docuseries, Reality TV","Feuds, flirtations and toilet talk go down amo...",,,,0.0,,,,,,0 min,0,2021-09-24,TV-MA,Adults,False,3,True,2020s,No info,No_info,0
s5,TV Show,Kota Factory,Sin informacion,"Mayur More, Jitendra Kumar, Ranjan Raj, Alam K...",India,"September 24, 2021",2021,TV-MA,2 Seasons,"International TV Shows, Romantic TV Shows, TV ...",In a city of coaching centers known to train I...,,,,0.0,,,,,,2 Seasons,2,2021-09-24,TV-MA,Adults,False,3,True,2020s,Mayur More,No_info,8
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
s8803,Movie,Zodiac,David Fincher,"Mark Ruffalo, Jake Gyllenhaal, Robert Downey J...",United States,"November 20, 2019",2007,R,158 min,"Cult Movies, Dramas, Thrillers","A political cartoonist, a crime reporter and a...",,,,0.0,,,,,,158 min,158,2019-11-20,R,Adults,False,17,False,2000s,Mark Ruffalo,David,10
s8804,TV Show,Zombie Dumb,Sin informacion,No actors,,"July 1, 2019",2018,TV-Y7,2 Seasons,"Kids' TV, Korean TV Shows, TV Comedies","While living alone in a spooky town, a young g...",,,,0.0,,,,,,2 Seasons,2,2019-07-01,TV-Y7,Unrated,False,6,False,2010s,No info,No_info,0
s8805,Movie,Zombieland,Ruben Fleischer,"Jesse Eisenberg, Woody Harrelson, Emma Stone, ...",United States,"November 1, 2019",2009,R,,"Comedies, Horror Movies",Looking to survive in a world taken over by zo...,,,,0.0,,,,,,0 min,0,2019-11-01,R,Adults,False,15,False,2000s,Jesse Eisenberg,Ruben,7
s8806,Movie,Zoom,Peter Hewitt,"Tim Allen, Courteney Cox, Chevy Chase, Kate Ma...",United States,"January 11, 2020",2006,PG,,"Children & Family Movies, Comedies","Dragged from civilian life, a former superhero...",,,,0.0,,,,,,0 min,0,2020-01-11,PG,General Audience,False,18,False,2000s,Tim Allen,Peter,9


In [44]:
def filtra_director(director, lista):
 
    if director not in lista:
        lista.append(director)
        return "No"
    else:
        return "Yes"


In [45]:
lista_directores = []

df["recurrent_directors"] = df["director"].apply(lambda director: filtra_director(director, lista_directores))
df


Unnamed: 0_level_0,type,Title,director,cast,country,date_added,release_year,rating,duration,listed_in,description,Genre,Premiere,Runtime,IMDB Score,Language,Originales,Mes,Ano,Dia_Semana,duration_String,duration_cleaned,Fecha_Insert,rating_String,Calificaciones,Actores_Famosos,Años_desde_lanzamiento,is_recent,Decade,first_actor,first_name_director,num_cast,recurrent_directors
show_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1
s1,Movie,Dick Johnson Is Dead,Kirsten Johnson,No actors,United States,"September 25, 2021",2020,PG-13,90 min,Documentaries,"As her father nears the end of his life, filmm...",Documentary,2020-10-02,90.0,7.5,English,True,October,2020.0,Friday,90 min,90,2021-09-25,PG-13,Teens,False,4,True,2020s,No info,Kirsten,0,No
s2,TV Show,Blood & Water,Sin informacion,"Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban...",South Africa,"September 24, 2021",2021,TV-MA,2 Seasons,"International TV Shows, TV Dramas, TV Mysteries","After crossing paths at a party, a Cape Town t...",,,,0.0,,,,,,2 Seasons,2,2021-09-24,TV-MA,Adults,False,3,True,2020s,Ama Qamata,No_info,19,No
s3,TV Show,Ganglands,Julien Leclercq,"Sami Bouajila, Tracy Gotoas, Samuel Jouy, Nabi...",,"September 24, 2021",2021,TV-MA,,"Crime TV Shows, International TV Shows, TV Act...",To protect his family from a powerful drug lor...,,,,0.0,,,,,,0 min,0,2021-09-24,TV-MA,Adults,False,3,True,2020s,Sami Bouajila,Julien,9,No
s4,TV Show,Jailbirds New Orleans,Sin informacion,No actors,,"September 24, 2021",2021,TV-MA,,"Docuseries, Reality TV","Feuds, flirtations and toilet talk go down amo...",,,,0.0,,,,,,0 min,0,2021-09-24,TV-MA,Adults,False,3,True,2020s,No info,No_info,0,Yes
s5,TV Show,Kota Factory,Sin informacion,"Mayur More, Jitendra Kumar, Ranjan Raj, Alam K...",India,"September 24, 2021",2021,TV-MA,2 Seasons,"International TV Shows, Romantic TV Shows, TV ...",In a city of coaching centers known to train I...,,,,0.0,,,,,,2 Seasons,2,2021-09-24,TV-MA,Adults,False,3,True,2020s,Mayur More,No_info,8,Yes
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
s8803,Movie,Zodiac,David Fincher,"Mark Ruffalo, Jake Gyllenhaal, Robert Downey J...",United States,"November 20, 2019",2007,R,158 min,"Cult Movies, Dramas, Thrillers","A political cartoonist, a crime reporter and a...",,,,0.0,,,,,,158 min,158,2019-11-20,R,Adults,False,17,False,2000s,Mark Ruffalo,David,10,Yes
s8804,TV Show,Zombie Dumb,Sin informacion,No actors,,"July 1, 2019",2018,TV-Y7,2 Seasons,"Kids' TV, Korean TV Shows, TV Comedies","While living alone in a spooky town, a young g...",,,,0.0,,,,,,2 Seasons,2,2019-07-01,TV-Y7,Unrated,False,6,False,2010s,No info,No_info,0,Yes
s8805,Movie,Zombieland,Ruben Fleischer,"Jesse Eisenberg, Woody Harrelson, Emma Stone, ...",United States,"November 1, 2019",2009,R,,"Comedies, Horror Movies",Looking to survive in a world taken over by zo...,,,,0.0,,,,,,0 min,0,2019-11-01,R,Adults,False,15,False,2000s,Jesse Eisenberg,Ruben,7,Yes
s8806,Movie,Zoom,Peter Hewitt,"Tim Allen, Courteney Cox, Chevy Chase, Kate Ma...",United States,"January 11, 2020",2006,PG,,"Children & Family Movies, Comedies","Dragged from civilian life, a former superhero...",,,,0.0,,,,,,0 min,0,2020-01-11,PG,General Audience,False,18,False,2000s,Tim Allen,Peter,9,No


In [46]:
df_directores_recurr = df.groupby(["director", "recurrent_directors"])['Title'].count().sort_values(ascending=False).reset_index()
df_directores_recurr


Unnamed: 0,director,recurrent_directors,Title
0,Sin informacion,Yes,2633
1,Rajiv Chilaka,Yes,18
2,"Raúl Campos, Jan Suter",Yes,17
3,Marcus Raboy,Yes,15
4,Suhas Kadav,Yes,15
...,...,...,...
5392,Óskar Thór Axelsson,No,1
5393,Ömer Faruk Sorak,No,1
5394,Ömer Faruk Sorak,Yes,1
5395,Şenol Sönmez,No,1


In [47]:
dir_counts = df["director"].value_counts()
dir_counts

director
Sin informacion                           2634
Rajiv Chilaka                               19
Raúl Campos, Jan Suter                      18
Suhas Kadav                                 16
Marcus Raboy                                16
                                          ... 
Milla Harrison-Hansley, Alicky Sussman       1
Drew Stone                                   1
Benjamin Turner                              1
S. Shankar                                   1
Peter Hewitt                                 1
Name: count, Length: 4529, dtype: int64

In [49]:
df_directores = df.groupby("director")['Title'].count().sort_values(ascending=False).reset_index()
df_directores

Unnamed: 0,director,Title
0,Sin informacion,2634
1,Rajiv Chilaka,19
2,"Raúl Campos, Jan Suter",18
3,Marcus Raboy,16
4,Suhas Kadav,16
...,...,...
4524,Álvaro Delgado-Aparicio L.,1
4525,Álvaro Brechner,1
4526,Zuko Nodada,1
4527,Zsolt Pálfi,1
