<div style="text-align: center;">
  <img src="https://github.com/Hack-io-Data/Imagenes/blob/main/01-LogosHackio/logo_naranja@4x.png?raw=true" alt="esquema" />
</div>

# Laboratorio Limpieza de Datos

En este laboratorio usaremos el DataFrame de Netflix completo creado en los primeros laboratorios de Pandas. 

**Instrucciones:**

1. Lee cuidadosamente el enunciado de cada ejercicio.

2. Implementa la solución en la celda de código proporcionada.

3. Documenta todas las funciones creadas durante el ejercicio. 

4. Debes incluir después de cada gráfica la interpretación de las mismas en una celda de markdown. 

In [1]:
import pandas as pd
import numpy as np

from pandas import ExcelWriter
import itertools

pd.set_option('display.max_columns', None)

df_union_final = pd.read_csv("datos/df_union_final.csv", index_col = 0)

df_union_final

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,listed_in,description,Genre,Premiere,Runtime,IMDB Score,Language
0,s2037,Movie,#Alive,Cho Il,"Yoo Ah-in, Park Shin-hye",South Korea,"September 8, 2020",2020,TV-MA,,"Horror Movies, International Movies, Thrillers","As a grisly virus rampages a city, a lone man ...",,,,,
1,s2305,Movie,#AnneFrank - Parallel Stories,"Sabina Fedeli, Anna Migotto","Helen Mirren, Gengher Gatti",Italy,"July 1, 2020",2019,TV-14,,"Documentaries, International Movies","Through her diary, Anne Frank's story is retol...",,,,,
2,s2482,Movie,#FriendButMarried,Rako Prijanto,"Adipati Dolken, Vanesha Prescilla, Rendi Jhon,...",Indonesia,"May 21, 2020",2018,TV-G,,"Dramas, International Movies, Romantic Movies","Pining for his high school crush for years, a ...",,,,,
3,s2325,Movie,#FriendButMarried 2,Rako Prijanto,"Adipati Dolken, Mawar de Jongh, Sari Nila, Von...",Indonesia,"June 28, 2020",2020,TV-G,104 min,"Dramas, International Movies, Romantic Movies",As Ayu and Ditto finally transition from best ...,,,,,
4,s5974,Movie,#Roxy,Michael Kennedy,"Jake Short, Sarah Fisher, Booboo Stewart, Dann...",Canada,"April 10, 2019",2018,TV-14,,"Comedies, Romantic Movies",A teenage hacker with a huge nose helps a cool...,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
8802,s6178,TV Show,忍者ハットリくん,,,Japan,"December 23, 2018",2012,TV-Y7,2 Seasons,"Anime Series, Kids' TV","Hailing from the mountains of Iga, Kanzo Hatto...",,,,,
8803,s4915,TV Show,海的儿子,,"Li Nanxing, Christopher Lee, Jesseca Liu, Appl...",,"April 27, 2018",2016,TV-14,,"International TV Shows, TV Dramas","Two brothers start a new life in Singapore, wh...",,,,,
8804,s7102,TV Show,마녀사냥,,"Si-kyung Sung, Se-yoon Yoo, Dong-yup Shin, Ji-...",South Korea,"February 19, 2018",2015,TV-MA,,"International TV Shows, Korean TV Shows, Stand...",Four Korean celebrity men and guest stars of b...,,,,,
8805,s5023,Movie,반드시 잡는다,Hong-seon Kim,Baek Yoon-sik,South Korea,"February 28, 2018",2017,TV-MA,110 min,"Dramas, International Movies, Thrillers",After people in his town start turning up dead...,,,,,


## Parte 1: Limpieza y Preparación de Datos

#### Ejercicio 1: Estandarización y limpieza de columnas

En este ejercicio, debes limpiar y estandarizar algunas columnas clave para hacerlas más manejables y consistentes en tus análisis. Específicamente, trabajarás con las columnas `date_added` y `duration` para convertirlas a un formato uniforme y estructurado.

Instrucciones:

1. **Convertir la columna `date_added`**: La columna `date_added` contiene fechas en formato de texto. Debes convertirla a un formato `datetime` que pandas pueda entender y manejar fácilmente.

2. **Limpiar la columna `duration`**: La columna `duration` tiene valores en diferentes formatos como "1 Season", "2 Seasons", "90 min", etc. Tu tarea es extraer el número (ya sea el número de temporadas o la cantidad de minutos) y crear una nueva columna llamada `duration_cleaned` con esos valores estandarizados.


**Resultado Esperado:**
Deberás obtener algo como esto:

| duration   | duration_cleaned |
|------------|-----------------|
| 1 Season   | 1               |
| 90 min     | 90              |
| 2 Seasons  | 2               |
| 45 min     | 45              |
| 3 Seasons  | 3               |

In [10]:
df_union_final["fecha"] = pd.to_datetime(df_union_final["date_added"], errors='coerce', format= '%B %d, %Y')
df_union_final

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,listed_in,description,Genre,Premiere,Runtime,IMDB Score,Language,fecha,duration_clean,clasificacion
0,s2037,Movie,#Alive,Cho Il,"Yoo Ah-in, Park Shin-hye",South Korea,"September 8, 2020",2020,TV-MA,,"Horror Movies, International Movies, Thrillers","As a grisly virus rampages a city, a lone man ...",,,,,,2020-09-08,,Adults
1,s2305,Movie,#AnneFrank - Parallel Stories,"Sabina Fedeli, Anna Migotto","Helen Mirren, Gengher Gatti",Italy,"July 1, 2020",2019,TV-14,,"Documentaries, International Movies","Through her diary, Anne Frank's story is retol...",,,,,,2020-07-01,,Teens
2,s2482,Movie,#FriendButMarried,Rako Prijanto,"Adipati Dolken, Vanesha Prescilla, Rendi Jhon,...",Indonesia,"May 21, 2020",2018,TV-G,,"Dramas, International Movies, Romantic Movies","Pining for his high school crush for years, a ...",,,,,,2020-05-21,,
3,s2325,Movie,#FriendButMarried 2,Rako Prijanto,"Adipati Dolken, Mawar de Jongh, Sari Nila, Von...",Indonesia,"June 28, 2020",2020,TV-G,104 min,"Dramas, International Movies, Romantic Movies",As Ayu and Ditto finally transition from best ...,,,,,,2020-06-28,104.0,
4,s5974,Movie,#Roxy,Michael Kennedy,"Jake Short, Sarah Fisher, Booboo Stewart, Dann...",Canada,"April 10, 2019",2018,TV-14,,"Comedies, Romantic Movies",A teenage hacker with a huge nose helps a cool...,,,,,,2019-04-10,,Teens
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
8802,s6178,TV Show,忍者ハットリくん,,,Japan,"December 23, 2018",2012,TV-Y7,2 Seasons,"Anime Series, Kids' TV","Hailing from the mountains of Iga, Kanzo Hatto...",,,,,,NaT,2.0,
8803,s4915,TV Show,海的儿子,,"Li Nanxing, Christopher Lee, Jesseca Liu, Appl...",,"April 27, 2018",2016,TV-14,,"International TV Shows, TV Dramas","Two brothers start a new life in Singapore, wh...",,,,,,2018-04-27,,Teens
8804,s7102,TV Show,마녀사냥,,"Si-kyung Sung, Se-yoon Yoo, Dong-yup Shin, Ji-...",South Korea,"February 19, 2018",2015,TV-MA,,"International TV Shows, Korean TV Shows, Stand...",Four Korean celebrity men and guest stars of b...,,,,,,2018-02-19,,Adults
8805,s5023,Movie,반드시 잡는다,Hong-seon Kim,Baek Yoon-sik,South Korea,"February 28, 2018",2017,TV-MA,110 min,"Dramas, International Movies, Thrillers",After people in his town start turning up dead...,,,,,,2018-02-28,110.0,Adults


In [12]:
df_union_final["duration_clean"] = df_union_final["duration"].str.extract('(\d+)').astype("Int64")
df_union_final[["duration" , "duration_clean"]]

Unnamed: 0,duration,duration_clean
0,,
1,,
2,,
3,104 min,104
4,,
...,...,...
8802,2 Seasons,2
8803,,
8804,,
8805,110 min,110


#### Ejercicio 2: Normalización de la columna `rating`

La columna `rating` tiene diferentes calificaciones como `PG`, `PG-13`, `R`, entre otras. Debes categorizar estas calificaciones en tres grupos:

- **'General Audience'** para calificaciones como `G`, `PG`.

- **'Teens'** para calificaciones como `PG-13`, `TV-14`.

- **'Adults'** para calificaciones como `R`, `TV-MA`.


In [4]:
mapa_rating = {"G":"General Audience", "PG":"General Audience" , "PG-13": "Teens", "TV-14": "Teens", "R": "Adults", "TV-MA" : "Adults"}
df_union_final["clasificacion"] = df_union_final["rating"].map(mapa_rating)
df_union_final[["rating", "clasificacion"]]


Unnamed: 0,rating,clasificacion
0,TV-MA,Adults
1,TV-14,Teens
2,TV-G,
3,TV-G,
4,TV-14,Teens
...,...,...
8802,TV-Y7,
8803,TV-14,Teens
8804,TV-MA,Adults
8805,TV-MA,Adults


#### Ejercicio 3: Creación de una columna personalizada basada en el elenco

Vamos a identificar si un actor clave como `Leonardo DiCaprio`, `Tom Hanks`, o `Morgan Freeman` aparece en el elenco.

Usa `apply` y una función lambda para crear una nueva columna llamada `has_famous_actor` que contenga `True` si alguno de estos actores está en la lista de `cast` y `False` en caso contrario.

TypeError: argument of type 'float' is not iterable

#### Ejercicio 4: Creación de una columna personalizada usando lógica condicional

Vamos a crear una columna llamada `is_recent` que identifique si un título fue lanzado en los últimos 5 años.

Crea una función para marcar con `True` si el título es reciente (lanzado en los últimos 5 años) y `False` si no lo es.

In [15]:
df_union_final["release_year"] = pd.to_datetime(df_union_final["release_year"], format='%Y')
df_union_final["is_recent"] = df_union_final["release_year"] >= pd.to_datetime('today') - pd.DateOffset(years=5)
df_union_final.sample(10)

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,listed_in,description,Genre,Premiere,Runtime,IMDB Score,Language,fecha,duration_clean,clasificacion,is_recent
5146,s2132,Movie,Octonauts & the Caves of Sac Actun,Blair Simmons,"Teresa Gallagher, Simon Greenall, Keith Wickha...",United Kingdom,"August 14, 2020",2020-01-01,TV-Y,,Children & Family Movies,The Octonauts embark on an underwater adventur...,Animation,"August 14, 2020",72.0,6.2,English,2020-08-14,,,True
6423,s2714,Movie,Sol Levante,Akira Saitoh,,Japan,"April 2, 2020",2020-01-01,TV-14,5 min,"Action & Adventure, Anime Features, Internatio...",A young warrior and her familiar search for th...,Anime / Short,"April 2, 2020",4.0,4.7,English,2020-04-02,5.0,Teens,True
3138,s187,TV Show,Hometown Cha-Cha-Cha,,"Shin Min-a, Kim Seon-ho, Lee Sang-yi, Gong Min...",,"August 29, 2021",2021-01-01,TV-14,,"International TV Shows, Romantic TV Shows, TV ...",A big-city dentist opens up a practice in a cl...,,,,,,2021-08-29,,Teens,True
6869,s8199,Movie,The Autopsy of Jane Doe,André Øvredal,"Emile Hirsch, Brian Cox, Ophelia Lovibond, Mic...","United Kingdom, United States","December 30, 2018",2016-01-01,R,,"Horror Movies, Independent Movies, Thrillers",A father-son team of small-town coroners perfo...,,,,,,2018-12-30,,Adults,False
7613,s2654,Movie,The Plagues of Breslau,Patryk Vega,"Małgorzata Kożuchowska, Daria Widawska, Katarz...",Poland,"April 22, 2020",2018-01-01,TV-MA,,"International Movies, Thrillers","After a body is found sewn inside a cow hide, ...",,,,,,2020-04-22,,Adults,False
772,s6219,Movie,Badland,Justin Lee,"Kevin Makely, Bruce Dern, Mira Sorvino, Trace ...",United States,"March 26, 2020",2019-01-01,TV-14,117 min,Dramas,A detective with a license to kill roams the O...,,,,,,2020-03-26,117.0,Teens,False
1678,s6533,TV Show,Cooking on High,,"Josh Leyva, Ngaio Bealum",United States,"June 22, 2018",2018-01-01,TV-MA,,Reality TV,In the first-ever competitive cannabis cooking...,,,,,,2018-06-22,,Adults,False
2036,s6624,Movie,Domino,Brian De Palma,"Nikolaj Coster-Waldau, Carice van Houten, Eriq...","Denmark, France, Belgium, Italy, Netherlands, ...","September 28, 2019",2019-01-01,R,,"International Movies, Thrillers",A Copenhagen police officer hunts for the man ...,,,,,,2019-09-28,,Adults,False
5858,s5705,Movie,Richard Pryor: Live in Concert,Jeff Margolis,Richard Pryor,United States,"December 1, 2016",1979-01-01,TV-MA,79 min,Stand-Up Comedy,Richard Pryor's classic 1979 concert film has ...,,,,,,2016-12-01,79.0,Adults,False
5902,s495,Movie,Rock the Kasbah,Barry Levinson,"Bill Murray, Kate Hudson, Zooey Deschanel, Dan...",United States,"July 8, 2021",2015-01-01,R,,"Comedies, Music & Musicals",When a has-been music producer gets stuck in A...,,,,,,2021-07-08,,Adults,False


#### Ejercicio 5: Clasificación de películas por década

En este ejercicio, tu objetivo es categorizar los años de lanzamiento de las películas o series en décadas. La columna `release_year` contiene el año de lanzamiento y debes crear una nueva columna llamada `decade` que indique la década correspondiente, como "1990s", "2000s", etc.


In [None]:
df_union_final["decade"] = (df_union_final["release_year"]// 10*10).astype(str) + "s"
df_union_final.sample(5)

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,listed_in,description,Genre,Premiere,Runtime,IMDB Score,Language,fecha,duration_clean,clasificacion,decade
416,s2542,Movie,Ali & Alia,Hussein El Ansary,"Khalifa Albhri, Neven Madi, Talal Mahmood, Saw...",United Arab Emirates,"May 12, 2020",2019,TV-14,,"Dramas, International Movies, Romantic Movies",Drugs and addiction endanger the love — and li...,,,,,,2020-05-12,,Teens,2010s
6553,s5635,Movie,Stereo,Maximilian Erlenwein,"Jürgen Vogel, Moritz Bleibtreu, Petra Schmidt-...",Germany,"January 15, 2017",2014,TV-MA,,"International Movies, Thrillers",Erik's peaceful rural family life is shaken by...,,,,,,2017-01-15,,Adults,2010s
6259,s4077,Movie,She's Dating the Gangster,Cathy Garcia-Molina,"Kathryn Bernardo, Daniel Padilla, Richard Gome...",Philippines,"February 27, 2019",2014,TV-14,113 min,"Dramas, International Movies, Romantic Movies","To make another woman jealous, a campus heartt...",,,,,,2019-02-27,113.0,Teens,2010s
4902,s1871,Movie,My Step Dad: The Hippie,Meltem Bozoflu,"Onur Buldu, Mahir İpek, Derya Karadaş, Onur At...","Turkey, South Korea","October 9, 2020",2018,TV-MA,,"Comedies, International Movies",When three adult siblings meet the offbeat man...,,,,,,2020-10-09,,Adults,2010s
46,s5961,Movie,187,Kevin Reynolds,"Samuel L. Jackson, John Heard, Kelly Rowan, Cl...",United States,"November 1, 2019",1997,R,119 min,Dramas,After one of his high school students attacks ...,,,,,,2019-11-01,119.0,Adults,1990s


#### Ejercicio 6: Extracción de información

Para practicar la extracción de información:

1. **Extrae el primer actor** de la lista en la columna `cast` y crea una nueva columna llamada `first_actor`.

2. **Extrae el primer nombre del director** y guárdalo en una columna llamada `first_name_director`.


In [25]:
df_union_final["first_actor"] = df_union_final["cast"].str.split(",").str[0]
df_union_final[["cast", "first_actor"]]


Unnamed: 0,cast,first_actor
0,"Yoo Ah-in, Park Shin-hye",Yoo Ah-in
1,"Helen Mirren, Gengher Gatti",Helen Mirren
2,"Adipati Dolken, Vanesha Prescilla, Rendi Jhon,...",Adipati Dolken
3,"Adipati Dolken, Mawar de Jongh, Sari Nila, Von...",Adipati Dolken
4,"Jake Short, Sarah Fisher, Booboo Stewart, Dann...",Jake Short
...,...,...
8802,,
8803,"Li Nanxing, Christopher Lee, Jesseca Liu, Appl...",Li Nanxing
8804,"Si-kyung Sung, Se-yoon Yoo, Dong-yup Shin, Ji-...",Si-kyung Sung
8805,Baek Yoon-sik,Baek Yoon-sik


In [26]:
df_union_final["first_director"] = df_union_final["director"].str.split(",").str[0]
df_union_final[["director", "first_director"]]

Unnamed: 0,director,first_director
0,Cho Il,Cho Il
1,"Sabina Fedeli, Anna Migotto",Sabina Fedeli
2,Rako Prijanto,Rako Prijanto
3,Rako Prijanto,Rako Prijanto
4,Michael Kennedy,Michael Kennedy
...,...,...
8802,,
8803,,
8804,,
8805,Hong-seon Kim,Hong-seon Kim


#### Ejercicio 7: Limpieza de la columna `cast`

La columna `cast` contiene una lista de actores separados por comas. Tu objetivo es realizar las siguientes tareas:

1. **Reemplaza los valores nulos** en la columna `cast` por "sin información".

2. **Contar el número de actores** en cada entrada y crear una nueva columna llamada `num_cast`.

3. **Normalizar los nombres**: Asegúrate de que los nombres de los actores estén en un formato consistente (por ejemplo, quitar espacios adicionales).


In [None]:
df_union_final["cast"] = df_union_final["cast"].fillna("sin_informacion")
df_union_final

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,listed_in,description,Genre,Premiere,Runtime,IMDB Score,Language,fecha,duration_clean,clasificacion,decade,first_actor
0,s2037,Movie,#Alive,Cho Il,"Yoo Ah-in, Park Shin-hye",South Korea,"September 8, 2020",2020,TV-MA,,"Horror Movies, International Movies, Thrillers","As a grisly virus rampages a city, a lone man ...",,,,,,2020-09-08,,Adults,2020s,Y
1,s2305,Movie,#AnneFrank - Parallel Stories,"Sabina Fedeli, Anna Migotto","Helen Mirren, Gengher Gatti",Italy,"July 1, 2020",2019,TV-14,,"Documentaries, International Movies","Through her diary, Anne Frank's story is retol...",,,,,,2020-07-01,,Teens,2010s,H
2,s2482,Movie,#FriendButMarried,Rako Prijanto,"Adipati Dolken, Vanesha Prescilla, Rendi Jhon,...",Indonesia,"May 21, 2020",2018,TV-G,,"Dramas, International Movies, Romantic Movies","Pining for his high school crush for years, a ...",,,,,,2020-05-21,,,2010s,A
3,s2325,Movie,#FriendButMarried 2,Rako Prijanto,"Adipati Dolken, Mawar de Jongh, Sari Nila, Von...",Indonesia,"June 28, 2020",2020,TV-G,104 min,"Dramas, International Movies, Romantic Movies",As Ayu and Ditto finally transition from best ...,,,,,,2020-06-28,104.0,,2020s,A
4,s5974,Movie,#Roxy,Michael Kennedy,"Jake Short, Sarah Fisher, Booboo Stewart, Dann...",Canada,"April 10, 2019",2018,TV-14,,"Comedies, Romantic Movies",A teenage hacker with a huge nose helps a cool...,,,,,,2019-04-10,,Teens,2010s,J
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
8802,s6178,TV Show,忍者ハットリくん,,sin_informacion,Japan,"December 23, 2018",2012,TV-Y7,2 Seasons,"Anime Series, Kids' TV","Hailing from the mountains of Iga, Kanzo Hatto...",,,,,,NaT,2.0,,2010s,
8803,s4915,TV Show,海的儿子,,"Li Nanxing, Christopher Lee, Jesseca Liu, Appl...",,"April 27, 2018",2016,TV-14,,"International TV Shows, TV Dramas","Two brothers start a new life in Singapore, wh...",,,,,,2018-04-27,,Teens,2010s,L
8804,s7102,TV Show,마녀사냥,,"Si-kyung Sung, Se-yoon Yoo, Dong-yup Shin, Ji-...",South Korea,"February 19, 2018",2015,TV-MA,,"International TV Shows, Korean TV Shows, Stand...",Four Korean celebrity men and guest stars of b...,,,,,,2018-02-19,,Adults,2010s,S
8805,s5023,Movie,반드시 잡는다,Hong-seon Kim,Baek Yoon-sik,South Korea,"February 28, 2018",2017,TV-MA,110 min,"Dramas, International Movies, Thrillers",After people in his town start turning up dead...,,,,,,2018-02-28,110.0,Adults,2010s,B


In [None]:
df_union_final["num_cast"] = df_union_final["cast"].astype(str)
df_union_final----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,listed_in,description,Genre,Premiere,Runtime,IMDB Score,Language,fecha,duration_clean,clasificacion,decade,first_actor,num_cast
0,s2037,Movie,#Alive,Cho Il,"Yoo Ah-in, Park Shin-hye",South Korea,"September 8, 2020",2020,TV-MA,,"Horror Movies, International Movies, Thrillers","As a grisly virus rampages a city, a lone man ...",,,,,,2020-09-08,,Adults,2020s,Y,"Yoo Ah-in, Park Shin-hye"
1,s2305,Movie,#AnneFrank - Parallel Stories,"Sabina Fedeli, Anna Migotto","Helen Mirren, Gengher Gatti",Italy,"July 1, 2020",2019,TV-14,,"Documentaries, International Movies","Through her diary, Anne Frank's story is retol...",,,,,,2020-07-01,,Teens,2010s,H,"Helen Mirren, Gengher Gatti"
2,s2482,Movie,#FriendButMarried,Rako Prijanto,"Adipati Dolken, Vanesha Prescilla, Rendi Jhon,...",Indonesia,"May 21, 2020",2018,TV-G,,"Dramas, International Movies, Romantic Movies","Pining for his high school crush for years, a ...",,,,,,2020-05-21,,,2010s,A,"Adipati Dolken, Vanesha Prescilla, Rendi Jhon,..."
3,s2325,Movie,#FriendButMarried 2,Rako Prijanto,"Adipati Dolken, Mawar de Jongh, Sari Nila, Von...",Indonesia,"June 28, 2020",2020,TV-G,104 min,"Dramas, International Movies, Romantic Movies",As Ayu and Ditto finally transition from best ...,,,,,,2020-06-28,104.0,,2020s,A,"Adipati Dolken, Mawar de Jongh, Sari Nila, Von..."
4,s5974,Movie,#Roxy,Michael Kennedy,"Jake Short, Sarah Fisher, Booboo Stewart, Dann...",Canada,"April 10, 2019",2018,TV-14,,"Comedies, Romantic Movies",A teenage hacker with a huge nose helps a cool...,,,,,,2019-04-10,,Teens,2010s,J,"Jake Short, Sarah Fisher, Booboo Stewart, Dann..."
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
8802,s6178,TV Show,忍者ハットリくん,sin_informacion,sin_informacion,Japan,"December 23, 2018",2012,TV-Y7,2 Seasons,"Anime Series, Kids' TV","Hailing from the mountains of Iga, Kanzo Hatto...",,,,,,NaT,2.0,,2010s,,sin_informacion
8803,s4915,TV Show,海的儿子,sin_informacion,"Li Nanxing, Christopher Lee, Jesseca Liu, Appl...",,"April 27, 2018",2016,TV-14,,"International TV Shows, TV Dramas","Two brothers start a new life in Singapore, wh...",,,,,,2018-04-27,,Teens,2010s,L,"Li Nanxing, Christopher Lee, Jesseca Liu, Appl..."
8804,s7102,TV Show,마녀사냥,sin_informacion,"Si-kyung Sung, Se-yoon Yoo, Dong-yup Shin, Ji-...",South Korea,"February 19, 2018",2015,TV-MA,,"International TV Shows, Korean TV Shows, Stand...",Four Korean celebrity men and guest stars of b...,,,,,,2018-02-19,,Adults,2010s,S,"Si-kyung Sung, Se-yoon Yoo, Dong-yup Shin, Ji-..."
8805,s5023,Movie,반드시 잡는다,Hong-seon Kim,Baek Yoon-sik,South Korea,"February 28, 2018",2017,TV-MA,110 min,"Dramas, International Movies, Thrillers",After people in his town start turning up dead...,,,,,,2018-02-28,110.0,Adults,2010s,B,Baek Yoon-sik



#### Ejercicio 9: Identificación de Directores Recurrentes

En este ejercicio, debes identificar los directores que aparecen más de una vez en el conjunto de datos. Realiza los siguientes pasos:

1. **Reemplaza los valores nulos** en la columna `director` por "sin información".

3. **Cuenta cuántas veces aparece cada director** en la columna creada en el ejercicio 6.

4. **Filtra aquellos directores que aparecen más de una vez** y crea una nueva columna llamada `recurrent_director` donde se indique "Yes" si el director aparece varias veces o "No" en caso contrario.

In [None]:
df_union_final["director"] = df_union_final["director"].fillna("sin_informacion")
df_union_final

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,listed_in,description,Genre,Premiere,Runtime,IMDB Score,Language,fecha,duration_clean,clasificacion,decade,first_actor
0,s2037,Movie,#Alive,Cho Il,"Yoo Ah-in, Park Shin-hye",South Korea,"September 8, 2020",2020,TV-MA,,"Horror Movies, International Movies, Thrillers","As a grisly virus rampages a city, a lone man ...",,,,,,2020-09-08,,Adults,2020s,Y
1,s2305,Movie,#AnneFrank - Parallel Stories,"Sabina Fedeli, Anna Migotto","Helen Mirren, Gengher Gatti",Italy,"July 1, 2020",2019,TV-14,,"Documentaries, International Movies","Through her diary, Anne Frank's story is retol...",,,,,,2020-07-01,,Teens,2010s,H
2,s2482,Movie,#FriendButMarried,Rako Prijanto,"Adipati Dolken, Vanesha Prescilla, Rendi Jhon,...",Indonesia,"May 21, 2020",2018,TV-G,,"Dramas, International Movies, Romantic Movies","Pining for his high school crush for years, a ...",,,,,,2020-05-21,,,2010s,A
3,s2325,Movie,#FriendButMarried 2,Rako Prijanto,"Adipati Dolken, Mawar de Jongh, Sari Nila, Von...",Indonesia,"June 28, 2020",2020,TV-G,104 min,"Dramas, International Movies, Romantic Movies",As Ayu and Ditto finally transition from best ...,,,,,,2020-06-28,104.0,,2020s,A
4,s5974,Movie,#Roxy,Michael Kennedy,"Jake Short, Sarah Fisher, Booboo Stewart, Dann...",Canada,"April 10, 2019",2018,TV-14,,"Comedies, Romantic Movies",A teenage hacker with a huge nose helps a cool...,,,,,,2019-04-10,,Teens,2010s,J
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
8802,s6178,TV Show,忍者ハットリくん,sin_informacion,sin_informacion,Japan,"December 23, 2018",2012,TV-Y7,2 Seasons,"Anime Series, Kids' TV","Hailing from the mountains of Iga, Kanzo Hatto...",,,,,,NaT,2.0,,2010s,
8803,s4915,TV Show,海的儿子,sin_informacion,"Li Nanxing, Christopher Lee, Jesseca Liu, Appl...",,"April 27, 2018",2016,TV-14,,"International TV Shows, TV Dramas","Two brothers start a new life in Singapore, wh...",,,,,,2018-04-27,,Teens,2010s,L
8804,s7102,TV Show,마녀사냥,sin_informacion,"Si-kyung Sung, Se-yoon Yoo, Dong-yup Shin, Ji-...",South Korea,"February 19, 2018",2015,TV-MA,,"International TV Shows, Korean TV Shows, Stand...",Four Korean celebrity men and guest stars of b...,,,,,,2018-02-19,,Adults,2010s,S
8805,s5023,Movie,반드시 잡는다,Hong-seon Kim,Baek Yoon-sik,South Korea,"February 28, 2018",2017,TV-MA,110 min,"Dramas, International Movies, Thrillers",After people in his town start turning up dead...,,,,,,2018-02-28,110.0,Adults,2010s,B
