<a href="https://colab.research.google.com/github/fralfaro/MAT306/blob/main/docs/labs/lab_03.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


# MAT306 - Laboratorio N°03





**Objetivo**: Aplicar técnicas avanzadas de manipulación y análisis de datos con pandas sobre un conjunto real de datos de contenido de Netflix, reforzando buenas prácticas y métodos eficientes sin recurrir a `groupby`, `merge`, `pivot`, ni `join`.



**Dataset**:

Trabajaremos con el archivo `netflix_titles.csv`, que contiene información sobre los títulos disponibles en la plataforma Netflix hasta el año 2021.

| Variable       | Clase     | Descripción                                                                 |
|----------------|-----------|------------------------------------------------------------------------------|
| show_id        | caracter  | Identificador único del título en el catálogo de Netflix.                   |
| type           | caracter  | Tipo de contenido: 'Movie' o 'TV Show'.                                     |
| title          | caracter  | Título del contenido.                                                       |
| director       | caracter  | Nombre del director (puede ser nulo).                                       |
| cast           | caracter  | Lista de actores principales (puede ser nulo).                              |
| country        | caracter  | País o países donde se produjo el contenido.                                |
| date_added     | fecha     | Fecha en la que el título fue agregado al catálogo de Netflix.              |
| release_year   | entero    | Año de lanzamiento original del título.                                     |
| rating         | caracter  | Clasificación por edad (por ejemplo: 'PG-13', 'TV-MA').                      |
| duration       | caracter  | Duración del contenido (minutos o número de temporadas para series).        |
| listed_in      | caracter  | Categorías o géneros en los que está clasificado el contenido.              |
| description    | caracter  | Breve sinopsis del contenido.                                               |




In [1]:
import pandas as pd

In [2]:
# Cargar datos
df = pd.read_csv('https://raw.githubusercontent.com/fralfaro/MAT306/main/docs/labs/data/netflix_titles.csv')
df.head()

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,listed_in,description
0,s1,Movie,Dick Johnson Is Dead,Kirsten Johnson,,United States,"September 25, 2021",2020,PG-13,90 min,Documentaries,"As her father nears the end of his life, filmm..."
1,s2,TV Show,Blood & Water,,"Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban...",South Africa,"September 24, 2021",2021,TV-MA,2 Seasons,"International TV Shows, TV Dramas, TV Mysteries","After crossing paths at a party, a Cape Town t..."
2,s3,TV Show,Ganglands,Julien Leclercq,"Sami Bouajila, Tracy Gotoas, Samuel Jouy, Nabi...",,"September 24, 2021",2021,TV-MA,1 Season,"Crime TV Shows, International TV Shows, TV Act...",To protect his family from a powerful drug lor...
3,s4,TV Show,Jailbirds New Orleans,,,,"September 24, 2021",2021,TV-MA,1 Season,"Docuseries, Reality TV","Feuds, flirtations and toilet talk go down amo..."
4,s5,TV Show,Kota Factory,,"Mayur More, Jitendra Kumar, Ranjan Raj, Alam K...",India,"September 24, 2021",2021,TV-MA,2 Seasons,"International TV Shows, Romantic TV Shows, TV ...",In a city of coaching centers known to train I...



### Parte 1: Limpieza y preparación

1. Revisar y describir el dataset:

   * ¿Cuántas filas y columnas tiene?
   * ¿Qué tipos de datos hay?
   * ¿Cuántos valores nulos hay por columna?

2. Transformar la columna `date_added` a tipo fecha.

3. Crear columnas auxiliares con `assign`:

   * Año (`year_added`)
   * Mes (`month_added`)



In [3]:
#Tamñano de dataframe
print("El dataframe posee", df.shape[0], "Filas y ", df.shape[1], "columnas")

El dataframe posee 8807 Filas y  12 columnas


In [8]:
#Tipos de dato segun cada columna
print("\nTipos de datos: \n", df.dtypes)



Tipos de datos: 
 show_id         object
type            object
title           object
director        object
cast            object
country         object
date_added      object
release_year     int64
rating          object
duration        object
listed_in       object
description     object
dtype: object


In [11]:
#valores nulos por columna
print("Valores nulos por columna: \n", df.isnull().sum())

Valores nulos por columna: 
 show_id            0
type               0
title              0
director        2634
cast             825
country          831
date_added        10
release_year       0
rating             4
duration           3
listed_in          0
description        0
dtype: int64


In [None]:
import datetime

#ELiminamos primero espacios en blanco, pues presenta problemas al momento de convertir al formato fecha
df["date_added"] = df["date_added"].str.strip()

#Pasamos a formato fecha
df["date_added"] = pd.to_datetime(df["date_added"], format="%B %d, %Y")

df["date_added"].dtype      # <M8[ns] = datetime64[ns]

dtype('<M8[ns]')

In [16]:
#Creamos columnas auxiliares
df = df.assign( year_added = df["date_added"].dt.year, month_added = df["date_added"].dt.month)
df[["year_added", "month_added"]]

Unnamed: 0,year_added,month_added
0,2021.0,9.0
1,2021.0,9.0
2,2021.0,9.0
3,2021.0,9.0
4,2021.0,9.0
...,...,...
8802,2019.0,11.0
8803,2019.0,7.0
8804,2019.0,11.0
8805,2020.0,1.0


## Parte 2: Técnicas avanzadas de pandas

4. Utilizar `.loc` para seleccionar películas (`type == 'Movie'`) que fueron agregadas después del año 2018.

5. Utilizar `str.contains()` y `str.extract()`:

   * Filtrar títulos que contienen la palabra 'love' (sin distinguir mayúsculas/minúsculas).
   * Extraer la duración en minutos para las películas desde la columna `duration`.

6. Aplicar `explode()` sobre la columna `listed_in` para obtener una fila por cada género.

7. Obtener un top 10 de géneros más frecuentes utilizando `value_counts()`.

8. Aplicar `where()` y `mask()` para marcar las películas de más de 120 minutos como contenido largo en una nueva columna.

9. Utilizar `.loc` para filtrar películas que cumplen con:

   * Más de 100 minutos de duración.
   * Rating igual a `'R'`.
   * País igual a `'United States'`.

10. Utilizar `.style` para formatear visualmente el top 10 de películas más largas.

In [18]:
#Creamos una copia del df incluyendo solo peliculas
df_movies = df.loc[df["type"] == "Movie"].copy()

#Filtramos las peliculas
df_movies.loc[df_movies["year_added"] > 2018]

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,listed_in,description,year_added,month_added
0,s1,Movie,Dick Johnson Is Dead,Kirsten Johnson,,United States,2021-09-25,2020,PG-13,90 min,Documentaries,"As her father nears the end of his life, filmm...",2021.0,9.0
6,s7,Movie,My Little Pony: A New Generation,"Robert Cullen, José Luis Ucha","Vanessa Hudgens, Kimiko Glenn, James Marsden, ...",,2021-09-24,2021,PG,91 min,Children & Family Movies,Equestria's divided. But a bright-eyed hero be...,2021.0,9.0
7,s8,Movie,Sankofa,Haile Gerima,"Kofi Ghanaba, Oyafunmike Ogunlano, Alexandra D...","United States, Ghana, Burkina Faso, United Kin...",2021-09-24,1993,TV-MA,125 min,"Dramas, Independent Movies, International Movies","On a photo shoot in Ghana, an American model s...",2021.0,9.0
9,s10,Movie,The Starling,Theodore Melfi,"Melissa McCarthy, Chris O'Dowd, Kevin Kline, T...",United States,2021-09-24,2021,PG-13,104 min,"Comedies, Dramas",A woman adjusting to life after a loss contend...,2021.0,9.0
12,s13,Movie,Je Suis Karl,Christian Schwochow,"Luna Wedler, Jannis Niewöhner, Milan Peschel, ...","Germany, Czech Republic",2021-09-23,2021,TV-MA,127 min,"Dramas, International Movies",After most of her family is murdered in a terr...,2021.0,9.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
8798,s8799,Movie,Zed Plus,Chandra Prakash Dwivedi,"Adil Hussain, Mona Singh, K.K. Raina, Sanjay M...",India,2019-12-31,2014,TV-MA,131 min,"Comedies, Dramas, International Movies",A philandering small-town mechanic's political...,2019.0,12.0
8802,s8803,Movie,Zodiac,David Fincher,"Mark Ruffalo, Jake Gyllenhaal, Robert Downey J...",United States,2019-11-20,2007,R,158 min,"Cult Movies, Dramas, Thrillers","A political cartoonist, a crime reporter and a...",2019.0,11.0
8804,s8805,Movie,Zombieland,Ruben Fleischer,"Jesse Eisenberg, Woody Harrelson, Emma Stone, ...",United States,2019-11-01,2009,R,88 min,"Comedies, Horror Movies",Looking to survive in a world taken over by zo...,2019.0,11.0
8805,s8806,Movie,Zoom,Peter Hewitt,"Tim Allen, Courteney Cox, Chevy Chase, Kate Ma...",United States,2020-01-11,2006,PG,88 min,"Children & Family Movies, Comedies","Dragged from civilian life, a former superhero...",2020.0,1.0


In [None]:
#Filtramos segun si tiene la palabra "love"
df.loc[df["title"].str.contains(" love ", case=False)]

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,listed_in,description,year_added,month_added
506,s507,Movie,This Little Love Of Mine,Christine Luby,"Saskia Hampele, Liam McIntyre, Lynn Gilmartin,...",Australia,2021-07-07,2021,TV-G,92 min,"International Movies, Romantic Movies",A workaholic lawyer returns to her island home...,2021.0,7.0
659,s660,TV Show,Bangkok Love Stories: Innocence,,"Nida Patcharaveerapong, Nicole Theriault, Natt...",Thailand,2021-06-19,2018,TV-14,1 Season,"International TV Shows, Romantic TV Shows, TV ...",From a teenage parkour enthusiast to a bawdy r...,2021.0,6.0
849,s850,Movie,Sam Smith: Love Goes - Live at Abbey Road Studios,,Sam Smith,,2021-05-22,2020,TV-G,61 min,"International Movies, Music & Musicals",Grammy-winning artist Sam Smith gives an intim...,2021.0,5.0
1161,s1162,Movie,Elizabeth and Margaret: Love and Loyalty,,,United Kingdom,2021-03-26,2020,TV-PG,87 min,Documentaries,This documentary takes an intimate look at the...,2021.0,3.0
1428,s1429,Movie,Is Love Enough? Sir,Rohena Gera,"Tillotama Shome, Vivek Gomber, Geetanjali Kulk...","India, France",2021-01-08,2018,TV-MA,99 min,"Dramas, Independent Movies, International Movies",A young widow is hired as the domestic helper ...,2021.0,1.0
1483,s1484,TV Show,A Love So Beautiful,,"Kim Yo-han, So Joo-yeon, Yeo Hoi-hyun, Jeong J...",South Korea,2020-12-28,2020,TV-PG,1 Season,"International TV Shows, Romantic TV Shows, TV ...",Love is as tough as it is sweet for a lovestru...,2020.0,12.0
1509,s1510,Movie,"Ariana grande: excuse me, i love you",Paul Dugdale,Ariana Grande,United States,2020-12-21,2020,TV-MA,98 min,"Documentaries, Music & Musicals",Ariana Grande takes the stage in London for he...,2020.0,12.0
1525,s1526,Movie,Eggnoid: Love & Time Portal,Naya Anindita,"Morgan Oey, Sheila Dara, Luna Maya, Kevin Juli...",Indonesia,2020-12-17,2019,TV-14,102 min,"Dramas, International Movies, Romantic Movies",Sent from the future to look after a lonely gi...,2020.0,12.0
1669,s1670,Movie,If Anything Happens I Love You,"Will McCormack, Michael Govier",,United States,2020-11-20,2020,PG,13 min,Dramas,Grieving parents journey through an emotional ...,2020.0,11.0
1962,s1963,Movie,A Love Song for Latasha,Sophia Nahli Allison,,United States,2020-09-21,2020,TV-PG,20 min,Documentaries,The killing of Latasha Harlins became a flashp...,2020.0,9.0


In [23]:
#Creamos la columna de duracion en minutos que contenga solo el valor
df_movies["duration_min"] = df_movies["duration"].str.extract("(\\d+)")
df_movies["duration_min"]


0        90
6        91
7       125
9       104
12      127
       ... 
8801     96
8802    158
8804     88
8805     88
8806    111
Name: duration_min, Length: 6131, dtype: object

In [26]:
#Copiamos el dataframe
df_ex = df.copy()

#Solucionamos problemas de formato y aplicamos explode
df_ex["listed_in"] = df_ex["listed_in"].str.split(",")
df_ex = df_ex.explode("listed_in")

df_ex

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,listed_in,description,year_added,month_added
0,s1,Movie,Dick Johnson Is Dead,Kirsten Johnson,,United States,2021-09-25,2020,PG-13,90 min,Documentaries,"As her father nears the end of his life, filmm...",2021.0,9.0
1,s2,TV Show,Blood & Water,,"Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban...",South Africa,2021-09-24,2021,TV-MA,2 Seasons,International TV Shows,"After crossing paths at a party, a Cape Town t...",2021.0,9.0
1,s2,TV Show,Blood & Water,,"Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban...",South Africa,2021-09-24,2021,TV-MA,2 Seasons,TV Dramas,"After crossing paths at a party, a Cape Town t...",2021.0,9.0
1,s2,TV Show,Blood & Water,,"Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban...",South Africa,2021-09-24,2021,TV-MA,2 Seasons,TV Mysteries,"After crossing paths at a party, a Cape Town t...",2021.0,9.0
2,s3,TV Show,Ganglands,Julien Leclercq,"Sami Bouajila, Tracy Gotoas, Samuel Jouy, Nabi...",,2021-09-24,2021,TV-MA,1 Season,Crime TV Shows,To protect his family from a powerful drug lor...,2021.0,9.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
8805,s8806,Movie,Zoom,Peter Hewitt,"Tim Allen, Courteney Cox, Chevy Chase, Kate Ma...",United States,2020-01-11,2006,PG,88 min,Children & Family Movies,"Dragged from civilian life, a former superhero...",2020.0,1.0
8805,s8806,Movie,Zoom,Peter Hewitt,"Tim Allen, Courteney Cox, Chevy Chase, Kate Ma...",United States,2020-01-11,2006,PG,88 min,Comedies,"Dragged from civilian life, a former superhero...",2020.0,1.0
8806,s8807,Movie,Zubaan,Mozez Singh,"Vicky Kaushal, Sarah-Jane Dias, Raaghav Chanan...",India,2019-03-02,2015,TV-14,111 min,Dramas,A scrappy but poor boy worms his way into a ty...,2019.0,3.0
8806,s8807,Movie,Zubaan,Mozez Singh,"Vicky Kaushal, Sarah-Jane Dias, Raaghav Chanan...",India,2019-03-02,2015,TV-14,111 min,International Movies,A scrappy but poor boy worms his way into a ty...,2019.0,3.0


In [27]:
#10 generos mas repetidos
df_ex["listed_in"].value_counts().head(10)

listed_in
 International Movies     2624
Dramas                    1600
Comedies                  1210
Action & Adventure         859
Documentaries              829
 Dramas                    827
International TV Shows     774
 Independent Movies        736
 TV Dramas                 696
 Romantic Movies           613
Name: count, dtype: int64

In [28]:
#Rellenamos valores vacios con cero y separamos
df_movies["duration_min"] = df_movies["duration_min"].fillna(0).astype(int)

df_movies["Contenido_largo"] = df_movies["duration_min"].where(df_movies["duration_min"] > 120, other="Corto")
df_movies["Contenido_largo"] = df_movies["Contenido_largo"].mask(df_movies["duration_min"] > 120, other="Largo")

df_movies["Contenido_largo"]

0       Corto
6       Corto
7       Largo
9       Corto
12      Largo
        ...  
8801    Corto
8802    Largo
8804    Corto
8805    Corto
8806    Corto
Name: Contenido_largo, Length: 6131, dtype: object

In [29]:
#Filtramos segun criterios
df_movies.loc[(df_movies["duration_min"] > 100) & (df_movies["rating"] == "R") & (df_movies["country"] == "United States")]

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,listed_in,description,year_added,month_added,duration_min,Contenido_largo
48,s49,Movie,Training Day,Antoine Fuqua,"Denzel Washington, Ethan Hawke, Scott Glenn, T...",United States,2021-09-16,2001,R,122 min,"Dramas, Thrillers",A rookie cop with one day to prove himself to ...,2021.0,9.0,122,Largo
81,s82,Movie,Kate,Cedric Nicolas-Troyan,"Mary Elizabeth Winstead, Jun Kunimura, Woody H...",United States,2021-09-10,2021,R,106 min,Action & Adventure,"Slipped a fatal poison on her final job, a rut...",2021.0,9.0,106,Corto
131,s132,Movie,Blade Runner: The Final Cut,Ridley Scott,"Harrison Ford, Rutger Hauer, Sean Young, Edwar...",United States,2021-09-01,1982,R,117 min,"Action & Adventure, Classic Movies, Cult Movies","In a smog-choked dystopian Los Angeles, blade ...",2021.0,9.0,117,Corto
139,s140,Movie,Do the Right Thing,Spike Lee,"Danny Aiello, Ossie Davis, Ruby Dee, Richard E...",United States,2021-09-01,1989,R,120 min,"Classic Movies, Comedies, Dramas","On a sweltering day in Brooklyn, simmering rac...",2021.0,9.0,120,Corto
144,s145,Movie,House Party,Reginald Hudlin,"Christopher Reid, Christopher Martin, Robin Ha...",United States,2021-09-01,1990,R,104 min,"Comedies, Cult Movies","Grounded by his strict father, Kid risks life ...",2021.0,9.0,104,Corto
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
8678,s8679,Movie,Vincent N Roxxy,Gary Michael Schultz,"Emile Hirsch, Zoë Kravitz, Emory Cohen, Zoey D...",United States,2017-09-02,2016,R,101 min,"Dramas, Thrillers","In rural Louisiana, a terse loner forges a red...",2017.0,9.0,101,Corto
8691,s8692,Movie,Wakefield,Robin Swicord,"Bryan Cranston, Jennifer Garner, Jason O'Mara,...",United States,2019-03-02,2016,R,109 min,Dramas,An unhappy father and lawyer quits his suburba...,2019.0,3.0,109,Corto
8751,s8752,Movie,Wish I Was Here,Zach Braff,"Zach Braff, Kate Hudson, Donald Faison, Joey K...",United States,2018-08-16,2014,R,106 min,"Comedies, Dramas, Independent Movies","With his acting career moribund, Aidan Bloom s...",2018.0,8.0,106,Corto
8754,s8755,Movie,Wolves,Bart Freundlich,"Michael Shannon, Carla Gugino, Taylor John Smi...",United States,2019-03-29,2016,R,109 min,"Dramas, Independent Movies, Sports Movies",A promising high school basketball player has ...,2019.0,3.0,109,Corto


In [None]:
#pip install jinja2

In [30]:
#Guardamos las mejores 10 peliculas
top10 = df_movies.sort_values("duration_min", ascending=False).head(10)

# Resaltar fila completa de cada película
top10.style.apply(lambda x: ["background-color: brown" for _ in x], axis=1)


Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,listed_in,description,year_added,month_added,duration_min,Contenido_largo
4253,s4254,Movie,Black Mirror: Bandersnatch,,"Fionn Whitehead, Will Poulter, Craig Parkinson, Alice Lowe, Asim Chaudhry",United States,2018-12-28 00:00:00,2018,TV-MA,312 min,"Dramas, International Movies, Sci-Fi & Fantasy","In 1984, a young programmer begins to question reality as he adapts a dark fantasy novel into a video game. A mind-bending tale with multiple endings.",2018.0,12.0,312,Largo
717,s718,Movie,Headspace: Unwind Your Mind,,"Andy Puddicombe, Evelyn Lewis Prieto, Ginger Daniels, Darren Pettie, Simon Prebble, Rhiannon Mcgavin, Kate Seftel",,2021-06-15 00:00:00,2021,TV-G,273 min,Documentaries,"Do you want to relax, meditate or sleep deeply? Personalize the experience according to your mood or mindset with this Headspace interactive special.",2021.0,6.0,273,Largo
2491,s2492,Movie,The School of Mischief,Houssam El-Din Mustafa,"Suhair El-Babili, Adel Emam, Saeed Saleh, Younes Shalabi, Hadi El-Gayyar, Ahmad Zaki, Hassan Moustafa",Egypt,2020-05-21 00:00:00,1973,TV-14,253 min,"Comedies, Dramas, International Movies",A high school teacher volunteers to transform five notorious misfits into model students — and has unintended results.,2020.0,5.0,253,Largo
2487,s2488,Movie,No Longer kids,Samir Al Asfory,"Said Saleh, Hassan Moustafa, Ahmed Zaki, Younes Shalabi, Nadia Shukri, Karima Mokhtar",Egypt,2020-05-21 00:00:00,1979,TV-14,237 min,"Comedies, Dramas, International Movies","Hoping to prevent their father from skipping town with his mistress, four rowdy siblings resort to absurd measures to stop him.",2020.0,5.0,237,Largo
2484,s2485,Movie,Lock Your Girls In,Fouad El-Mohandes,"Fouad El-Mohandes, Sanaa Younes, Sherihan, Ahmed Rateb, Ijlal Zaki, Zakariya Mowafi",,2020-05-21 00:00:00,1982,TV-PG,233 min,"Comedies, International Movies, Romantic Movies",A widower believes he must marry off his three problematic daughters before he can pursue his real goal of marrying his secret love.,2020.0,5.0,233,Largo
2488,s2489,Movie,Raya and Sakina,Hussein Kamal,"Suhair El-Babili, Shadia, Abdel Moneim Madbouly, Ahmed Bedir",,2020-05-21 00:00:00,1984,TV-14,230 min,"Comedies, Dramas, International Movies","When robberies and murders targeting women sweep early 20th-century Egypt, the hunt for suspects leads to two shadowy sisters. Based on a true story.",2020.0,5.0,230,Largo
166,s167,Movie,Once Upon a Time in America,Sergio Leone,"Robert De Niro, James Woods, Elizabeth McGovern, Treat Williams, Tuesday Weld, Burt Young, Joe Pesci, Danny Aiello, William Forsythe, James Hayden","Italy, United States",2021-09-01 00:00:00,1984,R,229 min,"Classic Movies, Dramas",Director Sergio Leone's sprawling crime epic follows a group of Jewish mobsters who rise in the ranks of organized crime in 1920s New York City.,2021.0,9.0,229,Largo
7932,s7933,Movie,Sangam,Raj Kapoor,"Raj Kapoor, Vyjayanthimala, Rajendra Kumar, Lalita Pawar, Achala Sachdev, Hari Shivdasani, Raj Mehra, Iftekhar",India,2019-12-31 00:00:00,1964,TV-14,228 min,"Classic Movies, Dramas, International Movies","Returning home from war after being assumed dead, a pilot weds the woman he has long loved, unaware that she had been planning to marry his best friend.",2019.0,12.0,228,Largo
1019,s1020,Movie,Lagaan,Ashutosh Gowariker,"Aamir Khan, Gracy Singh, Rachel Shelley, Paul Blackthorne, Kulbhushan Kharbanda, Raghuvir Yadav, Yashpal Sharma, Rajendranath Zutshi, Rajesh Vivek, Aditya Lakhia","India, United Kingdom",2021-04-17 00:00:00,2001,PG,224 min,"Dramas, International Movies, Music & Musicals","In 1890s India, an arrogant British commander challenges the harshly taxed residents of Champaner to a high-stakes cricket match.",2021.0,4.0,224,Largo
4573,s4574,Movie,Jodhaa Akbar,Ashutosh Gowariker,"Hrithik Roshan, Aishwarya Rai Bachchan, Sonu Sood, Poonam Sinha, Suhasini Mulay, Ila Arun, Raza Murad, Kulbhushan Kharbanda, Abeer Abrar",India,2018-10-01 00:00:00,2008,TV-14,214 min,"Action & Adventure, Dramas, International Movies","In 16th-century India, what begins as a strategic alliance between a Mughal emperor and a Hindu princess becomes a genuine opportunity for true love.",2018.0,10.0,214,Largo




### Pregunta Desafío

11. ¿Cuáles son las combinaciones más frecuentes de género y rating en el dataset?
    (Sugerencia: utilizar `value_counts` con `subset=["genre", "rating"]` después de aplicar `explode()`).



### Bonus: Análisis de duplicados y limpieza

12. ¿Existen películas con el mismo nombre (`title`) pero con distinto año de lanzamiento (`release_year`)?
13. ¿Cuántos títulos únicos hay en total en la columna `title`?





In [31]:
#Las 5 combinaciones mas frecuentes de genero y rating
df_ex.value_counts(subset=["listed_in", "rating"]).head(5)

listed_in              rating
 International Movies  TV-MA     1074
                       TV-14     1022
Dramas                 TV-MA      616
                       TV-14      428
 TV Dramas             TV-MA      401
Name: count, dtype: int64

In [36]:
df_movies["title"] = df_movies["title"].astype(str)
duplicados = df_movies.loc[df["title"].duplicated()]

print("hay", len(duplicados), "peliculas con el mismo nombre, por lo que no pueden haber peliculas que cumplan con el criterio establecido.")

hay 0 peliculas con el mismo nombre, por lo que no pueden haber peliculas que cumplan con el criterio establecido.


In [37]:
titulos_unicos = df["title"].nunique()
print("Existen ", df["title"].nunique(), "Titulos de peliculas unicos en total")

Existen  8807 Titulos de peliculas unicos en total
