# The Academy Awards, 1927 - 2023

Este [dataset](https://www.kaggle.com/datasets/unanimad/the-oscar-award) contiene un scraping de la base de datos de los Premios de la Academia, registros de los nominados y ganadores desde 1927 hasta 2023.

En este EDA cruzaré los datos de diferentes datasets de las películas con las nominadas y ganadoras a mejor pelicula para ver si los miembros de la academia siguen algun tipo de patron o *bias* para elegir la ganadora.

In [2]:
import pandas as pd
import requests

In [3]:
df_oscars = pd.read_csv("./data/the_oscar_award.csv")
df_oscars

Unnamed: 0,year_film,year_ceremony,ceremony,category,name,film,winner
0,1927,1928,1,ACTOR,Richard Barthelmess,The Noose,False
1,1927,1928,1,ACTOR,Emil Jannings,The Last Command,True
2,1927,1928,1,ACTRESS,Louise Dresser,A Ship Comes In,False
3,1927,1928,1,ACTRESS,Janet Gaynor,7th Heaven,True
4,1927,1928,1,ACTRESS,Gloria Swanson,Sadie Thompson,False
...,...,...,...,...,...,...,...
10760,2022,2023,95,HONORARY AWARD,"To Euzhan Palcy, a masterful filmmaker who bro...",,True
10761,2022,2023,95,HONORARY AWARD,"To Diane Warren, for her genius, generosity an...",,True
10762,2022,2023,95,HONORARY AWARD,"To Peter Weir, a fearless and consummate filmm...",,True
10763,2022,2023,95,GORDON E. SAWYER AWARD,Iain Neil,,True


### Los diferentes nombres que ha tenido la categoria para mejor película a lo largo de los años
1927/28–1928/29: Academy Award for Outstanding Picture  
1929/30–1940: Academy Award for Outstanding Production  
1941–1943: Academy Award for Outstanding Motion Picture  
1944–1961: Academy Award for Best Motion Picture  
1962–present: Academy Award for Best Picture

In [4]:
BEST_PICTURE = df_oscars.category == "BEST PICTURE"
OUTSTANDING_PICTURE =  df_oscars.category == "OUTSTANDING PICTURE"
OUTSTANDING_PRODUCTION = df_oscars.category == "OUTSTANDING PRODUCTION"
OUSTANDING_MOTION_PICTURE = df_oscars.category == "OUTSTANDING MOTION PICTURE"
BEST_MOTION_PICTURE = df_oscars.category == "BEST MOTION PICTURE"

df_best_picture = df_oscars[BEST_PICTURE | OUTSTANDING_PICTURE | OUTSTANDING_PRODUCTION | OUSTANDING_MOTION_PICTURE | BEST_MOTION_PICTURE]
df_best_picture

Unnamed: 0,year_film,year_ceremony,ceremony,category,name,film,winner
19,1927,1928,1,OUTSTANDING PICTURE,The Caddo Company,The Racket,False
20,1927,1928,1,OUTSTANDING PICTURE,Fox,7th Heaven,False
21,1927,1928,1,OUTSTANDING PICTURE,Paramount Famous Lasky,Wings,True
62,1928,1929,2,OUTSTANDING PICTURE,Feature Productions,Alibi,False
63,1928,1929,2,OUTSTANDING PICTURE,Fox,In Old Arizona,False
...,...,...,...,...,...,...,...
10719,2022,2023,95,BEST PICTURE,"Kristie Macosko Krieger, Steven Spielberg and ...",The Fabelmans,False
10720,2022,2023,95,BEST PICTURE,"Todd Field, Alexandra Milchan and Scott Lamber...",Tár,False
10721,2022,2023,95,BEST PICTURE,"Tom Cruise, Christopher McQuarrie, David Ellis...",Top Gun: Maverick,False
10722,2022,2023,95,BEST PICTURE,"Erik Hemmendorff and Philippe Bober, Producers",Triangle of Sadness,False


In [5]:
df_best_picture[df_best_picture.winner]
# para asegurarme de que tengo 95 años de datos

Unnamed: 0,year_film,year_ceremony,ceremony,category,name,film,winner
21,1927,1928,1,OUTSTANDING PICTURE,Paramount Famous Lasky,Wings,True
64,1928,1929,2,OUTSTANDING PICTURE,Metro-Goldwyn-Mayer,The Broadway Melody,True
100,1929,1930,3,OUTSTANDING PRODUCTION,Universal,All Quiet on the Western Front,True
140,1930,1931,4,OUTSTANDING PRODUCTION,RKO Radio,Cimarron,True
178,1931,1932,5,OUTSTANDING PRODUCTION,Metro-Goldwyn-Mayer,Grand Hotel,True
...,...,...,...,...,...,...,...
10219,2018,2019,91,BEST PICTURE,"Jim Burke, Charles B. Wessler, Brian Currie, P...",Green Book,True
10350,2019,2020,92,BEST PICTURE,"Kwak Sin Ae and Bong Joon Ho, Producers",Parasite,True
10474,2020,2021,93,BEST PICTURE,"Frances McDormand, Peter Spears, Mollye Asher,...",Nomadland,True
10591,2021,2022,94,BEST PICTURE,"Philippe Rousselet, Fabrice Gianfermi and Patr...",CODA,True


In [6]:
df_best_picture.drop(columns="category", inplace=True)
df_best_picture.info()

<class 'pandas.core.frame.DataFrame'>
Index: 591 entries, 19 to 10723
Data columns (total 6 columns):
 #   Column         Non-Null Count  Dtype 
---  ------         --------------  ----- 
 0   year_film      591 non-null    int64 
 1   year_ceremony  591 non-null    int64 
 2   ceremony       591 non-null    int64 
 3   name           591 non-null    object
 4   film           591 non-null    object
 5   winner         591 non-null    bool  
dtypes: bool(1), int64(3), object(2)
memory usage: 28.3+ KB


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_best_picture.drop(columns="category", inplace=True)


|columna|descripcion|
|-|-|
|year_film|Año de estreno|
|year_ceremony| Año en que se celebró la gala|
|ceremony|Número de ceremonia|
|name|Nombre de la productora/los productores|
|film|Título de la película|
|winner|Bool True si ganó|

In [7]:
df_best_picture.sample(50)

Unnamed: 0,year_film,year_ceremony,ceremony,name,film,winner
2590,1951,1952,24,"Anatole Litvak and Frank McCarthy, Producers",Decision before Dawn,False
8419,2003,2004,76,"Kathleen Kennedy, Frank Marshall and Gary Ross...",Seabiscuit,False
3224,1956,1957,29,"George Stevens and Henry Ginsberg, Producers",Giant,False
9345,2011,2012,84,"Graham King and Martin Scorsese, Producers",Hugo,False
606,1937,1938,10,RKO Radio,Stage Door,False
878,1939,1940,12,Hal Roach (production company),Of Mice and Men,False
9966,2016,2017,89,"Donna Gigliotti, Peter Chernin, Jenno Topping,...",Hidden Figures,False
874,1939,1940,12,Metro-Goldwyn-Mayer,"Goodbye, Mr. Chips",False
10220,2018,2019,91,"Gabriela Rodríguez and Alfonso Cuarón, Producers",Roma,False
4637,1968,1969,41,"John Woolf, Producer",Oliver!,True


In [8]:
df_best_picture.to_csv("./data/best_picture.csv")