In this project, I will try to answer if movies have increased in length the last years (since 2010). To perform this hypothesis testing, I used the dataset "title.basics.tsv.gz", downloaded from IMDb Datasets (https://www.imdb.com/interfaces/).
Subsets of IMDb data are available for access to customers for personal and non-commercial use.
The dataset files can be accessed and downloaded from https://datasets.imdbws.com/. The data is refreshed daily.
title.basics.tsv.gz - Contains the following information for titles:
tconst (string) - alphanumeric unique identifier of the title
titleType (string) – the type/format of the title (e.g. movie, short, tvseries, tvepisode, video, etc)
primaryTitle (string) – the more popular title / the title used by the filmmakers on promotional materials at the point of release
originalTitle (string) - original title, in the original language
isAdult (boolean) - 0: non-adult title; 1: adult title
startYear (YYYY) – represents the release year of a title. In the case of TV Series, it is the series start year
endYear (YYYY) – TV Series end year. ‘\N’ for all other title types
runtimeMinutes – primary runtime of the title, in minutes
genres (string array) – includes up to three genres associated with the title
Why it feels like movies are getting longer
Are movies really getting longer? Were films in the 1980s really shorter? We test the data...