Skip to content
A webscraper for the movie websites boxofficemojo.com and the-numbers.com
R
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
R Change some tests Dec 19, 2018
docs Fix tests Jun 10, 2019
inst Add spelling check Jan 16, 2019
man Change some tests Dec 19, 2018
tests Fix top grossing values in tests. Sep 3, 2019
vignettes
.Rbuildignore Make MIT license May 22, 2019
.gitignore make vignette Jan 22, 2018
.travis.yml
CRAN-RELEASE Update pkgdown. May 7, 2019
DESCRIPTION Fix tests Jun 10, 2019
LICENSE Make MIT license May 22, 2019
LICENSE.md Make MIT license May 22, 2019
NAMESPACE CRAN ready! Oct 20, 2018
NEWS.md Fix tests Jun 10, 2019
README.Rmd total badge. May 7, 2019
README.md Updates pkgdown and GitHub downloads badge. May 8, 2019
appveyor.yml major improvements and added tests Jan 7, 2018
boxoffice.Rproj
codecov.yml
cran-comments.md Update cran comments Apr 20, 2019

README.md

CRAN_Status_Badge Travis-CI Build Status AppVeyor Build Status Coverage status

Overview

The goal of boxoffice is to scrape movie data to get information about daily box office results of movies and top grossing movies. It scrapes the webpages of either https://www.boxofficemojo.com or https://www.the-numbers.com/ for this information.

Installation

To install this package, use the code
install.packages("boxoffice")


# The development version is available on Github.
# install.packages("devtools")
devtools::install_github("jacobkap/boxoffice")

Usage

The boxoffice() function gets daily boxoffice information. In essence, it shows how well each movie performed on that day.

The data it returns are the following:

  1. Movie name
  2. The studio that produced that movie
  3. The daily gross
  4. Daily percent change in gross
  5. Number of theaters it is playing in
  6. Average gross per theater (result of 4 / result of 5)
  7. Gross-to-date
  8. How many days the movie has been in theaters
  9. The date of the data
movies <- boxoffice::boxoffice(date = as.Date("2015-10-31"))
head(movies)
##                   movie      distributor   gross percent_change theaters
## 1           The Martian 20th Century Fox 4564809             31     3218
## 2       Bridge of Spies      Walt Disney 3588796             45     2873
## 3            Goosebumps    Sony Pictures 3326075              9     3618
## 4 The Last Witch Hunter        Lionsgate 2023321             36     3082
## 5  Hotel Transylvania 2    Sony Pictures 1905762              7     2962
## 6                 Burnt    Weinstein Co. 1733927             -5     3003
##   per_theater total_gross days       date
## 1        1419   179446657   30 2015-10-31
## 2        1249    43200132   16 2015-10-31
## 3         919    53277832   16 2015-10-31
## 4         656    17377961    9 2015-10-31
## 5         643   153858782   37 2015-10-31
## 6         577     3563747    2 2015-10-31

The top_grossing() function gets the

  1. Movie name
  2. Year released
  3. Total domestic (American market) sales
  4. Total international sales
  5. Total sales (domestic + international)
movies <- boxoffice::top_grossing()
## Please note that these numbers are not adjusted for inflation.
head(movies)
##   rank                                movie year_released
## 2    1 Star Wars Ep. VII: The Force Awakens          2015
## 3    2                               Avatar          2009
## 4    3                        Black Panther          2018
## 5    4               Avengers: Infinity War          2018
## 6    5                              Titanic          1997
## 7    6                       Jurassic World          2015
##   american_box_office international_box_office total_box_office
## 2           936662225               1116648995       2053311220
## 3           760507625               2015837654       2776345279
## 4           700059566                648198658       1348258224
## 5           678815482               1369318718       2048134200
## 6           659363944               1548844451       2208208395
## 7           652270625                996584239       1648854864
You can’t perform that action at this time.