Skip to content

gloriadai/movie-scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 

Repository files navigation

box-office-mojo-scraper

Box Office Mojo is a bit tricky when it comes to extracting data since there are tables nested on each movie's profile page. The Python code in this repository provides a way to extract the information on the profile page of the movies listed on the site. An example of the information that is pulled is available in the figure below, pertaining to the release of Big Hero 6.

MojoLinkExtract.py

Extracts the links for each movie profile page (approx 16,101) and writes them as comma separated strings into MovieLinks.txt.

MovieLinks.txt

Comma separated strings for each partial movie link to box office mojo. Must be preceded with http://www.boxofficemojo.com.

MojoMovieData.py

Extracts movie data in the example image above.

Releases

No releases published

Packages

No packages published

Languages