Movie-crawlr

A web application that returns a movie's or TV series' IMDb metadata given its title. Built this to learn and gain familiarity with Scrapy, Flask, and Heroku. I built this in a weekend in April 2014 to experiement with the above technologies for the very first time, so it's very rough. Check out how I parsed JSON then if you dare.

http://movie-crawlr.herokuapp.com

References

Getting Started with Python on Heroku
Scrapy Tutorial
Flask Quickstart
Jinja Template Designer Documentation
Bootstrap CSS
Google Python Style Guide
XPath
The Open Movie Database API - used this to get IMDb's movie id for queried movies.

Problems and bugs encountered

Could not install Scrapy in virtualenv

error: distutils.errors.DistutilsError: Setup script exited with error: command 'cc' failed with exit status 1
Solution on Stack Overflow

Accidentally adding venv to git

Fixed using git reset HEAD^

Pushing to Heroku

error: distutils.errors.DistutilsError: Setup script exited with error: command 'gcc' failed with exit status 1
Solution on Stack Overflow

Bash script could not run on Heroku

Substituted wget with curl in get_imdb_url.sh

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
imdb		imdb
static		static
templates		templates
.gitignore		.gitignore
Procfile		Procfile
README.md		README.md
crawl.py		crawl.py
get_imdb_url.sh		get_imdb_url.sh
movie-crawlr.py		movie-crawlr.py
requirements.txt		requirements.txt
scrapy.cfg		scrapy.cfg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Movie-crawlr

References

Problems and bugs encountered

About

Releases

Packages

Languages

staceytay/movie-crawlr

Folders and files

Latest commit

History

Repository files navigation

Movie-crawlr

References

Problems and bugs encountered

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages