Meet Arthur.
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
main
media/posters
project
scripts
.gitignore
README.md
__init__.py
manage.py
reqs.txt

README.md

This is Arthur.

At its core this is a web scraper for IMDB using Python and lxml. It begins at a person's page, opens all the links in their filmography, captures contextual data about each movie -- the name, year of release, the poster, a truncated description, genres -- then opens the link to the full cast and crew and grabs that, dumping it all into a database.

On the front end, I wanted to show relationships over time between a specific person and the cast and crew members IMDB can tell me he encountered.

With whom did he work most frequently? On what movies?

The gender pronoun there is intentional. This was written with someone in mind.

That's why this is called Arthur. Arthur Piantadosi was my grandfather, although I called him Doe. Just so happens he was a sound man.

IMDB's records are limited. Movie credits used to only credit the sound department or the department head rather than the crew members themselves. And let's not get started on how the academy handled awards.

This is a project that could be updated later should I get access to, say, old studio payroll records that can show me what specifically my grandfather worked on that's not reflected here. Meantime, this is what I can do.