acoffman/dvd_scrape
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
This is part of a larger project that I hope to complete someday. As it stands, this script will pull a ZIP file from the web and unzip it. It then parses the CSV file inside into a mysql database using ActiveRecord. The script then cycles through the database and scrapes Amazon for cover art for each DVD. The script "works" in the sense that it does everything that it is supposed to and won't crash. It is not optimized, tested very thouroughly or refactored much at all. When I have some additional time, all of those things will get done, and I plan on writing some RSpec tests for it as well (I know thats out of order!) The script is also very slow, it will take hours to run the large CSV file (about 400,000 values) The download, unzip, parsing, and database creation are fairly fast, taking about 7 minutes for the whole process on the large file. Fetching the artwork for the DVDs is the bottleneck, taking several seconds per DVD.
About
Fetches a CSV file containing DVD information from the web, parses it, saves it to a database, and scrapes the web for cover art.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published