Skip to content

A work-in-progress utility for parsing Amazon search results.

Notifications You must be signed in to change notification settings

Mertzenich/swidden

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Swidden: Amazon Search Result Parser

A work-in-progress utility for parsing Amazon search results. I am incrementally targeting different kinds of search results and adding more functionality. Note that this project, while offering naive scraping via the Etaoin Clojure Webdriver protocol implementation, is primarily focused on the parsing of page HTML. As time progresses I will expose a clean interface so that the parsing functionality can be easily accessed by a scraping solution.

Here is what the parser can currently handle as well as some of the planned features:

  • ☒ Result titles
  • ☒ Result authors
  • ☐ Author URLs
  • ☒ Check if result is sponsored
  • ☒ Return product URL
  • ☒ First book format listing
  • ☐ Handle multiple book format listings
  • ☒ Actual price
  • ☐ List price
  • ☐ Alternative offers
  • ☐ Image extraction
  • ☐ Stars
  • ☐ Review count
  • ☐ Delivery details

Demonstration

demo.gif

About

A work-in-progress utility for parsing Amazon search results.

Resources

Stars

Watchers

Forks