Skip to content

Use Scrapy to crawl text reviews & images from Dianping.com and generate pretty static pages!

Notifications You must be signed in to change notification settings

annieqt/Dianping-Gallery

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Dianping-Gallery

Use Scrapy to crawl text reviews & images from Dianping.com and generate pretty static pages!

Features

1. Crawling text reviews & images from a specific user's Dianping account and storing them locally

Images

  • The downloaded images will be stored under ../imgs/, sorted by /user/shop/
  • You can also custom images path by change in IMAGES_STORE in settings.py

Text Reviews

  • The text reviews are exported in JSON format in review.json

2. Generating pretty static pages from crawled data to visualize the user's FOOD preference!

To Be Done..

How to use

1. Dependencies

2. Configurations

  • Set start_urls in dianping_spider.py to the url of the review page that you want to crawl. e.g., click here to view my dianping reviews page

3. Run

Under ../Dianping-Gallery/dianping_gallery/dianping_gallery/spiders, run:
scrapy runspider dianping_spider.py -o review.json

The downloading process will then show in the command screen

About

Use Scrapy to crawl text reviews & images from Dianping.com and generate pretty static pages!

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages