Python
Switch branches/tags
Nothing to show
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
icone
.gitignore
README.md
scrapy.cfg

README.md

A spider for Icone

This spider loops though all the product items using the filters to get the information from the product.

  • name
  • price
  • description
  • images_uls
  • images

In order to get the results. Just run

  scrapy crawl icone -o items.json -t json

Then check the items.json file.

TODO

  • Refactoring

DONE

  • Clean up the strings
  • Clean description: Now it is a list [, , , ]
  • Pagination
  • Scrape the other urls

Instalation

  1. virtualenv --site-no-packages env
  2. source /env/bin/activate
  3. pip install scrapy
  4. pip install pil

Useful commands

  • Check the list of spiders
  scrapy list
  • Save the scraped data into a json file
  scrapy crawl icone -o items.json -t json

References