Skip to content

maxcohen31/Amazon-book-scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 

Repository files navigation

Amazon-book-scraper

Amazon spider made by using Scrapy framework

Run Locally

Clone the project

  git clone https://github.com/maxcohen31/Amazon-book-scraper.git

Directory Structure

amazonscraper/
|-- amazonscraper
|   |-- __init__.py
|   |-- items.py
|   |-- middlewares.py
|   |-- MLBooks
|   |   `-- ml_books.csv
|   |-- pipelines.py
|   |-- __pycache__
|   |   |-- __init__.cpython-39.pyc
|   |   `-- settings.cpython-39.pyc
|   |-- settings.py
|   `-- spiders
|       |-- amazon_book_scraper.py
|       |-- __init__.py
|       `-- __pycache__
|           |-- amazon_book_scraper.cpython-39.pyc
|           `-- __init__.cpython-39.pyc
`-- scrapy.cfg

Setup a virtual enviroment

virtualenv amazonscraper ; source bin/activate
pip install scrapy

Go to the project directory

  cd Amazon-book-scraper
  cd amazonscraper/amazonscraper/spiders/

Run the crawler

  python3 amazon_book_scraper.py

About

Amazon crawler made using Scrapy framework

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages