amazon_scraper

Make python virtual environment using following command in cmd (i'll ommit my current working directory shown on cmd)

$ python -m venv vscrapy

After activating virtual environment, install requirements as follows,

$ python -m pip install --upgrade pip
$ pip install Twisted-20.3.0-cp38-cp38-win_amd64.whl
$ pip install scrapy pywin32 scrapy_proxies scrapy-user-agents

Now all done, start scraping, put all links in the urls_list.txt file and run the following command,

"https://www.amazon.com/s?rh=n%3A172635&fs=true&ref=lp_172635_sar"

$ scrapy crawl amazon -a pages=1 -o output.csv

lets update the products, use the same csv you used above

$ scrapy crawl amazon_update -a

it will only update the products already scraped, only look for price and stock status

we can merge this data with original table later to update the product prices in our database using product link as the key

Design of scrapper

don't scrape what is not available from amazon
mark currently unavailable, temporarily out of stock, pre-order, available in x days products as out of stock
get product name, price, discounted price, variant information, stock status and also download the product image

Current Functions

Can go to unlimited links and scrape all items from those pages, but page layout MUST be the same as example page
Get most of the relevant information from the page
Uses proxies and try to avoid amazon bot detection using other methods

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
amazonscraper		amazonscraper
imgs/full		imgs/full
.gitignore		.gitignore
Amazon.pdf		Amazon.pdf
LICENSE		LICENSE
README.md		README.md
get_proxies.py		get_proxies.py
output.csv		output.csv
proxies_scrapper.py		proxies_scrapper.py
scrapy.cfg		scrapy.cfg
urls_list.txt		urls_list.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

amazon_scraper

Design of scrapper

Current Functions

About

Releases

Packages

Languages

License

haseeb5i/amazon_scraper

Folders and files

Latest commit

History

Repository files navigation

amazon_scraper

Design of scrapper

Current Functions

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages