Amazon-Crawler

crawl all laptops from amazon and save them in xml files in python using Scrapy, scrapy-user-agents, scrapy-rotating-proxies

Install Packages :

Scrapy :

install scrapy using the following command : pip install Scrapy

See the Documentation here : https://doc.scrapy.org/

scrapy-user-agents:

install the package using the following command: pip install scrapy-user-agents

See the Documentation here : https://github.com/hyan15/crawler-demo/tree/master/crawling-basic/scrapy_user_agents

scrapy-rotating-proxies:

install the package using the following command: pip install scrapy-rotating-proxies

See the Documentation here : https://github.com/TeamHG-Memex/scrapy-rotating-proxies

Run the spider:

navigate to Amazon-Crawler/crawler/crawler then run the command : scrapy crawl amazon

Result:

after the spider finishes, you can find the xml files insind Amazon-Crawler/crawler/crawler/product-xml-files and the links of the crawled products in Amazon-Crawler/crawler/crawler/laptops.txt

Scrape other products:

you will have to paste the link of the department you want to crawl inside start_urls array which is located in Amazon-Cralwer/crawler/crawler/spiders/amazon.py

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.idea		.idea
.vscode		.vscode
crawler		crawler
venv		venv
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Amazon-Crawler

Install Packages :

Scrapy :

scrapy-user-agents:

scrapy-rotating-proxies:

Run the spider:

Result:

Scrape other products:

About

Releases

Packages

Languages

beshoyabdelmalak/Amazon-Crawler

Folders and files

Latest commit

History

Repository files navigation

Amazon-Crawler

Install Packages :

Scrapy :

scrapy-user-agents:

scrapy-rotating-proxies:

Run the spider:

Result:

Scrape other products:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages