Skip to content

This repository contains example for a web scraping tutorial integrating Selenium with Scrapy.

License

Notifications You must be signed in to change notification settings

karthikn2789/Selenium---Scrapy-Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Web Scraping Project Using Selenium & Scrapy

The project contains 2 examples: a web scraping example written in Python to demonstrate web scraping combining Selenium with Scrapy and a project comparing the performance of Scrapy and Selenium. In the openaq project, PM2.5 values from https://openaq.org are extracted and stored in a JSON file using Selenium.

To run the project, Scrapy, Selenium and a webdriver needs to be installed. Scrapy can be installed either through anaconda or pip.

conda install -c conda-forge scrapy

or

pip install Scrapy

For installing on other OS and any other installation queries, please click here.

Selenium can be installed using the following command.

pip install selenium

Webdriver for 5 major browsers are supported by Selenium. Chromedriver for Chrome browser can be installed using the following commands.

wget https://chromedriver.storage.googleapis.com/83.0.4103.39/chromedriver_linux64.zip

unzip chromedriver_linux64.zip

sudo mv chromedriver /usr/local/bin/

Geckodriver for Firefox can be installed with the following command.

sudo apt install firefox-geckodriver

Commands to run example are provided in a README.md files inside the project openaq.

A performance comparison between Scrapy and Selenium can be found under performance_comparison.

About

This repository contains example for a web scraping tutorial integrating Selenium with Scrapy.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages