This is the demo project for the talk on Web Scraping using Python for PyCon KE held at USIU.
The presentation slides are here
These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.
The requirements are in the requirements.txt
file
Clone the repository and install the requirements in a virtual environment
cd PyCONKE-WebScraping
virtualenv --python=python3 pycondemo
. /pycondemo/bin/activate
pip install -r requirements.txt
Run the sample scrapper with the following command
python demo.py
- Beginner’s guide to Web Scraping in Python (using BeautifulSoup)
- Introduction to Web Scraping using Selenium
- 10 Web Scraping Tools to Extract Online Data
- 5 Tasty Python Web Scraping Libraries
- Webscraping with Selenium
- Requests - Requests is the only Non-GMO HTTP library for Python, safe for human consumption.
- Beautiful Soup 4 - Beautiful Soup is a Python library designed for quick turnaround projects like screen-scraping.
- Scrapy - An open source and collaborative framework for extracting the data you need from websites. In a fast, simple, yet extensible way.
- Selenium - Selenium is a tool that automates browsers, also known as a web-driver.
- Lxml - Lxml is a high-performance, production-quality HTML and XML parsing library.
- Robley Gori
See also the list of contributors who participated in this project.