Web Scraping with Python by Robley Gori for PyCONKE 17
This is the demo project for the talk on Web Scraping using Python for PyCon KE held at USIU.
The presentation slides are here
Getting Started
These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.
Prerequisites
The requirements are in the requirements.txt
file
Installing
Clone the repository and install the requirements in a virtual environment
cd PyCONKE-WebScraping
virtualenv --python=python3 pycondemo
. /pycondemo/bin/activate
pip install -r requirements.txt
Running
Run the sample scrapper with the following command
python demo.py
Scraping Resources
Guides
- Beginner’s guide to Web Scraping in Python (using BeautifulSoup)
- Introduction to Web Scraping using Selenium
- 10 Web Scraping Tools to Extract Online Data
- 5 Tasty Python Web Scraping Libraries
- Webscraping with Selenium
Tools
- Requests - Requests is the only Non-GMO HTTP library for Python, safe for human consumption.
- Beautiful Soup 4 - Beautiful Soup is a Python library designed for quick turnaround projects like screen-scraping.
- Scrapy - An open source and collaborative framework for extracting the data you need from websites. In a fast, simple, yet extensible way.
- Selenium - Selenium is a tool that automates browsers, also known as a web-driver.
- Lxml - Lxml is a high-performance, production-quality HTML and XML parsing library.
Authors
- Robley Gori
See also the list of contributors who participated in this project.