Skip to content
Introduction to web scraping
Jupyter Notebook
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.
solutions correct typo in construct_absolute_url Dec 5, 2018


Introduction to web scraping


This is a one-hour beginner's introuction to web scraping, using Python. We'll work through a complete example of scraping a website containing course information from a university, resulting in a dataset of almost 10,000 university courses. We'll focus on the concepts involved in web scraping rather than memorizing Python syntax.

What you'll learn

  • Why you'd want to scrape data from the web in the first place
  • A high-level view of how the web works
  • How to make a HTTP request in Python
  • How to parse HTML in Python
  • Why you need to read the Terms of Service of a website before you scrape any website


Anyone is welcome at this workshop no matter what level their programming is at. That's because we'll focus on the concepts behind web scraping more than the specific syntax. This workshop will be most useful to people who have some familiarity with Python but have never done web scraping before.


It's OK Not To Know! That's our motto at D-Lab. D-Lab is open to researchers and professionals from all disciplines and levels of experience. Ask any questions.


If you spot a problem with these materials, please make an issue describing the problem.


  • Geoff Bacon


  • Chris Hench

D-Lab logo


You can’t perform that action at this time.