Skip to content

Course materials for Hacks/Hackers scraping class. (Modified from class originally taught via Mizzou/IRE.)

License

Notifications You must be signed in to change notification settings

enriquemanuel/scraping-class

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

40 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Building a web scraper

Welcome to the Hacks and Hackers course on building a web scraper. This class is based off of the content from a class originally taught and hosted by MizzoU and IRE (please follow fork to see the original materials).

Although the stated goal of this course is to introduce the concepts of web scraping, we will also spend time covering programming fundamentals that can be applied to other problems, from data analysis to web development.

Course logistics

This course was originally taught over a couple of days. We are teaching a condensed version of the course.

  • 30 minutes for setting up
  • 1 hour for walk through of script
  • 30 minutes of wrap up & future application

Software requirements

This course will be taught primarily using the Python programming language. In addition, we'll be using two open source Python modules that greatly simplify the web scraping process -- BeautifulSoup, which makes it easy to parse and sort through HTML files; and mechanize, which allows you to emulate a web browser from within your Python programs.

We will need some place to edit and write code. If you don't already have a code editor, we recommend you explore Sublime.

In addition to Python, we'll also be making use of the Chrome web browser. Although it isn't required, we'd also recommend you check out git, version control software so you can download the course materials after you leave.

No worries if you don't have this software already installed. We'll help you set up everything up.

Contact

The modified version of this course was taught by:

The original course was created and taught by:

About

Course materials for Hacks/Hackers scraping class. (Modified from class originally taught via Mizzou/IRE.)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%