PyCon Introduction to Web and Data Scraping Tutorial

A tutorial-based introduction to web scraping with Python.

Virtual Env

If you'd like to use virtual environments, please follow the following instructions. It is not required for the tutorial but may be helpful.

For more details on virtual environments

If you don't have virtual env wrapper and/or pip:

$ easy_install pip
$ pip install virtualenvwrapper

and read the additional instructions here

$ mkvirtualenv scraper_tutorial
$ pip install -r requirements.txt

LXML and Selenium

You will need both LXML and Selenium to follow this tutorial in it's entirety.

If you are using a Mac, I would highly recommend using Homebrew. It will help make pip install very easy for you to use.

If you are using Windows, it might be worth it to run this within a Linux Virtual Machine. If you are a Windows + Python guru, please follow these installation instructions. I can help as needed but I have not programmed on Windows in more than 5 years.

Please reach out to me if you have any questions on getting the initial requirements set up. Thanks!

Firefox Web Browser

Firefox comes as the default web driver for Selenium. To use Selenium easily, please download and install Firefox.

Using PIP

If you have never used PIP before you will need to sudo easy_install pip or brew install pip. PIP is a python package manager and it's really super so I highly advise using it!

Questions?

/msg kjam on freenode or @kjam on twitter

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
data		data
.gitignore		.gitignore
README.md		README.md
bs_scraper.py		bs_scraper.py
csv_scraper.py		csv_scraper.py
family_tree.py		family_tree.py
json_scraper.py		json_scraper.py
requirements.txt		requirements.txt
scrape_netflix.py		scrape_netflix.py
scraper.py		scraper.py
start_selenium.py		start_selenium.py
xlsx2csv.py		xlsx2csv.py
xlsx_scraper.py		xlsx_scraper.py
xpath_intro.py		xpath_intro.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PyCon Introduction to Web and Data Scraping Tutorial

Virtual Env

LXML and Selenium

Firefox Web Browser

Using PIP

Questions?

About

Uh oh!

Releases

Packages

Uh oh!

Languages

kjam/python-web-scraping-tutorial

Folders and files

Latest commit

History

Repository files navigation

PyCon Introduction to Web and Data Scraping Tutorial

Virtual Env

LXML and Selenium

Firefox Web Browser

Using PIP

Questions?

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages