My collection of resources for connecting Python to the web. Also my notes from University of Michigan’s MOOC on scraping.
Trying out 'Behave' with 'Gherkin' as frameworks for specification and testing of Python code.
The things I want to accomplish around scraping:
- MOOC on webscraping with Python
- HTML with Beautiful Soup
- XML with lxml and etree
- JSON with json
- Regex with RE
- HTML more in depth following video tutorial
- Getting a script working scraping financial figures
- Getting Behave into a natural part of my workflow
- Create a simple library for scraping data based on my learnings from the course