Techniques for Scraping the Web in Python
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
.circleci
chalice_apps/scrape-yahoo
notebooks
scrapy-crawl-basketball-reference
.gitignore
Makefile
README.md
requirements.txt
wscli.py

README.md

Serverless Web Scraping in Python for AI, Fun and Profit

(Using Step Functions and Lambdas)

This material is also covered in Chapter 7 of Pragmatic AI

Web Scraping for AI/ML consists of three phases:

A. Doing the Work
B. Scheduling the Work
C. Modeling the Work

A. Web Scraping Techniques (Doing the Work)

B. Orchestrating Retrieval (Scheduling the Work)

Step Function Workflow

C. Wiring Results into A Machine Learning Pipeline (Modeling the Work)

  • TBD