Skip to content

Lecture notes, readings, code samples and resources for Brad Flaugher's Data-Focused Programming Bootcamp

License

Notifications You must be signed in to change notification settings

teacherc/DE-bootcamp

 
 

Repository files navigation

bootcamp

Lecture notes, readings, code samples and resources for Brad Flaugher's Data-Focused Programming Bootcamp

New Student TODOs

Preparation

Lecture Outline

  • Note 1: Lectures are a small part of the course, most bootcamper's time will be spent working on their final potrfolio projects.
  • Note 2: The 6 week course is broken into numeric and alphabetical lectures. Lectures 1-6 are technical in nature, Lectures A-E are soft-skills and history.

Lesson 1: Practical Science

Topics

Introduction to Portfolio projects

Project Ideas

Readings

Lesson A: History, Impostor Syndrome and Working With Technical Professionals

Topics

  • Definitions: Unix, Linux, Command Line, DevOps, Programming Language
  • History: Python and C Speed Test, SQL
  • History: BERT, GPT3, DALLE, Stable Diffusion and self-driving cars.
  • History: A historical perspective on technological adoption, is it fast or slow? Flavors of technological disruption. (Lateral thinking with withered technology, how many people can use spreadsheets, and Keynes quote)
  • Impostor Syndrome: "10,000 Qualified data scientists" Can you trust your professor at Berkley? Who are the ML Leads at big companies? Who are the IT consultants?
  • Impostor Syndrome: What does MIT Say? A review of Managing Technical Professionals.
  • Practice: "Head of Data" interview question, how fast can you spin up an environment? Remember your pandas functions

Readings

Optional Readings

Lesson 2: Docker, DevOps/MLOps, and Environment Setup

Topics

  • Definitions: docker, container, ephemeral, bash
  • History: SQL, what it is and why it's important (PowerBI, Tableau, Athena, BigQuery)
  • Docker: Command line usage, flags, interactive mode and bash
  • Docker in the cloud: How to think about the cloud, Big Providers (AWS, GCP, Azure) and Small (Linode, Oracle, etc...)
  • Aside: What are Kaggle and Colab?
  • Demonstration: Create a github project, spin up environment, run experiment, save python file, commit changes.

Post-lecture homework

docker pull tensorflow/tensorflow:latest  # Download latest stable image
docker run -it -p 8888:8888 tensorflow/tensorflow:latest-jupyter  # Start Jupyter server 
  • Run the tensorflow tutorial notebooks for either classification.ipynb (if you want to practice image classification) or text_classification.ipynb and fit the sample models with the sample data.

Readings

Optional Readings

Lesson B: Open Source, Freedom, and how to remove the stress of software choices

Topics

Final Project Update

Optional Readings

Lesson 3: ETL, Loading Data Types, "It's all numbers, man"

Topics

  • ETL: What is it and why do we need it?
  • Demonstration: Numbers are Data
  • Demonstration: Text is Data
  • Demonstration: Images are Data
  • Pandas: what is it and why do we use it?
  • Discussion: Data Collection, ETL and "glue code"

Final Project Update

Readings

Pandas Mini-Courses (required if you do not know pandas)

Optional Readings (Airflow)

Lesson C: Data Wrangling

Topics

  • Scraping Data
  • APIs
  • Python Requests
  • Combining datasets

Readings

Lesson 4: Break to make progress on final projects.

Lesson D: Features and Labels, how easy is that?

Topics

  • Demonstration: Simplest Text Classification
  • Demonstration: Simplest Image Classification
  • Ludwig

Readings

Lesson 5: Model Architecture

Topics

Current Events and Discussions in the Community

Readings

Lesson E: AI Optimism and Bias

Topics

  • Definitions: AI Ethics Big 3: Explainability, Bias, and Privacy
  • Discussion: Who should die? Self-Driving trolley preblems.
  • Discussion: I can predict criminality, should I?
  • Discussion: Are biased models useful? When?

Readings

Lesson 6: Model Deployment and MLOps

Topics

  • Demonstration: Tensorflow Lite, Tensorflow Serving
  • Discussion: Predict is easy, train is hard (computationally)
  • Demonstration: Docker + Flask
  • Discussion: DevOps vs MLOps, what is special? what is the same?

Readings

Optional Readings

Final Projects

Bootcampers will spend a tremendous time working on final projects that are targeted to the bootcamper's career goals. For an example final presentation see Oleh's Video (YouTube) and Oleh's Repository (GitHub).

After The Bootcamp

Recommended courses, videos and books

Data Janitoring

Model Training

Background Math

Operationalizing ML with MLFlow and MLOps tools

Ethics and AI

Recommended Mailing lists and online groups

Recommended Professional Groups

Recommended (in-person) Conferences

Recommended Job Boards

Gigs (Freelance work)

On-Demand Help

Huge Foundational Models

Competitions

About

Lecture notes, readings, code samples and resources for Brad Flaugher's Data-Focused Programming Bootcamp

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 99.9%
  • Other 0.1%