Skip to content
master
Switch branches/tags
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Advanced Python for Data Science

Test

Course Description

This is a two-day course that introduces how one can use Python for advanced machine learning applications. Most of the time will be spent working through example problems end-to-end in the classroom. Students will learn the fundamentals of the Scikit-learn library along with exploring several other tools and methodologies that allow you to implement a robust end-to-end machine learning workflow. Some additional time will be reserved for discussion of real programming challenges students have encountered, and for an overview of related relevant technologies students may need in an industry setting (e.g. Git and GitHub).

Objectives

  1. Develop an intuition for the machine learning workflow and Python tooling.
  2. Build familiarity with common software engineering tooling and methodologies for implementing a machine learning project.
  3. Gain a high-level understanding of the function of data science-adjacent technologies that students will encounter in the workplace, focusing on Git and GitHub.

Prerequisites

  • Strong understanding of core Python concepts: variables, loops, conditionals, and functions
  • Some experience using Jupyter Notebooks or Jupyter Lab
  • Solid grasp of Pandas and how to use it for data manipulation: filtering, selecting, aggregating, slicing (indexing), and updating
  • High-level understanding of modeling concepts: training and test data, model accuracy, and overfitting

Agenda

This workshop will be 100% virtual over 4 half-days.

Day Topic Time
1 Introductions 9:00 - 9:15
Setting the Stage 9:15 - 9:30
Git & version control 9:30 - 10:15
Break 10:15 - 10:30
EDA & Our First scikit-learn Model 10:30 - 12:00
Q&A 12:00 - 12:30
2 Q&A 8:45 - 9:00
Modular Code 9:00 - 10:00
Feature Engineering 10:00 - 11:00
Break 11:00 - 11:15
Case Study, pt. 1 11:15 - 12:00
Q&A 12:00 - 12:30
3 Q&A 8:45 - 9:00
Model Evaluation & Selection 9:00 - 10:15
Break 10:15 - 10:30
More on Modular Code 10:30 - 11:15
Unit Tests 11:15 - 12:00
Q&A 12:00 - 12:30
4 Q&A 8:45 - 9:00
More on Unit Tests 9:00 - 9:30
ML lifecycle management 9:30 - 10:30
Break 10:30 - 10:45
Case Study, pt. 2 10:45 - 11:45
Case Study Review, pt. 2 and Q&A 11:45 - 12:30

Course Preparation

You will need to install Python, Jupyter, and the relevant libraries on your personal computer for this workshop. we also recommend downloading the course materials.

See below for instructions on doing so.

1. Install Python, Jupyter and Needed Packages

These easiest way to install Python, Jupyter, and the necessary packages is through Anaconda. To download and install Anaconda and its graphical interface, Anaconda Navigator, follow these steps:

  1. Visit the Anaconda download page.
  2. Select your appropriate operating system.
  3. Click the "Download" button for Anaconda Individual Edition, Python 3.9 - this will begin to download the Anaconda installer.
  • If a popup appears, asking you to sign up for anything, you can close the window.
  1. Open the installer when the download completes, and then follow the prompts. If you are prompted about installing PyCharm, elect not to do so.
  2. Once installed, open the Anaconda Navigator and launch a Jupyter Notebook to ensure it works.
  3. Download the class materials (see the below section) and use the included environment.yaml file to create a new environment from Anaconda Navigator, using these steps:
  • In the tabs along the left side, select "Environments".
  • At the bottom of the list of environments (you will likely have just one, "base"), look for the "Import" button. Click it.
  • In the dialog box that appears, click on the folder icon and then navigate your computer's files in order to select the environment.yaml file you downloaded earlier. Click "Open" once you've selected it.
  • Wait for Anaconda Navigator to finish fetching and installing the needed packages. When it finishes, a new environment called "uc-python" should show up in the list.

2. Download Class Materials

There are two ways to download the class materials:

  1. Clone it - If you're familiar with using Git, we recommend cloning the repo.
  2. Download the files as a zip - This will allow you to download a static copy of the files here, but in order to get any updates you'll need to redownload the entire repo. Use this link.

Your Instructors

If you have any specific questions prior to the class you can reach out to us directly via GitHub or email:

About

Advanced Python for Data Science Workshop

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages