Skip to content
No description, website, or topics provided.
Branch: master
Clone or download
Pull request Compare This branch is 69 commits ahead of rachelrakov:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.

Introduction to Machine Learning

Machine learning is an application of artificial intelligence (AI) that provides systems the ability to automatically learn and improve from experience without being explicitly programmed.

Learning objectives

In this workshop, you will learn the following skills:

  • How to use skills from the NLTK workshop to build features for a classification task
  • How to build a text classification system that can predict whether sentences belong to one category ("news") or another ("romance")
  • How to model the topics in a corpus based on the distributions of words across the documents
  • How to group data and perform calculations on the aggregations
  • How to prepare data for machine learning using pandas, a package for Python that helps to organize your data
  • How to use the scikit-learn package for Python to perform different types of machine learning on the data
  • How to evaluate the results of machine learning algorithms
  • How to visualize observations, aggregations, and algorithmic results

This workshop will review key concepts for understanding how machine learning works, and walk participants through the process of analyzing data using statistical and machine learning methods.

Much gratitude to Kelsey Chatlosh, Lisa Rhody, and Michael Grossberg for substantive feedback that worked its way into content.

Get Started >>>

Installation and Setup
What Is Classification?
Getting Our Data
Extracting Features
Supervised Machine Learning
Supervised Classification Algorithm with sklearn
Unsupervised Machine Learning


Session leaders: Rachel Rakov and Hannah Aizenman
Based on previous work by: Rachel Rakov and Hannah Aizenman

Creative Commons License

Digital Research Institute (DRI) Curriculum by Graduate Center Digital Initiatives is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Based on a work at When sharing this material or derivative works, preserve this paragraph, changing only the title of the derivative work, or provide comparable attribution.

You can’t perform that action at this time.