GitHub - AashitaK/Machine-Learning-workshop-Fall2021

A Hands-on Workshop series in Machine Learning

Timing: 4-6 pm PST on Tuesdays and Fridays from Nov 2nd, 2021 to Nov 23rd, 2021 (7 sessions in total)
Where: Online on Zoom

The workshop series is designed with a focus on the practical aspects of machine learning using real-world datasets and the tools in the Python ecosystem. It is targeted towards complete beginners familiar with Python but is also designed adaptively so that you will be challenged even if you have some familiarity with machine learning tools.

You will learn the minimal but most useful tools for exploring datasets using pandas and then gently introduced to neural networks. Some concepts from natural language processing will also be covered as you will train neural network models on textual data. You will also learn more involved architectures such as Convolution Neural Networks (CNN) and apply them to real-world image datasets.

Register using this Google form to save your seat. Please also register for the Zoom meeting here. After registering, you will receive a confirmation email containing information about joining the Zoom meeting.

Each session of the workshop will build on the previous ones. It is important that you attend all the sessions of the series for it to be useful. The learning material and solutions will be made available in this Github repository after each session.

Pre-requisites:

The workshop will cover the data science and deep learning tools in the Python ecosystem from the scratch. Some familiarity with Python is a pre-requisite. If you have a grip on the basics of coding in some other language such as Javascript, that should suffice too.
Basics of Probability and Statistics
Basics of Calculus
Basics of Linear Algebra

Here is an optional quiz to brush up your Python skills before the workshop.

Please download and install Anaconda with Python 3.8 version on your laptop ahead of the workshop.

Topics to be covered:

1. Data Manipulation using `pandas` (Tuesday, Nov 2nd, 2021)

Introduction to Jupyter Notebook
Pandas dataframes as a data structure
Indexing and slicing data frames
Data exploration
Basic statistical plots using matplotlib and seaborn
Detecting and filling missing values
Regular expressions for text mining

2. More on `pandas` and Regular Expressions (Friday, Nov 5th, 2021)

More on pandas - Groupby operations
One hot encoding for categorical features
An exercise on preprocessing the movie reviews from the IMDb dataset using regular expressions

3. Logistic Regression (Tuesday, Nov 9th, 2021)

Binary classification algorithm: Logistic Regression
Underfitting and Overfitting to the training dataset; Model cross-validation
Natural language processing (NLP) concepts: Bag Of Words (BOW) model, TF-IDF vectorizor, using word n-grams, etc.
Application of Logistic Regression and NLP concepts using scikit-learn on the IMDb dataset to predict the sentiment (positive or negative) of the movie reviews

4. A Gentle Introduction to Neural Networks (Friday, Nov 12th, 2021)

Linear Regression
Neural networks: Building the intuition of the architecture and the iterative learning process
An exercise on implementing AND, OR and XOR gates with neural networks by trial-and-error
Multi-Layer Perception: Forward and Backward propagation
A primer on Keras
Training a neural network on IMDb dataset for sentiment analysis

5. Fine-tuning Neural Networks (Tuesday, Nov 16th, 2021)

Vanishing gradients and exploding gradients in deep networks
Activation functions
Weight Initialization
Regularization - L1 and L2, Dropout
Tuning other hyper-parameters such as learning rate, number of epochs, etc.
Exploring the TensorFlow Playground
Application of the above concepts on IMDb dataset for training a neural network for sentiment analysis

6. Convolution Neural Networks (Friday, Nov 19th, 2021)

Image preprocessing for neural networks
Feature extraction using convolution filters
Convolution Neural Network architecture (CNN)
Training a CNN model on CIFAR-10 dataset

7. Classification metrices (Tuesday, Nov 23rd, 2021)

Imbalanced datasets and classification metrices:
- Confusion matrix
- Decision Threshold
- Precision/Recall
- F1-score
- Area Under ROC curve
Mini-project: Building a spam detector using dataset from Kaggle

This page will be frequently updated with more information.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
1. Data Manipulation using pandas		1. Data Manipulation using pandas
Session 2		Session 2
Session 3		Session 3
Session 4		Session 4
Session 5		Session 5
Session 6		Session 6
Session 7		Session 7
README.md		README.md
_config.yml		_config.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

1. Data Manipulation using pandas

1. Data Manipulation using pandas

Session 2

Session 2

Session 3

Session 3

Session 4

Session 4

Session 5

Session 5

Session 6

Session 6

Session 7

Session 7

README.md

README.md

_config.yml

_config.yml

Repository files navigation

A Hands-on Workshop series in Machine Learning

Pre-requisites:

Topics to be covered:

1. Data Manipulation using `pandas` (Tuesday, Nov 2nd, 2021)

2. More on `pandas` and Regular Expressions (Friday, Nov 5th, 2021)

3. Logistic Regression (Tuesday, Nov 9th, 2021)

4. A Gentle Introduction to Neural Networks (Friday, Nov 12th, 2021)

5. Fine-tuning Neural Networks (Tuesday, Nov 16th, 2021)

6. Convolution Neural Networks (Friday, Nov 19th, 2021)

7. Classification metrices (Tuesday, Nov 23rd, 2021)

About

Releases

Packages

Languages

AashitaK/Machine-Learning-workshop-Fall2021

Folders and files

Latest commit

History

Repository files navigation

A Hands-on Workshop series in Machine Learning

Pre-requisites:

Topics to be covered:

1. Data Manipulation using pandas (Tuesday, Nov 2nd, 2021)

2. More on pandas and Regular Expressions (Friday, Nov 5th, 2021)

3. Logistic Regression (Tuesday, Nov 9th, 2021)

4. A Gentle Introduction to Neural Networks (Friday, Nov 12th, 2021)

5. Fine-tuning Neural Networks (Tuesday, Nov 16th, 2021)

6. Convolution Neural Networks (Friday, Nov 19th, 2021)

7. Classification metrices (Tuesday, Nov 23rd, 2021)

About

Resources

Stars

Watchers

Forks

Languages

1. Data Manipulation using `pandas` (Tuesday, Nov 2nd, 2021)

2. More on `pandas` and Regular Expressions (Friday, Nov 5th, 2021)