This repository contains code for the O'Reilly Live Online Training for Deep Learning for Modern AI
This training provides the theory and practical concepts for a comprehensive introduction to machine learning and deep learning with PyTorch —foundational knowledge needed to successfully build and train GenAI and multimodal models. By making our way through several real-world case studies including object recognition and text classification this session is an excellent crash course in deep learning with PyTorch.
We use tools including large pre-trained models and model training dashboards to set up reproducible deep learning experiments and build machine learning models optimized for performance. There are several code examples throughout the training to help solidify the theoretical concepts that will be introduced. Models like Stable Diffusion, Llama 3, GPT, and BERT are highlighted as we uncover the training and optimization strategies to get the most of our models' performance, speed, and memory usage.
All data can be downloaded for the art classification example here. Note it is about 6GB so it may take a bit.
- First steps with Deep Learning with MNIST
- RNNs and CNNs
- Working with pre-trained VGG-11 and BERT models
- Fine-tuning BERT vs ChatGPT
- Fine-tuning OpenAI: the code to compare against BERT
- Fine-tuning GPT-2 to convert English to LaTEX
- Fine-tuning Llama 3 to be a chatbot
- Production Optimization
- Quantizing Llama 3
- Testing different fine-tuning configurations
- Distilling BERT models
-
Intro to Multimodality: An introduction to multimodality with CLIP and SHAP-E + Diffusion
-
Whisper: An introduction to using Whisper for audio transcription
-
Llava: Using an open source mult-turn multimodal engine
-
CLIP-based Stock Image Search: Using CLIP to search through a library of images
-
Dreambooth: Fine-tuning a stable difusion model to make images of yours truly! Ever wonder what I look like blonde? Me neither but AI gave me some ideas of what it would look like.
-
-
Visual Q/A - This case study requires you to download the data from my Dropbox here. The code snippets should download them in code if that is easier! Our goal is to emulate the process done by Llama 3.2-Vision-Instruct: one of Meta's latest Llama models that can take in images.
-
Training a Reasoning Model with Unsloth - Advanced - See how companies like DeepSeek and Anthropic train their reasoning models. Unsloth AI is a package aiming to make fine-tuning more streamlined, faster, and more memory efficient by handwriting things like backprop in a faster way. We salute them for their work!
app.py
is a Flask app that uses a VGG16 model to classify the art style of an uploaded image. The app currently supports 10 different art styles:
- Abstract Expressionism
- Art Nouveau (Modern)
- Baroque
- Expressionism
- Impressionism
- Northern Renaissance
- Post-Impressionism
- Realism
- Romanticism
- Symbolism
Start the Flask app:
python app.py
This should start the Flask app and make it available at http://localhost:5000
.
To classify an image, you can use a cURL request in the following format:
curl -X POST -F 'image=@/path/to/your/image.jpg' http://localhost:5000/predict
Replace /path/to/your/image.jpg
with the path to your own image. The response will be in JSON format and will contain the predicted art style and associated confidence scores, as shown below:
e.g.
curl -X POST -F \
'image=@images/Venus_and_Adonis_by_Peter_Paul_Rubens.jpg' \
http://localhost:5000/predict
[
["Northern_Renaissance",0.13392961025238037],
["Realism",0.12794768810272217],
["Romanticism",0.12592236697673798],
["Post_Impressionism",0.11863630264997482],
["Baroque",0.11325731128454208],
["Symbolism",0.1120268702507019],
["Expressionism",0.08971412479877472],
["Impressionism",0.086906298995018],
["Art_Nouveau_Modern",0.05910796299576759],
["Abstract_Expressionism",0.03255145251750946]]
If there is an error with the request, such as no image being provided, the response will contain an error message instead:
{
"error": "No image provided"
}
Sinan Ozdemir is the Founder and CTO of LoopGenius where he uses State of the art AI to help people create and run their businesses. Sinan is a former lecturer of Data Science at Johns Hopkins University and the author of multiple textbooks on data science and machine learning. Additionally, he is the founder of the recently acquired Kylie.ai, an enterprise-grade conversational AI platform with RPA capabilities. He holds a master’s degree in Pure Mathematics from Johns Hopkins University and is based in San Francisco, CA.
- CHeck out Deep Learning Illustrated: A best seller by Jon Krohn, it's a very visual introduction to deep learning
- Deep Learning course: lecture slides and lab notebooks: The course covers the basics of Deep Learning, with a focus on applications.