# Topics

- Lesson 1: Introduction to Deep Learning and Fully Connected Networks
- Lesson 2: Convolutional Neural Networks
- Lesson 3: Better Network Training
- Lesson 4: Improved Network Architectures
- Lesson 5: Object Detection and Segmentation
- Lesson 6: Transfer Learning
- Lesson 7: Introduction to Transformers, Hugging Face, and Using LLMs Effectively
- Week 8: Transformer Internals - Self-Attention and Positional Encoding
- Week 9: Text Classification with Transformers
- Lesson 10: Named Entity Recognition (NER) and Tokenization
- Lesson 11: Text Generation and Decoding Strategies
- Lesson 12: Summarization with Transformers
- Lesson 13 (not finalized)

## Lesson 1: Introduction to Deep Learning and Fully Connected Networks

### Topics
* Introduction to PyTorch
* Fully-connected neural networks for regression and classification
* Modeling process: prepare data, prepare model, train model, evaluate model, make predictions

### Outcomes

1. **Understand deep learning fundamentals**, including the structure of neural networks.
2. **Utilize PyTorch datasets and data loaders** to efficiently handle and preprocess data for training deep learning models.
3. **Develop and train basic fully connected neural networks** in PyTorch, applying optimization techniques and appropriate loss functions.
4. **Evaluate model performance** by plotting loss functions and other metrics.
5. **Make predictions for new data** by applying a trained neural network to new inputs.

### Readings and Videos

* **Course Intro Notebook / Video** (Still to come)

* **(Optional) Review Neural Networks from DS740**

    * You might want to review the first 14 slides of the [Lesson on Neural networks in DS740](https://media.uwex.edu/content/ds/ds740_r23/ds740_artificial-neural-networks.sbproj/).  We're covering similar material this week.  Don't review the material about neural networks in R since we'll be using Python.

* **Readings from Inside Deep Learning (IDL)**

    * **Chapter 1: The Mechanics of Learning**
        - **Read Sections 1.2, 1.4, and 1.5**. Skim the other sections. No need to understand the detailed code or the backpropagation algorithm, but ensure you understand how the gradient is used in training.

    * **Chapter 2: Fully Connected Networks**
        - **Section 2.1**: Focus on understanding the training loop structure and process. Skip the code details but grasp the concept.
            - Don’t worry about the math notation at the bottom of page 40. It's shorthand for a fully connected linear layer. If you want to learn more about matrix multiplication see the videos listed under Auxiliary Materials below.
        - **Section 2.2**: Understand how activation functions introduce nonlinearity into networks.
        - **Section 2.3**: Grasp the basics of softmax and cross-entropy, especially how the loss function changes for classification tasks. An example will be explained in a notebook and video.
        - **Section 2.4**: Note the key concepts; they will be reinforced in video lectures.
        - **Section 2.5**: Understand the importance of batch training, particularly for large datasets that won’t fit in memory.

* **Course Notebooks with Videos**  Open each of the notebooks included in the lesson folder and watch the embedded video.  You can read along and work through the code examples as needed.  The notebooks for this lesson are in the Lesson_01 directory.  The notebooks are numbered in the order they should be used.
### Assessments

1.  Complete the reading quiz in Canvas (10 points).
2.  Complete the exercises in your the homework notebook in CoCalc (40 points).

### Auxiliary Materials

- **Background Mathematics from Dr. Anne Hsu:**
    * [Matrices and Vectors](https://www.youtube.com/watch?v=sM2Mm6aT_HI)
    * [Derivatives and Gradients 1](https://www.youtube.com/watch?v=Fiw0_w4AykA)
    * [Derivatives and Gradients 2](https://www.youtube.com/watch?v=qORZmKCB0g8)
    * [Playlist for entire Intro to Deep Learning](https://www.youtube.com/@drannehsu/playlists)
- **Object-Oriented Programming** It helps to have a basic familiarity with classes, inheritance, and methods for finding your way around in PyTorch.  [Real Python has a great tutorial](https://realpython.com/python3-object-oriented-programming/) on the basics of OOP with many code examples that you can either read or watch (40 minutes).
- **Deep Learning Basics: Introduction and Overview**  Lex Fridman is a well known podcast host for AI and Data Science.  Here he gives an [introductory lecture](https://youtu.be/O5xeyoRL95U?si=SrM7RLWB_iBPMiK4) for MIT's public deep learning class.  Since it was recorded in 2019 it doesn't include the latest on transformer architectures that are driving the current boom in AI (ChatGPT, etc.), it's still a great introduction that discusses many applications of deep learning.  Watch this if you want a good overview.  I also recommend his podcast.




---

## Lesson 2: Convolutional Neural Networks

### Topics
* Image data
* Convolutional Layers
* Pooling Layers

### Outcomes

1. **Understand the structure and role of Convolutional Neural Networks (CNNs)** in processing spatial data like images.
2. **Design and implement basic CNN architectures** in PyTorch, including convolutional layers, pooling, and activation functions.
3. **Understand how padding, stride, kernel size, and the number of output channels** interact to determine the dimensionality of the output in each convolutional layer.
### Readings and Videos

* Read Chapter 3, through Section 3.5, in Inside Deep Learning.  You can read 3.6 if you wish, we'll get into that material in the next lesson.
* [Andrew Ng on Convolution over Volumes](https://www.youtube.com/watch?v=KTB_OFoAQcc&ab_channel=DeepLearningAI) This is one of many videos Andrew Ng has made to support his deep learning course.  Watch this to solidify your understanding convolutions after doing the reading.  About 11 minutes.
* **Course Notebooks with Videos**  Open each of the notebooks included the lesson folder and watch the embedded video.  You can read along and work through the code examples as you want.  The notebooks for this lesson are in the Lesson_02 directory.  The notebooks are numbered in the order they should be used.

### Assessments

1.  Complete the reading quiz in Canvas (10 points).
2.  Complete the exercises in your the homework notebook in CoCalc (40 points).

---

## Lesson 3: Better Network Training

### Outcomes

1. **Apply data augmentation techniques** to improve model generalization, especially with small datasets.
2. **Understand and implement learning rate schedules**, including exponential decay, step drop, and cosine annealing.
3. **Optimize training** using modern techniques like SGD with momentum, Adam, and gradient clipping.
4. **Implement early stopping** to prevent overfitting and improve training efficiency by halting training when performance plateaus.

### Readings and Videos

* Read Sections 3.6, 5.1-5.3 in Inside Deep Learning.
* **Course Notebooks with Videos**  Open each of the notebooks included the lesson folder and watch the embedded video.  You can read along and work through the code examples as you want.  The notebooks for this lesson are in the Lesson_03 directory.  The notebooks are numbered in the order they should be used.

### Assessments

1.  Complete the reading quiz in Canvas (10 points).
2.  Complete the exercises in your the homework notebook in CoCalc (40 points).

---

## Lesson 4: Improved Network Architectures

### Outcomes
* **Understand and apply ReLU and LeakyReLU activations** to address vanishing gradient problems and enhance network convergence.
* **Implement batch and layer normalization** to stabilize training and improve network performance.
* **Analyze and utilize residual connections** to enable deeper network architectures by mitigating vanishing gradient issues.

### Readings and Videos
* Read Sections 6.1-6.4 from Inside Deep Learning
* **Course Notebooks with Videos** Open each of the notebooks included the lesson folder and watch the embedded video. You can read along and work through the code examples as you want. The notebooks are numbered in the order they should be used.

### Assessments
* Complete the reading quiz in Canvas (10 points).
* Complete the exercises in your the homework notebook in CoCalc (40 points).

---

## Lesson 5: Object Detection and Segmentation

### Topics
* Transposed Convolutions
* U-Net for Segmentation
* R-CNN for Object Detection
* Other architectures

### Outcomes

1. **Differentiate Image Segmentation and Object Detection**: Explain the roles of pixel-level segmentation and object detection with bounding boxes.
   
2. **Build an Image Segmentation Model**: Implement segmentation using transposed convolutions and U-Net architecture.

3. **Apply Bounding Box Detection**: Use Faster R-CNN for object detection with bounding boxes and assess precision trade-offs.

4. **Reduce False Positives**: Explore filtering methods to improve detection accuracy by minimizing overlapping boxes.

### Readings and Videos

* Read Chapter 8 in Inside Deep Learning
* **Course Notebooks with Videos**  Open each of the notebooks included the lesson folder and watch the embedded video.  You can read along and work through the code examples as you want.  The notebooks for this lesson are in the Lesson_02 directory.  The notebooks are numbered in the order they should be used.

### Assessments

1.  Complete the reading quiz in Canvas (10 points).
2.  Complete the exercises in your the homework notebook in CoCalc (40 points).

---

## Lesson 6: Transfer Learning

### Topics
* Sources for pre-trained models:  Torch Hub, TIMM, Hugging Face, and more
* Identifying the layers that need tweaking
* Freezing and unfreezing Layers
* The small data problem
* Fine-tuning vs full Training

### Outcomes

1. **Explain Transfer Learning**: Describe the concept and benefits of transfer learning for leveraging pre-trained models on new tasks.

2. **Implement Model Parameter Transfer**: Apply a pre-trained model’s weights to a new problem by modifying only specific layers.

3. **Optimize Training with Limited Data**: Use transfer learning techniques to improve model performance with smaller labeled datasets.

4. **Adapt CNNs for New Tasks**: Fine-tune models for target datasets by adjusting layers and applying warm or frozen weights.


### Readings and Videos

* Read 13.1-13.3 in Inside Deep Learning
* **Course Notebooks with Videos**  Open each of the notebooks included the lesson folder and watch the embedded video.  You can read along and work through the code examples as you want.  The notebooks for this lesson are in the Lesson_02 directory.  The notebooks are numbered in the order they should be used.

### Assessments

1.  Complete the reading quiz in Canvas (10 points).
2.  Complete the exercises in your the homework notebook in CoCalc (40 points).

---

## Lesson 7: Introduction to Transformers, Hugging Face, and Using LLMs Effectively

### Topics
* Transformer architecture overview
* Self-attention mechanism
* Hugging Face library introduction
* Effective use of large language models (LLMs) and prompt engineering basics

### Outcomes

1. **Describe Transformer Architecture**: Identify the core components of transformers, including the encoder-decoder structure and the role of self-attention.
   
2. **Explain Self-Attention Basics**: Outline how self-attention works, including the function of queries, keys, and values, and the advantages it provides for handling context.

3. **Use Pre-trained Models**: Load and use a pre-trained Hugging Face model for a simple NLP task to gain familiarity with the Hugging Face transformers library.

4. **Improve Prompting Techniques**: Experiment with prompt engineering to enhance response quality when interacting with LLMs, learning to refine prompts for clarity and relevance.

### Readings and Videos
* Read *Chapter 1: Hello Transformers* in *Natural Language Processing with Transformers*
* **Course Notebooks with Videos**: Open each of the notebooks included in the Lesson_07 directory and watch the embedded videos in the recommended order.

### Assessments
1. Complete the reading quiz in Canvas (10 points).
2. Complete the exercises in your homework notebook in CoCalc (40 points).


---

## Week 8: Transformer Internals - Self-Attention and Positional Encoding

### Topics
* In-depth self-attention mechanics
* Multi-headed attention and context capture
* Positional encoding and sequence structure

### Outcomes

1. **Understand Self-Attention Calculations**: Demonstrate how self-attention works by calculating attention scores for a simple sequence, showing the process of computing queries, keys, and values.
   
2. **Explain Multi-Headed Attention**: Describe the purpose of multi-headed attention in transformers and explain how multiple attention heads provide richer contextual understanding.

3. **Discuss Positional Encoding**: Explain the role of positional encoding in maintaining sequence order within transformers and understand its mathematical formulation.

4. **Experiment with Attention Mechanisms**: Use Hugging Face models to observe how multi-headed attention and positional encoding affect outputs.

### Readings and Videos
* Read *Chapter 3: Transformer Anatomy* in *Natural Language Processing with Transformers*
* **Course Notebooks with Videos**: Open each of the notebooks in the Lesson_08 directory and watch the embedded videos in the recommended order.

### Assessments
1. Complete the reading quiz in Canvas (10 points).
2. Complete the exercises in your homework notebook in CoCalc (40 points).


---

## Week 9: Text Classification with Transformers

### Topics
* Fine-tuning transformers for text classification
* Tokenization and data preprocessing for NLP tasks
* Using the Hugging Face Trainer API for efficient model training

### Outcomes

1. **Explain the Fine-Tuning Process**: Describe how transformers are fine-tuned for text classification tasks, focusing on modifying specific layers and adjusting model parameters.
   
2. **Use Tokenization for Classification Tasks**: Use Hugging Face’s `AutoTokenizer` to tokenize and preprocess input text for classification, understanding the effect of different tokenization strategies on model input.

3. **Fine-Tune a Transformer Model**: Fine-tune a BERT model (or similar) on a classification dataset, adjusting hyperparameters to improve model performance.

4. **Evaluate Model Performance**: Analyze the model’s accuracy on the validation set, learning to interpret common metrics (accuracy, F1) and assess model quality.

### Readings and Videos
* Read *Chapter 2: Text Classification* in *Natural Language Processing with Transformers*
* **Course Notebooks with Videos**: Open each notebook in the Lesson_09 directory and watch the embedded videos in the recommended order.

### Assessments
1. Complete the reading quiz in Canvas (10 points).
2. Complete the exercises in your homework notebook in CoCalc (40 points).


---

## Lesson 10: Named Entity Recognition (NER) and Tokenization

### Topics
* Overview of Named Entity Recognition (NER)
* Subword tokenization techniques (e.g., BPE, WordPiece)
* Multilingual considerations in tokenization and NER

### Outcomes

1. **Explain Named Entity Recognition (NER)**: Define NER and its applications, identifying how it is used to label specific entities (e.g., names, locations) in text.
   
2. **Differentiate Tokenization Methods**: Describe different tokenization techniques (e.g., Byte-Pair Encoding, WordPiece) and their relevance in NER and multilingual settings.

3. **Apply NER with Transformers**: Fine-tune a transformer model for NER tasks, learning how token-level classification works in transformers.

4. **Discuss Multilingual Challenges**: Explain tokenization and NER challenges in multilingual contexts, including handling multiple languages and out-of-vocabulary (OOV) words.

### Readings and Videos
* Read *Chapter 4: Multilingual Named Entity Recognition* in *Natural Language Processing with Transformers*
* **Course Notebooks with Videos**: Open each notebook in the Lesson_10 directory and watch the embedded videos in the recommended order.

### Assessments
1. Complete the reading quiz in Canvas (10 points).
2. Complete the exercises in your homework notebook in CoCalc (40 points).


---

## Lesson 11: Text Generation and Decoding Strategies

### Topics
* Overview of transformer-based text generation
* Decoding strategies: Greedy search, beam search, top-k sampling, and nucleus sampling
* Applications of text generation in NLP

### Outcomes

1. **Explain Text Generation Basics**: Describe how transformer models like GPT-2 generate text, and identify common applications of text generation, such as chatbots, content creation, and automated summarization.
   
2. **Use Decoding Strategies**: Implement and compare different decoding methods (e.g., greedy search, beam search, top-k sampling, nucleus sampling) to observe how each affects text generation quality.

3. **Evaluate Generated Text**: Assess the quality of generated text, discussing trade-offs in coherence, creativity, and relevance with different decoding strategies.

4. **Identify Real-World Applications**: Explain practical uses of text generation in industries like customer service, media, and content creation, understanding the strengths and limitations of transformer-based text generation.

### Readings and Videos
* Read *Chapter 5: Text Generation* in *Natural Language Processing with Transformers*
* **Course Notebooks with Videos**: Open each notebook in the Lesson_11 directory and watch the embedded videos in the recommended order.

### Assessments
1. Complete the reading quiz in Canvas (10 points).
2. Complete the exercises in your homework notebook in CoCalc (40 points).


---

## Lesson 12: Summarization with Transformers

### Topics
* Extractive vs. abstractive summarization
* Transformer models for summarization (e.g., BART, T5)
* Evaluation metrics for summarization (e.g., ROUGE scores)

### Outcomes

1. **Understand Summarization Approaches**: Differentiate between extractive and abstractive summarization, identifying the benefits and limitations of each approach.
   
2. **Fine-Tune a Summarization Model**: Use a transformer model, such as BART or T5, to perform summarization on a dataset, focusing on fine-tuning techniques for high-quality summary generation.

3. **Evaluate Summaries**: Apply evaluation metrics like ROUGE to assess the relevance, coherence, and completeness of generated summaries.

4. **Discuss Summarization Applications**: Identify practical applications of summarization in fields like news, research, customer service, and document management.

### Readings and Videos
* Read *Chapter 6: Summarization* in *Natural Language Processing with Transformers*
* **Course Notebooks with Videos**: Open each notebook in the Lesson_12 directory and watch the embedded videos in the recommended order.

### Assessments
1. Complete the reading quiz in Canvas (10 points).
2. Complete the exercises in your homework notebook in CoCalc (40 points).


---

## Lesson 13 (not finalized)

In the final two weeks of the course you will either:

* Investigate a new topic from one of our textbooks on your own and produce a notebook that introduces the topic, explains a bit about how it works, and demonstrates one or more computational experiments related topic.  Your goal is to produce an "educational exposition" that highlights the topic so that it could be read by one of your peers for them to supplement their own study of the corresponding textbook chapter.  It would be similar to any of the notebooks I've provided for the class.

* Apply one of the topics we've covered to a new dataset or in a new way.  For example, ...

### Assessments
* Submit your notebook in the Final Project folder in CoCalc.  (100 pts)

#### Rubric:

---