Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion week03_fast_pipelines/homework/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ Don't forget that you also need to build a correct attention mask to prevent cro

For each of the implemented methods (and all variations of the third method), mock one training epoch and measure minimum, maximum, mean and median batch processing times.
To mock a training epoch, you need to construct a small GPT-2-like model: use `nn.Embedding` layer, `PositionalEncoding` class from `transformer.py` file and a single `nn.TransformerDecoder` layer with a hidden size of 1024 and 8 heads.
For tokenization, use `torchtext.data.utils.get_tokenizer("basic_english")`.
For tokenization, use `.tokenize()` method of `AutoTokenizer.from_pretrained("bert-base-uncased")`.
Run one epoch **without a backward pass**.
Make sure you've [warmed up](https://forums.developer.nvidia.com/t/why-warm-up/48565) the GPU before computing the statistics and do not forget about asynchronous CUDA kernel execution.

Expand Down
1 change: 1 addition & 0 deletions week03_fast_pipelines/homework/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ torch==2.4.0
torchtext
torchvision==0.19.0
tqdm==4.64.1
transformers==4.48.2
vit_pytorch==0.40.2
gdown==4.7.3
matplotlib==3.8.2
1 change: 1 addition & 0 deletions week03_fast_pipelines/homework/task2/dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
import torch
from torch.utils.data.dataset import Dataset
from torch.utils.data import Sampler, IterableDataset
from transformers import AutoTokenizer


MAX_LENGTH = 640
Expand Down