In [None]:
# =============================================================================
# DS776 REQUIRED SETUP - Run this cell FIRST, before any other code!
# =============================================================================
# This cell:
#   1. Configures cache paths so downloads go to the right location
#   2. Updates the course package (introdl) if needed
#   3. Suppresses TensorFlow/Keras warnings
#
# If this fails, see: Lessons/Course_Tools/SETUP_HELP.md
# =============================================================================
%run ../Course_Tools/auto_update_introdl.py

In [None]:
# =============================================================================
# IMPORTS AND PATH CONFIGURATION
# =============================================================================

# Standard library imports
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# PyTorch imports
import torch
import torch.nn as nn

# Course utilities - ALWAYS use flattened imports from introdl
from introdl import (
    config_paths_keys,
    get_device,
    # Add other functions as needed for this lesson
)

# Configure paths - ALWAYS use these variables, never hardcode paths!
paths = config_paths_keys()
DATA_PATH = paths['DATA_PATH']      # Where datasets are stored
MODELS_PATH = paths['MODELS_PATH']  # Where trained models are saved
# CACHE_PATH = paths['CACHE_PATH']  # (Optional) Where pretrained models cache

# Set up plotting defaults
sns.set_theme(style='whitegrid')
plt.rcParams['figure.figsize'] = [8, 6]

# Lesson XX: Topic Title

Brief introduction to what this lesson covers.

---

## Video: Topic Overview

<iframe 
    src="https://media.uwex.edu/content/ds/ds776/..." 
    width="800" 
    height="450" 
    style="border: 5px solid cyan;"  
    allowfullscreen>
</iframe>
<br>
<a href="https://media.uwex.edu/content/ds/ds776/..." target="_blank">Open video in new tab</a>

## Section 1: First Topic

Explanation of the first concept...

In [None]:
# Example code demonstrating the concept
# Always use DATA_PATH for loading data:
# dataset = load_dataset(DATA_PATH / 'dataset_name')

# Always use MODELS_PATH for saving models:
# torch.save(model.state_dict(), MODELS_PATH / 'model_name.pt')

## Using HuggingFace Trainer (NLP Lessons)

When using HuggingFace `Trainer` or `Seq2SeqTrainer`, **always include these settings** to avoid filling up storage:

```python
training_args = TrainingArguments(
    output_dir=str(MODELS_PATH / 'model_name'),
    save_strategy="epoch",
    save_total_limit=1,           # CRITICAL: Keep only the best checkpoint!
    load_best_model_at_end=True,  # Load best model when training completes
    metric_for_best_model="eval_loss",  # Or "eval_accuracy", "eval_f1", etc.
    # ... other arguments ...
)
```

**Why this matters:**
- Without `save_total_limit=1`, Trainer saves a checkpoint after every epoch
- Each checkpoint can be 500MB-2GB for transformer models
- 10 epochs = 5-20GB of checkpoints, quickly filling CoCalc storage

## Section 2: Second Topic

More content...

In [None]:
# More example code...

## Summary

Key takeaways from this lesson:

1. First key point
2. Second key point
3. Third key point

---

## Cleanup Reminder

If you're running low on storage, you can free space by running the cleanup notebook:

**`Lessons/Course_Tools/Storage_Cleanup.ipynb`**

This will remove cached models and old checkpoints while preserving your current work.