# Clean and Free Space - Course-Wide Storage Management

This notebook helps you manage storage across the entire DS776 course.

## Features:
1. **Comprehensive Storage Report** - View usage across all lessons and homework
2. **Clean Old Cache** - Remove old pretrained models and datasets
3. **Delete All Lesson Models** - Free space from all lesson folders at once
4. **Delete All Homework Models** - Free space from all homework folders
5. **Emergency Cleanup** - Last resort options when critically low on space

In [None]:
# Import necessary utilities
from pathlib import Path
import os
import shutil
from datetime import datetime, timedelta

try:
    from introdl.utils import (
        display_storage_report,
        cleanup_old_cache,
        get_folder_size,
        format_size
    )
    print("✅ Course utilities loaded successfully")
except ImportError:
    print("❌ Could not import introdl utilities")
    print("   Please run Course_Setup.ipynb first!")

## 1. Comprehensive Storage Report

Get a complete overview of storage usage across the entire course.

In [None]:
# Display comprehensive storage report
display_storage_report()

## 2. Clean Old Cache Files (Pretrained Models)

Remove downloaded pretrained models and datasets older than 7 days.

**📚 What gets deleted?**
- Pretrained models downloaded from HuggingFace, OpenAI, etc.
- Cached datasets that can be re-downloaded
- Files in the `downloads` folder older than 7 days

**What is preserved?**
- Your trained models in Lesson_XX_models and Homework_XX_models folders
- All notebooks and scripts
- Recent downloads (< 7 days old)

In [None]:
# First, see what would be deleted (dry run)
print("🔍 Checking for old cache files...\n")
cleanup_old_cache(days_old=7, dry_run=True)

In [None]:
# ⚠️ Uncomment to actually delete old cache files
# cleanup_old_cache(days_old=7, dry_run=False)
# print("\n" + "="*60)
# print("📊 UPDATED STORAGE AFTER CLEANUP")
# print("="*60)
# display_storage_report()

## 3. Delete All Lesson Models

Remove ALL models you trained while working through the lessons.

**⚠️ WARNING**: This will delete all Lesson_XX_models folders across all lessons!

In [None]:
def delete_all_lesson_models(confirm=False):
    """Delete all lesson model folders."""
    course_root = Path(os.environ.get('DS776_ROOT_DIR', Path.home()))
    lessons_dir = course_root / "Lessons"
    
    if not lessons_dir.exists():
        print("❌ Lessons directory not found")
        return
    
    total_size = 0
    model_folders = []
    
    # Find all lesson model folders
    for lesson_dir in lessons_dir.glob("Lesson_*"):
        for models_dir in lesson_dir.glob("*_models"):
            size = get_folder_size(models_dir)
            total_size += size
            model_folders.append((models_dir, size))
    
    if not model_folders:
        print("ℹ️ No lesson model folders found")
        return
    
    print(f"🗂️ LESSON MODELS TO DELETE")
    print("-" * 40)
    for folder, size in model_folders:
        print(f"  • {folder.parent.name}/{folder.name}: {format_size(size)}")
    print(f"\n  Total: {format_size(total_size)} from {len(model_folders)} folders")
    
    if not confirm:
        print("\n⚠️ Set confirm=True to delete these folders")
        return
    
    # Delete the folders
    deleted_count = 0
    for folder, _ in model_folders:
        try:
            shutil.rmtree(folder)
            deleted_count += 1
            print(f"  ✅ Deleted: {folder.name}")
        except Exception as e:
            print(f"  ❌ Error deleting {folder.name}: {e}")
    
    print(f"\n✅ Deleted {deleted_count} folders, freed {format_size(total_size)}")

# Check what would be deleted
delete_all_lesson_models(confirm=False)

In [None]:
# ⚠️ Uncomment to actually delete ALL lesson models
# delete_all_lesson_models(confirm=True)
# display_storage_report()

## 4. Delete All Homework Models

Remove ALL models from homework folders.

**⚠️ WARNING**: 
- This will delete all Homework_XX_models folders!
- Consider using the Zip Models feature in homework utilities first to backup

In [None]:
def delete_all_homework_models(confirm=False):
    """Delete all homework model folders."""
    course_root = Path(os.environ.get('DS776_ROOT_DIR', Path.home()))
    homework_dir = course_root / "Homework"
    
    if not homework_dir.exists():
        print("❌ Homework directory not found")
        return
    
    total_size = 0
    model_folders = []
    
    # Find all homework model folders
    for hw_dir in homework_dir.glob("Homework_*"):
        for models_dir in hw_dir.glob("*_models"):
            size = get_folder_size(models_dir)
            total_size += size
            model_folders.append((models_dir, size))
    
    if not model_folders:
        print("ℹ️ No homework model folders found")
        return
    
    print(f"🗂️ HOMEWORK MODELS TO DELETE")
    print("-" * 40)
    for folder, size in model_folders:
        print(f"  • {folder.parent.name}/{folder.name}: {format_size(size)}")
    print(f"\n  Total: {format_size(total_size)} from {len(model_folders)} folders")
    
    if not confirm:
        print("\n⚠️ Set confirm=True to delete these folders")
        print("💡 TIP: Use homework utility notebooks to zip models first!")
        return
    
    # Delete the folders
    deleted_count = 0
    for folder, _ in model_folders:
        try:
            shutil.rmtree(folder)
            deleted_count += 1
            print(f"  ✅ Deleted: {folder.name}")
        except Exception as e:
            print(f"  ❌ Error deleting {folder.name}: {e}")
    
    print(f"\n✅ Deleted {deleted_count} folders, freed {format_size(total_size)}")

# Check what would be deleted
delete_all_homework_models(confirm=False)

In [None]:
# ⚠️ Uncomment to actually delete ALL homework models
# delete_all_homework_models(confirm=True)
# display_storage_report()

## 5. Emergency Cleanup

**🚨 USE ONLY WHEN CRITICALLY LOW ON SPACE!**

This will delete:
- ALL cached/pretrained models (regardless of age)
- ALL downloaded datasets
- Everything in the downloads folder

You'll need to re-download any pretrained models you use again.

In [None]:
# ⚠️ EMERGENCY ONLY - Uncomment TWO lines for safety
# print("🚨 EMERGENCY CLEANUP - Deleting ALL cache...")
# cleanup_old_cache(days_old=0, dry_run=False)  # Delete ALL cache

# print("\n" + "="*60)
# print("📊 STORAGE AFTER EMERGENCY CLEANUP")
# print("="*60)
# display_storage_report()

## Quick Reference

### Storage Management Priority:

1. **First**: Clean old cache (>7 days) - Section 2
2. **Second**: Delete lesson models - Section 3
3. **Third**: Zip and delete homework models - Section 4
4. **Last Resort**: Emergency cleanup - Section 5

### Tips:
- Run storage report first to understand usage
- Start with least destructive options
- Always backup homework models before deleting
- Remember: pretrained models can always be re-downloaded