# Tutorial 1: Archive Basics - Creating Your First Climate Data Archive

**Learning Goals:** By the end of this tutorial, you will understand how to create, list, and examine climate simulation archives using Tellus.

**Time Estimate:** 20 minutes

**Prerequisites:** Basic familiarity with climate model outputs (NetCDF files) and command line operations.

## What is a Climate Data Archive?

Imagine you've just finished running a 6-month CESM simulation. Your output directory contains hundreds of files:

```
cesm_simulation/
├── input/
│   ├── user_nl_cam
│   ├── user_nl_clm
│   └── initial_conditions.nc
├── output/
│   ├── cam.h0.2024-01.nc    # Monthly atmospheric data
│   ├── cam.h0.2024-02.nc
│   ├── clm.h0.2024-01.nc    # Monthly land data
│   ├── clm.h0.2024-02.nc
│   └── pop.h.2024-01.nc     # Monthly ocean data
├── restart/
│   ├── cam.r.2024-01-01.nc
│   └── clm.r.2024-01-01.nc
└── logs/
    ├── atm.log
    └── run.log
```

**The Challenge**: This data is scattered, takes up lots of space, and you need to move it to long-term storage. You also want to track what's in the archive for future analysis.

**The Solution**: Tellus archives provide:
- **Compression**: Reduces storage space
- **Organization**: Automatically classifies files by type and importance
- **Metadata**: Tracks what's inside without unpacking
- **Selective Access**: Extract only what you need later

Let's see how this works in practice!

## Setup: Creating Sample CESM Output

First, let's create a realistic CESM simulation directory to work with:

In [None]:
import tempfile
from pathlib import Path
import json
import numpy as np
import xarray as xr
from datetime import datetime, timedelta

# Create a temporary directory for our tutorial
tutorial_dir = Path(tempfile.mkdtemp())
print(f"Tutorial workspace: {tutorial_dir}")

def create_cesm_simulation_directory():
    """
    Creates a realistic CESM simulation directory structure with sample files.
    This simulates what you'd have after running a 3-month CESM simulation.
    """
    
    # Main simulation directory
    sim_dir = tutorial_dir / "cesm_f2000_tutorial"
    sim_dir.mkdir(parents=True, exist_ok=True)
    
    # 1. INPUT FILES - Configuration and initial conditions
    input_dir = sim_dir / "input"
    input_dir.mkdir(exist_ok=True)
    
    # CESM namelists (critical configuration files)
    (input_dir / "user_nl_cam").write_text(
        "! CAM atmospheric model configuration\n"
        "nhtfrq = -24\n"  # Daily output frequency
        "mfilt = 30\n"    # 30 time steps per file
        "fincl1 = 'T','Q','U','V'\n"  # Variables to output
    )
    
    (input_dir / "user_nl_clm").write_text(
        "! CLM land model configuration\n"
        "hist_nhtfrq = -24\n"  # Daily output
        "hist_mfilt = 30\n"
        "hist_fincl1 = 'TSA','RAIN','SNOW'\n"  # Land surface variables
    )
    
    # Create a small initial conditions file
    print("Creating sample initial conditions file...")
    create_sample_netcdf(input_dir / "initial_conditions.nc", "initial")
    
    # 2. OUTPUT FILES - Model results (the main data you want to analyze)
    output_dir = sim_dir / "output"
    output_dir.mkdir(exist_ok=True)
    
    print("Creating sample output files...")
    # Atmospheric output (CAM) - monthly files
    for month in ["01", "02", "03"]:
        create_sample_netcdf(output_dir / f"cam.h0.2024-{month}.nc", "atmosphere")
    
    # Land model output (CLM) - monthly files  
    for month in ["01", "02", "03"]:
        create_sample_netcdf(output_dir / f"clm.h0.2024-{month}.nc", "land")
    
    # Ocean output (POP) - monthly files
    for month in ["01", "02", "03"]:
        create_sample_netcdf(output_dir / f"pop.h.2024-{month}.nc", "ocean")
    
    # 3. RESTART FILES - For continuing simulations (critical!)
    restart_dir = sim_dir / "restart"
    restart_dir.mkdir(exist_ok=True)
    
    print("Creating sample restart files...")
    create_sample_netcdf(restart_dir / "cam.r.2024-04-01.nc", "restart")
    create_sample_netcdf(restart_dir / "clm.r.2024-04-01.nc", "restart")
    
    # 4. LOG FILES - Model run information
    logs_dir = sim_dir / "logs"
    logs_dir.mkdir(exist_ok=True)
    
    (logs_dir / "atm.log").write_text(
        "CAM Atmospheric Model Log\n"
        "=========================\n"
        "Run started: 2024-01-01 00:00:00\n"
        "Resolution: f19_g16\n"
        "Timestep: 1800s\n"
        "Integration successful for 90 days\n"
        "Run completed: 2024-03-31 23:59:59\n"
    )
    
    (logs_dir / "run.log").write_text(
        "CESM Run Log\n"
        "============\n"
        "Case: f2000_tutorial\n"
        "Components: CAM, CLM, POP, CICE\n"
        "Start date: 2024-01-01\n"
        "End date: 2024-03-31\n"
        "Total simulation time: 4.5 hours\n"
        "Status: COMPLETED SUCCESSFULLY\n"
    )
    
    # 5. SCRIPTS - Analysis and processing scripts
    scripts_dir = sim_dir / "scripts"
    scripts_dir.mkdir(exist_ok=True)
    
    (scripts_dir / "postprocess.py").write_text(
        "#!/usr/bin/env python3\n"
        "\"\"\"Post-processing script for CESM output\"\"\"\n"
        "import xarray as xr\n"
        "\n"
        "# Calculate monthly means\n"
        "def monthly_means(input_file, output_file):\n"
        "    ds = xr.open_dataset(input_file)\n"
        "    monthly = ds.resample(time='M').mean()\n"
        "    monthly.to_netcdf(output_file)\n"
    )
    
    return sim_dir

def create_sample_netcdf(filepath, data_type):
    """
    Creates a small but realistic NetCDF file for different Earth Science data types.
    This helps you understand what different file types contain.
    """
    
    # Simple coordinate system
    lat = np.linspace(-90, 90, 64)  # 64 latitude points
    lon = np.linspace(0, 360, 128)  # 128 longitude points 
    
    if data_type == "atmosphere":
        # Atmospheric data: temperature, humidity, winds
        time = [datetime(2024, 1, 15)]  # Mid-month
        temp = 288 + 30 * np.cos(np.radians(lat))[:, None]  # Temperature gradient
        
        ds = xr.Dataset({
            'T': (['time', 'lat', 'lon'], temp[None, :, :]),
            'Q': (['time', 'lat', 'lon'], 0.01 * np.ones((1, 64, 128))),
        }, coords={
            'time': time,
            'lat': lat,
            'lon': lon
        })
        
        ds.attrs = {
            'title': 'CAM Atmospheric Model Output',
            'model': 'CAM6',
            'resolution': 'f19_g16',
            'case': 'f2000_tutorial'
        }
        
    elif data_type == "land":
        # Land surface data: temperatures, precipitation
        time = [datetime(2024, 1, 15)]
        surface_temp = 285 + 25 * np.cos(np.radians(lat))[:, None]
        
        ds = xr.Dataset({
            'TSA': (['time', 'lat', 'lon'], surface_temp[None, :, :]),
            'RAIN': (['time', 'lat', 'lon'], 0.001 * np.ones((1, 64, 128))),
        }, coords={
            'time': time,
            'lat': lat,
            'lon': lon
        })
        
        ds.attrs = {
            'title': 'CLM Land Model Output',
            'model': 'CLM5',
            'case': 'f2000_tutorial'
        }
        
    elif data_type == "ocean":
        # Ocean data: temperature, currents
        time = [datetime(2024, 1, 15)]
        depth = np.array([5, 15, 25, 35])  # Ocean levels
        sst = 288 + 15 * np.cos(np.radians(lat))[:, None]  # Sea surface temp
        
        ds = xr.Dataset({
            'TEMP': (['time', 'z_t', 'lat', 'lon'], 
                    sst[None, None, :, :] * np.ones((1, 4, 64, 128))),
        }, coords={
            'time': time,
            'z_t': depth,
            'lat': lat,
            'lon': lon
        })
        
        ds.attrs = {
            'title': 'POP Ocean Model Output',
            'model': 'POP2',
            'case': 'f2000_tutorial'
        }
        
    elif data_type in ["initial", "restart"]:
        # Restart/initial files: model state for continuing runs
        time = [datetime(2024, 4, 1)]  # Restart date
        state_data = 300 * np.ones((64, 128))  # Model state
        
        ds = xr.Dataset({
            'STATE': (['lat', 'lon'], state_data),
            'CHECKPOINT': (['lat', 'lon'], state_data * 0.9),
        }, coords={
            'lat': lat,
            'lon': lon
        })
        
        ds.attrs = {
            'title': f'{data_type.title()} File for CESM',
            'restart_date': '2024-04-01',
            'case': 'f2000_tutorial'
        }
    
    # Save the file
    ds.to_netcdf(filepath, format='NETCDF4')

# Create the simulation directory
cesm_dir = create_cesm_simulation_directory()
print(f"\n✅ Created CESM simulation directory: {cesm_dir}")

# Show the structure
print("\n📁 Directory Structure:")
for item in sorted(cesm_dir.rglob('*')):
    if item.is_file():
        rel_path = item.relative_to(cesm_dir)
        size_mb = item.stat().st_size / (1024 * 1024)
        print(f"  {rel_path} ({size_mb:.1f} MB)")

## Understanding What We Have

Before creating archives, let's understand the different types of files in our CESM simulation:

### File Types and Their Importance

| **Type** | **Files** | **Purpose** | **Importance** |
|----------|-----------|-------------|----------------|
| **Input/Config** | `user_nl_*`, `initial_conditions.nc` | Model setup and parameters | **CRITICAL** - Need these to reproduce the run |
| **Output** | `cam.h0.*`, `clm.h0.*`, `pop.h.*` | Scientific results | **IMPORTANT** - Main analysis data |
| **Restart** | `*.r.*` files | Continue simulation | **CRITICAL** - Cannot continue run without these |
| **Logs** | `*.log` files | Diagnostic information | **OPTIONAL** - Useful for debugging |
| **Scripts** | `*.py` files | Analysis workflows | **IMPORTANT** - For reproducibility |

**Key Insight**: Not all files are equally important! You might archive everything, but extract only what you need for specific analyses.

## Step 1: Your First Archive Creation

Now let's create your first climate data archive. We'll use Tellus to compress and organize all the simulation files:

In [None]:
# Import Tellus archive system
from tellus.core.cli import console
import subprocess
import sys

# First, let's see the space we're using
def calculate_directory_size(directory):
    """Calculate total size of directory in MB"""
    total_size = sum(f.stat().st_size for f in directory.rglob('*') if f.is_file())
    return total_size / (1024 * 1024)

original_size = calculate_directory_size(cesm_dir)
console.print(f"[blue]Original simulation size: {original_size:.1f} MB[/blue]")

# Create our first archive using the CLI
archive_dir = tutorial_dir / "archives"
archive_dir.mkdir(exist_ok=True)

archive_name = "cesm_tutorial_complete"

console.print("\n[bold blue]Creating your first climate data archive...[/bold blue]")
console.print("[dim]This will compress and organize all simulation files[/dim]")

# Using pixi run to execute the CLI command properly
cmd = [
    "pixi", "run", "tellus", "archive", "create", 
    archive_name,
    str(cesm_dir),
    "--location", "local_archive"
]

try:
    result = subprocess.run(cmd, capture_output=True, text=True, cwd="/Users/pgierz/Code/github.com/pgierz/tellus")
    
    if result.returncode == 0:
        console.print("[green]✅ Archive created successfully![/green]")
        console.print(f"[dim]Command output: {result.stdout}[/dim]")
    else:
        console.print(f"[red]❌ Archive creation failed: {result.stderr}[/red]")
        # Let's try a simpler approach for the tutorial
        console.print("[yellow]⚠️  CLI not available, creating archive manually for tutorial...[/yellow]")
        
except Exception as e:
    console.print(f"[red]Error running command: {e}[/red]")
    console.print("[yellow]⚠️  Continuing with manual archive creation for tutorial...[/yellow]")

Let me show you what happens conceptually when you create an archive (since the CLI might not be available in this environment):

In [None]:
# Let's create a simplified archive manually to show the concepts
import tarfile
import json
from datetime import datetime

def create_tutorial_archive(source_dir, archive_path):
    """
    Manually create an archive to demonstrate the concepts.
    This shows what Tellus does internally.
    """
    
    console.print("[blue]📦 Creating compressed archive...[/blue]")
    
    # Create the compressed tar archive
    with tarfile.open(archive_path, "w:gz") as tar:
        # Add all files to the archive
        for file_path in source_dir.rglob('*'):
            if file_path.is_file():
                arcname = file_path.relative_to(source_dir)
                tar.add(file_path, arcname=arcname)
                console.print(f"  Added: {arcname}")
    
    # Create metadata (what Tellus does automatically)
    console.print("\n[blue]📋 Creating archive metadata...[/blue]")
    
    files_info = []
    total_size = 0
    
    for file_path in source_dir.rglob('*'):
        if file_path.is_file():
            rel_path = file_path.relative_to(source_dir)
            size = file_path.stat().st_size
            total_size += size
            
            # Classify file type (simplified version of what Tellus does)
            if rel_path.name.startswith('user_nl_') or 'initial' in rel_path.name:
                content_type = 'INPUT'
                importance = 'CRITICAL'
            elif rel_path.suffix == '.nc' and 'output' in str(rel_path):
                content_type = 'OUTPUT'
                importance = 'IMPORTANT'
            elif '.r.' in rel_path.name:
                content_type = 'RESTART'
                importance = 'CRITICAL'
            elif rel_path.suffix == '.log':
                content_type = 'LOG'
                importance = 'OPTIONAL'
            elif rel_path.suffix == '.py':
                content_type = 'SCRIPT'
                importance = 'IMPORTANT'
            else:
                content_type = 'OTHER'
                importance = 'OPTIONAL'
            
            files_info.append({
                'path': str(rel_path),
                'size': size,
                'content_type': content_type,
                'importance': importance
            })
    
    # Create metadata file
    metadata = {
        'metadata_version': '1.0',
        'created_at': datetime.now().isoformat(),
        'archive': {
            'archive_id': 'cesm_tutorial_complete',
            'source_directory': str(source_dir),
            'archive_type': 'compressed'
        },
        'simulation': {
            'case': 'f2000_tutorial',
            'model': 'CESM',
            'period': '2024-01-01 to 2024-03-31'
        },
        'inventory': {
            'total_files': len(files_info),
            'total_size': total_size,
            'content_summary': {},
            'files': files_info
        }
    }
    
    # Count files by type
    content_counts = {}
    for file_info in files_info:
        content_type = file_info['content_type']
        content_counts[content_type] = content_counts.get(content_type, 0) + 1
    
    metadata['inventory']['content_summary'] = content_counts
    
    # Save metadata file
    metadata_path = archive_path.with_suffix('.metadata.json')
    metadata_path.write_text(json.dumps(metadata, indent=2))
    
    return archive_path, metadata_path

# Create the archive
archive_path = tutorial_dir / "cesm_tutorial_complete.tar.gz"
archive_file, metadata_file = create_tutorial_archive(cesm_dir, archive_path)

# Show results
archive_size = archive_file.stat().st_size / (1024 * 1024)
compression_ratio = (original_size - archive_size) / original_size * 100

console.print(f"\n[green]✅ Archive creation complete![/green]")
console.print(f"[blue]Archive file: {archive_file.name} ({archive_size:.1f} MB)[/blue]")
console.print(f"[blue]Metadata file: {metadata_file.name}[/blue]")
console.print(f"[green]Space saved: {compression_ratio:.1f}% compression[/green]")

## Step 2: Examining Your Archive

Now let's explore what's in our archive without extracting it. This is like having a table of contents for your compressed data:

In [None]:
# Load and display the archive metadata
metadata = json.loads(metadata_file.read_text())

console.print("[bold blue]📊 Archive Contents Summary[/bold blue]")
console.print("=" * 50)

# General information
console.print(f"[cyan]Archive ID:[/cyan] {metadata['archive']['archive_id']}")
console.print(f"[cyan]Created:[/cyan] {metadata['created_at'][:19]}")
console.print(f"[cyan]Total Files:[/cyan] {metadata['inventory']['total_files']}")
console.print(f"[cyan]Total Size:[/cyan] {metadata['inventory']['total_size'] / (1024*1024):.1f} MB")

# Content breakdown
console.print("\n[bold blue]📁 Files by Content Type[/bold blue]")
for content_type, count in metadata['inventory']['content_summary'].items():
    console.print(f"  [green]{content_type}:[/green] {count} files")

# Show a few example files
console.print("\n[bold blue]📄 Example Files[/bold blue]")
for file_info in metadata['inventory']['files'][:10]:  # Show first 10 files
    size_str = f"{file_info['size'] / 1024:.1f} KB" if file_info['size'] > 1024 else f"{file_info['size']} B"
    console.print(
        f"  [yellow]{file_info['path']}[/yellow] "
        f"[dim]({size_str}, {file_info['content_type']}, {file_info['importance']})[/dim]"
    )

if len(metadata['inventory']['files']) > 10:
    console.print(f"  [dim]... and {len(metadata['inventory']['files']) - 10} more files[/dim]")

## Step 3: Listing Your Archives

In a real workflow, you'll have multiple archives. Let's create another archive and see how to list them:

In [None]:
# Create a second archive with only the critical files
console.print("[bold blue]Creating a 'Critical Files Only' archive...[/bold blue]")
console.print("[dim]This archive contains only files needed to restart the simulation[/dim]")

def create_critical_only_archive():
    """Create an archive with only critical files (configs + restart files)"""
    
    critical_archive_path = tutorial_dir / "cesm_tutorial_critical.tar.gz"
    
    with tarfile.open(critical_archive_path, "w:gz") as tar:
        for file_path in cesm_dir.rglob('*'):
            if file_path.is_file():
                rel_path = file_path.relative_to(cesm_dir)
                
                # Only include critical files
                is_critical = (
                    rel_path.name.startswith('user_nl_') or  # CESM namelists
                    'initial' in rel_path.name or           # Initial conditions
                    '.r.' in rel_path.name                  # Restart files
                )
                
                if is_critical:
                    tar.add(file_path, arcname=rel_path)
                    console.print(f"  Added critical file: {rel_path}")
    
    return critical_archive_path

critical_archive = create_critical_only_archive()
critical_size = critical_archive.stat().st_size / (1024 * 1024)

console.print(f"\n[green]✅ Critical archive created: {critical_archive.name} ({critical_size:.1f} MB)[/green]")

# Now let's "list" our archives (simulate what 'tellus archive list' would show)
console.print("\n[bold blue]📚 Your Archive Collection[/bold blue]")
console.print("=" * 60)

archives = [
    {
        'name': 'cesm_tutorial_complete',
        'file': archive_file,
        'description': 'Complete CESM simulation (all files)',
        'content_types': ['INPUT', 'OUTPUT', 'RESTART', 'LOG', 'SCRIPT']
    },
    {
        'name': 'cesm_tutorial_critical',
        'file': critical_archive,
        'description': 'Critical files only (restart capability)',
        'content_types': ['INPUT', 'RESTART']
    }
]

for i, archive in enumerate(archives, 1):
    size_mb = archive['file'].stat().st_size / (1024 * 1024)
    content_str = ', '.join(archive['content_types'])
    
    console.print(f"[cyan]{i}. {archive['name']}[/cyan]")
    console.print(f"   Size: {size_mb:.1f} MB")
    console.print(f"   Description: {archive['description']}")
    console.print(f"   Content Types: {content_str}")
    console.print()

## Step 4: Understanding Archive Benefits

Let's compare the space usage and organization benefits:

In [None]:
from rich.table import Table

# Create a comparison table
table = Table(title="Storage Comparison: Original vs. Archives")
table.add_column("Item", style="cyan")
table.add_column("Size (MB)", justify="right", style="green")
table.add_column("Files", justify="right", style="yellow")
table.add_column("Notes", style="dim")

# Original data
original_files = len(list(cesm_dir.rglob('*')))
table.add_row(
    "Original Directory",
    f"{original_size:.1f}",
    str(original_files),
    "Uncompressed, scattered files"
)

# Complete archive
complete_size = archive_file.stat().st_size / (1024 * 1024)
table.add_row(
    "Complete Archive",
    f"{complete_size:.1f}",
    "1",
    "All files, compressed + metadata"
)

# Critical archive
critical_size = critical_archive.stat().st_size / (1024 * 1024)
table.add_row(
    "Critical Archive",
    f"{critical_size:.1f}",
    "1",
    "Essential files only"
)

console.print(table)

# Space savings
total_savings = (original_size - complete_size) / original_size * 100
critical_savings = (original_size - critical_size) / original_size * 100

console.print(f"\n[bold green]💾 Space Savings[/bold green]")
console.print(f"Complete archive: {total_savings:.1f}% compression")
console.print(f"Critical archive: {critical_savings:.1f}% space reduction")

console.print(f"\n[bold blue]🎯 Key Benefits[/bold blue]")
console.print("✅ [green]Compression:[/green] Reduced storage space")
console.print("✅ [green]Organization:[/green] Files automatically classified")
console.print("✅ [green]Metadata:[/green] Know what's inside without extracting")
console.print("✅ [green]Portability:[/green] Single file easy to transfer")
console.print("✅ [green]Selective Access:[/green] Can extract only what you need")

## Real-World Scenarios: When to Use Each Archive Type

Understanding when to create different types of archives is crucial for effective data management:

### Scenario 1: Long-term Storage
**Use Case**: Moving old simulation to tape storage  
**Archive Type**: Complete archive  
**Why**: You want everything preserved for potential future analysis

```bash
# Complete archive for long-term storage
tellus archive create simulation_2024_complete /path/to/simulation \
  --location tape_storage
```

### Scenario 2: Continuing a Simulation
**Use Case**: Need to restart simulation on different machine  
**Archive Type**: Critical files only  
**Why**: Smaller, faster transfer, contains everything needed to restart

```bash
# Critical files for simulation restart
tellus archive create restart_package /path/to/simulation \
  --content-types input,restart,config
```

### Scenario 3: Sharing Results
**Use Case**: Collaborator wants to analyze your results  
**Archive Type**: Output files only  
**Why**: Scientists usually only need the output data, not restart files

```bash
# Output data for collaborators
tellus archive create results_for_analysis /path/to/simulation \
  --content-types output --patterns "*.nc"
```

## Common Beginner Mistakes and How to Avoid Them

### ❌ Mistake 1: Archiving Everything Always
**Problem**: Creating huge archives with log files and temporary data  
**Solution**: Use content type filtering to exclude non-essential files

### ❌ Mistake 2: Not Checking Archive Contents
**Problem**: Creating archive and not verifying what's inside  
**Solution**: Always use `tellus archive show` to inspect archive metadata

### ❌ Mistake 3: Forgetting to Test Extraction
**Problem**: Archive is created but never tested for extraction  
**Solution**: Always test extracting a few files to verify archive integrity

### ❌ Mistake 4: Poor Archive Naming
**Problem**: Using vague names like "simulation1", "test_archive"  
**Solution**: Use descriptive names: "cesm_f2000_spinup_2024", "wrf_hurricane_katrina_outputs"

## Cleanup and Summary

In [None]:
# Clean up tutorial files
import shutil

console.print("[bold blue]🧹 Cleaning up tutorial files...[/bold blue]")
shutil.rmtree(tutorial_dir)
console.print(f"[green]✅ Cleaned up: {tutorial_dir}[/green]")

console.print("\n[bold green]🎉 Tutorial 1 Complete![/bold green]")
console.print("\n[bold blue]What You Learned:[/bold blue]")
console.print("✅ How to create climate data archives")
console.print("✅ Understanding file types and importance")
console.print("✅ Reading archive metadata without extraction")
console.print("✅ Comparing different archive strategies")
console.print("✅ When to use complete vs. selective archives")

console.print("\n[bold blue]Next Steps:[/bold blue]")
console.print("📚 [cyan]Tutorial 2:[/cyan] Content Classification and Selective Archiving")
console.print("📚 [cyan]Tutorial 3:[/cyan] DateTime-Based Extraction and Filtering")
console.print("📚 [cyan]Tutorial 4:[/cyan] Fragment Assembly for Multi-Period Simulations")

console.print("\n[dim]Ready to continue? Open Tutorial 2 to learn about smart file filtering![/dim]")

## Decision Tree: Archive Strategy Selection

Use this flowchart to decide what type of archive to create:

```
📊 What's your goal?
├── 🎯 Continue simulation later?
│   └── ✅ Create CRITICAL archive (configs + restart files)
├── 📤 Share results with collaborators?
│   └── ✅ Create OUTPUT archive (scientific data only)
├── 💾 Long-term storage/backup?
│   └── ✅ Create COMPLETE archive (everything)
├── 🚀 Moving to faster storage?
│   └── ✅ Create IMPORTANT archive (exclude logs/temp files)
└── 🔍 Not sure?
    └── ✅ Start with COMPLETE, extract selectively later
```

## Key Commands Reference

```bash
# Create complete archive
tellus archive create my_simulation /path/to/data --location storage_location

# List all archives
tellus archive list

# Show archive details
tellus archive show my_simulation

# Create selective archive
tellus archive create critical_files /path/to/data \
  --content-types input,restart,config
```

**🎯 You're now ready to create and manage climate data archives! Continue to Tutorial 2 to learn about intelligent file classification and selective archiving strategies.**