# SeaTurtle Re-ID: Getting Started

This notebook provides a quick introduction to the SeaTurtle Re-Identification project setup and basic usage.

## 🎯 Objectives
- Verify installation and setup
- Load and explore sample data
- Understand the project structure
- Run a basic example

## 📦 Import Libraries

First, let's import the necessary libraries and verify they're working correctly.

In [None]:
# Core libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Computer vision
import cv2
from PIL import Image

# Deep learning
import torch
import torchvision
from torchvision import transforms

# Utilities
import os
import sys
from pathlib import Path

print("✅ All libraries imported successfully!")
print(f"📊 NumPy version: {np.__version__}")
print(f"🐼 Pandas version: {pd.__version__}")
print(f"🔥 PyTorch version: {torch.__version__}")
print(f"👁️ OpenCV version: {cv2.__version__}")

## 🗂️ Project Structure

Let's explore the project directory structure:

In [None]:
# Navigate to project root
project_root = Path('..')
print(f"📁 Project root: {project_root.absolute()}")
print("\n📂 Directory structure:")

for item in sorted(project_root.iterdir()):
    if item.is_dir() and not item.name.startswith('.'):
        print(f"  📁 {item.name}/")
        # Show subdirectories for key folders
        if item.name in ['notebooks', 'data', 'utils']:
            for subitem in sorted(item.iterdir()):
                if subitem.is_dir():
                    print(f"    📁 {subitem.name}/")
                elif subitem.suffix in ['.py', '.md']:
                    print(f"    📄 {subitem.name}")
    elif item.is_file() and item.suffix in ['.md', '.txt', '.yml', '.py']:
        print(f"  📄 {item.name}")

## 🔧 Environment Check

Let's verify that our environment is set up correctly:

In [None]:
# Check if CUDA is available
print(f"🖥️ CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"🎮 GPU device: {torch.cuda.get_device_name(0)}")
    print(f"💾 GPU memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.1f} GB")
else:
    print("⚠️ Running on CPU - consider using GPU for deep learning experiments")

# Check available RAM
import psutil
memory = psutil.virtual_memory()
print(f"🧠 Available RAM: {memory.available / 1e9:.1f} GB / {memory.total / 1e9:.1f} GB")

## 🎨 Sample Visualization

Let's create a sample visualization to test our plotting capabilities:

In [None]:
# Create a sample plot
plt.figure(figsize=(12, 4))

# Sample data representing re-identification accuracy over epochs
epochs = np.arange(1, 51)
train_acc = 0.5 + 0.4 * (1 - np.exp(-epochs/10)) + 0.02 * np.random.randn(50)
val_acc = 0.5 + 0.35 * (1 - np.exp(-epochs/12)) + 0.03 * np.random.randn(50)

plt.subplot(1, 2, 1)
plt.plot(epochs, train_acc, label='Training Accuracy', linewidth=2)
plt.plot(epochs, val_acc, label='Validation Accuracy', linewidth=2)
plt.xlabel('Epochs')
plt.ylabel('Accuracy')
plt.title('Re-ID Model Training Progress')
plt.legend()
plt.grid(True, alpha=0.3)

# Sample confusion matrix
plt.subplot(1, 2, 2)
confusion_matrix = np.random.rand(5, 5)
confusion_matrix = confusion_matrix / confusion_matrix.sum(axis=1, keepdims=True)
sns.heatmap(confusion_matrix, annot=True, fmt='.2f', cmap='Blues',
            xticklabels=[f'ID_{i}' for i in range(5)],
            yticklabels=[f'ID_{i}' for i in range(5)])
plt.title('Sample Identity Confusion Matrix')
plt.ylabel('True Identity')
plt.xlabel('Predicted Identity')

plt.tight_layout()
plt.show()

print("✅ Visualization test successful!")

## 🚀 Next Steps

Now that your environment is set up, you can:

1. **📊 Explore Data**: Start with notebooks in `01_data_exploration/`
2. **🔧 Preprocess**: Use `02_preprocessing/` notebooks for data preparation
3. **🏗️ Build Models**: Implement baselines in `03_baseline_models/`
4. **🧠 Advanced Methods**: Experiment in `04_advanced_models/`
5. **📈 Evaluate**: Compare results in `05_evaluation/`

### 📋 Checklist for Starting Your Project:

- [ ] Add your sea turtle dataset to `data/raw/`
- [ ] Create data exploration notebook
- [ ] Implement data preprocessing pipeline
- [ ] Set up baseline re-identification model
- [ ] Define evaluation metrics
- [ ] Document your experiments

### 💡 Tips:

- Use descriptive notebook names with numerical prefixes
- Document your methodology in markdown cells
- Save intermediate results and model checkpoints
- Use version control for your notebooks
- Create utility functions in the `utils/` directory

Happy experimenting! 🐢🔬