# 🌌 SSZ Full Pipeline - Google Colab

**Segmented Spacetime Mass Projection - Complete Analysis Pipeline**

© 2025 Carmen Wrede, Lino Casu  
Licensed under the ANTI-CAPITALIST SOFTWARE LICENSE v1.4

---

## 📋 What does this notebook do?

- ✅ Automatically installs all dependencies
- ✅ Clones the GitHub repository (with smart Git LFS support)
- ✅ Runs the complete SSZ pipeline
- ✅ Generates all reports and plots
- ✅ Optional: Segment-Redshift Add-on
- ✅ Downloadable results

**⏱️ Runtime:** 
- **Small files only:** ~5-10 minutes (recommended)
- **With Git LFS:** ~20-30 minutes (+15 min for 3.6 GB download)

---

## 🚀 Quick Start

**Simply run all cells in sequence:**
- `Runtime` → `Run all` (Ctrl+F9)
- Or individually: ▶️ button for each cell

---

## 💡 Two Modes Available

### **Mode 1: Small Files Only (Recommended)**
- Clone time: ~2 minutes
- Download size: ~36 MB
- Tests: Work immediately with v1/nightly datasets
- **Set:** `USE_GIT_LFS = False` in configuration

### **Mode 2: With Large Files (Complete)**
- Clone time: ~15-20 minutes
- Download size: ~3.6 GB
- Tests: All datasets including real-data
- **Set:** `USE_GIT_LFS = True` in configuration

---

## 📚 Documentation

For more details, see:
- **GOOGLE_COLAB_SETUP.md** - Complete setup guide
- **README_CLONE_TEST.md** - Clone and test instructions
- **GIT_HYBRID_STRATEGY.md** - Technical details

## ⚙️ Configuration

### Repository Settings
REPO_URL = "https://github.com/error-wtf/Segmented-Spacetime-Mass-Projection-Unified-Results"
REPO_NAME = "Segmented-Spacetime-Mass-Projection-Unified-Results"

### Git LFS Settings (for large files)
USE_GIT_LFS = False  # Set to True to download large files (~3.6 GB, +15 min)
                     # False: Only small files (~36 MB, tests work immediately)

### Pipeline Settings
ENABLE_EXTENDED_METRICS = True   # Extended plots and statistics
ENABLE_SEGMENT_REDSHIFT = True   # Gravitational redshift analysis

print("✅ Configuration loaded")
print(f"📦 Repository: {REPO_NAME}")
print(f"⚡ Git LFS: {'Enabled (large files)' if USE_GIT_LFS else 'Disabled (small files only)'}")
print(f"📊 Extended Metrics: {ENABLE_EXTENDED_METRICS}")
print(f"🌌 Segment Redshift: {ENABLE_SEGMENT_REDSHIFT}")

In [None]:
%%capture install_output
# Installation (Output wird gecaptured um Terminal sauber zu halten)

# Core scientific + astronomy
!pip install -q numpy scipy pandas matplotlib astropy astroquery

# Testing framework
!pip install -q pytest pytest-timeout

# Data formats
!pip install -q pyarrow pyyaml

# Utils
!pip install -q requests tqdm colorama

print("✅ Dependencies installiert!")

# Zusammenfassung anzeigen
print("📦 Installierte Pakete:")
!pip list | grep -E "numpy|scipy|pandas|matplotlib|astropy|astroquery|pytest"

In [None]:
%%capture install_output
# Installation (Output wird gecaptured um Terminal sauber zu halten)

!pip install -q numpy scipy pandas matplotlib astropy requests tqdm

print("✅ Dependencies installiert!")

In [None]:
# Zusammenfassung anzeigen
print("📦 Installierte Pakete:")
!pip list | grep -E "numpy|scipy|pandas|matplotlib|astropy"

## 📥 2. Repository klonen

In [None]:
import os
from pathlib import Path

print("="*80)
print("📥 REPOSITORY SETUP")
print("="*80)

# Install Git LFS if requested
if USE_GIT_LFS:
    print("\n📦 Installing Git LFS...")
    !apt-get install -y git-lfs > /dev/null 2>&1
    !git lfs install
    print("✅ Git LFS installed")

# Check if repository already exists
if Path(REPO_NAME).exists():
    print(f"\n⚠️  Repository already exists: {REPO_NAME}")
    print("🔄 Pulling latest changes...")
    !cd {REPO_NAME} && git pull
    
    # Pull LFS files if enabled
    if USE_GIT_LFS:
        print("⬇️  Updating LFS files...")
        !cd {REPO_NAME} && git lfs pull
else:
    # Clone repository
    print(f"\n📥 Cloning repository...")
    print(f"   URL: {REPO_URL}")
    print(f"   Strategy: {'Git LFS (large files)' if USE_GIT_LFS else 'Small files only'}")
    
    !git clone --depth 1 {REPO_URL} {REPO_NAME}
    
    # Pull large files if LFS is enabled
    if USE_GIT_LFS:
        print("\n⬇️  Downloading large files (~3.6 GB, this may take 10-15 minutes)...")
        !cd {REPO_NAME} && git lfs pull
        print("✅ Large files downloaded")
    else:
        print("\n⚡ Using small files only (~36 MB)")
        print("   Tests with v1/nightly datasets will work immediately!")
        print("   💡 To get large files later, set USE_GIT_LFS=True and re-run")

# Change to repository directory
os.chdir(REPO_NAME)
print(f"\n✅ Repository ready!")
print(f"📂 Working Directory: {os.getcwd()}")

# Show what's available
print("\n" + "="*80)
print("📄 AVAILABLE FILES")
print("="*80)

# Check small test file
small_test = Path("models/cosmology/2025-10-17_gaia_ssz_v1/ssz_field.parquet")
if small_test.exists():
    size_mb = small_test.stat().st_size / (1024 * 1024)
    print(f"✅ Small files: {size_mb:.2f} MB (v1/nightly datasets)")
else:
    print("❌ Small files missing!")

# Check large test file
large_test = Path("models/cosmology/2025-10-17_gaia_ssz_real/ssz_field.parquet")
if large_test.exists():
    size_mb = large_test.stat().st_size / (1024 * 1024)
    if size_mb > 100:
        print(f"✅ Large files: {size_mb:.2f} MB (real-data complete)")
    else:
        print(f"⚡ Large files: {size_mb*1024:.2f} KB (LFS pointers only)")
        print("   Run 'git lfs pull' to download actual data")
else:
    print("❌ Large files missing!")

print("="*80)

## 🔍 3. Repository-Struktur prüfen

In [None]:
# Wichtige Dateien prüfen
required_files = [
    "run_all_ssz_terminal.py",
    "data/real_data_full.csv",
    "scripts/addons/segment_redshift_addon.py"
]

print("🔍 Prüfe Repository-Struktur...\n")
all_ok = True
for file in required_files:
    exists = Path(file).exists()
    icon = "✅" if exists else "❌"
    print(f"{icon} {file}")
    if not exists:
        all_ok = False

if all_ok:
    print("\n✅ Alle erforderlichen Dateien vorhanden!")
else:
    print("\n⚠️  Einige Dateien fehlen - Pipeline läuft ggf. mit Einschränkungen.")

## 🌍 4. Umgebungsvariablen setzen

In [None]:
# UTF-8 Encoding für Windows-Kompatibilität
os.environ['PYTHONIOENCODING'] = 'utf-8:replace'
os.environ['LANG'] = 'en_US.UTF-8'

# Pipeline Features
if ENABLE_EXTENDED_METRICS:
    os.environ['SSZ_EXTENDED_METRICS'] = '1'
    print("✅ Extended Metrics aktiviert")
else:
    os.environ['SSZ_EXTENDED_METRICS'] = '0'
    print("⏭️  Extended Metrics deaktiviert")

if ENABLE_SEGMENT_REDSHIFT:
    os.environ['SSZ_SEGMENT_REDSHIFT'] = '1'
    print("✅ Segment-Redshift Add-on aktiviert")
else:
    os.environ['SSZ_SEGMENT_REDSHIFT'] = '0'
    print("⏭️  Segment-Redshift Add-on deaktiviert")

print("\n🌍 Umgebung konfiguriert!")

## 🚀 5. Full Pipeline ausführen

**Dies ist der Hauptlauf - dauert ~5-10 Minuten!**

Die Pipeline führt aus:
1. Root-Level Tests (6 Physik-Tests)
2. SegWave Tests (20 Tests)
3. Scripts Tests (15 Tests)
4. Cosmos Tests (1 Test)
5. SSZ Analyse (G79, Cygnus X)
6. Extended Metrics (falls aktiviert)
7. Segment-Redshift Add-on (falls aktiviert)
8. Plot-Übersicht

In [None]:
import time
from datetime import datetime

print("="*80)
print("🚀 SSZ FULL PIPELINE START")
print("="*80)
print(f"⏰ Start: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}\n")

start_time = time.time()

# Pipeline ausführen
!python run_all_ssz_terminal.py

elapsed = time.time() - start_time
minutes = int(elapsed // 60)
seconds = int(elapsed % 60)

print("\n" + "="*80)
print("✅ PIPELINE ABGESCHLOSSEN")
print("="*80)
print(f"⏱️  Laufzeit: {minutes} min {seconds} sec")
print(f"⏰ Ende: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")

## 📊 6. Ergebnisse prüfen

In [None]:
from pathlib import Path
import glob

print("📊 Generierte Reports:\n")

# Reports
report_files = [
    "reports/full-output.md",
    "reports/summary-output.md",
    "reports/RUN_SUMMARY.md",
    "reports/segment_redshift.csv",
    "reports/segment_redshift.md"
]

for file in report_files:
    if Path(file).exists():
        size = Path(file).stat().st_size / 1024  # KB
        print(f"✅ {file:<45} ({size:.1f} KB)")
    else:
        print(f"⏭️  {file:<45} (nicht generiert)")

# Plots zählen
print("\n📈 Generierte Plots:\n")
plot_dirs = ["reports/figures", "out", "agent_out/figures", "vfall_out"]

total_plots = 0
for plot_dir in plot_dirs:
    if Path(plot_dir).exists():
        png_files = list(Path(plot_dir).rglob("*.png"))
        svg_files = list(Path(plot_dir).rglob("*.svg"))
        count = len(png_files) + len(svg_files)
        total_plots += count
        if count > 0:
            print(f"  {plot_dir:<30} {count} Plots")

print(f"\n📊 **Gesamt: {total_plots} Plot-Dateien**")

## 📄 7. Zusammenfassung anzeigen

In [None]:
# RUN_SUMMARY.md anzeigen
summary_file = Path("reports/RUN_SUMMARY.md")
if summary_file.exists():
    print("="*80)
    print("📄 RUN SUMMARY")
    print("="*80)
    print(summary_file.read_text(encoding='utf-8'))
else:
    print("⚠️  RUN_SUMMARY.md nicht gefunden")

# Segment-Redshift Ergebnis
seg_file = Path("reports/segment_redshift.md")
if seg_file.exists():
    print("\n" + "="*80)
    print("🌌 SEGMENT REDSHIFT ERGEBNIS")
    print("="*80)
    print(seg_file.read_text(encoding='utf-8'))
else:
    print("\n⏭️  Segment-Redshift wurde nicht ausgeführt")

## 🖼️ 8. Beispiel-Plots anzeigen

In [None]:
from IPython.display import Image, display
import matplotlib.pyplot as plt
from PIL import Image as PILImage

# Suche interessante Plots
example_plots = [
    "reports/figures/fig_shared_segment_redshift_profile.png",
    "out/phi_step_residual_hist.png",
    "reports/figures/DemoObject/fig_DemoObject_ringchain_v_vs_k.png"
]

print("🖼️  Beispiel-Plots:\n")

for plot_path in example_plots:
    if Path(plot_path).exists():
        print(f"\n{'='*60}")
        print(f"📊 {plot_path}")
        print('='*60)
        
        # Bild anzeigen
        img = PILImage.open(plot_path)
        plt.figure(figsize=(10, 6))
        plt.imshow(img)
        plt.axis('off')
        plt.tight_layout()
        plt.show()
    else:
        print(f"⏭️  {plot_path} nicht gefunden")

print("\n✅ Weitere Plots findest du in den reports/figures/ Verzeichnissen")

## 💾 9. Ergebnisse herunterladen

In [None]:
import shutil
from datetime import datetime

# ZIP-Archiv erstellen
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
zip_name = f"SSZ_Results_{timestamp}"

print(f"📦 Erstelle ZIP-Archiv: {zip_name}.zip\n")

# Verzeichnisse zum Packen
dirs_to_zip = ["reports", "out", "agent_out"]

# Temporäres Verzeichnis für Archiv
temp_dir = Path("/tmp") / zip_name
temp_dir.mkdir(exist_ok=True)

# Kopiere Ergebnisse
for dir_name in dirs_to_zip:
    src = Path(dir_name)
    if src.exists():
        dst = temp_dir / dir_name
        shutil.copytree(src, dst, dirs_exist_ok=True)
        print(f"✅ {dir_name} kopiert")

# ZIP erstellen
shutil.make_archive(str(temp_dir), 'zip', temp_dir)
zip_path = f"{temp_dir}.zip"

size_mb = Path(zip_path).stat().st_size / (1024 * 1024)
print(f"\n✅ ZIP-Archiv erstellt: {zip_path}")
print(f"📊 Größe: {size_mb:.2f} MB")

# Download-Link (in Colab)
try:
    from google.colab import files
    print("\n⬇️  Starte Download...")
    files.download(zip_path)
    print("✅ Download gestartet!")
except ImportError:
    print(f"\n💡 Download manuell von: {zip_path}")

## 🧹 10. Cleanup (Optional)

In [None]:
# Optional: Cache und temporäre Dateien löschen
import shutil

print("🧹 Cleanup...\n")

cache_dirs = [
    "__pycache__",
    ".pytest_cache",
    "scripts/__pycache__",
    "tests/__pycache__"
]

for cache_dir in cache_dirs:
    if Path(cache_dir).exists():
        shutil.rmtree(cache_dir)
        print(f"✅ {cache_dir} gelöscht")

print("\n✅ Cleanup abgeschlossen!")

---

## 📚 Weitere Informationen

### 🔗 Links
- **GitHub:** https://github.com/error-wtf/Segmented-Spacetime-Mass-Projection-Unified-Results
- **Lizenz:** ANTI-CAPITALIST SOFTWARE LICENSE v1.4

### 📖 Dokumentation
- `README.md` - Projekt-Übersicht
- `papers/` - Wissenschaftliche Papers
- `reports/` - Generierte Analysen

### 🎯 Pipeline-Features
- **35 Physik-Tests** mit detaillierten Interpretationen
- **23 Technische Tests** (silent mode)
- **Extended Metrics** - Zusätzliche Plots und Statistiken
- **Segment-Redshift Add-on** - Gravitationelle Rotverschiebung

### ⚙️ Konfiguration anpassen
Gehe zurück zur **Konfigurations-Zelle** (oben) und ändere:
```python
ENABLE_EXTENDED_METRICS = True/False
ENABLE_SEGMENT_REDSHIFT = True/False
```

---

© 2025 Carmen Wrede, Lino Casu
