# AstroGraphAnomaly — Colab (workflow-only)

Objectif : exécuter le workflow **sans hack**, **sans modification de fichiers**, et sans supposer un mode package.

Le notebook :
- clone le dépôt
- installe `requirements.txt`
- détecte automatiquement l’entrypoint : `workflow.py` ou `run_workflow.py`
- exécute un run offline (CSV test) puis (optionnel) un run Gaia


In [None]:
!git clone --depth 1 https://github.com/dalozedidier-dot/AstroGraphAnomaly.git
%cd AstroGraphAnomaly
!python -m pip install -q --upgrade pip
!pip -q install -r requirements.txt


In [None]:
import sys, subprocess
from pathlib import Path

ENTRYPOINT = None
if Path('workflow.py').exists():
    ENTRYPOINT = 'workflow.py'
elif Path('run_workflow.py').exists():
    ENTRYPOINT = 'run_workflow.py'
else:
    raise FileNotFoundError('Aucun entrypoint trouvé: workflow.py ou run_workflow.py')

print('Entrypoint détecté:', ENTRYPOINT)

def run_offline_csv(out_dir='results/colab_csv', top_k=20, explain_top=5, plots=True):
    if ENTRYPOINT == 'workflow.py':
        cmd = [sys.executable, ENTRYPOINT, 'csv',
               '--in-csv', 'data/sample_gaia_like.csv',
               '--out', out_dir,
               '--top-k', str(top_k),
               '--explain-top', str(explain_top)]
        if plots:
            cmd.append('--plots')
    else:
        cmd = [sys.executable, ENTRYPOINT,
               '--mode', 'csv',
               '--in-csv', 'data/sample_gaia_like.csv',
               '--out', out_dir,
               '--top-k', str(top_k),
               '--explain-top', str(explain_top)]
        if plots:
            cmd.append('--plots')

    print('RUN:', ' '.join(cmd))
    subprocess.check_call(cmd)
    return Path(out_dir)

def run_gaia(out_dir='results/colab_gaia', ra=266.4051, dec=-28.936175, radius_deg=0.3, limit=800, top_k=30, explain_top=5, plots=True):
    if ENTRYPOINT == 'workflow.py':
        cmd = [sys.executable, ENTRYPOINT, 'gaia',
               '--ra', str(ra), '--dec', str(dec),
               '--radius-deg', str(radius_deg), '--limit', str(limit),
               '--out', out_dir,
               '--top-k', str(top_k),
               '--explain-top', str(explain_top)]
        if plots:
            cmd.append('--plots')
    else:
        cmd = [sys.executable, ENTRYPOINT,
               '--mode', 'gaia',
               '--ra', str(ra), '--dec', str(dec),
               '--radius-deg', str(radius_deg), '--limit', str(limit),
               '--out', out_dir,
               '--top-k', str(top_k),
               '--explain-top', str(explain_top)]
        if plots:
            cmd.append('--plots')

    print('RUN:', ' '.join(cmd))
    subprocess.check_call(cmd)
    return Path(out_dir)


## 1) Run offline (CSV test)

Ce run ne dépend pas du réseau. Il sert à valider que le pipeline tourne de bout en bout.


In [None]:
out = run_offline_csv(out_dir='results/colab_csv', top_k=20, explain_top=5, plots=True)
print('Outputs in:', out)


## 2) Inspection rapide
- top anomalies
- liste des artefacts
- affichage des PNG (si `--plots`)


In [None]:
import pandas as pd
top = pd.read_csv(out / 'top_anomalies.csv')
top.head(10)


In [None]:
!ls -lah results/colab_csv
!ls -lah results/colab_csv/plots || true


In [None]:
from IPython.display import Image, display
plots_dir = out / 'plots'
if plots_dir.exists():
    for p in sorted(plots_dir.glob('*.png')):
        print('PLOT:', p.name)
        display(Image(filename=str(p)))
else:
    print('No plots directory found')


## 3) Run Gaia (optionnel, réseau requis)

Si Gaia répond (quota/réseau), ce run ajoute souvent `bp_rp` et peut produire un CMD.


In [None]:
# Décommente si tu veux lancer Gaia
# out_gaia = run_gaia(out_dir='results/colab_gaia', ra=266.4051, dec=-28.936175, radius_deg=0.3, limit=800, top_k=30, explain_top=5, plots=True)
# print('Gaia outputs in:', out_gaia)
