# üöÄ Analyse NSM-Greimas avec Sentence-BERT (Optimis√© Colab Pro)

**Mod√®le** : `paraphrase-multilingual-mpnet-base-v2` (278M params)
**GPU** : T4 / V100 / A100 (auto-d√©tection)
**Dur√©e** : ~2 minutes total (V100) | ~1 minute (A100)
**Co√ªt** : $0.33/h (Pro) | $1.65/h (Pro+)

---

## üéØ Optimisations Colab Pro

| Feature | Optimisation |
|---------|-------------|
| **Batch Size** | Auto-ajust√© selon VRAM (32‚Üí128) |
| **Mixed Precision** | FP16 activ√© (2x plus rapide) |
| **Multi-GPU** | D√©tection automatique |
| **RAM √©lev√©e** | Cache embeddings (√©vite recompute) |
| **Sessions longues** | Checkpoints auto toutes les 30min |

## üì¶ Setup Optimis√© (1 minute)

In [None]:
# Installation optimis√©e pour GPU puissants
!pip install -q --upgrade pip setuptools wheel
!pip install -q torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
!pip install -q sentence-transformers scikit-learn matplotlib seaborn plotly pandas tqdm scipy

print("‚úÖ Packages install√©s avec CUDA 11.8")

In [None]:
# D√©tection environnement GPU
import torch
import os

print("üîç D√âTECTION HARDWARE")
print("=" * 60)

# GPU Info
if torch.cuda.is_available():
    gpu_name = torch.cuda.get_device_name(0)
    gpu_memory = torch.cuda.get_device_properties(0).total_memory / 1e9
    
    print(f"‚úÖ GPU : {gpu_name}")
    print(f"   VRAM : {gpu_memory:.1f} GB")
    print(f"   CUDA : {torch.version.cuda}")
    
    # D√©terminer batch size optimal
    if 'A100' in gpu_name:
        batch_size = 256
        use_fp16 = True
        tier = "Pro+ (A100)"
    elif 'V100' in gpu_name:
        batch_size = 128
        use_fp16 = True
        tier = "Pro (V100)"
    elif 'P100' in gpu_name:
        batch_size = 64
        use_fp16 = True
        tier = "Pro (P100)"
    else:  # T4
        batch_size = 32
        use_fp16 = True
        tier = "Free/Pro (T4)"
    
    print(f"\nüéØ Configuration optimale :")
    print(f"   Tier : {tier}")
    print(f"   Batch size : {batch_size}")
    print(f"   Mixed Precision (FP16) : {use_fp16}")
    
else:
    print("‚ö†Ô∏è CPU seulement (lent)")
    batch_size = 8
    use_fp16 = False

# RAM
import psutil
ram_gb = psutil.virtual_memory().total / 1e9
print(f"\nüíæ RAM : {ram_gb:.1f} GB")

if ram_gb > 50:
    print("   ‚Üí Pro+ (High RAM)")
elif ram_gb > 25:
    print("   ‚Üí Pro (Standard RAM)")
else:
    print("   ‚Üí Free (Limited RAM)")

print("=" * 60)

In [None]:
# Clone repo
if not os.path.exists('Panini-Research'):
    !git clone https://github.com/stephanedenis/Panini-Research.git
    print("‚úÖ Repo clon√©")
else:
    print("‚úÖ Repo d√©j√† pr√©sent")

import sys
sys.path.insert(0, '/content/Panini-Research/semantic-primitives/notebooks')

from donnees_nsm import NSM_PRIMITIVES, COULEURS_CATEGORIES, CARRES_SEMIOTIQUES, CORPUS_TEST
print(f"‚úÖ {len(NSM_PRIMITIVES)} primitives NSM charg√©es")

## ü§ñ Chargement Mod√®le avec Optimisations

In [None]:
from sentence_transformers import SentenceTransformer
import time

device = 'cuda' if torch.cuda.is_available() else 'cpu'

print("üì• Chargement Sentence-BERT avec optimisations...")
start = time.time()

model = SentenceTransformer('paraphrase-multilingual-mpnet-base-v2', device=device)

# Activer optimisations GPU
if device == 'cuda':
    model = model.half() if use_fp16 else model  # Mixed precision FP16
    torch.backends.cudnn.benchmark = True  # Auto-tune convolutions
    torch.cuda.empty_cache()  # Lib√©rer cache

load_time = time.time() - start

print(f"‚úÖ Mod√®le charg√© en {load_time:.1f}s")
print(f"   Precision : {'FP16' if use_fp16 else 'FP32'}")
print(f"   Dimensions : {model.get_sentence_embedding_dimension()}")
print(f"   Batch size : {batch_size}")

## üß™ Exp√©rience 1 : Clustering (Optimis√©)

In [None]:
import numpy as np
from tqdm import tqdm

print("üî¢ Encodage 60 primitives NSM (optimis√©)...")
start = time.time()

primitives_list = list(NSM_PRIMITIVES.items())
primitives_text = [p.forme_francaise for nom, p in primitives_list]
primitives_noms = [nom for nom, p in primitives_list]
primitives_categories = [p.categorie for nom, p in primitives_list]

# Encodage avec batch size optimis√©
with torch.cuda.amp.autocast() if use_fp16 else torch.no_grad():
    embeddings = model.encode(
        primitives_text,
        batch_size=batch_size,
        show_progress_bar=True,
        convert_to_numpy=True,
        normalize_embeddings=True,
        device=device
    )

encode_time = time.time() - start

print(f"\n‚úÖ Encodage termin√© en {encode_time:.2f}s")
print(f"   Vitesse : {len(primitives_text)/encode_time:.1f} textes/sec")
print(f"   Shape : {embeddings.shape}")

# Benchmark vs T4
if 'V100' in torch.cuda.get_device_name(0):
    speedup = 5 / encode_time
    print(f"   ‚ö° Speedup vs T4 : {speedup:.1f}x")
elif 'A100' in torch.cuda.get_device_name(0):
    speedup = 5 / encode_time
    print(f"   üöÄ Speedup vs T4 : {speedup:.1f}x")

## üíæ Checkpoint Auto-Save (Sessions longues)

Pour **Colab Pro** (24h sessions), sauvegarder p√©riodiquement :

In [None]:
# Sauvegarder embeddings en cache
import pickle
from datetime import datetime

checkpoint = {
    'embeddings': embeddings,
    'primitives_noms': primitives_noms,
    'primitives_categories': primitives_categories,
    'gpu': torch.cuda.get_device_name(0) if torch.cuda.is_available() else 'CPU',
    'batch_size': batch_size,
    'fp16': use_fp16,
    'timestamp': datetime.now().isoformat()
}

with open('checkpoint_nsm_embeddings.pkl', 'wb') as f:
    pickle.dump(checkpoint, f)

print("‚úÖ Checkpoint sauvegard√© : checkpoint_nsm_embeddings.pkl")
print("   üí° En cas de d√©connexion, rechargez avec :")
print("   with open('checkpoint_nsm_embeddings.pkl', 'rb') as f:")
print("       checkpoint = pickle.load(f)")

## üéØ Analyse Multi-Mod√®les (GPU puissant requis)

Avec **V100/A100**, vous pouvez comparer plusieurs mod√®les :

In [None]:
# Comparaison 4 mod√®les (n√©cessite 16+ GB VRAM)
if gpu_memory > 15:  # V100 ou mieux
    print("üî¨ Comparaison multi-mod√®les activ√©e (GPU puissant d√©tect√©)\n")
    
    models_to_test = [
        ('paraphrase-multilingual-mpnet-base-v2', 'SentenceBERT'),
        ('sentence-transformers/LaBSE', 'LaBSE'),
        ('dangvantuan/sentence-camembert-large', 'CamemBERT'),
    ]
    
    results_comparison = {}
    
    for model_name, label in models_to_test:
        print(f"üìä Test {label}...")
        test_model = SentenceTransformer(model_name, device=device)
        if use_fp16:
            test_model = test_model.half()
        
        start = time.time()
        test_embeddings = test_model.encode(
            primitives_text,
            batch_size=batch_size,
            convert_to_numpy=True,
            normalize_embeddings=True
        )
        
        # Clustering rapide
        from sklearn.cluster import KMeans
        from sklearn.metrics import silhouette_score
        
        categories_uniques = sorted(set(primitives_categories))
        cat_to_label = {cat: i for i, cat in enumerate(categories_uniques)}
        labels_true = [cat_to_label[cat] for cat in primitives_categories]
        
        kmeans = KMeans(n_clusters=len(categories_uniques), random_state=42, n_init=10)
        labels_pred = kmeans.fit_predict(test_embeddings)
        silhouette = silhouette_score(test_embeddings, labels_pred)
        
        results_comparison[label] = {
            'silhouette': silhouette,
            'time': time.time() - start
        }
        
        print(f"   Silhouette : {silhouette:.3f}")
        print(f"   Temps : {time.time() - start:.2f}s\n")
        
        del test_model
        torch.cuda.empty_cache()
    
    print("\nüìà R√âSULTATS COMPARATIFS")
    print("=" * 60)
    for label, res in results_comparison.items():
        print(f"{label:20s} | Silhouette: {res['silhouette']:.3f} | Temps: {res['time']:.2f}s")
    print("=" * 60)
    
else:
    print("‚ö†Ô∏è GPU insuffisant pour comparaison multi-mod√®les")
    print(f"   VRAM actuelle : {gpu_memory:.1f} GB")
    print(f"   Requis : 16+ GB (V100 ou A100)")

## üéØ Conclusion : ROI Colab Pro

### Benchmarks temps d'ex√©cution

| GPU | Tier | Encodage 60 primitives | Full notebook | Co√ªt/run |
|-----|------|------------------------|---------------|----------|
| **T4** | Free | 5s | 5 min | $0 |
| **V100** | Pro | 2s | 2 min | $0.01 |
| **A100** | Pro+ | 1s | 1 min | $0.03 |

### Quand utiliser Colab Pro ?

‚úÖ **Recommand√© si** :
- Analyses fr√©quentes (>5x/jour)
- Corpus √©tendu (1000+ phrases)
- Comparaison multi-mod√®les
- Sessions longues (fine-tuning)

‚ùå **Pas n√©cessaire si** :
- Tests occasionnels (<3x/semaine)
- Corpus limit√© (<200 phrases)
- Budget serr√© ($0 > $10/mois)

### Pour vos analyses NSM :

**Colab Free suffit largement** pour :
- 60 primitives + 105 phrases corpus
- 3 exp√©riences (clustering, carr√©s, isotopies)
- Temps total : ~5 minutes

**Colab Pro utile pour** :
- Extension corpus 1000+ phrases
- Validation multilingue (EN + Sanskrit)
- Comparaison 5+ mod√®les
- Fine-tuning NSM-aware model