# Projet Python — Momentum long/short (Poche A)

Ce notebook sert à **raconter** le projet (contexte → données → stratégie → résultats → robustesse → facteurs → limites) et à **montrer** les sorties générées par le code (`main.py` + `src/`).

**Règle simple :**
- `main.py` calcule et génère les fichiers dans `outs/`
- ce notebook charge les CSV/PNG et les affiche, + quelques commentaires.


## 0) Setup Colab (clone + dépendances)
Exécute cette cellule **une seule fois** par session Colab.

In [None]:
!git clone https://github.com/serynezerhouni/Projet-python.git
%cd Projet-python
!pip -q install yfinance pandas_datareader statsmodels
!ls


## 1) Contexte & objectif
On construit une stratégie **momentum cross-section** long/short sur un univers large US (type S&P100).

**Objectif :**
- Construire un signal momentum (ROC + MA trend + pénalités RSI & liquidité)
- Construire un portefeuille long/short (70/30) avec une pondération `rank × inverse-vol`
- Backtester + exporter les résultats
- Tester la robustesse (sous-périodes, sensibilité hyperparamètres)
- Mesurer l’exposition marché (CAPM) et les facteurs (FF3)
- Faire une ablation (enlever RSI / volume)


## 2) Données & univers
Les prix/volumes sont récupérés via **yfinance**. On vérifie rapidement la forme des données (dates × tickers).

In [None]:
import pandas as pd
import numpy as np

from src.data import get_prices_and_volume

UNIVERSE = [
    "AAPL","ABBV","ABT","ACN","ADBE","AIG","AMD","AMGN","AMT","AMZN","AVGO","AXP","BA","BAC","BK",
    "BKNG","BLK","BMY","BRK-B","C","CAT","CL","CMCSA","COF","COP","COST","CRM","CSCO","CVS","CVX",
    "DE","DHR","DIS","DUK","EMR","FDX","GD","GE","GILD","GM","GOOG","GOOGL","GS","HD","HON","IBM",
    "INTC","INTU","ISRG","JNJ","JPM","KO","LIN","LLY","LMT","LOW","MA","MCD","MDLZ","MDT","MET",
    "META","MMM","MRK","MS","MSFT","NEE","NFLX","NKE","NOW","NVDA","ORCL","PEP","PFE","PG","PLTR",
    "PM","PYPL","QCOM","RTX","SBUX","SCHW","SO","SPG","T","TGT","TMO","TMUS","TSLA","TXN","UBER",
    "UNH","UNP","UPS","USB","V","VZ","WFC","WMT","XOM"
]

START_DATE = "2010-01-01"
END_DATE = None

prices, volumes = get_prices_and_volume(UNIVERSE, start=START_DATE, end=END_DATE)
print("prices shape:", prices.shape)
print("volumes shape:", volumes.shape)
prices.tail(3)


### 2.1 Check rapide (optionnel)
- % de valeurs manquantes

In [None]:
missing_pct = prices.isna().mean().sort_values(ascending=False)
missing_pct.head(10)


## 3) Stratégie finale (règles + paramètres)

### Signal (Poche A)
- ROC ~6 mois (126 jours)
- tendance MA50/MA200
- pénalité RSI si sur-achat
- pénalité liquidité si volume < seuil

### Portefeuille
- sélection Top 20% en long, Bottom 40% en short
- expositions 70% long / 30% short
- pondération : `rank × inverse-vol` (vol fenêtre 20j)
- rebal mensuel, coûts de transaction = 8 bps × turnover


## 4) Lancer le pipeline (génère les exports dans outs/)
Cette cellule exécute `main.py` et écrit automatiquement :
- `outs/tables/*.csv`
- `outs/figures/*.png`


In [None]:
!python main.py

!ls outs
!ls outs/tables | head
!ls outs/figures | head


## 5) Résultats — Backtest final

In [None]:
from IPython.display import Image, display

perf_final = pd.read_csv("outs/tables/perf_final.csv")
perf_final


In [None]:
display(Image("outs/figures/equity_final.png"))
display(Image("outs/figures/drawdown_final.png"))


## 6) Comparaison — Equal-weight vs Rank/Inv-Vol

In [None]:
cmp = pd.read_csv("outs/tables/weighting_comparison.csv")
cmp


In [None]:
display(Image("outs/figures/equity_equal_vs_invvol.png"))
display(Image("outs/figures/drawdown_equal_vs_invvol.png"))


## 7) Robustesse
### 7.1 Sous-périodes

In [None]:
sub = pd.read_csv("outs/tables/subperiods_table.csv")
sub


### 7.2 Sensibilité hyperparamètres (Top 10 Sharpe)

In [None]:
grid_top10 = pd.read_csv("outs/tables/sensitivity_best_top10.csv")
grid_top10


## 8) Exposition marché (CAPM)

In [None]:
capm = pd.read_csv("outs/tables/capm_7030_vs_5050.csv")
capm


## 9) Fama-French 3 facteurs

In [None]:
ff3 = pd.read_csv("outs/tables/ff3_loadings_final.csv")
ff3


## 10) Ablation study

In [None]:
ablation = pd.read_csv("outs/tables/ablation_final.csv")
ablation


In [None]:
display(Image("outs/figures/ablation_sharpe_bar.png"))


## 11) Poids “aujourd’hui” (les deux variantes)
Ton `main.py` ne les exporte pas encore.
Cette cellule **calcule et sauvegarde** :
- `outs/tables/weights_equal_<date>.csv`
- `outs/tables/weights_invvol_<date>.csv`


In [None]:
from src.signals import momentum_scores_pocheA
from src.portfolio import compute_weights_from_scores

as_of = prices.index.max()
rets = prices.pct_change().replace([np.inf, -np.inf], np.nan).fillna(0.0)

# Variante A : equal-weight + score risk-adjust
scores_A = momentum_scores_pocheA(prices, volumes, lookback_days=60, as_of_date=as_of, risk_adjust_by_vol=True)
w_A = compute_weights_from_scores(
    scores=scores_A.reindex(prices.columns),
    returns=rets.loc[:as_of, prices.columns],
    vol_window=20,
    top_pct=0.2,
    bottom_pct=0.4,
    long_exposure=0.7,
    short_exposure=0.3,
    max_weight=None,
    min_names_per_side=3,
    weight_scheme="equal",
).reindex(prices.columns).fillna(0.0)

# Variante B : rank_inv_vol + score NON risk-adjust (final)
scores_B = momentum_scores_pocheA(prices, volumes, lookback_days=60, as_of_date=as_of, risk_adjust_by_vol=False)
w_B = compute_weights_from_scores(
    scores=scores_B.reindex(prices.columns),
    returns=rets.loc[:as_of, prices.columns],
    vol_window=20,
    top_pct=0.2,
    bottom_pct=0.4,
    long_exposure=0.7,
    short_exposure=0.3,
    max_weight=None,
    min_names_per_side=3,
    weight_scheme="rank_inv_vol",
).reindex(prices.columns).fillna(0.0)

date_str = pd.to_datetime(as_of).strftime("%Y-%m-%d")
path_A = f"outs/tables/weights_equal_{date_str}.csv"
path_B = f"outs/tables/weights_invvol_{date_str}.csv"

pd.DataFrame({"ticker": w_A.index, "weight": w_A.values}).to_csv(path_A, index=False)
pd.DataFrame({"ticker": w_B.index, "weight": w_B.values}).to_csv(path_B, index=False)

print("saved:", path_A)
print("saved:", path_B)

print("\nTop 10 Equal-weight:")
display(w_A.sort_values(ascending=False).head(10).to_frame("weight"))

print("\nTop 10 Rank×InvVol:")
display(w_B.sort_values(ascending=False).head(10).to_frame("weight"))


## 12) Limites & biais (à écrire en texte)
- Biais survivant / univers fixe
- Data quality (yfinance)
- Slippage non modélisé (TC simplifiés)
- Shorting : coûts d’emprunt non inclus
- Liquidité réelle non modélisée
- Overfitting potentiel (grid)


## 13) Télécharger tous les outputs
Zip + download du dossier `outs/`.

In [None]:
!zip -r outs.zip outs
from google.colab import files
files.download("outs.zip")
