In [None]:
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Modeling — Stock Market Analytics\n",
    "\n",
    "Notebook para experimentos de modelagem sobre features de mercado.\n",
    "\n",
    "Este notebook é um espaço exploratório — o treinamento oficial é feito via `src/models/train.py`.\n",
    "\n",
    "> **Nota**: conteúdo educacional, não constitui recomendação de investimento."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 0. Setup"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "from sklearn.model_selection import TimeSeriesSplit\n",
    "from sklearn.linear_model import LinearRegression, LogisticRegression\n",
    "from sklearn.metrics import mean_absolute_error, mean_absolute_percentage_error, accuracy_score, f1_score, roc_auc_score\n",
    "\n",
    "from src import config\n",
    "from src.utils.io import load_parquet"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1. Load Features"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "features_path = config.ANALYTICS_DIR / \"features.parquet\"\n",
    "features = load_parquet(features_path)\n",
    "features.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2. Prepare Data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "exclude = [\"date\", \"ticker\", \"target_reg_5d\", \"target_cls_5d\"]\n",
    "X = features[[c for c in features.columns if c not in exclude]].fillna(0)\n",
    "y_reg = features[\"target_reg_5d\"].fillna(0)\n",
    "y_cls = features[\"target_cls_5d\"].fillna(0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3. Modeling — Regression (target_reg_5d)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "tscv = TimeSeriesSplit(n_splits=3)\n",
    "reg = LinearRegression()\n",
    "maes, mapes = [], []\n",
    "\n",
    "for train_idx, test_idx in tscv.split(X):\n",
    "    reg.fit(X.iloc[train_idx], y_reg.iloc[train_idx])\n",
    "    preds = reg.predict(X.iloc[test_idx])\n",
    "    maes.append(mean_absolute_error(y_reg.iloc[test_idx], preds))\n",
    "    mapes.append(mean_absolute_percentage_error(y_reg.iloc[test_idx], preds))\n",
    "\n",
    "print(f\"MAE: {np.mean(maes):.4f} | MAPE: {np.mean(mapes):.4f}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4. Modeling — Classification (target_cls_5d)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "cls = LogisticRegression(max_iter=500)\n",
    "accs, f1s, aucs = [], [], []\n",
    "\n",
    "for train_idx, test_idx in tscv.split(X):\n",
    "    cls.fit(X.iloc[train_idx], y_cls.iloc[train_idx])\n",
    "    preds = cls.predict(X.iloc[test_idx])\n",
    "    probs = cls.predict_proba(X.iloc[test_idx])[:, 1]\n",
    "    accs.append(accuracy_score(y_cls.iloc[test_idx], preds))\n",
    "    f1s.append(f1_score(y_cls.iloc[test_idx], preds))\n",
    "    aucs.append(roc_auc_score(y_cls.iloc[test_idx], probs))\n",
    "\n",
    "print(f\"Acc: {np.mean(accs):.3f} | F1: {np.mean(f1s):.3f} | AUC: {np.mean(aucs):.3f}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 5. Observações\n",
    "- Insira insights sobre as métricas.\n",
    "- Compare com outros modelos se desejar (RandomForest, etc.).\n",
    "- Resultados podem variar de acordo com tickers e períodos."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "name": "python",
   "version": "3.11"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
