In [None]:
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# üöÄ Pipeline DVC + Poetry\n",
    "\n",
    "Este notebook executa todas as etapas do seu pipeline DVC via Poetry:\n",
    "\n",
    "‚úî Download do dataset\n",
    "‚úî Preprocessamento\n",
    "‚úî Treinamento\n",
    "‚úî Avalia√ß√£o\n",
    "‚úî Gr√°ficos e M√©tricas\n",
    "‚úî Visualiza√ß√£o do pipeline (DAG)\n",
    "‚úî Compara√ß√£o entre execu√ß√µes (metrics diff)\n",
    "\n",
    "---\n",
    "### ‚ö† Importante\n",
    "Sempre use `!poetry run ...` dentro do notebook.\n"
   ]
  },

  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## üîß Checando vers√µes do ambiente"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!poetry run python --version\n",
    "!poetry run dvc --version"
   ]
  },

  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## ‚ñ∂Ô∏è Executar todo o pipeline"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!poetry run dvc repro"
   ]
  },

  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## üß© Executar stages individuais"
   ]
  },

  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Download do dataset\n",
    "!poetry run dvc repro download"
   ]
  },

  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Preprocessamento\n",
    "!poetry run dvc repro preprocess"
   ]
  },

  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Treinamento\n",
    "!poetry run dvc repro train"
   ]
  },

  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Avalia√ß√£o\n",
    "!poetry run dvc repro evaluate"
   ]
  },

  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## üìä Lendo m√©tricas do treinamento (metrics.json)"
   ]
  },

  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import json\n",
    "import pandas as pd\n",
    "\n",
    "with open(\"metrics.json\") as f:\n",
    "    metrics = json.load(f)\n",
    "\n",
    "pd.DataFrame(metrics).T"
   ]
  },

  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## üìà Lendo AUC do eval.json"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "with open(\"eval.json\") as f:\n",
    "    eval_metrics = json.load(f)\n",
    "eval_metrics"
   ]
  },

  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## üìâ Curva ROC"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "from sklearn.metrics import roc_curve, auc\n",
    "import joblib\n",
    "\n",
    "df = pd.read_csv(\"data/processed.csv\")\n",
    "model = joblib.load(\"src/creditcard_ml/model/model.pkl\")\n",
    "\n",
    "X = df.drop(columns=[\"Class\"])\n",
    "y = df[\"Class\"]\n",
    "\n",
    "proba = model.predict_proba(X)[:, 1]\n",
    "fpr, tpr, _ = roc_curve(y, proba)\n",
    "\n",
    "plt.plot(fpr, tpr)\n",
    "plt.title(\"ROC Curve\")\n",
    "plt.xlabel(\"False Positive Rate\")\n",
    "plt.ylabel(\"True Positive Rate\")\n",
    "plt.grid(True)\n",
    "plt.show()"
   ]
  },

  {
   "cell_type": "markdown",
   "source": [
    "## üîç Visualizar DAG do pipeline"
   ]
  },
  {
   "cell_type": "code",
   "source": [
    "!poetry run dvc dag"
   ]
  },

  {
   "cell_type": "markdown",
   "source": [
    "## üîÑ Comparar m√©tricas entre execu√ß√µes"
   ]
  },
  {
   "cell_type": "code",
   "source": [
    "!poetry run dvc metrics diff"
   ]
  }

 ],

 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "name": "python",
   "version": "3.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
