In [None]:
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 04 – Model Inversion Attack Simulation\n",
    "\n",
    "In this notebook, we simulate a basic model inversion attack – a technique where attackers attempt to reconstruct sensitive training data by exploiting the predictions of a trained model.\n",
    "\n",
    "This is particularly relevant in finance and healthcare, where reconstructed features may reveal private or regulatory-relevant attributes."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Imports\n",
    "import pandas as pd\n",
    "import numpy as np\n",
    "from src.data_loader import load_and_preprocess_data\n",
    "from src.model_trainer import train_model\n",
    "import matplotlib.pyplot as plt\n",
    "import seaborn as sns\n",
    "sns.set(style=\"whitegrid\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Load data and train model\n",
    "X_train, X_test, y_train, y_test = load_and_preprocess_data()\n",
    "model = train_model(X_train, y_train)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Inversion Goal\n",
    "Let's assume an attacker has access to the model and wants to infer sensitive input patterns (e.g. approximate credit amounts of individuals who received credit).\n",
    "\n",
    "We simulate this by:\n",
    "- Fixing all input features except Credit amount\n",
    "- Generating inputs with varied Credit amount\n",
    "- Observing model outputs to infer thresholds"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Choose one real sample from test set\n",
    "base_sample = X_test.iloc[0].copy()\n",
    "base_sample_df = pd.DataFrame([base_sample]*50)\n",
    "\n",
    "# Modify 'Credit amount' over a range\n",
    "base_sample_df[\"Credit amount\"] = np.linspace(500, 20000, 50)\n",
    "\n",
    "# Predict outputs\n",
    "probs = model.predict_proba(base_sample_df)[:, 1]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Plot predictions vs. manipulated feature\n",
    "plt.figure(figsize=(8, 4))\n",
    "plt.plot(base_sample_df[\"Credit amount\"], probs)\n",
    "plt.xlabel(\"Credit amount\")\n",
    "plt.ylabel(\"Predicted Default Probability\")\n",
    "plt.title(\"Model Inversion – Reconstructing Credit Risk Pattern\")\n",
    "plt.grid(True)\n",
    "plt.tight_layout()\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Conclusion\n",
    "\n",
    "This example demonstrates how prediction APIs can be probed to infer sensitive input-output relationships.\n",
    "Model inversion is a known attack vector – especially dangerous in open APIs and unmonitored environments.\n",
    "\n",
    "Mitigation strategies include output randomization, confidence clipping, or limiting access to probability scores."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "name": "python",
   "version": "3.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}