In [None]:
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Example 1: Basic Extraction Workflow\n",
    "\n",
    "This notebook demonstrates the basic end-to-end workflow of the `evidence-extractor` tool.\n",
    "\n",
    "We will cover:\n",
    "1.  **Setting up**: Ensuring your environment is ready.\n",
    "2.  **Running the Extraction**: Executing the main `extract` command on a sample PDF.\n",
    "3.  **Inspecting the JSON Output**: Loading and exploring the structured data.\n",
    "4.  **Running the Review**: Using the interactive `review` command to validate the results."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1. Setup\n",
    "\n",
    "First, make sure you have followed the installation instructions in the main `README.md` file.\n",
    "\n",
    "Crucially, you must have:\n",
    "- Installed the `evidence-extractor` package (`pip install -e .`).\n",
    "- Created a `.env` file in the root of the project with your `GEMINI_API_KEY`.\n",
    "- Placed a sample PDF file in the `data/raw/` directory. We will use `sample.pdf` for this example."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2. Running the Extraction\n",
    "\n",
    "We can execute shell commands directly from the notebook by starting a line with `!`. Let's run the `extract` command.\n",
    "\n",
    "This will process `data/raw/sample.pdf` and save the structured output to `data/processed/sample_extraction.json`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!evidence-extractor extract --pdf data/raw/sample.pdf --output data/processed/sample_extraction.json"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You should see the log output from the application above, indicating that the process ran successfully and completed. If you see errors, please double-check your `.env` file and API key."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3. Inspecting the JSON Output\n",
    "\n",
    "Now that the extraction is complete, we can load the resulting JSON file and inspect its contents using Python."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import json\n",
    "from pprint import pprint\n",
    "\n",
    "output_path = \"../data/processed/sample_extraction.json\"\n",
    "\n",
    "with open(output_path, 'r') as f:\n",
    "    data = json.load(f)\n",
    "\n",
    "# Let's look at the generated summary and the first extracted claim\n",
    "print(\"--- Generated Summary ---\")\n",
    "pprint(data.get('summary'))\n",
    "\n",
    "if data.get('claims'):\n",
    "    print(\"\\n--- First Claim ---\")\n",
    "    pprint(data['claims'][0])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4. Running the Review (Optional)\n",
    "\n",
    "The interactive `review` command cannot be run directly inside a non-interactive notebook environment because it requires user input.\n",
    "\n",
    "To try it, open a terminal, activate your virtual environment, and run the following command:\n",
    "\n",
    "```bash\n",
    "evidence-extractor review data/processed/sample_extraction.json\n",
    "```\n",
    "\n",
    "You will be prompted to verify, reject, or skip each of the key findings, and your changes will be saved back to the JSON file."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}