In [None]:
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Run AlphaFold Pipeline\n",
    "\n",
    "This notebook demonstrates how to:\n",
    "1. Connect to your Azure Machine Learning workspace.\n",
    "2. Locate and submit the published **AlphaFold** pipeline.\n",
    "3. Monitor the run and retrieve outputs.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1. Environment Setup\n",
    "\n",
    "Install or upgrade the Azure ML SDK if needed (uncomment if running locally):\n",
    "```bash\n",
    "# !pip install --upgrade azureml-core azureml-pipeline-core\n",
    "```\n"
   ]
  },
  {
   "cell_type": "code",
   "metadata": {
    "tags": []
   },
   "source": [
    "import os\n",
    "from azureml.core import Workspace, Experiment\n",
    "from azureml.pipeline.core import PublishedPipeline, PipelineRun\n",
    "\n",
    "# IMPORTANT: Adjust these values if you're not using the default config\n",
    "subscription_id = os.getenv(\"AZURE_SUBSCRIPTION_ID\", \"<YOUR-SUBSCRIPTION-ID>\")\n",
    "resource_group = os.getenv(\"AZURE_RG\", \"<YOUR-RESOURCE-GROUP>\")\n",
    "workspace_name = os.getenv(\"AZURE_WORKSPACE_NAME\", \"<YOUR-AML-WORKSPACE>\")\n",
    "\n",
    "# Connect to the AML workspace\n",
    "ws = Workspace.get(\n",
    "    name=workspace_name,\n",
    "    subscription_id=subscription_id,\n",
    "    resource_group=resource_group\n",
    ")\n",
    "print(\"Workspace:\", ws.name, \"loaded.\")\n"
   ],
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2. Locate the Published AlphaFold Pipeline\n",
    "\n",
    "When the pipelines were registered, a pipeline named `AlphaFold_Pipeline` should have been published in your workspace. Let's locate it and list the published pipelines."
   ]
  },
  {
   "cell_type": "code",
   "metadata": {},
   "source": [
    "published_pipelines = PublishedPipeline.list(ws)\n",
    "pipeline_id = None\n",
    "\n",
    "for p in published_pipelines:\n",
    "    print(f\"Found pipeline: {p.name} (ID: {p.id})\")\n",
    "    if p.name == \"AlphaFold_Pipeline\":\n",
    "        pipeline_id = p.id\n",
    "\n",
    "if pipeline_id:\n",
    "    print(\"\\nAlphaFold_Pipeline found. ID:\", pipeline_id)\n",
    "else:\n",
    "    raise ValueError(\"No published pipeline named 'AlphaFold_Pipeline' found.\")"
   ],
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3. Submit a Run of the Pipeline\n",
    "\n",
    "We’ll create an **Experiment** in AML (e.g., `alphafold-inference`) and submit the published pipeline with default parameters or specify an input FASTA file."
   ]
  },
  {
   "cell_type": "code",
   "metadata": {},
   "source": [
    "# Create an experiment named 'alphafold-inference'\n",
    "experiment = Experiment(ws, \"alphafold-inference\")\n",
    "\n",
    "published_pipeline = PublishedPipeline.get(ws, pipeline_id)\n",
    "# If your pipeline has parameters, you can pass them here:\n",
    "# e.g., pipeline_parameters = {\"input_fasta\": \"data/sample.fasta\"}\n",
    "\n",
    "pipeline_run = experiment.submit(\n",
    "    published_pipeline,\n",
    "    # pipeline_parameters=pipeline_parameters\n",
    ")\n",
    "print(f\"Submitted pipeline run: {pipeline_run.id}\")\n",
    "pipeline_run.wait_for_completion(show_output=True)\n",
    "\n",
    "# Retrieve final status\n",
    "run_status = pipeline_run.get_status()\n",
    "print(\"\\nPipeline run finished with status:\", run_status)\n",
    "\n",
    "if run_status == \"Failed\":\n",
    "    raise Exception(\"Pipeline run failed. Check logs above for details.\")"
   ],
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4. Inspect Outputs\n",
    "You can explore outputs in two ways:\n",
    "1. **Azure ML Studio**: Go to Jobs → your pipeline run → Outputs/Logs.\n",
    "2. **Programmatically**: Download output artifacts using the run's artifact methods.\n",
    "\n",
    "Below is an example snippet if you have a named output in your pipeline step that you want to download."
   ]
  },
  {
   "cell_type": "code",
   "metadata": {},
   "source": [
    "# Example: If your step wrote outputs to a specific output name 'alphafold_output'\n",
    "alphafold_step_run = None\n",
    "for step_run in pipeline_run.get_children():\n",
    "    if step_run.name.lower().startswith(\"alphafold\"):  # or exact name match\n",
    "        alphafold_step_run = step_run\n",
    "        break\n",
    "\n",
    "if alphafold_step_run:\n",
    "    print(\"AlphaFold step run ID:\", alphafold_step_run.id)\n",
    "    # Suppose the pipeline step produced an output named 'alphafold_output'\n",
    "    alphafold_step_run.download_files(\n",
    "        output_directory=\"./local_alphafold_results\",\n",
    "        prefix=\"alphafold_output\"  # or the prefix used in your step\n",
    "    )\n",
    "    print(\"Downloaded AlphaFold results to local_alphafold_results/\")\n",
    "else:\n",
    "    print(\"No step matching 'AlphaFold' found. Check pipeline step names.\")"
   ],
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 5. Next Steps\n",
    "1. **Modify** the pipeline definition in `alphafold_pipeline.py` to include additional steps or use GPU compute.\n",
    "2. **Adjust** the environment Dockerfile in `environments/alphafold_env.dockerfile` to update dependencies or version.\n",
    "3. **Deploy** changes by re-running your GitHub Actions or the `scripts/register_pipelines.py` script.\n",
    "4. **Submit** new runs from this notebook or from `sample_inference.py`.\n",
    "\n",
    "You now have a working AlphaFold pipeline in Azure ML!"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "name": "python"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
