Merge pull request #258 from YueyingCoding/update-notebook

update-notebook
YiVal · Dec 3, 2023 · 95674ee · 95674ee
2 parents efa33f8 + 0fc9dcc
commit 95674ee
Show file tree

Hide file tree

Showing 15 changed files with 8,415 additions and 1,127 deletions.
diff --git a/demo/tutorial_notebook/Auto_TikTok_Title_Generation_Bots_Demo.ipynb b/demo/tutorial_notebook/Auto_TikTok_Title_Generation_Bots_Demo.ipynb
diff --git a/demo/tutorial_notebook/Guardrails_run_leetcode.ipynb b/demo/tutorial_notebook/Guardrails_run_leetcode.ipynb
@@ -0,0 +1,215 @@
+{
+  "nbformat": 4,
+  "nbformat_minor": 0,
+  "metadata": {
+    "colab": {
+      "provenance": []
+    },
+    "kernelspec": {
+      "name": "python3",
+      "display_name": "Python 3"
+    },
+    "language_info": {
+      "name": "python"
+    }
+  },
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "source": [
+        "<a href=\"https://colab.research.google.com/drive/1QgRQmFmC_L07Ler4vbq_vcCNm_OHmJL_#scrollTo=nhaq7McR-Dpv\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
+      ],
+      "metadata": {
+        "id": "nhaq7McR-Dpv"
+      }
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "# Guardrail's Performance in Python Leetcode\n",
+        "\n",
+        "In this use case, our demo will evaluate performance of guardrails with Yival.\n",
+        "\n",
+        "After applying guardrails' wrapper to 80 Python Leetcode questions, you'll notice a discernible dip in performance, which may come with an uptick in cost and processing time.\n",
+        "\n",
+        "[Guardrails](https://github.com/ShreyaR/guardrails) is a Python package that lets a user add structure, type and quality guarantees to the outputs of large language models (LLMs).\n"
+      ],
+      "metadata": {
+        "id": "AkgS24Ei2nmI"
+      }
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "# Install Yival with git"
+      ],
+      "metadata": {
+        "id": "BMcU4KWm1pyG"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "# Clone the latest yival\n",
+        "import os\n",
+        "!python --version\n",
+        "!rm -rf YiVal\n",
+        "!git clone https://github.com/YiVal/YiVal.git\n",
+        "\n",
+        "# Install and config poetry\n",
+        "import shutil\n",
+        "!pip install poetry\n",
+        "POETRY_PATH = shutil.which(\"poetry\") or (os.getenv(\"HOME\") + \"/.local/bin/poetry\")\n",
+        "os.environ[\"PATH\"] += os.pathsep + os.path.dirname(POETRY_PATH)\n",
+        "!poetry --version\n",
+        "!poetry config virtualenvs.create true"
+      ],
+      "metadata": {
+        "id": "C_1FUzv-1pCv"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "qGYH-EYCMuqR"
+      },
+      "outputs": [],
+      "source": [
+        "os.chdir(\"/content/YiVal\")\n",
+        "!poetry install --no-ansi"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "# Configure your OpenAI API key"
+      ],
+      "metadata": {
+        "id": "2VKHnus31zC4"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "os.environ['OPENAI_API_KEY']= ''"
+      ],
+      "metadata": {
+        "id": "Ym8BdfdG10sG"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "# Guardrails evaluation yml\n",
+        "## Dataset\n",
+        "The experiment sources its data from a CSV file which storage 80 leetcode problems (leetcode_problems.csv). The data source type is defined as a dataset.\n",
+        "\n",
+        "## Custom Function\n",
+        "The custom function `demo.guardrails.run_leetcode.run_leetcode` is utilized in this experiment. This function addresses Leetcode problems by either utilizing Guardrails' wrapper or directly employing the ChatCompletion interface. It then returns the corresponding Python code as the result.\n",
+        "\n",
+        "## Variations\n",
+        "In this experiment, we will explore two different variations: **`use_guardrails`** and **`gpt`**. **`use_guardrails`** indicates the use of Guardrails method in the experiment, while **`gpt`** signifies the absence of Guardrails method. These two variations will impact the outcomes and performance of the experiment.\n",
+        "\n",
+        "## Evaluators\n",
+        "The experiment will utilize an evaluator named **`python_validation_evaluator`** to assess the model's performance. This assessment hinges on the successful execution of the generated Python code. It will compute an average score as the evaluation outcome.\n",
+        "\n",
+        "```\n",
+        "custom_function: demo.guardrails.run_leetcode.run_leetcode\n",
+        "dataset:\n",
+        "  file_path: demo/guardrails/data/leetcode_problems.csv\n",
+        "  reader: csv_reader\n",
+        "  source_type: dataset\n",
+        "description: Generate the expected results for the Leetcode problems.\n",
+        "evaluators:\n",
+        "  - evaluator_type: individual\n",
+        "    matching_technique: includes\n",
+        "    metric_calculators:\n",
+        "      - method: AVERAGE\n",
+        "    name: python_validation_evaluator\n",
+        "\n",
+        "variations:\n",
+        "  - name: use_guardrails\n",
+        "    variations:\n",
+        "      - instantiated_value: \"use_guardrails\"\n",
+        "        value: \"use_guardrails\"\n",
+        "        value_type: str\n",
+        "        variation_id: null\n",
+        "      - instantiated_value: \"gpt\"\n",
+        "        value: \"gpt\"\n",
+        "        value_type: str\n",
+        "        variation_id: null\n",
+        "```"
+      ],
+      "metadata": {
+        "id": "KnmaSTEc13Rg"
+      }
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "# Configure your Ngrok token\n",
+        "Our current ngrok authtoken only supports one public session at a time. If it's being used by others or if you're using it to run multiple Colabs at once, you might bump into a Network error. To avoid this, we suggest getting your own ngrok authtoken for your Colab notebooks. It's easy and free to get your own authtoken from ngrok.\n",
+        "\n",
+        "Here's how to do it:\n",
+        "- If you don't have a ngrok account yet, head over to https://dashboard.ngrok.com/login to sign up.\n",
+        "- Once you're logged in, you can grab your authtoken at https://dashboard.ngrok.com/get-started/your-authtoken.\n",
+        "\n",
+        "Prior to initiating a new demo, ensure that all other applications utilizing ngrok within Colab have been terminated via the `Connect -> Manage Sessions` pathway. You can check and manage your sessions as follow picture.\n",
+        "\n",
+        "![image](https://github.com/uni-zhuan/uni_CDN/blob/master/picture/Yival/iShot_2023-10-06_16.46.22.png?raw=true)"
+      ],
+      "metadata": {
+        "id": "W4aUMFd0SV2Q"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "os.environ['ngrok']='true'\n",
+        "!poetry run ngrok config add-authtoken 2UK3G7MKgDqCqDnu36njaaE02bZ_7FqvcqBke5hbpgHjizoo7"
+      ],
+      "metadata": {
+        "id": "j1fDEqSRSVR0"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "# YIVAL！"
+      ],
+      "metadata": {
+        "id": "2iVdkyWvSxJF"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "!poetry run yival run /content/YiVal/demo/configs/guardrails_leetcode.yml --async_eval=True"
+      ],
+      "metadata": {
+        "id": "k-flN0tTIesb"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "# Experiment Result between use_guardrails\n",
+        "![image](https://github.com/uni-zhuan/uni_CDN/blob/master/picture/Yival/WechatIMG134.png?raw=true)\n",
+        "\n",
+        "We can clearly see that the model's performance dropped when using the Guardrails wrapper. This indicates that in this demo, applying Guardrails to the LLM does not provide strong quality guarantees."
+      ],
+      "metadata": {
+        "id": "XrZJzjx_TcR7"
+      }
+    }
+  ]
+}