-
Notifications
You must be signed in to change notification settings - Fork 421
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #258 from YueyingCoding/update-notebook
update-notebook
- Loading branch information
Showing
15 changed files
with
8,415 additions
and
1,127 deletions.
There are no files selected for viewing
343 changes: 343 additions & 0 deletions
343
demo/tutorial_notebook/Auto_TikTok_Title_Generation_Bots_Demo.ipynb
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,215 @@ | ||
{ | ||
"nbformat": 4, | ||
"nbformat_minor": 0, | ||
"metadata": { | ||
"colab": { | ||
"provenance": [] | ||
}, | ||
"kernelspec": { | ||
"name": "python3", | ||
"display_name": "Python 3" | ||
}, | ||
"language_info": { | ||
"name": "python" | ||
} | ||
}, | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"source": [ | ||
"<a href=\"https://colab.research.google.com/drive/1QgRQmFmC_L07Ler4vbq_vcCNm_OHmJL_#scrollTo=nhaq7McR-Dpv\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>" | ||
], | ||
"metadata": { | ||
"id": "nhaq7McR-Dpv" | ||
} | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"source": [ | ||
"# Guardrail's Performance in Python Leetcode\n", | ||
"\n", | ||
"In this use case, our demo will evaluate performance of guardrails with Yival.\n", | ||
"\n", | ||
"After applying guardrails' wrapper to 80 Python Leetcode questions, you'll notice a discernible dip in performance, which may come with an uptick in cost and processing time.\n", | ||
"\n", | ||
"[Guardrails](https://github.com/ShreyaR/guardrails) is a Python package that lets a user add structure, type and quality guarantees to the outputs of large language models (LLMs).\n" | ||
], | ||
"metadata": { | ||
"id": "AkgS24Ei2nmI" | ||
} | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"source": [ | ||
"# Install Yival with git" | ||
], | ||
"metadata": { | ||
"id": "BMcU4KWm1pyG" | ||
} | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"source": [ | ||
"# Clone the latest yival\n", | ||
"import os\n", | ||
"!python --version\n", | ||
"!rm -rf YiVal\n", | ||
"!git clone https://github.com/YiVal/YiVal.git\n", | ||
"\n", | ||
"# Install and config poetry\n", | ||
"import shutil\n", | ||
"!pip install poetry\n", | ||
"POETRY_PATH = shutil.which(\"poetry\") or (os.getenv(\"HOME\") + \"/.local/bin/poetry\")\n", | ||
"os.environ[\"PATH\"] += os.pathsep + os.path.dirname(POETRY_PATH)\n", | ||
"!poetry --version\n", | ||
"!poetry config virtualenvs.create true" | ||
], | ||
"metadata": { | ||
"id": "C_1FUzv-1pCv" | ||
}, | ||
"execution_count": null, | ||
"outputs": [] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"id": "qGYH-EYCMuqR" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"os.chdir(\"/content/YiVal\")\n", | ||
"!poetry install --no-ansi" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"source": [ | ||
"# Configure your OpenAI API key" | ||
], | ||
"metadata": { | ||
"id": "2VKHnus31zC4" | ||
} | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"source": [ | ||
"os.environ['OPENAI_API_KEY']= ''" | ||
], | ||
"metadata": { | ||
"id": "Ym8BdfdG10sG" | ||
}, | ||
"execution_count": null, | ||
"outputs": [] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"source": [ | ||
"# Guardrails evaluation yml\n", | ||
"## Dataset\n", | ||
"The experiment sources its data from a CSV file which storage 80 leetcode problems (leetcode_problems.csv). The data source type is defined as a dataset.\n", | ||
"\n", | ||
"## Custom Function\n", | ||
"The custom function `demo.guardrails.run_leetcode.run_leetcode` is utilized in this experiment. This function addresses Leetcode problems by either utilizing Guardrails' wrapper or directly employing the ChatCompletion interface. It then returns the corresponding Python code as the result.\n", | ||
"\n", | ||
"## Variations\n", | ||
"In this experiment, we will explore two different variations: **`use_guardrails`** and **`gpt`**. **`use_guardrails`** indicates the use of Guardrails method in the experiment, while **`gpt`** signifies the absence of Guardrails method. These two variations will impact the outcomes and performance of the experiment.\n", | ||
"\n", | ||
"## Evaluators\n", | ||
"The experiment will utilize an evaluator named **`python_validation_evaluator`** to assess the model's performance. This assessment hinges on the successful execution of the generated Python code. It will compute an average score as the evaluation outcome.\n", | ||
"\n", | ||
"```\n", | ||
"custom_function: demo.guardrails.run_leetcode.run_leetcode\n", | ||
"dataset:\n", | ||
" file_path: demo/guardrails/data/leetcode_problems.csv\n", | ||
" reader: csv_reader\n", | ||
" source_type: dataset\n", | ||
"description: Generate the expected results for the Leetcode problems.\n", | ||
"evaluators:\n", | ||
" - evaluator_type: individual\n", | ||
" matching_technique: includes\n", | ||
" metric_calculators:\n", | ||
" - method: AVERAGE\n", | ||
" name: python_validation_evaluator\n", | ||
"\n", | ||
"variations:\n", | ||
" - name: use_guardrails\n", | ||
" variations:\n", | ||
" - instantiated_value: \"use_guardrails\"\n", | ||
" value: \"use_guardrails\"\n", | ||
" value_type: str\n", | ||
" variation_id: null\n", | ||
" - instantiated_value: \"gpt\"\n", | ||
" value: \"gpt\"\n", | ||
" value_type: str\n", | ||
" variation_id: null\n", | ||
"```" | ||
], | ||
"metadata": { | ||
"id": "KnmaSTEc13Rg" | ||
} | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"source": [ | ||
"# Configure your Ngrok token\n", | ||
"Our current ngrok authtoken only supports one public session at a time. If it's being used by others or if you're using it to run multiple Colabs at once, you might bump into a Network error. To avoid this, we suggest getting your own ngrok authtoken for your Colab notebooks. It's easy and free to get your own authtoken from ngrok.\n", | ||
"\n", | ||
"Here's how to do it:\n", | ||
"- If you don't have a ngrok account yet, head over to https://dashboard.ngrok.com/login to sign up.\n", | ||
"- Once you're logged in, you can grab your authtoken at https://dashboard.ngrok.com/get-started/your-authtoken.\n", | ||
"\n", | ||
"Prior to initiating a new demo, ensure that all other applications utilizing ngrok within Colab have been terminated via the `Connect -> Manage Sessions` pathway. You can check and manage your sessions as follow picture.\n", | ||
"\n", | ||
"![image](https://github.com/uni-zhuan/uni_CDN/blob/master/picture/Yival/iShot_2023-10-06_16.46.22.png?raw=true)" | ||
], | ||
"metadata": { | ||
"id": "W4aUMFd0SV2Q" | ||
} | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"source": [ | ||
"os.environ['ngrok']='true'\n", | ||
"!poetry run ngrok config add-authtoken 2UK3G7MKgDqCqDnu36njaaE02bZ_7FqvcqBke5hbpgHjizoo7" | ||
], | ||
"metadata": { | ||
"id": "j1fDEqSRSVR0" | ||
}, | ||
"execution_count": null, | ||
"outputs": [] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"source": [ | ||
"# YIVAL!" | ||
], | ||
"metadata": { | ||
"id": "2iVdkyWvSxJF" | ||
} | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"source": [ | ||
"!poetry run yival run /content/YiVal/demo/configs/guardrails_leetcode.yml --async_eval=True" | ||
], | ||
"metadata": { | ||
"id": "k-flN0tTIesb" | ||
}, | ||
"execution_count": null, | ||
"outputs": [] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"source": [ | ||
"# Experiment Result between use_guardrails\n", | ||
"![image](https://github.com/uni-zhuan/uni_CDN/blob/master/picture/Yival/WechatIMG134.png?raw=true)\n", | ||
"\n", | ||
"We can clearly see that the model's performance dropped when using the Guardrails wrapper. This indicates that in this demo, applying Guardrails to the LLM does not provide strong quality guarantees." | ||
], | ||
"metadata": { | ||
"id": "XrZJzjx_TcR7" | ||
} | ||
} | ||
] | ||
} |
Oops, something went wrong.