Skip to content

Commit

Permalink
Add openai bootcamp (#207)
Browse files Browse the repository at this point in the history
Signed-off-by: shiyu22 <shiyu.chen@zilliz.com>
  • Loading branch information
shiyu22 authored Apr 14, 2023
1 parent 3cc350e commit da732c5
Show file tree
Hide file tree
Showing 7 changed files with 831 additions and 0 deletions.
File renamed without changes.
File renamed without changes.
253 changes: 253 additions & 0 deletions docs/bootcamp/openai/chat.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,253 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "6b3ba1cc",
"metadata": {},
"source": [
"# Chat\n",
"\n",
"This example will show you how to chat with GPT, the original example is on [OpenAI Example](https://platform.openai.com/docs/guides/chat/introduction), the difference is that we will teach you how to cache the response for exact and similar matches with **gptcache**, it will be very simple, you just need to add an extra step to initialize the cache.\n",
"\n",
"Before running the example, make sure the `OPENAI_API_KEY` environment variable is set by executing `echo $OPENAI_API_KEY`. If it is not already set, it can be set by using `export OPENAI_API_KEY=YOUR_API_KEY` on Unix/Linux/MacOS systems or `set OPENAI_API_KEY=YOUR_API_KEY` on Windows systems.\n",
"\n",
"Then we can learn the usage and acceleration effect of gptcache by the following code, which consists of three parts, the original openai way, the exact search and the similar search."
]
},
{
"cell_type": "markdown",
"id": "aa0ba70e",
"metadata": {},
"source": [
"## OpenAI API original usage"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "80e9dae2",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Question: what‘s chatgpt\n",
"Time consuming: 4.83s\n",
"Answer: Sorry, as an AI language model, I don't have access to the internet, so I can't provide you with any information about a term ChatGPT. Could you please provide more context or clarify your question?\n",
"\n"
]
}
],
"source": [
"import time\n",
"import openai\n",
"\n",
"\n",
"def response_text(openai_resp):\n",
" return openai_resp['choices'][0]['message']['content']\n",
"\n",
"\n",
"question = 'what‘s chatgpt'\n",
"\n",
"# OpenAI API original usage\n",
"start_time = time.time()\n",
"response = openai.ChatCompletion.create(\n",
" model='gpt-3.5-turbo',\n",
" messages=[\n",
" {\n",
" 'role': 'user',\n",
" 'content': question\n",
" }\n",
" ],\n",
")\n",
"print(f'Question: {question}')\n",
"print(\"Time consuming: {:.2f}s\".format(time.time() - start_time))\n",
"print(f'Answer: {response_text(response)}\\n')"
]
},
{
"cell_type": "markdown",
"id": "9d871550",
"metadata": {},
"source": [
"## OpenAI API + GPTCache, exact match cache\n",
"\n",
"Initalize the cache to run GPTCache and import `openai` form `gptcache.adapter`, which will automatically set the map data manager to match the exact cahe, more details refer to [build your cache](https://gptcache.readthedocs.io/en/dev/usage.html#build-your-cache).\n",
"\n",
"And if you ask ChatGPT the exact same two questions, the answer to the second question will be obtained from the cache without requesting ChatGPT again."
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "024484f3",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Cache loading.....\n",
"Question: what's github\n",
"Time consuming: 8.62s\n",
"Answer: GitHub is a web-based platform used by developers to store, manage, and share their code with other developers around the world. It is essentially a code hosting and collaboration platform that lets users contribute to open-source projects and share their own code with others. GitHub provides features such as version control, issue tracking, and pull requests, which allow developers to easily collaborate with others on projects. It is widely used in the software development industry and has become a standard tool for developers worldwide.\n",
"\n",
"Question: what's github\n",
"Time consuming: 0.00s\n",
"Answer: GitHub is a web-based platform used by developers to store, manage, and share their code with other developers around the world. It is essentially a code hosting and collaboration platform that lets users contribute to open-source projects and share their own code with others. GitHub provides features such as version control, issue tracking, and pull requests, which allow developers to easily collaborate with others on projects. It is widely used in the software development industry and has become a standard tool for developers worldwide.\n",
"\n"
]
}
],
"source": [
"import time\n",
"\n",
"\n",
"def response_text(openai_resp):\n",
" return openai_resp['choices'][0]['message']['content']\n",
"\n",
"print(\"Cache loading.....\")\n",
"\n",
"# To use GPTCache, that's all you need\n",
"# -------------------------------------------------\n",
"from gptcache import cache\n",
"from gptcache.adapter import openai\n",
"\n",
"cache.init()\n",
"cache.set_openai_key()\n",
"# -------------------------------------------------\n",
"\n",
"question = \"what's github\"\n",
"for _ in range(2):\n",
" start_time = time.time()\n",
" response = openai.ChatCompletion.create(\n",
" model='gpt-3.5-turbo',\n",
" messages=[\n",
" {\n",
" 'role': 'user',\n",
" 'content': question\n",
" }\n",
" ],\n",
" )\n",
" print(f'Question: {question}')\n",
" print(\"Time consuming: {:.2f}s\".format(time.time() - start_time))\n",
" print(f'Answer: {response_text(response)}\\n')"
]
},
{
"cell_type": "markdown",
"id": "6f2ff699",
"metadata": {},
"source": [
"## OpenAI API + GPTCache, similar search cache\n",
"\n",
"Set the cache with `embedding_func` to generate embedding for the text, and `data_manager` to manager the cache data, `similarity_evaluation` to evaluate the similarities, more details refer to [build your cache](https://gptcache.readthedocs.io/en/dev/usage.html#build-your-cache).\n",
"\n",
"After obtaining an answer from ChatGPT in response to several similar questions, the answers to subsequent questions can be retrieved from the cache without the need to request ChatGPT again."
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "fd1ff06e",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Cache loading.....\n",
"Question: what's github\n",
"Time consuming: 7.91s\n",
"Answer: GitHub is a web-based platform used for version control and collaboration by developers working on software projects. It provides a centralized repository for tracking changes made to code, and facilitates collaboration between developers through features such as pull requests, code reviews, and issue tracking. It is widely used for open-source software development and has become a popular platform for hosting code and collaborating on software projects.\n",
"\n",
"Question: can you explain what GitHub is\n",
"Time consuming: 0.22s\n",
"Answer: GitHub is a web-based platform used for version control and collaboration by developers working on software projects. It provides a centralized repository for tracking changes made to code, and facilitates collaboration between developers through features such as pull requests, code reviews, and issue tracking. It is widely used for open-source software development and has become a popular platform for hosting code and collaborating on software projects.\n",
"\n",
"Question: can you tell me more about GitHubwhat is the purpose of GitHub\n",
"Time consuming: 0.23s\n",
"Answer: GitHub is a web-based platform used for version control and collaboration by developers working on software projects. It provides a centralized repository for tracking changes made to code, and facilitates collaboration between developers through features such as pull requests, code reviews, and issue tracking. It is widely used for open-source software development and has become a popular platform for hosting code and collaborating on software projects.\n",
"\n"
]
}
],
"source": [
"import time\n",
"\n",
"\n",
"def response_text(openai_resp):\n",
" return openai_resp['choices'][0]['message']['content']\n",
"\n",
"from gptcache import cache\n",
"from gptcache.adapter import openai\n",
"from gptcache.embedding import Onnx\n",
"from gptcache.manager import CacheBase, VectorBase, get_data_manager\n",
"from gptcache.similarity_evaluation.distance import SearchDistanceEvaluation\n",
"\n",
"print(\"Cache loading.....\")\n",
"\n",
"onnx = Onnx()\n",
"data_manager = get_data_manager(CacheBase(\"sqlite\"), VectorBase(\"faiss\", dimension=onnx.dimension))\n",
"cache.init(\n",
" embedding_func=onnx.to_embeddings,\n",
" data_manager=data_manager,\n",
" similarity_evaluation=SearchDistanceEvaluation(),\n",
" )\n",
"cache.set_openai_key()\n",
"\n",
"questions = [\n",
" \"what's github\",\n",
" \"can you explain what GitHub is\",\n",
" \"can you tell me more about GitHub\"\n",
" \"what is the purpose of GitHub\"\n",
"]\n",
"\n",
"for question in questions:\n",
" start_time = time.time()\n",
" response = openai.ChatCompletion.create(\n",
" model='gpt-3.5-turbo',\n",
" messages=[\n",
" {\n",
" 'role': 'user',\n",
" 'content': question\n",
" }\n",
" ],\n",
" )\n",
" print(f'Question: {question}')\n",
" print(\"Time consuming: {:.2f}s\".format(time.time() - start_time))\n",
" print(f'Answer: {response_text(response)}\\n')"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "07d92eae",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.12"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Loading

0 comments on commit da732c5

Please sign in to comment.