diff --git a/04-panel-intro.ipynb b/04-panel-intro.ipynb new file mode 100644 index 0000000..8c2545c --- /dev/null +++ b/04-panel-intro.ipynb @@ -0,0 +1,397 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Intro to HoloViz" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "HoloViz is a suite of high-level Python tools that are designed to work together to make visualizing data a breeze, from conducting exploratory data analysis to deploying complex dashboards.\n", + "\n", + "The core HoloViz projects are as follows:\n", + "\n", + "- [Panel](https://panel.holoviz.org): Create interactive dashboards in Jupyter notebooks or standalone apps\n", + "- [hvPlot](https://hvplot.holoviz.org): Quickly and interactively explore data with a familiar API\n", + "- [HoloViews](https://holoviews.org): Interactive plotting experience\n", + "- [GeoViews](http://geoviews.org): Geographic extension of HoloViews\n", + "- [Datashader](https://datashader.org): Render big data images in a browser\n", + "- [Lumen](https://lumen.holoviz.org/): Construct no-code dashboards from simple YAML specifications\n", + "- [Colorcet](https://colorcet.holoviz.org/): Plot with perceptually based colormaps\n", + "- [Param](https://param.holoviz.org): Declaratively code in Python" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## What is Panel\n", + "\n", + "Today, the focus is on Panel.\n", + "\n", + "Panel packs many pre-built frontend components that are **usable with Python**.\n", + "\n", + "That means you can convert your static Python scripts into interactive ones--**no Javascript necessary**!" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import panel as pn\n", + "pn.extension()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Basic Panel Tutorial\n", + "\n", + "Let's start out building an interactive app that allows the user to print a custom message.\n", + "\n", + "Currently it's hard coded to `\"Hello World\"`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "print(\"Hello World!\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Widget\n", + "\n", + "We can give the user more control by introducing a `TextInput` widget." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "message_input = pn.widgets.TextInput(value=\"Hello World!\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Interactivity\n", + "\n", + "Then, we can `pn.bind` the widget's `param.value` to the callback, `echo_message`, which simply echos the input value on change.\n", + "\n", + "Note: it's important to prefix `value` with `param`--without it, there will be no updates!" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "def echo_message(message):\n", + " return f\"{message}\"\n", + "\n", + "message_ref = pn.bind(echo_message, message=message_input.param.value)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Layout\n", + "\n", + "Next, create a simple layout to see the results.\n", + "\n", + "Try typing unique in the widget to see the message update!" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "pn.Column(message_input, message_ref)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Recap\n", + "\n", + "To recap, we:\n", + "\n", + "1. instantiated a widget (`TextInput`).\n", + "2. defined a function `echo_message`\n", + "3. bounded the function to the widget's *param* value\n", + "4. laid out the the widget and the bound reference\n", + "\n", + "![recap](images/recap.png)\n", + "\n", + "Here's all the code cells collected into one!" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import panel as pn\n", + "pn.extension()\n", + "\n", + "message_input = pn.widgets.TextInput(value=\"Hello World!\")\n", + "\n", + "def echo_message(message):\n", + " return f\"{message}\"\n", + "\n", + "message_ref = pn.bind(echo_message, message=message_input.param.value)\n", + "\n", + "pn.Column(message_input, message_ref)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Challenge\n", + "\n", + "Doing this repeatedly is key to creating more complex apps with Panel, so let's do a quick exercise.\n", + "\n", + "Your goal is to create a widget that will toggle the message to upper case if activated by filling out the ellipses (`...`)!\n", + "\n", + "Hint: check out the [Component gallery](https://panel.holoviz.org/reference/index.html) to see what widgets are available to accomplish this goal (one of them starts with a `T`, but there are multiple solutions!)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import panel as pn\n", + "pn.extension()\n", + "\n", + "message_input = pn.widgets.TextInput(value=\"Hello World!\")\n", + "toggle_upper = ... # Fill this out\n", + "\n", + "def echo_message(message, toggle_upper):\n", + " ... # Fill this out\n", + " return f\"{message}\"\n", + "\n", + "message_ref = pn.bind(echo_message, message=message_input.param.value, toggle_upper=...) # Fill this out\n", + "\n", + "pn.Column(message_input, message_ref)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Congrats on building an interactive Panel app! 🎉" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Introducing Panel ChatInterface" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now, introducing `pn.chat.ChatInterface`, which is a component that packages all the steps you just learned to provide convenient features for developing a Chat UI with LLMs!\n", + "\n", + "### Widget\n", + "\n", + "Try typing a message and pressing enter to send!" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "chat = pn.chat.ChatInterface()\n", + "chat" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You might have noticed that it echoes the message you entered, but it doesn't reply... not fun (yet).\n", + "\n", + "### Interactivity\n", + "\n", + "To make it reply, all we have to do is set a `callback`, like `pn.bind`, but with a caveat: it needs these three arguments: `contents`, `user`, and `instance`.\n", + "\n", + "Now when you try sending a message in the chat interface, it will be echoed back in italics!" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "def echo_message(contents: str, user: str, instance: pn.chat.ChatInterface):\n", + " return f\"{contents}\"\n", + "\n", + "chat.callback = echo_message" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Streaming\n", + "\n", + "You might have seen services, like OpenAI and Mistral, stream tokens as they arrive.\n", + "\n", + "We can simulate streaming tokens by looping through the contents of the user's input, concatenating the characters to the final message, and `yield`ing it in italics.\n", + "\n", + "Since there's no serious computation, it'll run too fast for us to perceive streaming--thus `time.sleep`.\n", + "\n", + "Here's the latest code collected into one (and also `callback` within instantation)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import time\n", + "import panel as pn\n", + "pn.extension()\n", + "\n", + "def stream_echo_message(contents: str, user: str, instance: pn.chat.ChatInterface):\n", + " message = \"\"\n", + " for char in contents:\n", + " time.sleep(0.1) # to simulate a serious computation\n", + " message += char\n", + " yield f\"{message}\"\n", + "\n", + "chat = pn.chat.ChatInterface(callback=stream_echo_message)\n", + "chat" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### More Interactivity\n", + "\n", + "`pn.chat.ChatInterface` can be used with other widgets too!\n", + "\n", + "Here, we include a `pn.widgets.FloatSlider` to control how long to wait between each character streamed." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import time\n", + "import panel as pn\n", + "pn.extension()\n", + "\n", + "def stream_echo_message(contents: str, user: str, instance: pn.chat.ChatInterface):\n", + " message = \"\"\n", + " for char in contents:\n", + " time.sleep(slider.value) # to simulate a serious computation\n", + " message += char\n", + " yield f\"{message}\"\n", + "\n", + "slider = pn.widgets.FloatSlider(start=0.01, value=0.5, name=\"Sleep (s)\", align=\"center\")\n", + "chat = pn.chat.ChatInterface(callback=stream_echo_message, min_height=350)\n", + "\n", + "pn.Column(slider, chat)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Other Inputs\n", + "\n", + "`pn.chat.ChatInterface` also supports multi-modal inputs, like images, videos, PDFs, and more!" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from io import BytesIO\n", + "import panel as pn\n", + "\n", + "pn.extension()\n", + "\n", + "def display_info(contents: BytesIO, user: str, instance: pn.chat.ChatInterface):\n", + " size = len(contents.getvalue())\n", + " return f\"Size of input: {size / (1024 * 1024):.2f} MB\"\n", + "\n", + "file_input = pn.widgets.FileInput(accept=\".jpeg,.png,.gif,.mp4,.pdf\")\n", + "chat = pn.chat.ChatInterface(widgets=[file_input], callback=display_info)\n", + "chat" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "That's it for a crash course on Panel! These techniques, used repeatedly, will allow you to build increasingly complex web apps with just Python.\n", + "\n", + "To learn more about `pn.chat.ChatInterface`, click [here](https://panel.holoviz.org/reference/chat/ChatInterface.html). It inherits from `pn.chat.ChatFeed`, so check that out [here](https://panel.holoviz.org/reference/chat/ChatFeed.html) too!\n", + "\n", + "For more tutorials delving into Panel in general, click [here](https://panel.holoviz.org/tutorials/index.html) or check out the app gallery [here](https://panel.holoviz.org/gallery/index.html).\n", + "\n", + "There is also a HoloViz Discourse if you want to ask questions [here](https://discourse.holoviz.org/)." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "ahuang@anaconda.com-ahuang@anaconda.com-ragna-panel-presentations", + "language": "python", + "name": "conda-env-ahuang_anaconda.com-ahuang_anaconda.com-ragna-panel-presentations-py" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.9" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/05-panel-local-llm.ipynb b/05-panel-local-llm.ipynb new file mode 100644 index 0000000..c4266ca --- /dev/null +++ b/05-panel-local-llm.ipynb @@ -0,0 +1,315 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Local LLMs with Panel" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In the [previous notebook](04-panel-intro.ipynb), `pn.chat.ChatInterface` was introduced with a callback that simply echoed the sent message.\n", + "\n", + "In this section, we will make it much more interesting by connecting a local LLM, specfically Llama-3 from earlier.\n", + "\n", + "## ExLlama2\n", + "\n", + "### Initialize\n", + "\n", + "Let's first initialize the model." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from local_llm import Llama38BInstruct\n", + "from ragna import Rag, source_storages\n", + "import panel as pn\n", + "pn.extension()\n", + "\n", + "documents = [\n", + " \"files/psf-report-2021.pdf\",\n", + " \"files/psf-report-2022.pdf\",\n", + "]\n", + "\n", + "chat = Rag().chat(\n", + " documents=documents,\n", + " source_storage=source_storages.Chroma,\n", + " assistant=Llama38BInstruct,\n", + ")\n", + "\n", + "await chat.prepare();" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Migrate\n", + "\n", + "We can first do a test run to see if it works with the example from before." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "message = await chat.answer(\"Who is the Python Developer in Residence?\", stream=True)\n", + "\n", + "async for chunk in message:\n", + " print(chunk, end=\"\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now, let's migrate this functionality into `pn.chat.ChatInterface` with a callback.\n", + "\n", + "To do this, we copy paste the prior cell's code into a function, and then:\n", + "\n", + "1. prefix the `def` with `async` to make it async\n", + "2. replace the hard-coded string with `contents`\n", + "3. concatenate the chunks into a `response` string\n", + "4. yield the `response`" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "async def reply(contents, user, instance):\n", + " message = await chat.answer(contents, stream=True)\n", + "\n", + " response = \"\"\n", + " async for chunk in message:\n", + " response += chunk\n", + " yield response\n", + "\n", + "chat_interface = pn.chat.ChatInterface(callback=reply)\n", + "chat_interface" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now try entering \"Who is the Python Developer in Residence?\" into the chat. It should give you a similar response as before!" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## LlamaCpp" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "For posterity, we can use `llama-cpp-python` for quantized models too!\n", + "\n", + "`llama-cpp` can run on both CPU and GPU, and has an API that mimics OpenAI's API.\n", + "\n", + "Personally, I use it because I don't have any spare GPUs lying around and it runs extremely well on my local Mac M2 Pro! It also handles chat template formats internally so it's just a matter of specifying a the proper `chat_format` key.\n", + "\n", + "Here, we:\n", + "1. download the quantized model (if it doesn't exist already) in GGUF format\n", + "2. instantiate the model; first checking the cache\n", + "3. serialize all messages into `transformers` format (new)\n", + "4. calls the chat completion Openai-like API on the messages\n", + "5. stream the chunks" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from pathlib import Path\n", + "\n", + "import llama_cpp\n", + "import panel as pn\n", + "from huggingface_hub import hf_hub_download\n", + "pn.extension()\n", + "\n", + "model_path = hf_hub_download(\n", + " \"TheBloke/Mistral-7B-Instruct-v0.2-GGUF\",\n", + " \"mistral-7b-instruct-v0.2.Q5_K_M.gguf\",\n", + " local_dir=str(Path.home() / \"shared/analyst/models\")\n", + ") # 1.\n", + "\n", + "# 2.\n", + "if model_path in pn.state.cache:\n", + " llama = pn.state.cache[model_path]\n", + "else:\n", + " llama = llama_cpp.Llama(\n", + " model_path=model_path,\n", + " n_gpu_layers=-1,\n", + " chat_format=\"mistral-instruct\",\n", + " n_ctx=2048,\n", + " logits_all=True,\n", + " verbose=False,\n", + " )\n", + " pn.state.cache[model_path] = llama\n", + "\n", + "def reply(contents: str, user: str, instance: pn.chat.ChatInterface):\n", + " messages = instance.serialize() # 3.\n", + " message = llama.create_chat_completion_openai_v1(messages=messages, stream=True) # 4.\n", + "\n", + " response = \"\"\n", + " for chunk in message:\n", + " part = chunk.choices[0].delta.content or \"\"\n", + " response += part\n", + " yield response # 5.\n", + "\n", + "chat_interface = pn.chat.ChatInterface(callback=reply)\n", + "chat_interface" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can even give the model a personality by setting a system message!\n", + "\n", + "Update the callback with the a system message.\n", + "\n", + "Note, Mistral Instruct does NOT support the `system` role so we use `user` instead." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "system_message = \"You are an excessively passionate Pythonista.\"\n", + "\n", + "def reply(contents: str, user: str, instance: pn.chat.ChatInterface):\n", + " messages = [\n", + " {\"role\": \"user\", \"content\": system_message} # updated here\n", + " ] + instance.serialize()\n", + " message = llama.create_chat_completion_openai_v1(messages=messages, stream=True)\n", + "\n", + " response = \"\"\n", + " for chunk in message:\n", + " part = chunk.choices[0].delta.content or \"\"\n", + " response += part\n", + " yield response\n", + "\n", + "chat_interface.callback = reply\n", + "chat_interface" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Challenge\n", + "\n", + "Your turn! Try aggregating all you've learned to customize the personality of the chatbot on the go!\n", + "\n", + "Again, replace the ellipses with the appropriate code snippets!" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import llama_cpp\n", + "import panel as pn\n", + "from pydantic import BaseModel\n", + "from huggingface_hub import hf_hub_download\n", + "\n", + "pn.extension()\n", + "\n", + "model_path = hf_hub_download(\n", + " \"TheBloke/Mistral-7B-Instruct-v0.2-GGUF\",\n", + " \"mistral-7b-instruct-v0.2.Q5_K_M.gguf\",\n", + " local_dir=str(Path.home() / \"shared/analyst/models\")\n", + ")\n", + "\n", + "if model_path in pn.state.cache:\n", + " llama = pn.state.cache[model_path]\n", + "else:\n", + " llama = llama_cpp.Llama(\n", + " model_path=model_path,\n", + " n_gpu_layers=-1,\n", + " chat_format=\"mistral-instruct\",\n", + " n_ctx=2048,\n", + " logits_all=True,\n", + " verbose=False,\n", + " )\n", + " pn.state.cache[model_path] = llama\n", + "\n", + "def reply(contents: str, user: str, instance: pn.chat.ChatInterface):\n", + " messages = [\n", + " {\"role\": \"user\", \"content\": ...} # Fill this out\n", + " ] + instance.serialize()\n", + " message = llama.create_chat_completion_openai_v1(\n", + " messages=messages, stream=True\n", + " )\n", + "\n", + " response = \"\"\n", + " for chunk in message:\n", + " part = chunk.choices[0].delta.content or \"\"\n", + " response += part\n", + " yield response\n", + "\n", + "\n", + "system_input = ... # Fill this out\n", + "chat_interface = pn.chat.ChatInterface(callback=reply, min_height=350)\n", + "layout = pn.Column(\n", + " system_input,\n", + " chat_interface,\n", + ")\n", + "layout" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "That's all for now. Click [here](https://holoviz-topics.github.io/panel-chat-examples/) to see more on how you can integrate `pn.chat.ChatInterface` with other services!\n", + "\n", + "Again, there is also a HoloViz Discourse if you want to ask questions [here](https://discourse.holoviz.org/)." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "ahuang@anaconda.com-ahuang@anaconda.com-ragna-panel-presentations", + "language": "python", + "name": "conda-env-ahuang_anaconda.com-ahuang_anaconda.com-ragna-panel-presentations-py" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.9" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/environment.yml b/environment.yml index a92d01a..cb6c716 100644 --- a/environment.yml +++ b/environment.yml @@ -1,12 +1,15 @@ name: ragna-presentations channels: + - nvidia - conda-forge dependencies: - python=3.11 + - libcublas - gxx - pip - pip: - python-dotenv + - panel - ragna @ git+https://github.com/Quansight/ragna@pycon - chromadb>=0.4.13 - httpx_sse @@ -18,9 +21,13 @@ dependencies: - python-pptx - tiktoken - torch ==2.2.* + - openai + - llama-cpp-python @ + https://github.com/abetlen/llama-cpp-python/releases/download/v0.2.72-cu121/llama_cpp_python-0.2.72-cp311-cp311-linux_x86_64.whl - exllamav2 @ https://github.com/turboderp/exllamav2/releases/download/v0.0.18/exllamav2-0.0.18+cu121-cp311-cp311-linux_x86_64.whl - jupyterlab_nvdashboard - ipykernel - iprogress - ipywidgets +variables: {} diff --git a/images/recap.png b/images/recap.png new file mode 100644 index 0000000..01eb77a Binary files /dev/null and b/images/recap.png differ