diff --git a/.github/ignore-notebooks.txt b/.github/ignore-notebooks.txt
index ba0615c..0fe17bd 100644
--- a/.github/ignore-notebooks.txt
+++ b/.github/ignore-notebooks.txt
@@ -6,3 +6,4 @@
01_routing_optimization
02_semantic_cache_optimization
spring_ai_redis_rag.ipynb
+00_litellm_proxy_redis.ipynb
\ No newline at end of file
diff --git a/.gitignore b/.gitignore
index 47ae183..1a8186b 100644
--- a/.gitignore
+++ b/.gitignore
@@ -217,6 +217,7 @@ pyrightconfig.json
pyvenv.cfg
pip-selfcheck.json
+# other
libs/redis/docs/.Trash*
.python-version
.idea/*
@@ -224,3 +225,6 @@ java-recipes/.*
python-recipes/vector-search/beir_datasets
python-recipes/vector-search/datasets
+
+litellm_proxy.log
+litellm_redis.yml
diff --git a/README.md b/README.md
index 16fb4c6..9113ca3 100644
--- a/README.md
+++ b/README.md
@@ -38,6 +38,13 @@ No faster way to get started than by diving in and playing around with a demo.
Need quickstarts to begin your Redis AI journey? **Start here.**
+### Non-Python Redis AI Recipes
+
+#### ☕️ Java
+
+A set of Java recipes can be found under [/java-recipes](/java-recipes/README.md).
+
+
### Getting started with Redis & Vector Search
| Recipe | Description |
@@ -48,11 +55,6 @@ Need quickstarts to begin your Redis AI journey? **Start here.**
| [/vector-search/02_hybrid_search.ipynb](/python-recipes/vector-search/02_hybrid_search.ipynb) | Hybrid search techniques with Redis (BM25 + Vector) |
| [/vector-search/03_dtype_support.ipynb](/python-recipes/vector-search/03_dtype_support.ipynb) | Shows how to convert a float32 index to float16 or integer dataypes|
-### Non-Python Redis AI Recipes
-
-#### ☕️ Java
-
-A set of Java recipes can be found under [/java-recipes](/java-recipes/README.md).
### Retrieval Augmented Generation (RAG)
@@ -77,7 +79,7 @@ LLMs are stateless. To maintain context within a conversation chat sessions must
| [/llm-session-manager/00_session_manager.ipynb](python-recipes/llm-session-manager/00_llm_session_manager.ipynb) | LLM session manager with semantic similarity |
| [/llm-session-manager/01_multiple_sessions.ipynb](python-recipes/llm-session-manager/01_multiple_sessions.ipynb) | Handle multiple simultaneous chats with one instance |
-### Semantic Cache
+### Semantic Caching
An estimated 31% of LLM queries are potentially redundant ([source](https://arxiv.org/pdf/2403.02694)). Redis enables semantic caching to help cut down on LLM costs quickly.
| Recipe | Description |
@@ -94,6 +96,15 @@ Routing is a simple and effective way of preventing misuses with your AI applica
| [/semantic-router/00_semantic_routing.ipynb](python-recipes/semantic-router/00_semantic_routing.ipynb) | Simple examples of how to build an allow/block list router in addition to a multi-topic router |
| [/semantic-router/01_routing_optimization.ipynb](python-recipes/semantic-router/01_routing_optimization.ipynb) | Use RouterThresholdOptimizer from redisvl to setup best router config |
+
+### AI Gateways
+AI gateways manage LLM traffic through a centralized, managed layer that can implement routing, rate limiting, caching, and more.
+
+| Recipe | Description |
+| --- | --- |
+| [/gateway/00_litellm_proxy_redis.ipynb](python-recipes/gateway/00_litellm_proxy_redis.ipynb) | Getting started with LiteLLM proxy and Redis. |
+
+
### Agents
| Recipe | Description |
diff --git a/python-recipes/gateway/00_litellm_proxy_redis.ipynb b/python-recipes/gateway/00_litellm_proxy_redis.ipynb
new file mode 100644
index 0000000..5116a6b
--- /dev/null
+++ b/python-recipes/gateway/00_litellm_proxy_redis.ipynb
@@ -0,0 +1,1347 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "id": "47c3fefa",
+ "metadata": {
+ "id": "47c3fefa"
+ },
+ "source": [
+ "\n",
+ "
\n",
+ "

\n",
+ "

\n",
+ "
\n",
+ "\n",
+ "# LiteLLM Proxy with Redis\n",
+ "\n",
+ "This notebook demonstrates how to use [LiteLLM](https://github.com/BerriAI/litellm) with Redis to build a powerful and efficient LLM proxy server backed by caching & rate limiting capabilities. LiteLLM provides a unified interface for accessing multiple LLM providers while Redis enhances performance of the application in several different ways.\n",
+ "\n",
+ "*This recipe will help you understand*:\n",
+ "\n",
+ "* **How** to set up LiteLLM as a proxy for different LLM endpoints\n",
+ "* **Why** and **how** to implement exact and semantic caching for LLM calls\n",
+ "\n",
+ "**Open in Colab**\n",
+ "\n",
+ "
\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "06c7b959",
+ "metadata": {
+ "id": "06c7b959"
+ },
+ "source": [
+ "\n",
+ "## 1 · Environment Setup \n",
+ "Before we begin, we need to make sure our environment is properly set up with all the necessary tools and resources.\n",
+ "\n",
+ "**Requirements**:\n",
+ "* Python ≥ 3.9 with the below packages\n",
+ "* OpenAI API key (set as `OPENAI_API_KEY` environment variable)\n",
+ "\n",
+ "\n",
+ "### Install Python Dependencies\n",
+ "\n",
+ "First, let's install the required packages."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "47246c48",
+ "metadata": {
+ "id": "47246c48"
+ },
+ "outputs": [],
+ "source": [
+ "%pip install \"litellm[proxy]==1.68.0\" \"redisvl==0.5.2\" requests openai"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "redis-setup",
+ "metadata": {
+ "id": "redis-setup"
+ },
+ "source": [
+ "### Install Redis Stack\n",
+ "\n",
+ "\n",
+ "#### For Colab\n",
+ "Use the shell script below to download, extract, and install [Redis Stack](https://redis.io/docs/getting-started/install-stack/) directly from the Redis package archive."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "0db80601",
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "0db80601",
+ "outputId": "e01d1a40-f412-4808-d5f0-4d34fb2204d7"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "deb [signed-by=/usr/share/keyrings/redis-archive-keyring.gpg] https://packages.redis.io/deb jammy main\n",
+ "Starting redis-stack-server, database path /var/lib/redis-stack\n"
+ ]
+ }
+ ],
+ "source": [
+ "# NBVAL_SKIP\n",
+ "%%sh\n",
+ "curl -fsSL https://packages.redis.io/gpg | sudo gpg --dearmor -o /usr/share/keyrings/redis-archive-keyring.gpg\n",
+ "echo \"deb [signed-by=/usr/share/keyrings/redis-archive-keyring.gpg] https://packages.redis.io/deb $(lsb_release -cs) main\" | sudo tee /etc/apt/sources.list.d/redis.list\n",
+ "sudo apt-get update > /dev/null 2>&1\n",
+ "sudo apt-get install redis-stack-server > /dev/null 2>&1\n",
+ "redis-stack-server --daemonize yes"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "b750e779",
+ "metadata": {
+ "id": "b750e779"
+ },
+ "source": [
+ "#### For Alternative Environments\n",
+ "There are many ways to get the necessary redis-stack instance running\n",
+ "1. On cloud, deploy a [FREE instance of Redis in the cloud](https://redis.io/try-free/). Or, if you have your\n",
+ "own version of Redis Enterprise running, that works too!\n",
+ "2. Per OS, [see the docs](https://redis.io/docs/latest/operate/oss_and_stack/install/install-stack/)\n",
+ "3. With docker: `docker run -d --name redis-stack-server -p 6379:6379 redis/redis-stack-server:latest`"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "177e9fe3",
+ "metadata": {
+ "id": "177e9fe3"
+ },
+ "source": [
+ "### Define the Redis Connection URL\n",
+ "\n",
+ "By default this notebook connects to the local instance of Redis Stack. **If you have your own Redis Enterprise instance** - replace REDIS_PASSWORD, REDIS_HOST and REDIS_PORT values with your own."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 1,
+ "id": "be77a1d3",
+ "metadata": {
+ "id": "be77a1d3"
+ },
+ "outputs": [],
+ "source": [
+ "import os\n",
+ "\n",
+ "# Replace values below with your own if using Redis Cloud instance\n",
+ "REDIS_HOST = os.getenv(\"REDIS_HOST\", \"localhost\") # ex: \"redis-18374.c253.us-central1-1.gce.cloud.redislabs.com\"\n",
+ "REDIS_PORT = os.getenv(\"REDIS_PORT\", \"6379\") # ex: 18374\n",
+ "REDIS_PASSWORD = os.getenv(\"REDIS_PASSWORD\", \"\") # ex: \"1TNxTEdYRDgIDKM2gDfasupCADXXXX\"\n",
+ "\n",
+ "# If SSL is enabled on the endpoint, use rediss:// as the URL prefix\n",
+ "REDIS_URL = f\"redis://:{REDIS_PASSWORD}@{REDIS_HOST}:{REDIS_PORT}\"\n",
+ "os.environ[\"REDIS_URL\"] = REDIS_URL\n",
+ "os.environ[\"REDIS_HOST\"] = REDIS_HOST\n",
+ "os.environ[\"REDIS_PORT\"] = REDIS_PORT\n",
+ "os.environ[\"REDIS_PASSWORD\"] = REDIS_PASSWORD"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "redis-connection",
+ "metadata": {
+ "id": "redis-connection"
+ },
+ "source": [
+ "### Verify Redis Connection\n",
+ "\n",
+ "Let's test our Redis connection to make sure it's working properly:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 132,
+ "id": "f3ddcabf",
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "f3ddcabf",
+ "outputId": "162846c8-4add-4de7-9ed6-69e8656ec102"
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "True"
+ ]
+ },
+ "execution_count": 132,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "from redis import Redis\n",
+ "\n",
+ "client = Redis.from_url(REDIS_URL)\n",
+ "client.ping()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 133,
+ "id": "AZmD8eR1lphs",
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "AZmD8eR1lphs",
+ "outputId": "0aaf4533-d239-4ad9-8853-e7192abf78d6"
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "True"
+ ]
+ },
+ "execution_count": 133,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "client.flushall()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "ce052678",
+ "metadata": {
+ "id": "ce052678"
+ },
+ "source": [
+ "### Set OPENAI API Key"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "e21ac07e",
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "e21ac07e",
+ "outputId": "3a6d5465-35e0-49af-ce1a-54df86898cee"
+ },
+ "outputs": [],
+ "source": [
+ "import getpass\n",
+ "import os\n",
+ "\n",
+ "os.environ[\"LITELLM_LOG\"] = \"DEBUG\"\n",
+ "\n",
+ "def _set_env(key: str):\n",
+ " if key not in os.environ:\n",
+ " os.environ[key] = getpass.getpass(f\"{key}:\")\n",
+ "\n",
+ "_set_env(\"OPENAI_API_KEY\")\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "5X9nFyFkPdkV",
+ "metadata": {
+ "id": "5X9nFyFkPdkV"
+ },
+ "source": [
+ "## 2 · Running the LiteLLM Proxy\n",
+ "First, we will define a LiteLLM config that contains:\n",
+ "\n",
+ "- a few supported model options\n",
+ "- a semantic caching configuration using Redis"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 234,
+ "id": "pdeAixSUPxT7",
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "pdeAixSUPxT7",
+ "outputId": "9cbff8c0-7fc8-431a-e93c-ba05698d217e"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Overwriting litellm_redis.yml\n"
+ ]
+ }
+ ],
+ "source": [
+ "%%writefile litellm_redis.yml\n",
+ "model_list:\n",
+ "- litellm_params:\n",
+ " api_key: os.environ/OPENAI_API_KEY\n",
+ " model: gpt-3.5-turbo\n",
+ " rpm: 30\n",
+ " model_name: gpt-3.5-turbo\n",
+ "- litellm_params:\n",
+ " api_key: os.environ/OPENAI_API_KEY\n",
+ " model: gpt-4o-mini\n",
+ " rpm: 30\n",
+ " model_name: gpt-4o-mini\n",
+ "- litellm_params:\n",
+ " api_key: os.environ/OPENAI_API_KEY\n",
+ " model: text-embedding-3-small\n",
+ " model_name: text-embedding-3-small\n",
+ "\n",
+ "litellm_settings:\n",
+ " cache: True\n",
+ " cache_params:\n",
+ " type: redis\n",
+ " host: os.environ/REDIS_HOST\n",
+ " port: os.environ/REDIS_PORT\n",
+ " password: os.environ/REDIS_PASSWORD\n",
+ " default_in_redis_ttl: 60"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "4RqOqBoAHwVD",
+ "metadata": {
+ "id": "4RqOqBoAHwVD"
+ },
+ "source": [
+ "Now for some helper code that will start/stop **LiteLLM** proxy as a background task here on the host machine."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 235,
+ "id": "8mml7LhvPxWU",
+ "metadata": {
+ "id": "8mml7LhvPxWU"
+ },
+ "outputs": [],
+ "source": [
+ "import subprocess, atexit, os, signal, socket, time, pathlib, textwrap, sys\n",
+ "\n",
+ "\n",
+ "_proxy_handle: subprocess.Popen | None = None\n",
+ "\n",
+ "\n",
+ "def _is_port_open(port: int) -> bool:\n",
+ " with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:\n",
+ " s.settimeout(0.25)\n",
+ " return s.connect_ex((\"127.0.0.1\", port)) == 0\n",
+ "\n",
+ "def start_proxy(\n",
+ " config_path: str = \"litellm_redis.yml\",\n",
+ " port: int = 4000,\n",
+ " log_path: str = \"litellm_proxy.log\",\n",
+ " restart: bool = True,\n",
+ " timeout: float = 10.0, # seconds we’re willing to wait\n",
+ ") -> subprocess.Popen:\n",
+ "\n",
+ " global _proxy_handle\n",
+ "\n",
+ " # ── 1. stop running proxy we launched earlier ──\n",
+ " if _proxy_handle and _proxy_handle.poll() is None:\n",
+ " if restart:\n",
+ " _proxy_handle.terminate()\n",
+ " _proxy_handle.wait(timeout=3)\n",
+ " time.sleep(1) # give the OS a breath\n",
+ " else:\n",
+ " print(f\"LiteLLM already running (PID {_proxy_handle.pid}) — reusing.\")\n",
+ " return _proxy_handle\n",
+ "\n",
+ " # ── 2. ensure the port is free ──\n",
+ " if _is_port_open(port):\n",
+ " print(f\"Port {port} busy; trying to free it …\")\n",
+ " pids = os.popen(f\"lsof -ti tcp:{port}\").read().strip().splitlines()\n",
+ " for pid in pids:\n",
+ " try:\n",
+ " os.kill(int(pid), signal.SIGTERM)\n",
+ " except Exception:\n",
+ " pass\n",
+ " time.sleep(1)\n",
+ "\n",
+ " # ── 3. launch proxy ──\n",
+ " log_file = open(log_path, \"w\")\n",
+ " cmd = [\"litellm\", \"--config\", config_path, \"--port\", str(port), \"--detailed_debug\"]\n",
+ " _proxy_handle = subprocess.Popen(cmd, stdout=log_file, stderr=subprocess.STDOUT)\n",
+ "\n",
+ " atexit.register(lambda: _proxy_handle and _proxy_handle.terminate())\n",
+ "\n",
+ " # ── 4. readiness loop with timeout & crash detection ──\n",
+ " deadline = time.time() + timeout\n",
+ " while time.time() < deadline:\n",
+ " if _is_port_open(port):\n",
+ " break\n",
+ " if _proxy_handle.poll() is not None: # died early\n",
+ " last_lines = pathlib.Path(log_path).read_text().splitlines()[-20:]\n",
+ " raise RuntimeError(\n",
+ " \"LiteLLM exited before opening the port:\\n\" +\n",
+ " textwrap.indent(\"\\n\".join(last_lines), \" \")\n",
+ " )\n",
+ " time.sleep(0.25)\n",
+ " else:\n",
+ " _proxy_handle.terminate()\n",
+ " raise RuntimeError(f\"LiteLLM proxy did not open port {port} within {timeout}s.\")\n",
+ "\n",
+ " print(f\"✅ LiteLLM proxy on http://localhost:{port} (PID {_proxy_handle.pid})\")\n",
+ " print(f\" Logs → {pathlib.Path(log_path).resolve()}\")\n",
+ " return _proxy_handle\n",
+ "\n",
+ "\n",
+ "def stop_proxy() -> None:\n",
+ " global _proxy_handle\n",
+ " if _proxy_handle and _proxy_handle.poll() is None:\n",
+ " _proxy_handle.terminate()\n",
+ " _proxy_handle.wait(timeout=3)\n",
+ " print(\"LiteLLM proxy stopped.\")\n",
+ " _proxy_handle = None"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "8WSEon9JIRn8",
+ "metadata": {
+ "id": "8WSEon9JIRn8"
+ },
+ "source": [
+ "Start up the LiteLLM proxy for the first time."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 236,
+ "id": "jrw2Gu6uPxYr",
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "jrw2Gu6uPxYr",
+ "outputId": "ae65f321-1d4e-49fe-9282-d418f324a5cc"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "✅ LiteLLM proxy on http://localhost:4000 (PID 63464)\n",
+ " Logs → /content/litellm_proxy.log\n"
+ ]
+ }
+ ],
+ "source": [
+ "_proxy_handle = start_proxy()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "zzOSmL0_IzwF",
+ "metadata": {
+ "id": "zzOSmL0_IzwF"
+ },
+ "source": [
+ "Now we will add a simple helper method to test out models."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 237,
+ "id": "9rbN7PiMVAmA",
+ "metadata": {
+ "id": "9rbN7PiMVAmA"
+ },
+ "outputs": [],
+ "source": [
+ "import requests\n",
+ "\n",
+ "\n",
+ "def call_model(text: str, model: str = \"gpt-4o-mini\"):\n",
+ " try:\n",
+ " t0 = time.time()\n",
+ " payload = {\n",
+ " \"model\": model,\n",
+ " \"messages\": [{\"role\": \"user\", \"content\": text}]\n",
+ " }\n",
+ " r = requests.post(\"http://localhost:4000/chat/completions\", json=payload, timeout=30)\n",
+ " r.raise_for_status()\n",
+ " print(r.json()[\"choices\"][0][\"message\"][\"content\"])\n",
+ " print(f\"{r.json()['id']} -- {r.json()['model']} -- latency: {time.time() - t0:.2f}s \\n\")\n",
+ " return r\n",
+ " except Exception as e:\n",
+ " print(str(e))\n",
+ " if \"error\" in r.json():\n",
+ " print(r.json()[\"error\"][\"message\"])"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 238,
+ "id": "KEdfst47VdjN",
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "KEdfst47VdjN",
+ "outputId": "0898a5da-b907-4231-c171-ddf6a1043911"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?\n",
+ "chatcmpl-BUdDxEetmH0k6yJkaDLeSshRZmGnz -- gpt-4o-mini-2024-07-18 -- latency: 0.90s \n",
+ "\n"
+ ]
+ }
+ ],
+ "source": [
+ "res = call_model(\"hello, how are you?\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 239,
+ "id": "XJnkyMUDI9xu",
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "XJnkyMUDI9xu",
+ "outputId": "bebbc826-60e8-4de9-8ddf-425d7c087cfa"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Hello! I'm just a computer program, so I don't have feelings, but I'm here to assist you. How can I help you today?\n",
+ "chatcmpl-BUdDySZjzxB8tCTLkuYDTyPFfKo1P -- gpt-3.5-turbo-0125 -- latency: 0.65s \n",
+ "\n"
+ ]
+ }
+ ],
+ "source": [
+ "res = call_model(\"hello, how are you?\", model=\"gpt-3.5-turbo\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 240,
+ "id": "79nkkD6cVii2",
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "79nkkD6cVii2",
+ "outputId": "c4ee9d21-3a81-4453-e412-2bd17d4a4372"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "400 Client Error: Bad Request for url: http://localhost:4000/chat/completions\n",
+ "{'error': '/chat/completions: Invalid model name passed in model=claude. Call `/v1/models` to view available models for your key.'}\n"
+ ]
+ }
+ ],
+ "source": [
+ "# Try a non-supported model!\n",
+ "res = call_model(\"hello, how are you?\", model=\"claude\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "fc65bfdd",
+ "metadata": {
+ "id": "fc65bfdd"
+ },
+ "source": [
+ "## 3 · Implement LLM caching with Redis\n",
+ "\n",
+ "LiteLLM Proxy with Redis provides two powerful caching capabilities that can significantly improve your LLM application performance and reliability:\n",
+ "\n",
+ "* **Exact cache (identical prompt)**: Pulls exact prompt/query matches from Redis with configurable TTL.\n",
+ "* **Semantic cache (similar prompt)**: Uses Redis as a semantic cache powered by **vector search** to determine if a prompt/query is similar enough to a cached entry.\n",
+ "\n",
+ "### Why Use Caching for LLMs?\n",
+ "\n",
+ "1. **Cost Reduction**: Avoid redundant API calls for identical or similar prompts\n",
+ "2. **Latency Improvement**: Cached responses return in milliseconds vs. seconds\n",
+ "3. **Reliability**: Reduce dependency on external API availability\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 241,
+ "id": "eup_Z0Z_Y493",
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "eup_Z0Z_Y493",
+ "outputId": "d815413e-acc0-4108-8b47-87dfb35cd59f"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "The capital of France is Paris.\n",
+ "chatcmpl-BUdDz7ZsNbR2PTGbnzgALezkkVvh8 -- gpt-4o-mini-2024-07-18 -- latency: 0.63s \n",
+ "\n",
+ "The capital of France is Paris.\n",
+ "chatcmpl-BUdDz7ZsNbR2PTGbnzgALezkkVvh8 -- gpt-4o-mini-2024-07-18 -- latency: 0.02s \n",
+ "\n",
+ "The capital of France is Paris.\n",
+ "chatcmpl-BUdDz7ZsNbR2PTGbnzgALezkkVvh8 -- gpt-4o-mini-2024-07-18 -- latency: 0.02s \n",
+ "\n",
+ "The capital of France is Paris.\n",
+ "chatcmpl-BUdDz7ZsNbR2PTGbnzgALezkkVvh8 -- gpt-4o-mini-2024-07-18 -- latency: 0.02s \n",
+ "\n",
+ "The capital of France is Paris.\n",
+ "chatcmpl-BUdDz7ZsNbR2PTGbnzgALezkkVvh8 -- gpt-4o-mini-2024-07-18 -- latency: 0.02s \n",
+ "\n",
+ "The capital of France is Paris.\n",
+ "chatcmpl-BUdDz7ZsNbR2PTGbnzgALezkkVvh8 -- gpt-4o-mini-2024-07-18 -- latency: 0.03s \n",
+ "\n",
+ "The capital of France is Paris.\n",
+ "chatcmpl-BUdDz7ZsNbR2PTGbnzgALezkkVvh8 -- gpt-4o-mini-2024-07-18 -- latency: 0.02s \n",
+ "\n",
+ "The capital of France is Paris.\n",
+ "chatcmpl-BUdDz7ZsNbR2PTGbnzgALezkkVvh8 -- gpt-4o-mini-2024-07-18 -- latency: 0.02s \n",
+ "\n",
+ "18.6 ms ± 3.59 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
+ ]
+ }
+ ],
+ "source": [
+ "%%timeit\n",
+ "res = call_model(\"what is the capital of france?\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "GQRkOghoB9-Y",
+ "metadata": {
+ "id": "GQRkOghoB9-Y"
+ },
+ "source": [
+ "Check response equivalence:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 242,
+ "id": "IbfUylGGUhP7",
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "IbfUylGGUhP7",
+ "outputId": "e56853a1-61b0-4916-fb2b-c1695d922e8f"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "The capital of France is Paris.\n",
+ "chatcmpl-BUdDz7ZsNbR2PTGbnzgALezkkVvh8 -- gpt-4o-mini-2024-07-18 -- latency: 0.02s \n",
+ "\n",
+ "The capital of France is Paris.\n",
+ "chatcmpl-BUdDz7ZsNbR2PTGbnzgALezkkVvh8 -- gpt-4o-mini-2024-07-18 -- latency: 0.02s \n",
+ "\n"
+ ]
+ },
+ {
+ "data": {
+ "text/plain": [
+ "{'id': 'chatcmpl-BUdDz7ZsNbR2PTGbnzgALezkkVvh8',\n",
+ " 'created': 1746640319,\n",
+ " 'model': 'gpt-4o-mini-2024-07-18',\n",
+ " 'object': 'chat.completion',\n",
+ " 'system_fingerprint': 'fp_129a36352a',\n",
+ " 'choices': [{'finish_reason': 'stop',\n",
+ " 'index': 0,\n",
+ " 'message': {'content': 'The capital of France is Paris.',\n",
+ " 'role': 'assistant',\n",
+ " 'tool_calls': None,\n",
+ " 'function_call': None,\n",
+ " 'annotations': []}}],\n",
+ " 'usage': {'completion_tokens': 8,\n",
+ " 'prompt_tokens': 14,\n",
+ " 'total_tokens': 22,\n",
+ " 'completion_tokens_details': {'accepted_prediction_tokens': 0,\n",
+ " 'audio_tokens': 0,\n",
+ " 'reasoning_tokens': 0,\n",
+ " 'rejected_prediction_tokens': 0},\n",
+ " 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}},\n",
+ " 'service_tier': 'default'}"
+ ]
+ },
+ "execution_count": 242,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "res1 = call_model(\"what is the capital of france?\")\n",
+ "res2 = call_model(\"what is the capital of france?\")\n",
+ "\n",
+ "assert res1.json() == res2.json()\n",
+ "\n",
+ "res1.json()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "e121e215",
+ "metadata": {
+ "id": "e121e215"
+ },
+ "source": [
+ "## 4 · Semantic caching\n",
+ "\n",
+ "Now we'll demonstrate semantic caching by sending similar prompts back to back. The first request should hit the LLM API, while future requests should be served from cache as long as they are similar enough. We'll see this reflected in the response times.\n",
+ "\n",
+ "First, we need to stop the running proxy and update the LiteLLM config."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 243,
+ "id": "iX5F90uWCpuY",
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "iX5F90uWCpuY",
+ "outputId": "6ba29c04-a9f1-48f0-ae59-8fd059419fa7"
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "-15"
+ ]
+ },
+ "execution_count": 243,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "# Stop the proxy process\n",
+ "_proxy_handle.terminate()\n",
+ "_proxy_handle.wait(timeout=4)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 244,
+ "id": "MpcYlHdSCvQE",
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "MpcYlHdSCvQE",
+ "outputId": "666254d5-4d3e-4af2-e003-60a0c70ae29c"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Overwriting litellm_redis.yml\n"
+ ]
+ }
+ ],
+ "source": [
+ "%%writefile litellm_redis.yml\n",
+ "model_list:\n",
+ "- litellm_params:\n",
+ " api_key: os.environ/OPENAI_API_KEY\n",
+ " model: gpt-3.5-turbo\n",
+ " rpm: 30\n",
+ " model_name: gpt-3.5-turbo\n",
+ "- litellm_params:\n",
+ " api_key: os.environ/OPENAI_API_KEY\n",
+ " model: gpt-4o-mini\n",
+ " rpm: 30\n",
+ " model_name: gpt-4o-mini\n",
+ "- litellm_params:\n",
+ " api_key: os.environ/OPENAI_API_KEY\n",
+ " model: text-embedding-3-small\n",
+ " model_name: text-embedding-3-small\n",
+ "\n",
+ "litellm_settings:\n",
+ " cache: True\n",
+ " set_verbose: True\n",
+ " cache_params:\n",
+ " type: redis-semantic\n",
+ " host: os.environ/REDIS_HOST\n",
+ " port: os.environ/REDIS_PORT\n",
+ " password: os.environ/REDIS_PASSWORD\n",
+ " ttl: 60\n",
+ " similarity_threshold: 0.90\n",
+ " redis_semantic_cache_embedding_model: text-embedding-3-small\n",
+ " redis_semantic_cache_index_name: llmcache"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 245,
+ "id": "9Ak-jWcXC6dq",
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "9Ak-jWcXC6dq",
+ "outputId": "eec709e6-075a-4c23-b6d4-c2ed59a4fd02"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "✅ LiteLLM proxy on http://localhost:4000 (PID 63528)\n",
+ " Logs → /content/litellm_proxy.log\n"
+ ]
+ }
+ ],
+ "source": [
+ "_proxy_handle = start_proxy()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "4sf49YkOnhww",
+ "metadata": {
+ "id": "4sf49YkOnhww"
+ },
+ "source": [
+ "Semantic cache can handle exact match scenarios (where the characters/tokens are identical). This would happen more in a development environment or in cases where a programmatic user is providing input to an LLM call."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 246,
+ "id": "c08699fc",
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "c08699fc",
+ "outputId": "1ef29ae8-6fd6-4cff-909f-0da1874dbe60"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "The capital city of the United States is Washington, D.C.\n",
+ "chatcmpl-BUdE8A9yQyijCBN4Agg5QJxsrifUJ -- gpt-4o-mini-2024-07-18 -- latency: 1.35s \n",
+ "\n",
+ "The capital city of the United States is Washington, D.C.\n",
+ "chatcmpl-BUdE8A9yQyijCBN4Agg5QJxsrifUJ -- gpt-4o-mini-2024-07-18 -- latency: 0.37s \n",
+ "\n",
+ "The capital city of the United States is Washington, D.C.\n",
+ "chatcmpl-BUdE8A9yQyijCBN4Agg5QJxsrifUJ -- gpt-4o-mini-2024-07-18 -- latency: 0.53s \n",
+ "\n",
+ "The capital city of the United States is Washington, D.C.\n",
+ "chatcmpl-BUdE8A9yQyijCBN4Agg5QJxsrifUJ -- gpt-4o-mini-2024-07-18 -- latency: 0.47s \n",
+ "\n",
+ "The capital city of the United States is Washington, D.C.\n",
+ "chatcmpl-BUdE8A9yQyijCBN4Agg5QJxsrifUJ -- gpt-4o-mini-2024-07-18 -- latency: 0.36s \n",
+ "\n",
+ "The capital city of the United States is Washington, D.C.\n",
+ "chatcmpl-BUdE8A9yQyijCBN4Agg5QJxsrifUJ -- gpt-4o-mini-2024-07-18 -- latency: 0.24s \n",
+ "\n",
+ "The capital city of the United States is Washington, D.C.\n",
+ "chatcmpl-BUdE8A9yQyijCBN4Agg5QJxsrifUJ -- gpt-4o-mini-2024-07-18 -- latency: 0.39s \n",
+ "\n",
+ "The capital city of the United States is Washington, D.C.\n",
+ "chatcmpl-BUdE8A9yQyijCBN4Agg5QJxsrifUJ -- gpt-4o-mini-2024-07-18 -- latency: 0.28s \n",
+ "\n",
+ "379 ms ± 94.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
+ ]
+ }
+ ],
+ "source": [
+ "%%timeit\n",
+ "\n",
+ "call_model(\"what is the capital city of the United States?\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "mQTzCNvCFHRJ",
+ "metadata": {
+ "id": "mQTzCNvCFHRJ"
+ },
+ "source": [
+ "Additional (or variable) latency here per check is due to using OpenAI embeddings which makes calls over the network. A more optimized solution would be to use a more scalable embedding inference system OR a localized model that doesn't require a network hop.\n",
+ "\n",
+ "The semantic cache can also be used for near exact matches (fuzzy caching) based on semantic meaning. Below are a few scenarios:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 258,
+ "id": "v5lkpxafr7ot",
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "v5lkpxafr7ot",
+ "outputId": "c00f3c88-e72d-4195-fd64-84bccf2ae185"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "As of my last update in October 2023, the President of France is Emmanuel Macron. He has been in office since May 14, 2017. However, please verify with a current source, as political positions can change.\n",
+ "chatcmpl-BUdHNxLLb7HBmnTUUHRQpxWBVhGAI -- gpt-4o-mini-2024-07-18 -- latency: 2.37s \n",
+ "\n",
+ "As of my last knowledge update in October 2023, the President of France is Emmanuel Macron. He has been in office since May 14, 2017, and was re-elected for a second term in April 2022. Please verify with up-to-date sources, as political situations can change.\n",
+ "chatcmpl-BUdHOz7UCsO4KKKcDfx8ZGv2LJ6dZ -- gpt-4o-mini-2024-07-18 -- latency: 1.38s \n",
+ "\n",
+ "As of my last update in October 2023, the President of France is Emmanuel Macron. He has been in office since May 14, 2017. However, please verify with a current source, as political positions can change.\n",
+ "chatcmpl-BUdHNxLLb7HBmnTUUHRQpxWBVhGAI -- gpt-4o-mini-2024-07-18 -- latency: 0.65s \n",
+ "\n",
+ "As of my last update in October 2023, the President of France is Emmanuel Macron. He has been in office since May 14, 2017. However, please verify with a current source, as political positions can change.\n",
+ "chatcmpl-BUdHNxLLb7HBmnTUUHRQpxWBVhGAI -- gpt-4o-mini-2024-07-18 -- latency: 0.60s \n",
+ "\n"
+ ]
+ }
+ ],
+ "source": [
+ "texts = [\n",
+ " \"who is the president of France?\",\n",
+ " \"who is the country president of France?\",\n",
+ " \"who is France's current presidet?\",\n",
+ " \"The current president of France is?\"\n",
+ "]\n",
+ "\n",
+ "for text in texts:\n",
+ " res = call_model(text)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "-akCGqYkqGVs",
+ "metadata": {
+ "id": "-akCGqYkqGVs"
+ },
+ "source": [
+ "## 5 · Inspect Redis Index with RedisVL\n",
+ "Use the `redisvl` helpers and CLI to investigate more about the underlying vector index that supports the checks within the LiteLLM proxy."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 248,
+ "id": "RntBqIlipyHA",
+ "metadata": {
+ "id": "RntBqIlipyHA"
+ },
+ "outputs": [],
+ "source": [
+ "from redisvl.index import SearchIndex\n",
+ "\n",
+ "idx = SearchIndex.from_existing(redis_client=client, name=\"llmcache\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 249,
+ "id": "tHVIHkXCqU7V",
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "tHVIHkXCqU7V",
+ "outputId": "f68ad535-0f9d-4467-e0c7-bbf9ca271915"
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "True"
+ ]
+ },
+ "execution_count": 249,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "idx.exists()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 250,
+ "id": "8mNvmr7op-B-",
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "8mNvmr7op-B-",
+ "outputId": "ea0535f7-e6fa-490e-8a8d-288572d7170d"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "\u001b[32m17:52:13\u001b[0m \u001b[34m[RedisVL]\u001b[0m \u001b[1;30mINFO\u001b[0m Using Redis address from environment variable, REDIS_URL\n",
+ "\n",
+ "\n",
+ "Index Information:\n",
+ "╭──────────────┬────────────────┬──────────────┬─────────────────┬────────────╮\n",
+ "│ Index Name │ Storage Type │ Prefixes │ Index Options │ Indexing │\n",
+ "├──────────────┼────────────────┼──────────────┼─────────────────┼────────────┤\n",
+ "│ llmcache │ HASH │ ['llmcache'] │ [] │ 0 │\n",
+ "╰──────────────┴────────────────┴──────────────┴─────────────────┴────────────╯\n",
+ "Index Fields:\n",
+ "╭───────────────┬───────────────┬─────────┬────────────────┬────────────────┬────────────────┬────────────────┬────────────────┬────────────────┬─────────────────┬────────────────╮\n",
+ "│ Name │ Attribute │ Type │ Field Option │ Option Value │ Field Option │ Option Value │ Field Option │ Option Value │ Field Option │ Option Value │\n",
+ "├───────────────┼───────────────┼─────────┼────────────────┼────────────────┼────────────────┼────────────────┼────────────────┼────────────────┼─────────────────┼────────────────┤\n",
+ "│ prompt │ prompt │ TEXT │ WEIGHT │ 1 │ │ │ │ │ │ │\n",
+ "│ response │ response │ TEXT │ WEIGHT │ 1 │ │ │ │ │ │ │\n",
+ "│ inserted_at │ inserted_at │ NUMERIC │ │ │ │ │ │ │ │ │\n",
+ "│ updated_at │ updated_at │ NUMERIC │ │ │ │ │ │ │ │ │\n",
+ "│ prompt_vector │ prompt_vector │ VECTOR │ algorithm │ FLAT │ data_type │ FLOAT32 │ dim │ 1536 │ distance_metric │ COSINE │\n",
+ "╰───────────────┴───────────────┴─────────┴────────────────┴────────────────┴────────────────┴────────────────┴────────────────┴────────────────┴─────────────────┴────────────────╯\n"
+ ]
+ }
+ ],
+ "source": [
+ "!rvl index info -i llmcache"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "00bd3fc6",
+ "metadata": {
+ "id": "00bd3fc6"
+ },
+ "source": [
+ "### Examining the Cached Keys in Redis\n",
+ "\n",
+ "Let's look at the keys created in Redis for the cache and understand how LiteLLM structures them:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 251,
+ "id": "46eb6aa5",
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "46eb6aa5",
+ "outputId": "bfae071a-b8c4-44bd-8672-0bbddc170027"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Found 1 cache keys in Redis\n",
+ "\n",
+ "Example cache key: llmcache:e4e4faaeea347b9876d03c4f68b7d981234a3a7a4281590ab4bc0e70dbdaef9e\n",
+ "TTL: 55 seconds remaining...\n",
+ "{'response': '{\\'timestamp\\': 1746640328.978919, \\'response\\': \\'{\"id\":\"chatcmpl-BUdE8A9yQyijCBN4Agg5QJxsrifUJ\",\"created\":1746640328,\"model\":\"gpt-4o-mini-2024-07-18\",\"object\":\"chat.completion\",\"system_fingerprint\":\"fp_dbaca60df0\",\"choices\":[{\"finish_reason\":\"stop\",\"index\":0,\"message\":{\"content\":\"The capital city of the United States is Washington, D.C.\",\"role\":\"assistant\",\"tool_calls\":null,\"function_call\":null,\"annotations\":[]}}],\"usage\":{\"completion_tokens\":14,\"prompt_tokens\":17,\"total_tokens\":31,\"completion_tokens_details\":{\"accepted_prediction_tokens\":0,\"audio_tokens\":0,\"reasoning_tokens\":0,\"rejected_prediction_tokens\":0},\"prompt_tokens_details\":{\"audio_tokens\":0,\"cached_tokens\":0}},\"service_tier\":\"default\"}\\'}', 'prompt_vector': b'\\xccY/=\\xbf0\\x00\\xbdd\\x0f\\xa2=X\\xa5\\xc8=\\x1f\\t-\\xbc\\\\\\x1d\\x1b\\xbc^\\xda\\xdb\\xbc\\x02\\xfc<@\\xbc\\xe8h\\xb4<\\xaf\\x8bn\\xbc\\x91Ad\\xbcP\\xf2\\xf0;}$\\xe6\\xbc\\xf2V\\x11\\xbdk\\x03>\\xbc\\xe6l\\x91\\xbd\\xaf\\xcc\\xe5\\xbc\\xaa\\x15\\x17<\\x90\\xc3\\x05\\xbc\\xb4\\x83\\xe7\\xb9\\t\\xaf\\x14=\\xe9\\'=\\xbc\\xc8\\xe1\\x0f<\\xf6P\\x1f\\xbb^\\xda\\x0e\\xbd\\x8c\\x8a\\xe2\\xb9\\xfb\\x07n;\\x7f\\xe1\\x8c\\xbcts\\x89=\\x95zT\\xbb&<\\xab\\xbb\\xe6l\\x11=h\\x89\\xd6\\xbc\\x9b\\xaf\\x9a\\xbb\\xfe\\x01/=\\xba\\xf9$\\xbdSn\\xa0\\xbb\\xad\\x8f\\xcb\\xb9\\xa7Z89\\xbds\\x0c<\\xa6\\xdcs<\\xf4\\x93+=v0\\xca\\xbb[\\xe0\\x00<\\xbf\\xb0s\\xbc1\\xa8\\xe6;\\xda\\x80\\xc9\\xbd(\\xf9\\x1e<\\xb6\\xc04\\xbdSn ;\\x91A\\x97\\xbd\\xc1m\\x9a;\\xd2O`<\\xd8\\x84\\xa6:xmd=c\\x91\\x10\\xbc\\xe3\\xb1\\xff\\xbc\\xc9\\x9e\\x03=\\xdfx\\xc2\\xbc\\x1d\\xcc\\x92\\xbaQ1\\x86<\\x88Q%\\xbc\\xaf\\xcc\\xe5:ts\\x89\\xbc\\xc9_!\\xbd\\x8c\\x8a\\xe2\\xbc\\x82\\xdb\\xe7\\xbc\\xa6\\x9b/=\\xe3p;\\xba\\xdf\\xf8\\x1b\\xbc\\xef\\x1bY\\xbb%\\xbe\\x99\\xbc\\x9f\\xa7`\\xbd\\xbd\\xb4\\x03<\\xb2\\xc6&\\xbdc\\xd2\\x87\\xbc\\xc2*[<\\x85UO<\\x18\\x15\\x91\\xbbL9\\x8d<\\xe9\\'\\xbd;aTC\\xbbN\\xf6M={\\xe7\\xcb\\xbc\\xf2\\x17\\xaf\\xbb\\x055z\\xbc@\\x0e\\x16<\\xb5B\\xf0<=\\x14\\x08\\xbcc\\x91\\x90\\xbcR\\xaf\\x97<\\x1a\\x114=\\x13^\\x0f=\\xdd|\\x1f\\xbd|\\xa6\\xd4\\xbc\\xfd\\xc4\\x14\\xbd\\xb4\\x83\\x9a\\xbcO\\xb5\\x89\\xba..2=\\':c\\xbc\\x96\\xf8\\xe5<\\xdc\\xfe\\x8d<\\xb9:i\\xbd\\x1b\\xd0<\\xbd`\\x97\\x82;\\xd0\\x92\\x1f;\\x03zN\\xbc+\\xf3\\xac\\xbb\\xe4\\xaf\\x9d;\\xeb#\\x93\\xbd\\x9f\\xa7`:\\xb1\\x89\\x0c\\xbd\\xa5^\\x15<=\\x94\\xae\\xbc\\xb3\\xc4\\xde<\\x1c\\rW\\xc0<\\xb0\\xca\\x03<\\x9c-,=\\xc6\\xa4B\\xbc3e\\x8dS\\xb7<\\xba\\xf9\\xf1\\xbb\\xe7\\xa9\\xf8\\x12@\\xc0;\\xb3F\\x00\\xbd-\\xb0\\xed\\xbbJ\\xbd\\xdd<0k\\xcc<\\x7f\\xe1\\x0c=\\xc2\\xeb+;_\\x99\\x97<\\x16X\\x9d<\\x83\\xd9\\x05\\xbd5\"\\xce\\xbb\\x87\\x92\\xe9\\xbc\\xd2\\x0e\\xe9S7=\\x8a\\xcd\\xa1<\\xf2\\x17/\\xbc\\x98\\xb5\\x0c=9\\x1a\\xc7;\\xacR1S\\xb7<\\xead\\x8a