From 735460fffc04c72fa3382af1a4458f13ea298076 Mon Sep 17 00:00:00 2001 From: Tyler Hutcherson Date: Thu, 27 Jun 2024 11:24:15 -0400 Subject: [PATCH 1/3] initial update to RAG quickstart guide --- content/develop/get-started/rag.md | 64 ++++++++++++++++-------------- 1 file changed, 34 insertions(+), 30 deletions(-) diff --git a/content/develop/get-started/rag.md b/content/develop/get-started/rag.md index a613304f00..69b63379e9 100644 --- a/content/develop/get-started/rag.md +++ b/content/develop/get-started/rag.md @@ -9,59 +9,63 @@ categories: - oss - kubernetes - clients -description: Understand how to use Redis with RAG use cases -linkTitle: Redis and RAG +description: Understand how to use Redis for RAG use cases +linkTitle: RAG with Redis stack: true -title: Redis with RAG +title: RAG with Redis weight: 4 --- -### Using Redis for Retrieval Augmented Generation (RAG) use cases +### What is Retrieval Augmented Generation (RAG)? +Large Language Models (LLMs) generate human-like text but are limited by the data they were trained on. RAG enhances LLMs by integrating them with external, domain-specific data stored in a Redis [vector database]({{< relref "/develop/get-started/vector-database" >}}). -RAG is a method that enhances the capabilities of generative AI models by integrating them with Redis vector databases. -This approach allows the AI to retrieve relevant information in real-time, improving the accuracy and relevance of generated content. -Redis, with its high performance and versatile data structures, is an excellent choice for implementing RAG. -Here's an overview of how Redis can be leveraged in a RAG use case. +RAG involves three main steps: -### The role of Redis in RAG - -Redis provides a robust platform for managing the data retrieval process in RAG. -It supports the storage and retrieval of vectors, which are essential for handling large-scale, unstructured data and performing similarity searches. -Here are some key features and components of Redis that make it suitable for RAG: +- **Retrieve**: Fetch relevant information from Redis using vector search and filters based on the user query. +- **Augment**: Create a prompt for the LLM, including the user query, relevant context, and additional instructions. +- **Generate**: Return the response generated by the LLM to the user. -1. **Redis as a vector database**: The following quick start tutorial provides an example of how to use Redis as a vector database: - - [Basic vector search ]({{< relref "/develop/get-started/vector-database" >}}) +RAG enables LLMs to use real-time information, improving the accuracy and relevance of generated content. +Redis is ideal for RAG due to its speed, versatility, and familiarity. -1. **Redis Vector Library (RedisVL)**: This library is designed to enhance the development of generative AI applications by efficiently managing vector data. It allows the storage of embeddings (vector representations of text) and facilitates fast similarity searches, which are crucial for retrieving relevant information in RAG. +### The role of Redis in RAG -1. **Integration with AI frameworks**: Redis integrates seamlessly with various AI frameworks and tools. For instance, combining Redis with LangChain, a library for building language models, enables developers to create sophisticated RAG pipelines. This integration allows for efficient data management and retrieval operations that support real-time AI applications. +Redis provides a robust platform for managing real-time data. It supports the storage and retrieval of vectors, essential for handling large-scale, unstructured data and performing similarity searches. Key features and components of Redis that make it suitable for RAG include: -1. **High performance and scalability**: Redis is known for its low latency and high throughput, which are essential for real-time applications. Its in-memory data store ensures quick access to data, making it ideal for applications requiring rapid data retrieval and generation. +1. **Vector Database**: Stores and indexes vector embeddings that semantically represent unstructured data. +1. **Semantic Cache**: Caches frequently asked questions (FAQs) in a RAG pipeline. Using vector search, Redis retrieves similar previously answered questions, reducing LLM inference costs and latency. +1. **LLM Session Manager**: Stores conversation history between an LLM and a user. Redis fetches recent and relevant portions of the chat history to provide context, improving the quality and accuracy of responses. +1. **High Performance and Scalability**: Known for its [low latency and high throughput](https://redis.io/blog/benchmarking-results-for-vector-databases/), Redis is ideal for RAG systems and AI agents requiring rapid data retrieval and generation. -1. **Spring AI and Redis**: Using Spring AI with Redis simplifies the process of building RAG applications. Spring AI provides a structured approach to integrating AI capabilities into applications, while Redis handles the data management aspect, ensuring that the RAG pipeline is efficient and scalable. +### Build a RAG Application with Redis -### Build a RAG application with Redis +To build a RAG application with Redis, follow these general steps: -To build a RAG application with Redis, the following are some general steps: +1. **Set up Redis**: Start by setting up a Redis instance and configuring it to handle vector data. -1. **Set up Redis**: Start by setting up a Redis instance and configuring it to handle vector data. The RedisVL library will be instrumental here, as it provides the necessary tools for storing and querying vector embeddings. +2. **Use a Framework**: + 1. **Redis Vector Library (RedisVL)**: [RedisVL](https://redis.io/docs/latest/integrate/redisvl/) enhances the development of generative AI applications by efficiently managing vectors and metadata. It allows for storage of vector embeddings and facilitates fast similarity searches, crucial for retrieving relevant information in RAG. + 2. **Popular AI Frameworks**: Redis integrates seamlessly with various AI frameworks and tools. For instance, combining Redis with [LangChain](https://python.langchain.com/v0.2/docs/integrations/vectorstores/redis/) or [LlamaIndex](https://docs.llamaindex.ai/en/latest/examples/vector_stores/RedisIndexDemo/), libraries for building language models, enables developers to create sophisticated RAG pipelines. These integrations support efficient data management and building real-time LLM chains. + 3. **Spring AI and Redis**: Using [Spring AI with Redis](https://redis.io/blog/building-a-rag-application-with-redis-and-spring-ai/) simplifies building RAG applications. Spring AI provides a structured approach to integrating AI capabilities into applications, while Redis handles data management, ensuring the RAG pipeline is efficient and scalable. -1. **Embed and store data**: Convert your data into vector embeddings using a suitable model (e.g., BERT, GPT). Store these embeddings in Redis, where they can be quickly retrieved based on vector searches. +3. **Embed and Store Data**: Convert your data into vector embeddings using a suitable model (e.g., BERT, GPT). Store these embeddings in Redis, where they can be quickly retrieved based on vector searches. -1. **Integrate with a generative model**: Use a generative AI model that can leverage the retrieved data. The model will use the vectors stored in Redis to augment its generation process, ensuring that the output is informed by relevant, up-to-date information. +4. **Integrate with a Generative Model**: Use a generative AI model that can leverage the retrieved data. The model will use the vectors stored in Redis to augment its generation process, ensuring the output is informed by relevant, up-to-date information. -1. **Query and generate**: Implement the query logic that retrieves relevant vectors from Redis based on the input prompt. Feed these vectors into the generative model to produce augmented outputs. +5. **Query and Generate**: Implement the query logic to retrieve relevant vectors from Redis based on the input prompt. Feed these vectors into the generative model to produce augmented outputs. ### Benefits of Using Redis for RAG -- **Efficiency**: The in-memory data store of Redis ensures that retrieval operations are performed with minimal latency, which is crucial for real-time applications. -- **Scalability**: Redis can handle large volumes of data and scale horizontally, making it suitable for applications with growing data needs. -- **Flexibility**: The support for various data structures and integration with different AI frameworks in Redis allows for flexible and adaptable RAG pipelines. +- **Efficiency**: Redis's in-memory data store ensures that retrieval operations are performed with minimal latency. +- **Scalability**: Redis scales horizontally, seamlessly handling growing volumes of data and queries. +- **Flexibility**: Redis supports a variety of data structures and integrates with AI frameworks. -In summary, Redis offers a powerful and efficient platform for implementing Retrieval Augmented Generation. Its vector management capabilities, high performance, and seamless integration with AI frameworks make it an ideal choice if you are looking to enhance your generative AI applications with real-time data retrieval. +In summary, Redis offers a powerful and efficient platform for implementing RAG. Its vector management capabilities, high performance, and seamless integration with AI frameworks make it an ideal choice for enhancing generative AI applications with real-time data retrieval. ### Resources - [RAG defined](https://redis.io/glossary/retrieval-augmented-generation/). - [RAG overview](https://redis.io/kb/doc/2ok7xd1drq/how-to-perform-retrieval-augmented-generation-rag-with-redis). - [Redis Vector Library (RedisVL)](https://redis.io/docs/latest/integrate/redisvl/) and [introductory article](https://redis.io/blog/introducing-the-redis-vector-library-for-enhancing-genai-development/). -- [RAG with Redis and SpringAI](https://redis.io/blog/building-a-rag-application-with-redis-and-spring-ai/) \ No newline at end of file +- [RAG with Redis and SpringAI](https://redis.io/blog/building-a-rag-application-with-redis-and-spring-ai/) +- [Build a multimodal RAG app with LangChain and Redis](https://redis.io/blog/explore-the-new-multimodal-rag-template-from-langchain-and-redis/) +- [Get hands-on with advanced Redis AI Recipes](https://github.com/redis-developer/redis-ai-resources) From 71c58cb3176083775ef7d38a7045724a36b2c5f0 Mon Sep 17 00:00:00 2001 From: Tyler Hutcherson Date: Fri, 28 Jun 2024 10:58:09 -0400 Subject: [PATCH 2/3] Apply suggestions from code review Co-authored-by: David Dougherty --- content/develop/get-started/rag.md | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/content/develop/get-started/rag.md b/content/develop/get-started/rag.md index 69b63379e9..a4225914c5 100644 --- a/content/develop/get-started/rag.md +++ b/content/develop/get-started/rag.md @@ -31,10 +31,10 @@ Redis is ideal for RAG due to its speed, versatility, and familiarity. Redis provides a robust platform for managing real-time data. It supports the storage and retrieval of vectors, essential for handling large-scale, unstructured data and performing similarity searches. Key features and components of Redis that make it suitable for RAG include: -1. **Vector Database**: Stores and indexes vector embeddings that semantically represent unstructured data. -1. **Semantic Cache**: Caches frequently asked questions (FAQs) in a RAG pipeline. Using vector search, Redis retrieves similar previously answered questions, reducing LLM inference costs and latency. -1. **LLM Session Manager**: Stores conversation history between an LLM and a user. Redis fetches recent and relevant portions of the chat history to provide context, improving the quality and accuracy of responses. -1. **High Performance and Scalability**: Known for its [low latency and high throughput](https://redis.io/blog/benchmarking-results-for-vector-databases/), Redis is ideal for RAG systems and AI agents requiring rapid data retrieval and generation. +1. **Vector database**: Stores and indexes vector embeddings that semantically represent unstructured data. +1. **Semantic cache**: Caches frequently asked questions (FAQs) in a RAG pipeline. Using vector search, Redis retrieves similar previously answered questions, reducing LLM inference costs and latency. +1. **LLM session manager**: Stores conversation history between an LLM and a user. Redis fetches recent and relevant portions of the chat history to provide context, improving the quality and accuracy of responses. +1. **High performance and scalability**: Known for its [low latency and high throughput](https://redis.io/blog/benchmarking-results-for-vector-databases/), Redis is ideal for RAG systems and AI agents requiring rapid data retrieval and generation. ### Build a RAG Application with Redis @@ -44,18 +44,18 @@ To build a RAG application with Redis, follow these general steps: 2. **Use a Framework**: 1. **Redis Vector Library (RedisVL)**: [RedisVL](https://redis.io/docs/latest/integrate/redisvl/) enhances the development of generative AI applications by efficiently managing vectors and metadata. It allows for storage of vector embeddings and facilitates fast similarity searches, crucial for retrieving relevant information in RAG. - 2. **Popular AI Frameworks**: Redis integrates seamlessly with various AI frameworks and tools. For instance, combining Redis with [LangChain](https://python.langchain.com/v0.2/docs/integrations/vectorstores/redis/) or [LlamaIndex](https://docs.llamaindex.ai/en/latest/examples/vector_stores/RedisIndexDemo/), libraries for building language models, enables developers to create sophisticated RAG pipelines. These integrations support efficient data management and building real-time LLM chains. + 2. **Popular AI frameworks**: Redis integrates seamlessly with various AI frameworks and tools. For instance, combining Redis with [LangChain](https://python.langchain.com/v0.2/docs/integrations/vectorstores/redis/) or [LlamaIndex](https://docs.llamaindex.ai/en/latest/examples/vector_stores/RedisIndexDemo/), libraries for building language models, enables developers to create sophisticated RAG pipelines. These integrations support efficient data management and building real-time LLM chains. 3. **Spring AI and Redis**: Using [Spring AI with Redis](https://redis.io/blog/building-a-rag-application-with-redis-and-spring-ai/) simplifies building RAG applications. Spring AI provides a structured approach to integrating AI capabilities into applications, while Redis handles data management, ensuring the RAG pipeline is efficient and scalable. -3. **Embed and Store Data**: Convert your data into vector embeddings using a suitable model (e.g., BERT, GPT). Store these embeddings in Redis, where they can be quickly retrieved based on vector searches. +3. **Embed and store data**: Convert your data into vector embeddings using a suitable model (e.g., BERT, GPT). Store these embeddings in Redis, where they can be quickly retrieved based on vector searches. -4. **Integrate with a Generative Model**: Use a generative AI model that can leverage the retrieved data. The model will use the vectors stored in Redis to augment its generation process, ensuring the output is informed by relevant, up-to-date information. +4. **Integrate with a generative model**: Use a generative AI model that can leverage the retrieved data. The model will use the vectors stored in Redis to augment its generation process, ensuring the output is informed by relevant, up-to-date information. -5. **Query and Generate**: Implement the query logic to retrieve relevant vectors from Redis based on the input prompt. Feed these vectors into the generative model to produce augmented outputs. +5. **Query and generate**: Implement the query logic to retrieve relevant vectors from Redis based on the input prompt. Feed these vectors into the generative model to produce augmented outputs. ### Benefits of Using Redis for RAG -- **Efficiency**: Redis's in-memory data store ensures that retrieval operations are performed with minimal latency. +- **Efficiency**: The in-memory data store of Redis ensures that retrieval operations are performed with minimal latency. - **Scalability**: Redis scales horizontally, seamlessly handling growing volumes of data and queries. - **Flexibility**: Redis supports a variety of data structures and integrates with AI frameworks. From 7b49fdc4c85aeddad552a870b87a4e60908ecd9f Mon Sep 17 00:00:00 2001 From: Tyler Hutcherson Date: Fri, 28 Jun 2024 10:59:11 -0400 Subject: [PATCH 3/3] update numeric list --- content/develop/get-started/rag.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/content/develop/get-started/rag.md b/content/develop/get-started/rag.md index a4225914c5..f4e06a51d9 100644 --- a/content/develop/get-started/rag.md +++ b/content/develop/get-started/rag.md @@ -42,16 +42,16 @@ To build a RAG application with Redis, follow these general steps: 1. **Set up Redis**: Start by setting up a Redis instance and configuring it to handle vector data. -2. **Use a Framework**: +1. **Use a Framework**: 1. **Redis Vector Library (RedisVL)**: [RedisVL](https://redis.io/docs/latest/integrate/redisvl/) enhances the development of generative AI applications by efficiently managing vectors and metadata. It allows for storage of vector embeddings and facilitates fast similarity searches, crucial for retrieving relevant information in RAG. - 2. **Popular AI frameworks**: Redis integrates seamlessly with various AI frameworks and tools. For instance, combining Redis with [LangChain](https://python.langchain.com/v0.2/docs/integrations/vectorstores/redis/) or [LlamaIndex](https://docs.llamaindex.ai/en/latest/examples/vector_stores/RedisIndexDemo/), libraries for building language models, enables developers to create sophisticated RAG pipelines. These integrations support efficient data management and building real-time LLM chains. - 3. **Spring AI and Redis**: Using [Spring AI with Redis](https://redis.io/blog/building-a-rag-application-with-redis-and-spring-ai/) simplifies building RAG applications. Spring AI provides a structured approach to integrating AI capabilities into applications, while Redis handles data management, ensuring the RAG pipeline is efficient and scalable. + 1. **Popular AI frameworks**: Redis integrates seamlessly with various AI frameworks and tools. For instance, combining Redis with [LangChain](https://python.langchain.com/v0.2/docs/integrations/vectorstores/redis/) or [LlamaIndex](https://docs.llamaindex.ai/en/latest/examples/vector_stores/RedisIndexDemo/), libraries for building language models, enables developers to create sophisticated RAG pipelines. These integrations support efficient data management and building real-time LLM chains. + 1. **Spring AI and Redis**: Using [Spring AI with Redis](https://redis.io/blog/building-a-rag-application-with-redis-and-spring-ai/) simplifies building RAG applications. Spring AI provides a structured approach to integrating AI capabilities into applications, while Redis handles data management, ensuring the RAG pipeline is efficient and scalable. -3. **Embed and store data**: Convert your data into vector embeddings using a suitable model (e.g., BERT, GPT). Store these embeddings in Redis, where they can be quickly retrieved based on vector searches. +1. **Embed and store data**: Convert your data into vector embeddings using a suitable model (e.g., BERT, GPT). Store these embeddings in Redis, where they can be quickly retrieved based on vector searches. -4. **Integrate with a generative model**: Use a generative AI model that can leverage the retrieved data. The model will use the vectors stored in Redis to augment its generation process, ensuring the output is informed by relevant, up-to-date information. +1. **Integrate with a generative model**: Use a generative AI model that can leverage the retrieved data. The model will use the vectors stored in Redis to augment its generation process, ensuring the output is informed by relevant, up-to-date information. -5. **Query and generate**: Implement the query logic to retrieve relevant vectors from Redis based on the input prompt. Feed these vectors into the generative model to produce augmented outputs. +1. **Query and generate**: Implement the query logic to retrieve relevant vectors from Redis based on the input prompt. Feed these vectors into the generative model to produce augmented outputs. ### Benefits of Using Redis for RAG