Skip to content

redis-developer/gcp-redis-llm-stack

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Scalable LLM Architectures with Redis & GCP Vertex AI

☁️ Generative AI with Google Vertex AI comes with a specialized in-console studio experience, a dedicated API for Gemini and easy-to-use Python SDK designed for deploying and managing instances of Google's powerful language models.

⚡ Redis Enterprise offers fast and scalable vector search, with an API for index creation, management, blazing-fast search, and hybrid filtering. When coupled with its versatile data structures - Redis Enterprise shines as the optimal solution for building high-quality Large Language Model (LLM) apps.

This repo serves as a foundational architecture for building LLM applications with Redis and GCP services.

Reference architecture

  1. Primary Data Sources
  2. Data Extraction and Loading
  3. Large Language Models
    • text-embedding-gecko@003 for embeddings
    • gemini-1.0-pro-001 for LLM generation and chat
  4. High-Performance Data Layer (Redis)
    • Semantic caching to improve LLM performance and associated costs
    • Vector search for context retrieval from knowledge base

RAG demo

Open In Colab

Open the code tutorial using the Colab notebook to get your hands dirty with Redis and Vertex AI on GCP. It's a step-by-step walkthrough of setting up the required data, and generating embeddings, and building RAG from scratch in order to build fast LLM apps; highlighting Redis vector search and semantic caching.

Additional resources

About

Reference architecture for LLM-based applications on Google Cloud Platform with Redis Enterprise as a high-performance data layer.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published