Skip to content

Repository dedicated to enhancing data retrieval and processing efficiencies in Google Cloud's Vertex AI by implementing a semantic caching layer with MemoryStore, Vertex AI Vector Search, and Gemini, focusing on GenAI applications.

License

Notifications You must be signed in to change notification settings

arunpshankar/VertexAI-Semantic-Caching

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

92 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

A Guide to Semantic Caching: Optimizing GenAI Workflows on GCP

This repository focuses on enhancing data retrieval and processing efficiencies in generative AI applications. It achieves this by implementing a semantic caching layer utilizing MemoryStore, Vertex AI Vector Search, and Gemini, primarily on the Google Cloud Platform stack.

This codebase aligns with a Medium article, "Implementing Semantic Caching: A Step-by-Step Guide to Faster, Cost-Effective GenAI Workflows." This article provides a comprehensive guide on setting up the necessary architecture for semantic caching in a document question-answering RAG pipeline.

For detailed instructions and more insight, please refer to the article.

Semantic Caching

Prerequisites 📋

Before getting started, ensure you have the following:

  • Python 3.6 or later
  • Git
  • Google Cloud Platform account with a project set up and Vertex AI API enabled

Make sure you have permissions to create service accounts and manage API keys within your GCP project.

Installation

Let's set up your local development environment and configure dependencies.

Clone the Repository 📂

  1. Clone the Repository: In your terminal, execute the following command:

    git clone https://github.com/arunpshankar/VertexAI-Semantic-Caching.git
    cd VertexAI-Semantic-Caching

Set Up Your Environment 🛠️

  1. Create a Virtual Environment: Isolate project dependencies by creating a Python virtual environment:

    • For macOS/Linux:

      python3 -m venv .VertexAI-Semantic-Caching
      source .VertexAI-Semantic-Caching/bin/activate
    • For Windows:

      python3 -m venv .VertexAI-Semantic-Caching
      .VertexAI-Semantic-Caching\Scripts\activate
  2. Upgrade pip and Install Dependencies: Ensure pip is up-to-date and install project dependencies:

    python3 -m pip install --upgrade pip
    pip install -r requirements.txt
  3. Update Your PYTHONPATH:

    Ensure your Python interpreter recognizes the project directory as a module location.

    • For macOS/Linux:

      export PYTHONPATH=$PYTHONPATH:.
    • For Windows (use set instead of export):

      set PYTHONPATH=%PYTHONPATH%;.
  4. Configure Service Account Credentials 🔑

    • Create a directory to store your Google Cloud service account key securely:

      mkdir credentials
    • Generate a Service Account Key from the Google Cloud Console, then move the downloaded JSON file to the credentials directory, renaming it to key.json.

Architecture

Semantic Match

About

Repository dedicated to enhancing data retrieval and processing efficiencies in Google Cloud's Vertex AI by implementing a semantic caching layer with MemoryStore, Vertex AI Vector Search, and Gemini, focusing on GenAI applications.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages