A production-ready Retrieval-Augmented Generation (RAG) engine built with Google's Agent Development Kit (ADK) and Vertex AI RAG Engine. This project provides a modular framework for managing Google Cloud Storage (GCS) buckets, RAG corpora, and document retrieval with a focus on best practices and user experience.
Vertex AI RAG Engine is a component of the Vertex AI Platform that facilitates Retrieval-Augmented Generation (RAG) and serves as a data framework for developing context-augmented large language model (LLM) applications. It enables you to enrich LLM context with your organization's private knowledge, reducing hallucinations and improving answer accuracy.
These concepts are listed in the order of the retrieval-augmented generation (RAG) process:
-
Data ingestion: Intake data from different data sources. For example, local files, Cloud Storage, and Google Drive.
-
Data transformation: Conversion of the data in preparation for indexing. For example, data is split into chunks.
-
Embedding: Numerical representations of words or pieces of text. These numbers capture the semantic meaning and context of the text. Similar or related words or text tend to have similar embeddings, which means they are closer together in the high-dimensional vector space.
-
Data indexing: Vertex AI RAG Engine creates an index called a corpus. The index structures the knowledge base so it's optimized for searching. For example, the index is like a detailed table of contents for a massive reference book.
-
Retrieval: When a user asks a question or provides a prompt, the retrieval component in Vertex AI RAG Engine searches through its knowledge base to find information that is relevant to the query.
-
Generation: The retrieved information becomes the context added to the original user query as a guide for the generative AI model to generate factually grounded and relevant responses.
Agent Development Kit (ADK) is a flexible and modular framework for developing and deploying AI agents. Key features include:
- Model-Agnostic: While optimized for Gemini and the Google ecosystem, ADK works with any model.
- Flexible Orchestration: Define workflows using workflow agents (
Sequential
,Parallel
,Loop
) or leverage LLM-driven dynamic routing for adaptive behavior. - Multi-Agent Architecture: Build modular applications by composing multiple specialized agents in a hierarchy.
- Rich Tool Ecosystem: Equip agents with diverse capabilities through pre-built tools, custom functions, third-party integrations, or even other agents as tools.
- Deployment Ready: Deploy agents anywhere – locally, on Vertex AI Agent Engine, or using Cloud Run/Docker.
- Built-in Evaluation: Assess agent performance by evaluating both response quality and execution trajectory.
ADK makes agent development feel more like software development, making it easier to create, deploy, and orchestrate agents ranging from simple tasks to complex workflows.
- Vertex AI RAG Engine
- Agent Development Kit (ADK)
- Features
- Pre-created RAG Corpora
- Architecture
- Prerequisites
- Installation
- Usage
- Configuration
- Supported File Types
- Troubleshooting
- Contributing
- License
- References
- Example Workflow
- Author
- 🗂️ GCS Bucket Management: Create, list, and manage GCS buckets for file storage.
- 📚 RAG Corpus Management: Create, update, list, and delete RAG corpora in Vertex AI.
- 📄 Document Management: Import documents from GCS into RAG corpora for vector search.
- 🔎 Semantic Search: Query RAG corpora for relevant information with citations.
- 🤖 Agent-based Interface: Interact with all functionalities through a natural language interface.
- ⚙️ Configurable & Extensible: Centralized configuration, emoji-enhanced responses, and schema-compliant tools.
The project includes several pre-created RAG corpora covering major AI topics:
- Foundation Models & Prompt Engineering: Resources on large language models and effective prompt design
- Embeddings & Vector Stores: Details on text embeddings and vector databases
- Generative AI Agents: Information on agent design, implementation, and usage
- Domain-Specific LLMs: Techniques for applying LLMs to solve domain-specific problems
- MLOps for Generative AI: Deployment and production considerations for GenAI systems
Each corpus contains relevant PDF documents imported from Google and Kaggle's Gen AI Intensive course:
- Day 1: Foundational Models & Prompt Engineering
- Day 2: Embeddings & Vector Stores / Databases
- Day 3: Generative AI Agents
- Day 4: Domain-Specific LLMs
- Day 5: MLOps for Generative AI
These documents are from Google and Kaggle's Gen AI Intensive course, which broke the GUINNESS WORLD RECORDS™ title for the Largest Attendance at a Virtual AI Conference in One Week with more than 280,000 signups in just 20 days. The materials provide a comprehensive overview of Vertex AI capabilities and best practices for working with generative AI.
The project follows a modular architecture based on the ADK framework:
The architecture consists of several key components:
- User Interface: Interact with the system through ADK Web or CLI
- Agent Development Kit (ADK): The core orchestration layer that manages tools and user interactions
- Function Tools: Modular components divided into:
- Storage Tools: For GCS bucket and file management
- RAG Corpus Tools: For corpus management and semantic search
- Google Cloud Services:
- Google Cloud Storage: Stores document files
- Vertex AI RAG Engine: Provides embedding, indexing and retrieval capabilities
- Gemini 2.0 LLM Model: Generates responses grounded in retrieved contexts
File structure:
adk-vertex-ai-rag-engine/
├── rag/ # Main project package
│ ├── __init__.py # Package initialization
│ ├── agent.py # The main RAG corpus manager agent
│ ├── config/ # Configuration directory
│ │ └── __init__.py # Centralized configuration settings
│ └── tools/ # ADK function tools
│ ├── __init__.py # Tools package initialization
│ ├── corpus_tools.py # RAG corpus management tools
│ └── storage_tools.py # GCS bucket management tools
├── .Images/ # Demo images and GIFs
└── README.md # Project documentation
- Python 3.11+
- Google Cloud project with Vertex AI API enabled
- Google Cloud SDK
- Access to Vertex AI and Cloud Storage
# Clone the repository
git clone https://github.com/arjunprabhulal/adk-vertex-ai-rag-engine.git
cd adk-vertex-ai-rag-engine
# (Optional) Create and activate a virtual environment
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Configure your Google Cloud project
export GOOGLE_CLOUD_PROJECT="your-project-id"
export GOOGLE_CLOUD_LOCATION="us-central1"
# Enable required Google Cloud services
gcloud services enable aiplatform.googleapis.com --project=${GOOGLE_CLOUD_PROJECT}
gcloud services enable storage.googleapis.com --project=${GOOGLE_CLOUD_PROJECT}
# Set up IAM permissions
gcloud projects add-iam-policy-binding ${GOOGLE_CLOUD_PROJECT} \
--member="user:YOUR_EMAIL@domain.com" \
--role="roles/aiplatform.user"
gcloud projects add-iam-policy-binding ${GOOGLE_CLOUD_PROJECT} \
--member="user:YOUR_EMAIL@domain.com" \
--role="roles/storage.objectAdmin"
# Set up Gemini API key
# Get your API key from Google AI Studio: https://ai.google.dev/
export GOOGLE_API_KEY=your_gemini_api_key_here
# Set up authentication credentials
# Option 1: Use gcloud application-default credentials (recommended for development)
gcloud auth application-default login
# Option 2: Use a service account key (for production or CI/CD environments)
# Download your service account key from GCP Console and set the environment variable
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/your/service-account-key.json
There are two ways to run the agent:
# Option 1: Use ADK web interface (recommended for interactive usage)
adk web
# Option 2: Run the agent directly in the terminal
adk run rag
The web interface provides a chat-like experience for interacting with the agent, while the direct run option is suitable for scripting and automated workflows.
# List all GCS buckets
[user]: List all GCS buckets
# Create a bucket for Foundation LLMs
[user]: Create a GCS bucket named "adk-embedding-vector-stores"
# Upload a document
[user]: Upload this PDF file to GCS bucket gs://adk-embedding-vector-stores/ and keep the same destination blob name
# Create a RAG corpus
[user]: Create a RAG corpus named "adk-embedding-vector-stores" with description "adk-embedding-vector-stores"
# Import a document into RAG corpus
[user]: Import the file gs://adk-embedding-vector-stores/emebddings-vector-stores.pdf into the RAG corpus
# Query a specific RAG corpus about prompt engineering
[user]: What is Chain of Thought (CoT)?
# Query across all corpora about MLOps
[user]: How do multiple teams collaborate to operationalize GenAI models?
Edit rag/config/__init__.py
to customize your settings:
PROJECT_ID
: Your Google Cloud project IDLOCATION
: Default location for Vertex AI and GCS resourcesGCS_DEFAULT_*
: Defaults for GCS operationsRAG_DEFAULT_*
: Defaults for RAG operationsAGENT_*
: Settings for the agent
The engine supports various document types, including:
- TXT
- DOC/DOCX
- XLS/XLSX
- PPT/PPTX
- CSV
- JSON
- HTML
- Markdown
- 403 Errors: Make sure you've authenticated with
gcloud auth application-default login
- Resource Exhausted: Check your quota limits in the GCP Console
- Upload Issues: Ensure your file format is supported and file size is within limits
Contributions are welcome! Please feel free to submit a Pull Request.
Below is a complete example workflow showing how to set up the entire RAG environment with the Google Gen AI Intensive course materials:
Create the following 7 Google Cloud Storage buckets for my project, using the default settings (location: US, storage class: STANDARD) for all of them. Do not ask for confirmation for each bucket.
1. adk-foundation-llm
2. adk-prompt-engineering
3. adk-embedding-vector-stores
4. adk-agents-llm
5. adk-agents-companion
6. adk-solving-domain-problem-using-llms
7. adk-operationalizing-genai-vertex-ai
Upload the file "promptengineering.pdf" to the GCS bucket gs://adk-prompt-engineering/ and use "promptengineering.pdf" as the destination blob name. Do not ask for confirmation.
Upload the file "foundational-large-language-models-text-generation.pdf" to the GCS bucket gs://adk-foundation-llm/ and use "foundational-large-language-models-text-generation.pdf" as the destination blob name. Do not ask for confirmation.
Upload the file "agents.pdf" to the GCS bucket gs://adk-agents-llm/ and use "agents.pdf" as the destination blob name. Do not ask for confirmation.
Upload the file "agents-companion.pdf" to the GCS bucket gs://adk-agents-companion/ and use "agents-companion.pdf" as the destination blob name. Do not ask for confirmation.
Upload the file "emebddings-vector-stores.pdf" to the GCS bucket gs://adk-embedding-vector-stores/ and use "emebddings-vector-stores.pdf" as the destination blob name. Do not ask for confirmation.
Upload the file "operationalizing-generative-ai-on-vertex-ai.pdf" to the GCS bucket gs://adk-operationalizing-genai-vertex-ai/ and use "operationalizing-generative-ai-on-vertex-ai.pdf" as the destination blob name. Do not ask for confirmation.
Upload the file "solving-domain-specific-problems-using-llms.pdf" to the GCS bucket gs://adk-solving-domain-problem-using-llms/ and use "solving-domain-specific-problems-using-llms.pdf" as the destination blob name. Do not ask for confirmation.
Create a RAG corpus named "adk-agents-companion" with description of rag as "adk-agents-companion" and import the gs://adk-agents-companion/agents-companion.pdf into RAG
Create a RAG corpus named "adk-agents-llm" with description "adk-agents-llm" and import the file gs://adk-agents-llm/agents.pdf into the RAG corpus.
Create a RAG corpus named "adk-embedding-vector-stores" with description "adk-embedding-vector-stores" and import the file gs://adk-embedding-vector-stores/emebddings-vector-stores.pdf into the RAG corpus.
Create a RAG corpus named "adk-foundation-llm" with description "adk-foundation-llm" and import the file gs://adk-foundation-llm/foundational-large-language-models-text-generation.pdf into the RAG corpus.
Create a RAG corpus named "adk-operationalizing-genai-vertex-ai" with description "adk-operationalizing-genai-vertex-ai" and import the file gs://adk-operationalizing-genai-vertex-ai/operationalizing-generative-ai-on-vertex-ai.pdf into the RAG corpus.
Create a RAG corpus named "adk-solving-domain-problem-using-llms" with description "adk-solving-domain-problem-using-llms" and import the file gs://adk-solving-domain-problem-using-llms/solving-domain-specific-problems-using-llms.pdf into the RAG corpus.
# Questions about Prompt Engineering
What is Chain of Thought (CoT)?
What is Tree of Thoughts (ToT)?
What is ReAct (reason & act)?
# Questions about Embeddings & Vector Stores
What are Types of embeddings?
What is Vector search?
What is Vector databases?
# Questions about Agents
What is Agent Lifecycle?
# Questions about MLOps & Operationalization
How do multiple teams collaborate to operationalize GenAI models?
How multiple teams collaborate to operationalize both models and GenAI applications?
For more articles on AI/ML and Generative AI, follow me on Medium: https://medium.com/@arjun-prabhulal