In [1]:

%pip install --upgrade --quiet google-cloud-aiplatform google-genai

Note: you may need to restart the kernel to use updated packages.


In [2]:

%pip install --upgrade --user --quiet google-cloud-aiplatform google-cloud-discoveryengine

Note: you may need to restart the kernel to use updated packages.


In [3]:

from IPython.display import Markdown, display
from google.genai.types import GenerateContentConfig, Retrieval, Tool, VertexRagStore
from vertexai import rag



In [4]:
#import IPython

#app = IPython.Application.instance()
#app.kernel.do_shutdown(True)

In [5]:
# Use the environment variable if the user doesn't provide Project ID.
import os

from google import genai
import vertexai

PROJECT_ID = "[iwd2025]"  # @param {type: "string", placeholder: "[your-project-id]", isTemplate: true}
if not PROJECT_ID or PROJECT_ID == "[iwd2025]":
    PROJECT_ID = str(os.environ.get("GOOGLE_CLOUD_PROJECT"))

LOCATION = os.environ.get("GOOGLE_CLOUD_REGION", "us-central1")

vertexai.init(project=PROJECT_ID, location=LOCATION)

# Start api with Vertex 
client = genai.Client(vertexai=True, project=PROJECT_ID, location=LOCATION)
     

In [6]:

# Currently supports Google first-party embedding models
EMBEDDING_MODEL = "publishers/google/models/text-embedding-004"  # @param {type:"string", isTemplate: true}

#rag de vertex create a corpus using the embeding 004
rag_corpus = rag.create_corpus(
    display_name="my-rag-corpus",
    backend_config=rag.RagVectorDbConfig(
        rag_embedding_model_config=rag.RagEmbeddingModelConfig(
            vertex_prediction_endpoint=rag.VertexPredictionEndpoint(
                publisher_model=EMBEDDING_MODEL
            )
        )
    ),
)

#https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/rag-api
#The Vertex AI RAG Engine is a component of the Vertex AI platform, which facilitates Retrieval-Augmented Generation (RAG). RAG Engine enables Large Language Models (LLMs) to access and incorporate data from external knowledge sources, such as documents and databases. By using RAG, LLMs can generate more accurate and informative LLM responses.
     

In [7]:

rag.list_corpora()

ListRagCorporaPager<rag_corpora {
  name: "projects/iwd2025/locations/us-central1/ragCorpora/2305843009213693952"
  display_name: "my-rag-corpus"
  create_time {
    seconds: 1741357986
    nanos: 581625000
  }
  update_time {
    seconds: 1741357986
    nanos: 581625000
  }
  corpus_status {
    state: ACTIVE
  }
  vector_db_config {
    rag_managed_db {
    }
    rag_embedding_model_config {
      vertex_prediction_endpoint {
        endpoint: "projects/iwd2025/locations/us-central1/publishers/google/models/text-embedding-004"
      }
    }
  }
}
rag_corpora {
  name: "projects/iwd2025/locations/us-central1/ragCorpora/3379951520341557248"
  display_name: "my-rag-corpus"
  create_time {
    seconds: 1741628569
    nanos: 683768000
  }
  update_time {
    seconds: 1741628569
    nanos: 683768000
  }
  corpus_status {
    state: ACTIVE
  }
  vector_db_config {
    rag_managed_db {
    }
    rag_embedding_model_config {
      vertex_prediction_endpoint {
        endpoint: "projects/iwd20

In [8]:
%%writefile test.md

Retrieval-Augmented Generation (RAG) is a technique that enhances the capabilities of large language models (LLMs) by allowing them to access and incorporate external data sources when generating responses. Here's a breakdown:

**What it is:**

* **Combining Retrieval and Generation:**
    * RAG combines the strengths of information retrieval systems (like search engines) with the generative power of LLMs.
    * It enables LLMs to go beyond their pre-trained data and access up-to-date and specific information.
* **How it works:**
    * When a user asks a question, the RAG system first retrieves relevant information from external data sources (e.g., databases, documents, web pages).
    * This retrieved information is then provided to the LLM as additional context.
    * The LLM uses this augmented context to generate a more accurate and informative response.

**Why it's helpful:**

* **Access to Up-to-Date Information:**
    * LLMs are trained on static datasets, so their knowledge can become outdated. RAG allows them to access real-time or frequently updated information.
* **Improved Accuracy and Factual Grounding:**
    * RAG reduces the risk of LLM "hallucinations" (generating false or misleading information) by grounding responses in verified external data.
* **Enhanced Contextual Relevance:**
    * By providing relevant context, RAG enables LLMs to generate more precise and tailored responses to specific queries.
* **Increased Trust and Transparency:**
    * RAG can provide source citations, allowing users to verify the information and increasing trust in the LLM's responses.
* **Cost Efficiency:**
    * Rather than constantly retraining large language models, RAG allows for the introduction of new data in a more cost effective way.

In essence, RAG bridges the gap between the vast knowledge of LLMs and the need for accurate, current, and contextually relevant information.


Overwriting test.md


In [9]:
#super easy to upload file 
rag_file = rag.upload_file(
    corpus_name=rag_corpus.name,
    path="test.md",
    display_name="test.md",
    description="my test file",
)

In [10]:
#Bucket data 
INPUT_GCS_BUCKET = (
    "gs://grammarbucketrag/"
)

response = rag.import_files(
    corpus_name=rag_corpus.name,
    paths=[INPUT_GCS_BUCKET],  #my bucket
    # Optional
    transformation_config=rag.TransformationConfig(
        chunking_config=rag.ChunkingConfig(chunk_size=1024, chunk_overlap=100)  # CHUNK BUCKETS
    ),
    max_embedding_requests_per_min=900,  # Optional
)

In [11]:
response

imported_rag_files_count: 3

In [12]:
rag.list_files('projects/540591311657/locations/us-central1/ragCorpora/6838716034162098176')

ListRagFilesPager<rag_files {
  name: "projects/540591311657/locations/us-central1/ragCorpora/6838716034162098176/ragFiles/5386472047294458114"
  display_name: "test.md"
  description: "my test file"
  create_time {
    seconds: 1741629133
    nanos: 784242000
  }
  update_time {
    seconds: 1741629133
    nanos: 784242000
  }
  direct_upload_source {
  }
  file_status {
    state: ACTIVE
  }
}
rag_files {
  name: "projects/540591311657/locations/us-central1/ragCorpora/6838716034162098176/ragFiles/5386472098916773497"
  display_name: "7b2d049c2da59f7764990a3588d518c3.pdf"
  create_time {
    seconds: 1741629139
    nanos: 937508000
  }
  update_time {
    seconds: 1741629139
    nanos: 937508000
  }
  gcs_source {
    uris: "gs://grammarbucketrag/7b2d049c2da59f7764990a3588d518c3.pdf"
  }
  file_status {
    state: ACTIVE
  }
}
rag_files {
  name: "projects/540591311657/locations/us-central1/ragCorpora/6838716034162098176/ragFiles/5386472101001005969"
  display_name: "Sicher_B2_Grammat

In [13]:
# retival object  get from the rag the similatiry k 19
# Create a tool for the RAG Corpus
rag_retrieval_tool = Tool(
    retrieval=Retrieval(
        vertex_rag_store=VertexRagStore(
            rag_corpora=[rag_corpus.name],
            similarity_top_k=10,
            vector_distance_threshold=0.5,
        )
    )
)

In [14]:

MODEL_ID = "gemini-2.0-flash-001"

In [15]:

response = client.models.generate_content(
    model=MODEL_ID,
    contents="Erklärt mir die zweiteilige konnentoren",
    config=GenerateContentConfig(tools=[rag_retrieval_tool]),
)

display(Markdown(response.text))

Zweiteilige Konnektoren haben verschiedene Funktionen: Aufzählungen, Alternativen, Gegensätze und Einschränkungen. Sie können an verschiedenen Positionen stehen.

Beispiele:

*   **Aufzählung (positiv):** Wir haben sowohl in derselben Firma gearbeitet als auch im selben Chor gesungen.
*   **Aufzählung (negativ):** Es macht weder meinem Freund noch mir etwas aus.
*   **Alternative:** Entweder gehen wir etwas essen oder wir treffen uns zu Hause.
*   **Gegensatz:** Einerseits würde ich ihn gern treffen, andererseits bringt das nichts.
*   **Einschränkung:** Wir sehen uns zwar nicht mehr oft, aber wir bleiben Freunde.

In [16]:
Modalverben können auch ohne Infinitiv gebraucht werden (= Vollverb):
Muss ich Petra schreiben?
Du musst nicht, aber du kannst, wenn du willst.
Nikolas möchte einen Kaffee.

SyntaxError: invalid syntax (976360472.py, line 1)