# Install Libraries

In [26]:
%pip install --upgrade pip

Note: you may need to restart the kernel to use updated packages.


In [27]:
%pip install --upgrade vertexai

Note: you may need to restart the kernel to use updated packages.


In [28]:
pip install --upgrade google-cloud-aiplatform

Collecting google-cloud-aiplatform
  Downloading google_cloud_aiplatform-1.138.0-py2.py3-none-any.whl.metadata (46 kB)
Downloading google_cloud_aiplatform-1.138.0-py2.py3-none-any.whl (8.2 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m8.2/8.2 MB[0m [31m14.4 MB/s[0m  [33m0:00:00[0m eta [36m0:00:01[0m
[?25hInstalling collected packages: google-cloud-aiplatform
  Attempting uninstall: google-cloud-aiplatform
    Found existing installation: google-cloud-aiplatform 1.71.1
    Uninstalling google-cloud-aiplatform-1.71.1:
      Successfully uninstalled google-cloud-aiplatform-1.71.1
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
vertexai 1.71.1 requires google-cloud-aiplatform[all]==1.71.1, but you have google-cloud-aiplatform 1.138.0 which is incompatible.[0m[31m
[0mSuccessfully installed google-cloud-aiplatform-1.138.0
Note: you

In [29]:
from google.cloud import aiplatform
from google.cloud.aiplatform import gapic

In [30]:
import vertexai
from vertexai import rag
from vertexai.generative_models import GenerativeModel, Tool



# Initialize Vertex AI

In [32]:

PROJECT_ID = "project-fad2cff6-f406-416a-940"
display_name = "test_corpus"

In [33]:
vertexai.init(project=PROJECT_ID, location="asia-south1")

# Configure Embedding Model

In [None]:
embedding_model_config = rag.RagEmbeddingModelConfig(
    vertex_prediction_endpoint=rag.VertexPredictionEndpoint(
        publisher_model="publishers/google/models/text-embedding-005"
    )
)


# Configure RAG Corpus

In [35]:
rag_corpus = rag.create_corpus(
    display_name=display_name,
    backend_config=rag.RagVectorDbConfig(
        rag_embedding_model_config=embedding_model_config
    ),
)

# Upload, Embed and Index documents to the corpus.

In [36]:
rag.upload_file(
    corpus_name=rag_corpus.name,
    # path="/Users/kratikagupta/Downloads/ss123.pdf",
    path="/Users/kratikagupta/Downloads/250219-annual-report-and-accounts-2024.pdf",
    display_name=display_name,
)

RagFile(name='projects/1001816638724/locations/asia-south1/ragCorpora/5685794529555251200/ragFiles/5636158114308209236', display_name='test_corpus', description=None)

# Define retriever config to retrieve relevant chunks from the corpus based on the query

In [None]:

# Direct context retrieval
rag_retrieval_config = rag.RagRetrievalConfig(
    top_k=3,  # Optional
    filter=rag.Filter(vector_distance_threshold=0.5),  # Optional
)
response = rag.retrieval_query(
    rag_resources=[
        rag.RagResource(
            rag_corpus=rag_corpus.name,  # Currently only 1 corpus is allowed.
            # Optional: supply IDs from `rag.list_files()`.
            # rag_file_ids=["rag-file-1", "rag-file-2", ...],
        )
    ],
    text="What is HSBC's return on tangible equity for 2024?",
    rag_retrieval_config=rag_retrieval_config,
)
print(response)

contexts {
  contexts {
    source_uri: "test_corpus"
    text: "From 1 January 2024, we have revised the adjustments made to \r\nreturn on average tangible equity (‘RoTE’). Prior to this, we adjusted \r\nRoTE for the impact of strategic transactions and the impairment of \r\nour investment in Bank of Communications Co., Limited (‘BoCom’), \r\nwhereas from 1 January 2024 we have excluded all notable items. \r\nThis was intended to improve alignment with the treatment of notable \r\nitems in our other income statement disclosures. Comparatives have \r\nbeen re-presented on the revised basis and we no longer disclose \r\nRoTE excluding strategic transactions and the impairment of BoCom. \r\nWe will now target a RoTE in the mid-teens in each of the three years \r\nfrom 2025 to 2027 excluding the impact of notable items. \r\nReconciliation of alternative performance measures\r\n122 HSBC Holdings plc Annual Report and Accounts 2024The following table details the adjustments made to reported

In [40]:
rag_retrieval_tool = Tool.from_retrieval(
    retrieval=rag.Retrieval(
        source=rag.VertexRagStore(
            rag_resources=[
                rag.RagResource(
                    rag_corpus=rag_corpus.name# Currently only 1 corpus is allowed.
                    # Optional: supply IDs from `rag.list_files()`.
                    # rag_file_ids=["rag-file-1", "rag-file-2", ...],
                )
            ],
            rag_retrieval_config=rag_retrieval_config,
        ),
    )
)

# Initialize Generative Model

In [41]:
rag_model = GenerativeModel(
    model_name="gemini-2.5-flash", tools=[rag_retrieval_tool]
)



# Sample Responses

In [None]:
response = rag_model.generate_content("What is HSBC's return on tangible equity for 2024?")
print(response.text)

HSBC's return on average tangible equity (RoTE) for 2024 was 14.6%. Excluding notable items, the RoTE was 16.0% in 2024.


In [43]:
response = rag_model.generate_content("Tell me about Harsh")
print(response.text)

I am sorry, but there is no information about "Harsh" in the provided sources.


In [44]:
response = rag_model.generate_content("hat is HSBC's return on tangible equity for 2025?")
print(response.text)

HSBC is targeting a return on average tangible equity (RoTE) in the mid-teens for each of the three years from 2025 to 2027, excluding notable items.


In [45]:
response = rag_model.generate_content("How does HSBC's 2024 financial performance compare to 2023")
print(response.text)

HSBC's financial performance in 2024 compared to 2023 showed several changes:

*   **Profit before tax** rose by $2.0 billion to $32.3 billion in 2024 (from $30.3 billion in 2023). Profit after tax increased by $0.4 billion to $25.0 billion in 2024.
*   **Revenue** was stable at $65.9 billion in 2024. However, on a constant currency basis, revenue was $28.7 billion in 2024, up $1.8 billion or 7% compared to 2023, primarily driven by Wealth revenue.
*   **Net interest income** decreased to $32.7 billion in 2024 from $35.8 billion in 2023.
*   **Operating expenses** increased to $33.0 billion in 2024 from $32.1 billion in 2023.
*   **Return on average tangible equity** remained at 14.6% in both years.
*   **Common equity tier 1 capital ratio** increased to 14.9% in 2024 from 14.8% in 2023.
*   **Dividend per share** increased to $0.87 in 2024 (including a special dividend of $0.21) from $0.61 in 2023.
*   **Net new invested assets generated in Wealth** decreased to $64 billion in 2024 fr

In [46]:
response = rag_model.generate_content("What were the major challenging areas for HSBC in 2024")
print(response.text)

In 2024, major challenging areas for HSBC included geopolitical and macroeconomic risks (such as political instability, trade restrictions, and distressed Chinese economic activity), digitalisation and technological advances (including AI and cybersecurity risks), financial crime risk, and ESG risks, all of which remained at heightened levels. Other significant challenges included model risk due to evolving regulatory requirements and new technologies, risks related to workforce capability, capacity, and employee retention, change execution risk, and the heightened risk of supply chain disruption.
