<a href="https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/examples/managed/manage_retrieval_benchmark.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Semantic Retriever Benchmark

In this notebook, we will compare different retrieval services including
* Google Semantic Retrieval
* LlamaIndex default Retrieval
* Vectara Managed Retrieval

## Installation

In [None]:
%pip install llama-index
%pip install "google-ai-generativelanguage>=0.4,<=1.0"

### Google Authentication Overview

The Google Semantic Retriever API lets you perform semantic search on your own data. Since it's **your data**, this needs stricter access controls than API Keys. Authenticate with OAuth through service accounts or through your user credentials. This quickstart uses a simplified authentication approach for a testing environment, and service account setup are typically easier to start. For a production environment, learn about [authentication and authorization](https://developers.google.com/workspace/guides/auth-overview) before choosing the [access credentials](https://developers.google.com/workspace/guides/create-credentials#choose_the_access_credential_that_is_right_for_you) that are appropriate for your app.

Demo recording for authenticating using service accounts: [Demo](https://drive.google.com/file/d/199LzrdhuuiordS15MJAxVrPKAwEJGPOh/view?usp=sharing)

**Note**: At this time, the Google Generative AI Semantic Retriever API is [only available in certain regions](https://ai.google.dev/available_regions).

#### Authentication (Option 1): OAuth using service accounts

Google Auth [service accounts](https://cloud.google.com/iam/docs/service-account-overview) let an application authenticate to make authorized Google API calls. To OAuth using service accounts, follow the steps below:

1. Enable the `Generative Language API`: [Documentation](https://developers.generativeai.google/tutorials/oauth_quickstart#1_enable_the_api)

1. Create the Service Account by following the [documentation](https://developers.google.com/identity/protocols/oauth2/service-account#creatinganaccount).

 * After creating the service account, generate a service account key.

1. Upload your service account file by using the file icon on the left sidebar, then the upload icon, as shown in the screenshot below.

<img width=400 src="https://developers.generativeai.google/tutorials/images/colab_upload.png">

In [None]:
%pip install google-auth-oauthlib

In [None]:
from google.oauth2 import service_account
from llama_index.indices.managed.google.generativeai import (
    GoogleIndex,
    set_google_config,
)

credentials = service_account.Credentials.from_service_account_file(
    "service_account_key.json",
    scopes=[
        "https://www.googleapis.com/auth/cloud-platform",
        "https://www.googleapis.com/auth/generative-language.retriever",
    ],
)

set_google_config(auth_credentials=credentials)

#### Authentication (Option 2): OAuth using user credentials

Please follow [OAuth Quickstart](https://developers.generativeai.google/tutorials/oauth_quickstart) to setup OAuth using user credentials. Below are overview of steps from the documentation that are required.

1. Enable the `Generative Language API`: [Documentation](https://developers.generativeai.google/tutorials/oauth_quickstart#1_enable_the_api)

1. Configure the OAuth consent screen: [Documentation](https://developers.generativeai.google/tutorials/oauth_quickstart#2_configure_the_oauth_consent_screen)

1. Authorize credentials for a desktop application: [Documentation](https://developers.generativeai.google/tutorials/oauth_quickstart#3_authorize_credentials_for_a_desktop_application)
  * If you want to run this notebook in Colab start by uploading your
`client_secret*.json` file using the "File > Upload" option.

 * Rename the uploaded file to `client_secret.json` or change the variable `client_file_name` in the code below.

<img width=400 src="https://developers.generativeai.google/tutorials/images/colab_upload.png">


**Note**: At this time, the Google Generative AI Semantic Retriever API is [only available in certain regions](https://developers.generativeai.google/available_regions).

In [None]:
# Replace TODO-your-project-name with the project used in the OAuth Quickstart
project_name = "TODO-your-project-name"  #  @param {type:"string"}
# Replace TODO-your-email@gmail.com with the email added as a test user in the OAuth Quickstart
email = "ht@runllama.ai"  #  @param {type:"string"}
# Replace client_secret.json with the client_secret_* file name you uploaded.
client_file_name = "client_secret.json"

# IMPORTANT: Follow the instructions from the output - you must copy the command
# to your terminal and copy the output after authentication back here.
!gcloud config set project $project_name
!gcloud config set account $email

# NOTE: The simplified project setup in this tutorial triggers a "Google hasn't verified this app." dialog.
# This is normal, click "Advanced" -> "Go to [app name] (unsafe)"
!gcloud auth application-default login --no-browser --client-id-file=$client_file_name --scopes="https://www.googleapis.com/auth/generative-language.retriever,https://www.googleapis.com/auth/cloud-platform"

This will provide you with a URL, which you should enter into your local browser.
Follow the instruction to complete the authentication and authorization.

## Download Paul Graham Data

In [None]:
!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'

Ground truth for the query `"which program did this author attend?"`

Wiki Link: https://en.wikipedia.org/wiki/Paul_Graham_(programmer)

Answer from Wiki:
```
Graham and his family moved to Pittsburgh, Pennsylvania in 1968, where he later attended Gateway High School. Graham gained interest in science and mathematics from his father who was a nuclear physicist.[8]

Graham received a Bachelor of Arts with a major in philosophy from Cornell University in 1986.[9][10][11] He then received a Master of Science in 1988 and a Doctor of Philosophy in 1990, both in computer science from Harvard University.[9][12]

Graham has also studied painting at the Rhode Island School of Design and at the Accademia di Belle Arti in Florence.[9][12]
```

## Google Semantic Retrieval

In [None]:
import os

GOOGLE_API_KEY = ""  # add your GOOGLE API key here
os.environ["GOOGLE_API_KEY"] = GOOGLE_API_KEY

In [None]:
from llama_index import SimpleDirectoryReader
from llama_index.indices.managed.google.generativeai import GoogleIndex

# Create a Google corpus.
index = GoogleIndex.create_corpus(display_name="My first corpus!")
print(f"Newly created corpus ID is {index.corpus_id}.")

# Ingestion.
documents = SimpleDirectoryReader("./data/paul_graham/").load_data()
index.insert_documents(documents)

### Google Semantic Retrieval: Using default query engine

In [None]:
# Querying.
# print(f"Newly created corpus ID is {index.corpus_id}.")
query_engine = index.as_query_engine()
response = query_engine.query("which program did this author attend?")
print(response)

Newly created corpus ID is 18e88264-68af-4c5d-a970-a3c0f9d18452.
The author attended Cornell, MIT, and Yale.


### Google Semantic Retrieval: Using `Verbose` Answer Style

In [None]:
from google.ai.generativelanguage import (
    GenerateAnswerRequest,
)

query_engine = index.as_query_engine(
    # Extra parameters specific to the Google query engine.
    temperature=0.7,
    answer_style=GenerateAnswerRequest.AnswerStyle.VERBOSE,
)

response = query_engine.query("Which program did this author attend?")
print(response)

The author attended Cornell University for their undergraduate studies, where they majored in Computer Science and minored in Philosophy. They then attended Harvard University for their graduate studies, where they studied Computer Science and wrote their dissertation on Lisp programming.


### Google Semantic Retrieval: Using `Abstractive` Answer Style

In [None]:
from google.ai.generativelanguage import (
    GenerateAnswerRequest,
)

query_engine = index.as_query_engine(
    # Extra parameters specific to the Google query engine.
    temperature=0.7,
    answer_style=GenerateAnswerRequest.AnswerStyle.ABSTRACTIVE,
)

response = query_engine.query("Which program did this author attend?")
print(response)

This author attended Cornell University and Harvard University.


### Google Semantic Retrieval: Using `Extractive` Answer Style

In [None]:
from google.ai.generativelanguage import (
    GenerateAnswerRequest,
)

query_engine = index.as_query_engine(
    # Extra parameters specific to the Google query engine.
    temperature=0.7,
    answer_style=GenerateAnswerRequest.AnswerStyle.EXTRACTIVE,
)

response = query_engine.query("Which program did this author attend?")
print(response)

Cornell


### Google Semantic Retrieval: Advanced Retrieval with LlamaIndex Reranking
* Gemini as Reranker LLM
* Adopt Abstractive Answer Style for Response 

In [None]:
from llama_index.response_synthesizers.google.generativeai import (
    GoogleTextSynthesizer,
)
from llama_index.vector_stores.google.generativeai import (
    GoogleVectorStore,
    google_service_context,
)
from llama_index import ServiceContext, VectorStoreIndex
from llama_index.llms import Gemini
from llama_index.postprocessor import LLMRerank
from llama_index.query_engine import RetrieverQueryEngine
from llama_index.retrievers import VectorIndexRetriever
from llama_index.embeddings import GeminiEmbedding


# Set up the query engine with a reranker.

response_synthesizer = GoogleTextSynthesizer.from_defaults(
    temperature=0.7, answer_style=GenerateAnswerRequest.AnswerStyle.ABSTRACTIVE
)

embed_model = GeminiEmbedding(
    model_name="models/embedding-001", api_key=GOOGLE_API_KEY
)

reranker = LLMRerank(
    top_n=5,
    service_context=ServiceContext.from_defaults(
        llm=Gemini(api_key=GOOGLE_API_KEY), embed_model=embed_model
    ),
)
retriever = index.as_retriever(similarity_top_k=5)
query_engine = RetrieverQueryEngine.from_args(
    retriever=retriever,
    response_synthesizer=response_synthesizer,
    node_postprocessors=[reranker],
)

# Query for better result!
response = query_engine.query("Which program did this author attend?")

In [None]:
print(response.response)

The author attended Cornell, Harvard, RISD, and the Accademia di Belli Arti.


## LlamaIndex Default Baseline with OpenAI embedding and GPT as LLM for Synthesizer 

In [None]:
import os

OPENAI_API_TOKEN = "sk-"
os.environ["OPENAI_API_KEY"] = OPENAI_API_TOKEN

In [None]:
from llama_index import VectorStoreIndex, StorageContext, ServiceContext
from llama_index.vector_stores import QdrantVectorStore
from llama_index import StorageContext
import qdrant_client


# Create a local Qdrant vector store
client = qdrant_client.QdrantClient(path="qdrant_retrieval_10")

vector_store = QdrantVectorStore(client=client, collection_name="collection")
qdrant_index = VectorStoreIndex.from_documents(documents)

service_context = ServiceContext.from_defaults(chunk_size=256)
storage_context = StorageContext.from_defaults(vector_store=vector_store)

In [None]:
query_engine = qdrant_index.as_query_engine()
response = query_engine.query("Which program did this author attend?")
print(response)

The author attended the Accademia di Belli Arti.


## LlamaIndex Default Configuration with LLM Reranker and Tree Summarize for Response

In [None]:
from llama_index import get_response_synthesizer


reranker = LLMRerank(top_n=4, service_context=service_context)
retriever = index.as_retriever(similarity_top_k=4)
query_engine = RetrieverQueryEngine.from_args(
    retriever=retriever,
    response_synthesizer=get_response_synthesizer(
        service_context=service_context,
        response_mode="tree_summarize",
    ),
    node_postprocessors=[reranker],
)

response = query_engine.query("Which program did this author attend?")

In [None]:
# for r in response.source_nodes:
#     print(r.text)
print(response.response)

The author attended the BFA program at RISD (Rhode Island School of Design).


## Vectara Managed Index and Retrieval

In [None]:
from llama_index import SimpleDirectoryReader
from llama_index.indices import VectaraIndex

In [None]:
vectara_customer_id = ""
vectara_corpus_id = ""
vectara_api_key = ""


documents = SimpleDirectoryReader("./data/paul_graham/").load_data()
vectara_index = VectaraIndex.from_documents(
    documents,
    vectara_customer_id=vectara_customer_id,
    vectara_corpus_id=vectara_corpus_id,
    vectara_api_key=vectara_api_key,
)

In [None]:
vectara_query_engine = vectara_index.as_query_engine(similarity_top_k=5)
response = vectara_query_engine.query("Which program did this author attend?")
# texts = [t.node.text for t in response]
# print("\n--\n".join(texts))
print(response)

The author attended Cornell University [1] and Harvard University [2]. They applied to RISD, but ended up attending Rhode Island School of Design (RISD) [3]. Additionally, the author was in a PhD program in computer science [5]. The search results indicate that the author had a diverse academic background and pursued different programs at various institutions.
