<a href="https://colab.research.google.com/github/jerryjliu/llama_index/blob/main/docs/examples/managed/vectaraDemo.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Vectara Managed Index
In this notebook we are going to show how to use [Vectara](https://vectara.com) with LlamaIndex.
Vectara is the first example of a "Managed" Index, a new type of index in Llama-index which is managed via an API.

If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.

In [None]:
!pip install llama-index

In [None]:
from llama_index import SimpleDirectoryReader
from llama_index.indices import VectaraIndex

### Loading documents
Load the documents stored in the `Uber 10q` using the SimpleDirectoryReader

In [None]:
documents = SimpleDirectoryReader(os.path.abspath("../data/10q/")).load_data()
print(f"documents loaded into {len(documents)} document objects")
print(f"Document ID of first doc is {documents[0].doc_id}")

documents loaded into 305 document objects
Document ID of first doc is 812706e0-f35d-439f-8083-c5a9f7dd5c89


### Add the content of the documents into a pre-created Vectara corpus
Here we assume an empty corpus is created and the details are available as environment variables:
* VECTARA_CORPUS_ID
* VECTARA_CUSTOMER_ID
* VECTARA_API_KEY

In [None]:
index = VectaraIndex.from_documents(documents)

### Query the Vectara Index
We can now ask questions using the VectaraIndex retriever.

In [None]:
query = "Is Uber still losing money or have they achieved profitability?"

First we use the retriever to list the returned documents:

In [None]:
query_engine = index.as_query_engine(similarity_top_k=5)
response = query_engine.retrieve(query)
texts = [t.node.text for t in response]
print("\n--\n".join(texts))

The court granted our motion to defer the summary judgment motion on January 12, 2022. Our chances of success on
the merits are still uncertain and any reasonably possible loss or range of loss cannot be estimated. Swiss Social Security Reclassification
Several Swiss administrative bodies have issued decisions in which they classify Drivers as employees of Uber Switzerland, Rasier Operations B.V. or of Uber
B.V. for social security or labor purposes. We are challenging each of them before the Social Security and Administrative Tribunals. In April 2021, a ruling was made that Uber Switzerland could not be held liable for social security contributions.
--
The court granted our motion to defer the summary judgment motion on January 12, 2022. Our chances of success on
the merits are still uncertain and any reasonably possible loss or range of loss cannot be estimated. Swiss Social Security Reclassification
Several Swiss administrative bodies have issued decisions in which they classify Dri

with the as_query_engine(), we can ask questions and get the responses based on Vectara's full RAG pipeline:

In [None]:
query_engine = index.as_query_engine(similarity_top_k=5)
response = query_engine.query(query)
print(response)

Uber's financial situation remains uncertain, and it is difficult to estimate their potential losses or achieve profitability [1][5]. They face challenges in various jurisdictions, including reclassification of drivers for social security or labor purposes [1][5]. Uber Payments B.V., their subsidiary in the Netherlands, is authorized to provide payment services [3][4]. However, the search results do not provide a clear and direct answer regarding Uber's current profitability or ongoing losses.


Note that the "response" object above includes both the summary text but also the source documents used to provide this response (citations)

Vectara supports max-marginal-relevance natively in the backend, and this is available as a query mode. 
Let's see an example of how to use MMR: We will run the same query "Is Uber still losing money or have they achieved profitability?" but this time we will use MMR where mmr_diversity_bias=1.0 which maximizes the focus on maximum diversity:

In [None]:
query_engine = index.as_query_engine(
    similarity_top_k=5,
    n_sentences_before=2,
    n_sentences_after=2,
    vectara_query_mode="mmr",
    vectara_kwargs={"mmr_k": 50, "mmr_diversity_bias": 1.0},
)
response = query_engine.retrieve(query)

texts = [t.node.text for t in response]
print("\n--\n".join(texts))

Taxing authorities have appealed the orders related to tax issues and plan confirmation but did not
appeal the settlement approval. Uber is not a party to those appeals. The taxing authorities’ chances of success on the merits are still uncertain and any reasonably
possible loss or range of loss is immaterial. Non-Income Tax Matters
We recorded an estimated liability for contingencies related to non-income tax matters and are under audit by various domestic and foreign tax authorities with
regard to such matters. The subject matter of these contingent liabilities and non-income tax audits primarily arises from our transactions with Drivers, as well as
the tax treatment of certain employee benefits and related employment taxes.
--
We are challenging each of them before the Social Security and Administrative Tribunals. In April 2021, a ruling was made that Uber Switzerland could not be held liable for social security contributions. The litigations with regards to Uber B.V. and
Rasier Ope

As you can see, the results in this case are much more diverse, and for example do not contain the same text more than once. The response is also better since the LLM had a more diverse set of facts to ground its response on:

In [None]:
query_engine = index.as_query_engine(
    similarity_top_k=5,
    n_sentences_before=2,
    n_sentences_after=2,
    summary_enabled=True,
    vectara_query_mode="mmr",
    vectara_kwargs={"mmr_k": 50, "mmr_diversity_bias": 1.0},
)
response = query_engine.query(query)
print(response)

Uber has faced various legal challenges regarding its classification of drivers and social security contributions [2, 4, 7]. While there have been rulings in favor of Uber, such as in Switzerland, other cases are still pending [2, 7]. The company also deals with tax audits and potential tax liabilities [1]. Furthermore, Uber operates under evolving laws and regulations related to payment services and financial activities [3, 6]. Overall, Uber's financial situation remains uncertain, and they have incurred significant losses since inception [5]. It cannot be definitively stated whether Uber has achieved profitability or is still losing money based on the provided information.


So far we've used Vectara's internal summarization capability, which is the best way for most users.

You can still use Llama-Index's standard VectorStore as_query_engine() method, in which case Vectara's summarization won't be used, and you would be using an external LLM (like OpenAI's GPT-4 or similar) and a cutom prompt from LlamaIndex to generate the summart. For this option just set summary_enabled=False

In [None]:
query_engine = index.as_query_engine(
    similarity_top_k=5,
    summary_enabled=False,
    vectara_query_mode="mmr",
    vectara_kwargs={"mmr_k": 50, "mmr_diversity_bias": 0.5},
)
response = query_engine.query(query)
print(response)

Uber is still losing money and has not achieved profitability.
