# Alpha Tuning w/ Koda Retriever

*For this example non-production ready infrastructure is leveraged*
More specifically, the default sample data provided in a free start instance of [Pinecone](https://www.pinecone.io/) is used. This data consists of movie scripts and their summaries embedded in a free Pinecone vector database.

### Agenda
- Fixture Setup
- Alpha Tuning Setup
- Koda Retriever: Retrieval 

A quick overview/visual on how alpha tuning works: (excuse the weird colors, my color settings on Windows was weirdly interacting w/ Google Sheets and made some colors useless)
![alpha-tuning](https://i.imgur.com/zxCXqGb.png)

### Example Context
Let's say you're building a query engine or agent that is expected to answer questions on Deep Learning, AI, RAG Architecture, and adjacent topics. As a result of this, your team has narrowed down three main query classifications and associated alpha values with those classifications. Your alpha values were determined by incrementally decreasing the alpha value from 1 to 0 and for each new alpha value, several queries are run and evaluated. Repeating this process for each category should yield an optimal alpha value for each category over your data. This process should be repeated periodically as your data expands or changes. 

With the categories and corresponding alpha values uncovered, these are our categories:
- Concept Seeking Queries *(α: 0.2)*
- Fact Seeking Queries *(α: .6)*
- Queries w/ Misspellings *(α: 1)*

Clearly, these categories have very different biases towards one end of the retrieval spectrum. The default for Llama Index hybrid retrievers is 0.5.

In [None]:
# Import all the necessary modules
from llama_index.llms.openai import OpenAI
from llama_index.core import VectorStoreIndex
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.core.postprocessor import LLMRerank
from llama_index.core import VectorStoreIndex
from llama_index.vector_stores.pinecone import PineconeVectorStore
from llama_index.core import Settings
from llama_index.core.query_engine import RetrieverQueryEngine
from llama_index.packs.koda_retriever import KodaRetriever
import os
from pinecone import Pinecone

## Setup

Building *required objects* for a Koda Retriever.
- Vector Index
- LLM/Model

Other objects are *optional*, and will be used if provided:
- Reranker
- Custom categories & corresponding alpha weights
- A custom model trained on the custom info above

In [None]:
pc = Pinecone(api_key=os.environ.get("PINECONE_API_KEY"))
index = pc.Index("sample-movies")

Settings.llm = OpenAI()
Settings.embed_model = OpenAIEmbedding()

vector_store = PineconeVectorStore(pinecone_index=index, text_key="summary")
vector_index = VectorStoreIndex.from_vector_store(
    vector_store=vector_store, embed_model=Settings.embed_model
)

reranker = LLMRerank(llm=Settings.llm)  # optional

## Defining Categories & Alpha Values

We need to first input our categories and alpha values in a format that Koda can understand.

### Important Considerations:
If you provide these custom categories and no custom model, these values will be input as few-shot context training for whatever model is provided to Koda Retriever. Otherwise, if a custom model is provided and has been trained on the data that would otherwise be provided below, ensure the keys of the categories dictionary matches the expected labels of the custom model. Likewise, do NOT provide any examples or a description.

In [None]:
# It is easiest and recommended to define your alpha categories and values in a dictionary
# It must follow this format:
# {
#     "category_name": {
#         "alpha": 0.5,  # alpha value - always required
#         "description": "description of the category", # optional if providing fine tuned model
#         "examples": [ # optional if providing fine tuned model
#             "example 1", # provide at least 1 example
#             "example 2"
#         ]

categories = {  # key, #alpha, description [and examples]
    "concept seeking query": {
        "alpha": 0.2,
        "description": "Abstract questions, usually on a specific topic, that require multiple sentences to answer",
        "examples": [
            "What is the dual-encoder architecture used in recent works on dense retrievers?",
            "Why should I use semantic search to rank results?",
        ],
    },
    "fact seeking query": {
        "alpha": 0.6,
        "description": "Queries with a single, clear answer",
        "examples": [
            "What is the total number of propositions the English Wikipedia dump is segmented into in FACTOID WIKI?",
            "How many documents are semantically ranked?",
        ],
    },
    "queries with misspellings": {
        "alpha": 1,
        "description": "Queries with typos, transpositions and common misspellings introduced",
        "examples": [
            "What is the advntage of prposition retrieval over sentnce or passage retrieval?",
            "Ho w mny documents are samantically r4nked",
        ],
    },
}

## Building Koda Retriever

In [None]:
retriever = KodaRetriever(
    index=vector_index,
    llm=Settings.llm,
    matrix=categories,  # koda now knows to use these categories
    reranker=reranker,  # optional
    verbose=True,
)

## Retrieving w/ Koda Retriever

In [None]:
query = "Can you explain the Jurassic Park as a business as it was supposed to operate inside the movie's lore or timeline?"
results = retriever.retrieve(query)

results

[NodeWithScore(node=TextNode(id_='33', embedding=None, metadata={'box-office': 1113138548.0, 'title': 'Jurassic Park', 'year': 1993.0}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, text='A theme park showcasing genetically engineered dinosaurs turns into a nightmare when the creatures escape their enclosures, forcing the visitors to fight for survival.', start_char_idx=None, end_char_idx=None, text_template='{metadata_str}\n\n{content}', metadata_template='{key}: {value}', metadata_seperator='\n'), score=8.0),
 NodeWithScore(node=TextNode(id_='7', embedding=None, metadata={'box-office': 1671537444.0, 'title': 'Jurassic World', 'year': 2015.0}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, text='Set in a fully functioning dinosaur theme park on Isla Nublar, Jurassic World faces chaos when a genetically engineered dinosaur, the Indominus Rex, escapes. Stars Chris Pratt and Bryce Dallas Howard.', start_char_idx=None,

Those results don't look quite palletteable though. For that, lets look into making the response more *natural*. For that we'll likely need a Query Engine.

# Query Engine w/ Koda Retriever

Query Engines are [Llama Index abstractions](https://docs.llamaindex.ai/en/stable/module_guides/deploying/query_engine/root.html) that combine retrieval and synthesization of an LLM to interpret the results given by a retriever into a natural language response to the original query. They are themselves an end-to-end pipeline from query to natural langauge response.

In [None]:
query_engine = RetrieverQueryEngine.from_args(retriever=retriever)

response = query_engine.query(query)

str(response)

"Jurassic Park was intended to be a theme park that featured genetically engineered dinosaurs as its main attraction. Visitors were meant to experience the thrill of seeing these creatures up close in a controlled environment. However, due to unforeseen circumstances in the movie's storyline, the dinosaurs escaped their enclosures, leading to chaos and danger for the visitors."