# Koda Retriever: Quickstart

*For this example non-production ready infrastructure is leveraged, and default categories (and corresponding alpha values) are used.*

More specifically, the default sample data provided in a free start instance of [Pinecone](https://www.pinecone.io/) is used. This data consists of movie scripts and their summaries embedded in a free Pinecone vector database.

### Agenda:
- Setup
- Koda Retriever: Retrieval 
- Koda Retriever: Query Engine

In [None]:
# Import all the necessary modules
from llama_index.llms.openai import OpenAI
from llama_index.core import VectorStoreIndex
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.core.postprocessor import LLMRerank
from llama_index.core import VectorStoreIndex
from llama_index.vector_stores.pinecone import PineconeVectorStore
from llama_index.core import Settings
from llama_index.core.query_engine import RetrieverQueryEngine
from llama_index.packs.koda_retriever import KodaRetriever
import os
from pinecone import Pinecone

## Setup

Building *required objects* for a Koda Retriever.
- Vector Index
- LLM/Model

Other objects are *optional*, and will be used if provided:
- Reranker
- Custom categories & corresponding alpha weights
- A custom model trained on the custom info above

In [None]:
pc = Pinecone(api_key=os.environ.get("PINECONE_API_KEY"))
index = pc.Index("sample-movies")

Settings.llm = OpenAI()
Settings.embed_model = OpenAIEmbedding()

vector_store = PineconeVectorStore(pinecone_index=index, text_key="summary")
vector_index = VectorStoreIndex.from_vector_store(
    vector_store=vector_store, embed_model=Settings.embed_model
)

reranker = LLMRerank(llm=Settings.llm)  # optional

## Building Koda Retriever

In [None]:
retriever = KodaRetriever(
    index=vector_index,
    llm=Settings.llm,
    reranker=reranker,  # optional
    verbose=True,
)

## Retrieving w/ Koda Retriever

In [None]:
query = "How many Jurassic Park movies are there?"
results = retriever.retrieve(query)

results

[NodeWithScore(node=TextNode(id_='7', embedding=None, metadata={'box-office': 1671537444.0, 'title': 'Jurassic World', 'year': 2015.0}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, text='Set in a fully functioning dinosaur theme park on Isla Nublar, Jurassic World faces chaos when a genetically engineered dinosaur, the Indominus Rex, escapes. Stars Chris Pratt and Bryce Dallas Howard.', start_char_idx=None, end_char_idx=None, text_template='{metadata_str}\n\n{content}', metadata_template='{key}: {value}', metadata_seperator='\n'), score=8.0)]

Those results don't look quite palletteable though. For that, lets look into making the response more *natural*. For that we'll likely need a Query Engine.

# Query Engine w/ Koda Retriever

Query Engines are [Llama Index abstractions](https://docs.llamaindex.ai/en/stable/module_guides/deploying/query_engine/root.html) that combine retrieval and synthesization of an LLM to interpret the results given by a retriever into a natural language response to the original query. They are themselves an end-to-end pipeline from query to natural langauge response.

In [None]:
query_engine = RetrieverQueryEngine.from_args(retriever=retriever)

response = query_engine.query(query)

str(response)

'There are five Jurassic Park movies.'