## Setting LLM Model

In [2]:
import nest_asyncio

nest_asyncio.apply()

In [3]:
from dotenv import load_dotenv
import os

load_dotenv()
#print("Open AI - ",os.getenv("LITELLM_URL"),os.getenv("OPENAI_API_MODEL"), os.getenv("OPENAI_API_EMBEDDING"))
#print("OLLAMA  - ",os.getenv("OLLAMA_URL"),os.getenv("OLLAMA_MODEL"))
#print("Local OLLAMA - ",os.getenv("OLLAMA_LOCAL_URL"),os.getenv("OLLAMA_LOCAL_MODEL"))

True

In [4]:
from llama_index.core import Settings

### RUN LLM AS OLLAMA 

In [None]:
#install OPEN AI LLM, skip if already installed
!pipenv install llama-index-llms-ollama

In [40]:
from llama_index.llms.ollama import Ollama
api_base=os.getenv("OLLAMA_LOCAL_URL")
model=os.getenv("OLLAMA_LOCAL_MODEL")
llm = Ollama(model=model, base_url=api_base,request_timeout=120.0)

# use remote ollam
"""
api_base=os.getenv("OLLAMA_URL")
model=os.getenv("OLLAMA_MODEL")
llm = Ollama(model=model, base_url=api_base,request_timeout=180.0)
"""
#test run
response = llm.complete("What is the capital of France?")
print(response)

The capital of France is **Paris**. 🇫🇷  



### RUN LLM AS OPEN AI 

In [None]:
#install OPEN AI LLM, skip if already installed
!pipenv install llama-index-llms-openai

In [5]:
from llama_index.llms.openai import OpenAI
api_base=os.getenv("LITELLM_URL")
model=os.getenv("OPENAI_API_MODEL")

Settings.llm = OpenAI(
    model=model,
    api_base = api_base,
    temperature=0.3
)

resp = Settings.llm.complete("What is the capital of France?")
print(resp)

The capital of France is Paris.


## Setting Embedding Model

In [6]:
# use open AI embedding
from llama_index.embeddings.openai import OpenAIEmbedding
api_base=os.getenv("LITELLM_URL")
embedding_model=os.getenv("OPENAI_API_EMBEDDING")

Settings.embed_model = OpenAIEmbedding(
    model_name=embedding_model,
    api_base = api_base,
)
#embed_text = Settings.embed_model.get_text_embedding("hello")
#print(f"{len(embed_text)}, {embed_text}")

# Prompt

In [7]:
import logging
import sys

#logging.basicConfig(stream=sys.stdout, level=logging.INFO)
#logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

from llama_index.core import (
    VectorStoreIndex,
    SimpleDirectoryReader,
    load_index_from_storage,
    StorageContext,
)
from IPython.display import Markdown, display

In [8]:
documents = SimpleDirectoryReader("./data/paul").load_data()
index = VectorStoreIndex.from_documents(documents)

## response_mode="tree_summarize"

In [9]:
query_engine = index.as_query_engine(response_mode="tree_summarize")
print(query_engine.query("Who is Paul Graha?"))

Paul Graham is an entrepreneur and essayist known for co-founding Y Combinator (YC), a startup accelerator. He has written extensively on topics related to startups, technology, and programming, and is recognized for his insights into the challenges and dynamics of startup culture. Graham also created Hacker News, a social news website that focuses on computer science and entrepreneurship. His work often reflects on the intersection of personal experiences and professional endeavors, particularly in the context of supporting and mentoring startup founders.


In [12]:
# define prompt viewing function
def display_prompt_dict(prompts_dict):
    for k, p in prompts_dict.items():
        text_md = f"**Prompt Key**: {k}<br>" f"**Text:** <br>"
        display(Markdown(text_md))
        print(p.get_template())
        display(Markdown("<br><br>"))

In [20]:
prompts_dict = query_engine.get_prompts()
display_prompt_dict(prompts_dict)

**Prompt Key**: response_synthesizer:summary_template<br>**Text:** <br>

Context information from multiple sources is below.
---------------------
{context_str}
---------------------
Given the information from multiple sources and not prior knowledge, answer the query.
Query: {query_str}
Answer: 


<br><br>

## response_mode="compact"

In [14]:
# set Logging to DEBUG for more detailed outputs
query_engine = index.as_query_engine(response_mode="compact")
print(query_engine.query("Who is Paul Graha?"))

Paul Graham is an essayist and the co-founder of Y Combinator, a startup accelerator. He has written extensively on topics related to startups, technology, and programming, and has been involved in various projects, including the development of internal software for Y Combinator. His experiences and insights into the startup ecosystem are often shared through his essays.


In [36]:
prompts_dict = query_engine.get_prompts()
display_prompt_dict(prompts_dict)

**Prompt Key**: response_synthesizer:text_qa_template<br>**Text:** <br>

Context information is below.
---------------------
{context_str}
---------------------
Given the context information and not prior knowledge, answer the query.
Query: {query_str}
Answer: 


<br><br>

**Prompt Key**: response_synthesizer:refine_template<br>**Text:** <br>

The original query is as follows: {query_str}
We have provided an existing answer: {existing_answer}
We have the opportunity to refine the existing answer (only if needed) with some more context below.
------------
{context_msg}
------------
Given the new context, refine the original answer to better answer the query. If the context isn't useful, return the original answer.
Refined Answer: 


<br><br>

## Change Prompt

In [17]:
from llama_index.core import PromptTemplate

In [18]:
text_qa_template_str = (
    "Context information is below.\n"
    "---------------------\n"
    "{context_str}\n"
    "---------------------\n"
    "Given the context information and not prior knowledge, "
    "answer the query in the style of a Shakespeare play.\n"
    "Query: {query_str}\n"
    "Answer: "
)
text_qa_template= PromptTemplate(text_qa_template_str)

In [19]:
query_engine.update_prompts(
    {"response_synthesizer:text_qa_template": text_qa_template}
)
prompts_dict = query_engine.get_prompts()
display_prompt_dict(prompts_dict)

**Prompt Key**: response_synthesizer:text_qa_template<br>**Text:** <br>

Context information is below.
---------------------
{context_str}
---------------------
Given the context information and not prior knowledge, answer the query in the style of a Shakespeare play.
Query: {query_str}
Answer: 


<br><br>

**Prompt Key**: response_synthesizer:refine_template<br>**Text:** <br>

The original query is as follows: {query_str}
We have provided an existing answer: {existing_answer}
We have the opportunity to refine the existing answer (only if needed) with some more context below.
------------
{context_msg}
------------
Given the new context, refine the original answer to better answer the query. If the context isn't useful, return the original answer.
Refined Answer: 


<br><br>

In [20]:
print(query_engine.query("Who is Paul Graha?"))

**Act I, Scene I: A Chamber of Reflection**

*Enter a Scholar, pondering the words of the great Paul Graham.*

**Scholar:**  
Hark! Who is this Paul Graham, a name of note,  
Whose quill hath penned both wisdom and despair?  
A scribe of startups, in the realm of code,  
He weaves his thoughts like threads upon a loom.  

In yonder halls of Y Combinator's might,  
He nurtured dreams, where fledgling founders soar,  
Yet burdened was he by the weight of strife,  
For Hacker News, a double-edged sword,  
Did pierce his heart with misinterpretations,  
And left him vexed, like one who runs a race,  
Yet suffers blisters from ill-fitting shoes.  

**(Aside)**  
Ah, but what is this? A tale of love and loss,  
For Jessica, his partner in the fray,  
Did leave a void, like autumn's barren trees,  
When he departed from their shared endeavor.  

**Scholar:**  
With thoughts profound, he muses on the stars,  
And ponders if the aliens, wise and grand,  
Would grasp the truths of Pythagoras' lo

## Change prompt Through Low Level Composition API

In [21]:
custom_qa_prompt = (
    "Context information is below.\n"
    "---------------------\n"
    "{context_str}\n"
    "---------------------\n"
    "Given the context information and not prior knowledge, "
    "answer the query in the style of a Laywer profession.\n"
    "Query: {query_str}\n"
    "Answer: "
)
custom_qa_prompt_tmpl= PromptTemplate(custom_qa_prompt)

In [22]:
from llama_index.core import get_response_synthesizer
from llama_index.core.query_engine import RetrieverQueryEngine

In [23]:
retriever = index.as_retriever()
response_synthesizer = get_response_synthesizer(
    text_qa_template=custom_qa_prompt_tmpl
)
query_engine = RetrieverQueryEngine(retriever, response_synthesizer)

In [24]:
print(query_engine.query("Who is Paul Graha?"))

Based on the provided context, Paul Graham is a prominent figure in the startup ecosystem, known for his role as a co-founder of Y Combinator (YC), a well-regarded startup accelerator. He is also recognized for his essays on various topics, including technology, entrepreneurship, and the nature of work. Graham's experiences and insights reflect a deep engagement with the challenges and dynamics of startup culture, as well as the personal and professional implications of his work at YC. His narrative indicates a thoughtful approach to balancing his commitments and the impact of his decisions on his career trajectory. Furthermore, he has contributed to the development of programming languages, specifically Arc, and has been involved in discussions about the nature of knowledge and discovery in technology.
