# 05 - Prompt Engineering
We'll now exlore the possibilities with prompt engineering

## Lab Setup
We'll setup our lab and use the public reports from a few Australia public sector agencies for our corpus.


In [1]:
%pip install -q vectara-skunk-client==0.4.11

Note: you may need to restart the kernel to use updated packages.


In [2]:
from lab_setup import create_lab_corpus
corpus_id = create_lab_corpus("05-lab-prompt-engineering", quiet=True)

08:59:42 +1100 lab_setup            INFO:User prefix for lab: david
08:59:42 +1100 lab_setup            INFO:Setting up lab corpus with name [david-05-lab-prompt-engineering]
08:59:42 +1100 Factory              INFO:initializing builder
08:59:42 +1100 Factory              INFO:Factory will load configuration from home directory
08:59:42 +1100 root                 INFO:We are processing authentication type [OAuth2]
08:59:42 +1100 root                 INFO:initializing Client
08:59:44 +1100 root                 INFO:No existing corpus with the name david-05-lab-prompt-engineering
08:59:46 +1100 AdminService         INFO:Created new corpus with 246
08:59:46 +1100 root                 INFO:New corpus created CreateCorpusResponse(corpusId=246, status=Status(code=<StatusCode.OK: 0>, statusDetail='Corpus Created', cause=None))
08:59:46 +1100 lab_setup            INFO:New lab created with id [246]


In [3]:
from pathlib import Path
from vectara.client.core import Factory

resources_dir = Path("./resources/05_prompt_engineering/vectara")
client = Factory().build()
indexer_service = client.indexer_service

for p in resources_dir.glob("*.pdf"):
    indexer_service.upload(corpus_id, p)

08:59:47 +1100 Factory              INFO:initializing builder
08:59:47 +1100 Factory              INFO:Factory will load configuration from home directory
08:59:47 +1100 root                 INFO:We are processing authentication type [OAuth2]
08:59:47 +1100 root                 INFO:initializing Client
08:59:47 +1100 IndexerService       INFO:Headers: {"c": "1623270172", "o": "246"}
Avoiding Hallucinations in LLMs.pdf: 3.31MB [00:05, 665kB/s]                                                         
08:59:54 +1100 IndexerService       INFO:Headers: {"c": "1623270172", "o": "246"}
RAG done right - part 1 - chunking.pdf: 4.69MB [00:05, 831kB/s]                                                      
09:00:00 +1100 IndexerService       INFO:Headers: {"c": "1623270172", "o": "246"}
RAG done right - part 2 - retrieval.pdf: 3.32MB [00:05, 618kB/s]                                                     


In [None]:
from vectara.client.core import Factory

client = Factory().build()

## Summarization Using Default Prompt
The query below uses our default summarizer (vectara-summary-ext-v1.2.0 aka GPT 3.5) to return the results using RAG only.

In [7]:
from lab_setup import render_response

query = "Why is Vectara's platform better than other LLM options?"
qs = client.query_service
resp = qs.query(query, corpus_id)
render_response(query, resp, show_search_results=False)


# Query: Why is Vectara's platform better than other LLM options?

The returned results did not contain sufficient information to be summarized into a useful answer for your query. Please try a different search or restate your query differently.


## Summarization Using Default Prompt
We're now creating a custom prompt with the following preferences:

1. We want to tailor the style by setting `persona="Head of Investor Relations"`
2. We show a more concise answer instead of citing each result by setting `cite=False`
3. We will also allow _general_context_ to creep in from the base LLM by setting the `just_rag=False` . Whilst this may increase the risk of hallucinations, if done right it can bolster the results from the retrieval model.
4. We create a more concise result by setting `max_word_count=100`

In [8]:
from vectara.client.util import SimplePromptFactory
import logging

prompt_factory = SimplePromptFactory(persona="Head of Investor Relations", cite=False, max_word_count=100, just_rag=False)
prompt_text = prompt_factory.build()

logging.info(f"Here's our tailored \"promptText\" we'll be supplying on the query:\n\n{prompt_text}\n")

resp = qs.query(query, corpus_id, promptText=prompt_text)
render_response(query, resp, show_search_results=False)

09:00:47 +1100 root                 INFO:Here's our tailored "promptText" we'll be supplying on the query:

[ {"role": "system", "content": "You are a Head of Investor Relations who takes the search results and only return the most relevant answer. Do not iterate over each question, preferably based on the search results in this chat. You may allow additional information you know in the results. Respond in the language denoted by ISO 639 code \"$vectaraLangCode\"."}, 
#foreach ($qResult in $vectaraQueryResults) 
   #if ($foreach.first) 
   {"role": "user", "content": "Search for \"$esc.java(${vectaraQuery})\", and give me the first search result."}, 
   {"role": "assistant", "content": "$esc.java(${qResult.getText()})" }, 
   #else 
   {"role": "user", "content": "Give me the \"$vectaraIdxWord[$foreach.index]\" search result."}, 
   {"role": "assistant", "content": "$esc.java(${qResult.getText()})" }, 
   #end 
 #end 
{"role": "user", "content": "Generate a detailed answer (that is no 


# Query: Why is Vectara's platform better than other LLM options?

Vectara's platform stands out amongst other Language Model options due to its advanced Retrieval Augmented Generation (RAG) capabilities. Unlike traditional models, Vectara's RAG optimizes information retrieval, ensuring more accurate and contextually relevant responses. Moreover, Vectara's unique "chunking" feature helps break down complex queries into manageable parts for better comprehension and response generation. This combination of advanced retrieval and processing capabilities makes Vectara's platform a superior choice.


## Persona: ELI5
Now lets respond as though we're doing it using "Explain Like I'm 5" language.

In [9]:
prompt_factory = SimplePromptFactory(persona="Explaning to someone in ELI5 language", cite=False, max_word_count=100, just_rag=False)
prompt_text = prompt_factory.build()
resp = qs.query(query, corpus_id, promptText=prompt_text)
render_response(query, resp, show_search_results=False)


# Query: Why is Vectara's platform better than other LLM options?

Vectara's platform stands out among other Language Model options due to its unique features. It employs techniques like avoiding hallucinations in applications, which ensures more accurate and reliable outputs. Additionally, it uses Retrieval Augmented Generation (RAG) effectively, breaking information into manageable chunks for better comprehension. This makes it highly efficient and accurate, enhancing the overall user experience.


In [10]:
query = "Explain what RAG is?"
prompt_factory = SimplePromptFactory(persona="Explaning to someone in ELI5 language", cite=True, max_word_count=100, just_rag=True)
prompt_text = prompt_factory.build()
resp = qs.query(query, corpus_id, promptText=prompt_text)
render_response(query, resp, show_search_results=False)


# Query: Explain what RAG is?

I'm sorry, but the search results provided do not contain enough information to accurately and comprehensively explain what RAG (Retrieval Augmented Generation) is. The titles suggest that RAG relates to something being 'done right', possibly in relation to 'chunking' and 'retrieval' [1,2], and it might be used in applications to avoid 'hallucinations' [3]. However, without more detailed information, a comprehensive explanation of RAG is not possible.


In [11]:
query = "What is steam power?"
prompt_factory = SimplePromptFactory(persona="Explaning to someone in ELI5 language", cite=True, max_word_count=100, just_rag=False)
prompt_text = prompt_factory.build()
resp = qs.query(query, corpus_id, promptText=prompt_text)
render_response(query, resp, show_search_results=False)


# Query: What is steam power?

I apologize for the confusion, but the search results provided do not contain information about steam power. However, I can tell you that steam power is a type of power that is generated by heating water until it becomes steam. The steam then drives a steam turbine or engine, which produces mechanical power. This mechanical power can be used for various applications, such as powering machinery or generating electricity.
