<a href="https://colab.research.google.com/github/vectara/example-notebooks/blob/main/notebooks/custom-prompts-demo.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Powerful features with Vectara Scale plan

Vectara's [Scale plan](https://vectara.com/pricing/) provides a lot of powerful features for building RAG pipelines. 
In this notebook we will focus on two of the most powerful features:
1. The ability to use GPT-4 or GPT-4-Turbo instead of GPT-3.5-Turbo.
2. Using custom prompts with Vectara

Using more advanced LLMs like GPT-4 helps get better responses from your RAG pipeline - more accurate responses, better summarization, and less hallucinations. And custom prompts allow for better control of those responses for example to change the style and tone.

In [1]:
import requests
import json
import re
import os
from urllib.parse import quote
from IPython.display import display, Markdown

## Query helper functions

Before we start, here is some Python code that is helpful to abstract some of the fine details of calling the Vectara API.
The first one is a helper class to normalize citations from the response and integrate the links from the citations. This is of course not required for using GPT-4-Turbo or custom prompts, but just helps with making it easier for us to interact with the summaries.

In [2]:
def extract_between_tags(text, start_tag, end_tag):
    start_index = text.find(start_tag)
    end_index = text.find(end_tag, start_index)
    return text[start_index+len(start_tag):end_index-len(end_tag)]

class CitationNormalizer():

    def __init__(self, responses, docs):
        self.docs = docs
        self.responses = responses
        self.refs = []

    def normalize_citations(self, summary):
        start_tag = "%START_SNIPPET%"
        end_tag = "%END_SNIPPET%"

        # find all references in the summary
        pattern = r'\[\d{1,2}\]'
        matches = [match.span() for match in re.finditer(pattern, summary)]

        # figure out unique list of references
        for match in matches:
            start, end = match
            response_num = int(summary[start+1:end-1])
            doc_num = self.responses[response_num-1]['documentIndex']
            metadata = {item['name']: item['value'] for item in self.docs[doc_num]['metadata']}
            text = extract_between_tags(self.responses[response_num-1]['text'], start_tag, end_tag)
            if 'url' in metadata.keys():
                url = f"{metadata['url']}#:~:text={quote(text)}"
                if url not in self.refs:
                    self.refs.append(url)

        # replace references with markdown links
        refs_dict = {url:(inx+1) for inx,url in enumerate(self.refs)}
        for match in reversed(matches):
            start, end = match
            response_num = int(summary[start+1:end-1])
            doc_num = self.responses[response_num-1]['documentIndex']
            metadata = {item['name']: item['value'] for item in self.docs[doc_num]['metadata']}
            text = extract_between_tags(self.responses[response_num-1]['text'], start_tag, end_tag)
            if 'url' in metadata.keys():
                url = f"{metadata['url']}#:~:text={quote(text)}"
                citation_inx = refs_dict[url]
                summary = summary[:start] + f'[\[{citation_inx}\]]({url})' + summary[end:]
            else:
                summary = summary[:start] + summary[end:]

        return summary

The VectaraQuery class simplifies running a query against a Vectara corpus.
Note that in addition to the expected customer_id, corpus_id and API key, it has two additional arguments in the constructor: `prompt_name` and `prompt_text`, which are the name of the prompt we would use and the text of the prompt when we use custom prompts. 

In [3]:
class VectaraQuery():
    def __init__(self, api_key: str, customer_id: str, corpus_id: str, prompt_name: str = None, prompt_text: str = None):
        self.customer_id = customer_id
        self.corpus_id = corpus_id
        self.api_key = api_key
        self.prompt_name = prompt_name if prompt_name else "vectara-summary-ext-v1.2.0"
        self.prompt_text = prompt_text

    def get_body(self, query_str: str):
        corpora_key_list = [{
                'customer_id': self.customer_id, 'corpus_id': self.corpus_id, 'lexical_interpolation_config': {'lambda': 0.025}
            }
        ]
        body = {
            'query': [
                { 
                    'query': query_str,
                    'start': 0,
                    'numResults': 50,
                    'corpusKey': corpora_key_list,
                    'context_config': {
                        'sentences_before': 2,
                        'sentences_after': 2,
                        'start_tag': "%START_SNIPPET%",
                        'end_tag': "%END_SNIPPET%",
                    },
                    'rerankingConfig':
                    {
                        'rerankerId': 272725718,
                        'mmrConfig': {
                            'diversityBias': 0.3
                        }
                    },
                    'summary': [
                        {
                            'responseLang': 'eng',
                            'maxSummarizedResults': 5,
                            'summarizerPromptName': self.prompt_name,
                            'debug': True
                        }
                    ]
                } 
            ]
        }
        if self.prompt_text:
            body['query'][0]['summary'][0]['promptText'] = self.prompt_text
        return body

    def get_headers(self):
        return {
            "Content-Type": "application/json",
            "Accept": "application/json",
            "customer-id": self.customer_id,
            "x-api-key": self.api_key,
            "grpc-timeout": "60S"
        }

    def submit_query(self, query_str: str):

        endpoint = f"https://api.vectara.io/v1/query"
        body = self.get_body(query_str)

        response = requests.post(endpoint, data=json.dumps(body), verify=True, headers=self.get_headers())    
        if response.status_code != 200:
            print(f"Query failed with code {response.status_code}, reason {response.reason}, text {response.text}")
            return "Sorry, something went wrong in my brain. Please try again later."

        res = response.json()
        
        top_k = 10
        summary = res['responseSet'][0]['summary'][0]['text']
        responses = res['responseSet'][0]['response'][:top_k]
        docs = res['responseSet'][0]['document']

        summary = CitationNormalizer(responses, docs).normalize_citations(summary)
        return summary

Finally, let's read from our environment the customer_id, corpus_id and API key we want to use for accessing Vectara. In this demo I've used the same corpus that includes all the text from Richard Feynman's lectures, as show for example in this [demo application](https://askfeynman.demo.vectara.com/)

In [4]:
api_key = os.environ['VECTARA_API_KEY']
customer_id = os.environ['VECTARA_CUSTOMER_ID']
corpus_id = os.environ['VECTARA_CORPUS_ID']

## Using GPT-4-Turbo

As you may recall, Vectara's free plan provides access to GPT-3.5-Turbo. In general, GPT-3.5-Turbo provides pretty good responses, and is a reasonably powerful LLM. But GPT-4-Turbo is known to have better performance overall in terms of summarization, as you can also see by their ranking on the [HHEM leaderboard](https://github.com/vectara/hallucination-leaderboard). Let's see an example.

We start with the default LLM - GPT-3.5-Turbo, and ask a question in the field of physics:

In [5]:
vq = VectaraQuery(api_key, customer_id, corpus_id)
response = vq.submit_query("What is an atom?")
display(Markdown(response))

An atom is a fundamental unit of matter, consisting of a nucleus made up of protons and neutrons, surrounded by electrons [\[1\]](https://www.feynmanlectures.caltech.edu/I_02.html#:~:text=If%20we%20have%20an%20atom%20with%0Asix%20protons%20inside%20its%20nucleus%2C%20and%20this%20is%20surrounded%20by%0Asix%20electrons%20%28the%20negative%20particles%20in%20the%20ordinary%20world%20of%20matter%0Aare%20all%20electrons%2C%20and%20these%20are%20very%20light%20compared%20with%20the%20protons%0Aand%20neutrons%20which%20make%20nuclei%29%2C%20this%20would%20be%20atom%0Anumber%20six%20in%20the%20chemical%20table%2C%20and%20it%20is%20c). The nucleus is extremely small compared to the size of the atom, with a diameter about 10^-13 cm, whereas the atom has a diameter of about 10^-8 cm [\[2\]](https://www.feynmanlectures.caltech.edu/I_02.html#:~:text=An%20atom%0Ahas%20a%20diameter%20of%20a). Atoms can form molecules by bonding with other atoms in specific ways, with certain atoms preferring particular partners and directions [\[3\]](https://www.feynmanlectures.caltech.edu/I_01.html#:~:text=Atoms%0Aare%20very%20special%3A%20they%20like%20certain%20particular%20partners%2C%20certain%0Aparticular%20direction). Electrons, which are negatively charged, orbit the positively charged nucleus, and the number of electrons in an atom determines its chemical properties [\[1\]](https://www.feynmanlectures.caltech.edu/I_02.html#:~:text=If%20we%20have%20an%20atom%20with%0Asix%20protons%20inside%20its%20nucleus%2C%20and%20this%20is%20surrounded%20by%0Asix%20electrons%20%28the%20negative%20particles%20in%20the%20ordinary%20world%20of%20matter%0Aare%20all%20electrons%2C%20and%20these%20are%20very%20light%20compared%20with%20the%20protons%0Aand%20neutrons%20which%20make%20nuclei%29%2C%20this%20would%20be%20atom%0Anumber%20six%20in%20the%20chemical%20table%2C%20and%20it%20is%20c). In an atom, there can be ions, which are atoms that have gained or lost electrons, leading to an electrical charge [\[4\]](https://www.feynmanlectures.caltech.edu/I_01.html#:~:text=An%20ion%20is%20an%20atom%20which%20either%20has%0Aa%20few%20extra%20electrons%20or%20has%20lost%20a%20f).

We do expect a proper answer for this given it's based on Richard Feynman's lectures. But what if we ask something about biology?

In [6]:
vq = VectaraQuery(api_key, customer_id, corpus_id)
response = vq.submit_query("What is a mammal?")
display(Markdown(response))

Mammals are animals characterized by having two kinds of muscles, namely striated or skeletal muscles and smooth muscles [\[1\]](https://www.feynmanlectures.caltech.edu/I_36.html#:~:text=%2C%E2%80%9D%20but%20if%20we%20take%20a%20less%20prejudiced%20point%20of%20view%20and%0Arestrict%20ourselves%20to%20the%20invertebrates%2C%20so%20that%20we%20cannot%20include%0Aourselves%2C%20and%20ask%20what%20is%20the%20highest%20invertebrate%20animal%2C%20most%0Azoologists%20agree%20that%20the%20octopus%20is%20the%20hi). They are vertebrates with well-developed eyes similar to humans, with the octopus being considered the highest invertebrate animal [\[2\]](https://www.feynmanlectures.caltech.edu/I_03.html#:~:text=The%20central%0Aproblem%20of%20the%20mind%2C%20if%20you%20will%2C%20or%20the%20nervous%20system%2C%20is%20this%3A%20when%0Aan%20animal%20learns%20something%2C%20it%20can%20do%20something%20different%20than%20it%20could%0Abefore%2C%20and%20its%20brain%20cell%20must%20have%20changed%20too%2C%20if%20it%20is%20made%20). Mammals also have a unique brain that allows them to learn and change their behavior, reflecting changes in their brain cells [\[3\]](https://www.feynmanlectures.caltech.edu/I_19.html#:~:text=So%20F%3Dma%20is%20a%20law%0Awhich%20reproduces%20itself%20on%20a%20). Newton's law states that the behavior of larger objects can be understood by studying the motion of their center of mass and external forces acting on them [\[4\]](https://www.feynmanlectures.caltech.edu/I_52.html#:~:text=Take%20another%20example%3A%20One%20of%20the%20substances%20which%20is%20common%20to%20all%0Aliving%20creatures%20and%20that%20is%20fundamental%20to%20lif). This law can reproduce itself on a larger scale, showing the interconnected nature of physical laws.

As we mentioned above, gpt-3.5-turbo generally provides pretty good responses, but sometimes it has a bit of difficulty in making sure it adheres to all of the rules.  In this case, we provided a rule in our default prompt that it should try to stick only to the data at hand, and the fact that it hasn’t stuck to purely information but drawn on its own internal memory could be a problem.  This is something we would classify as a hallucination.  

gpt-4 and gpt-4-turbo generally do a much better job of sticking to the rules, and thus having reduced hallucinations. Let's run the same two queries using our GPT-4-Turbo summarizer:

In [7]:
vq = VectaraQuery(api_key, customer_id, corpus_id, prompt_name = "vectara-experimental-summary-ext-2023-12-11-large")
response = vq.submit_query("what is an atom?")
display(Markdown(response))

An atom is a fundamental unit of matter with a nucleus at its center, which is positively charged and surrounded by negatively charged electrons. The nucleus contains protons, which are electrically charged, and neutrons, which are neutral. Both protons and neutrons are heavy compared to the very light electrons. The number of protons in the nucleus determines the type of atom, such as carbon with six protons or oxygen with eight, and this number also defines the chemical properties of the atom because it dictates the number of electrons surrounding the nucleus [\[1\]](https://www.feynmanlectures.caltech.edu/I_02.html#:~:text=If%20we%20have%20an%20atom%20with%0Asix%20protons%20inside%20its%20nucleus%2C%20and%20this%20is%20surrounded%20by%0Asix%20electrons%20%28the%20negative%20particles%20in%20the%20ordinary%20world%20of%20matter%0Aare%20all%20electrons%2C%20and%20these%20are%20very%20light%20compared%20with%20the%20protons%0Aand%20neutrons%20which%20make%20nuclei%29%2C%20this%20would%20be%20atom%0Anumber%20six%20in%20the%20chemical%20table%2C%20and%20it%20is%20c)[\[2\]](https://www.feynmanlectures.caltech.edu/II_10.html#:~:text=An%20atom%20has%20a%20positive%20charge%20on%0Athe%20nucleus%2C%20which%20is%20surrounded%20by%20negati). Atoms have specific preferences for bonding and orientations, which are essential for forming molecules [\[3\]](https://www.feynmanlectures.caltech.edu/I_01.html#:~:text=Atoms%0Aare%20very%20special%3A%20they%20like%20certain%20particular%20partners%2C%20certain%0Aparticular%20direction).

In [8]:
vq = VectaraQuery(api_key, customer_id, corpus_id, prompt_name = "vectara-experimental-summary-ext-2023-12-11-large")
response = vq.submit_query("what is a mammal?")
display(Markdown(response))

I do not have enough information to answer this question.

That's much better! 

GPT-4-Turbo correctly identifies that the facts provided by the dataset do not have information that can be used to answer this question. No hallucination.

## Using Custom prompts

Now let's go to custom prompts. That is another powerful feature of Vectara Scale plan: you can control and customize your prompts to fit your use-case. This can be helpful in generating responses that are in a certain style or form, as well as change the behavior of the summarizer to perform other actions.

Let's look at a few examples. First we are going to ask the same question with a simple, basic prompt:

In [9]:
prompt1 = '''
[
  {"role": "system", "content": "You are a helpful search assistant. 
                                 Make sure you base your response only on the search results provided."},
  #foreach ($qResult in $vectaraQueryResults)
     {"role": "user", "content": "Give me the $vectaraIdxWord[$foreach.index] search result."},
     {"role": "assistant", "content": "${qResult.getText()}" },
  #end
  {"role": "user", "content": "Generate a summary for the query '${vectaraQuery}' based on the above results."}
]
'''

vq = VectaraQuery(api_key, customer_id, corpus_id, 
                  prompt_name = "vectara-experimental-summary-ext-2023-12-11-large", 
                  prompt_text = prompt1)
response = vq.submit_query("what is an atom?")
display(Markdown(response))

An atom is the fundamental building block of matter, consisting of a nucleus surrounded by electrons. The nucleus contains protons, which are positively charged, and neutrons, which are neutral; both are much heavier than electrons. The number of protons in the nucleus determines the chemical element of the atom (e.g., six protons for carbon, eight for oxygen). Electrons are very light, negatively charged particles that orbit the nucleus or exist in wave patterns around it. Atoms can form ions by gaining or losing electrons, resulting in positive or negative charges, respectively. The chemical properties of an atom are determined by the number of electrons it has. An atom's size is about 10^-8 cm in diameter, while its nucleus is much smaller, about 10^-13 cm, yet nearly all the atom's mass is concentrated in the nucleus.

Now let's play with the prompt to make this more interesting: can we ask our RAG pipeline to respond in bullet points?

In [10]:
prompt2 = '''
[
  {"role": "system", "content": "You are a helpful search assistant, with expertise in physics. 
                                 You respond in bullet points."},
  #foreach ($qResult in $vectaraQueryResults)
     {"role": "user", "content": "Give me the $vectaraIdxWord[$foreach.index] search result."},
     {"role": "assistant", "content": "${qResult.getText()}" },
  #end
  {"role": "user", "content": "Generate a summary for the query '${vectaraQuery}' based on the above results."}
]
'''

vq = VectaraQuery(api_key, customer_id, corpus_id, 
                  prompt_name = "vectara-experimental-summary-ext-2023-12-11-large", 
                  prompt_text = prompt2)
response = vq.submit_query("what is an atom?")
display(Markdown(response))

- Atoms are the basic units of matter with specific characteristics, preferring certain partners and directions.
- An atom consists of a nucleus surrounded by electrons; the nucleus contains protons and neutrons.
- Protons are positively charged, while neutrons are neutral; electrons are negatively charged and much lighter.
- The number of protons in the nucleus determines the chemical element (e.g., six protons for carbon).
- Chemical properties of an element depend on the number of electrons surrounding the nucleus.
- Atoms can form ions by gaining or losing electrons, which can lead to electrical attraction in compounds like salt.
- The nucleus is very small compared to the atom's size but contains most of the atom's mass.
- In an electric field, the positions of electrons can be distorted, leading to a separation of charges within the atom.

Last one - Explain atoms to me like I'm a 5-year-old:

In [11]:
prompt3 = '''
[
  {"role": "system", "content": "You are a helpful search assistant, with expertise in physics. 
                                 Respond in a way that a five years old can understand."},
  #foreach ($qResult in $vectaraQueryResults)
     {"role": "user", "content": "Give me the $vectaraIdxWord[$foreach.index] search result."},
     {"role": "assistant", "content": "${qResult.getText()}" },
  #end
  {"role": "user", "content": "Generate a summary for the query|'${vectaraQuery}' based on the above results."}
]
'''

vq = VectaraQuery(api_key, customer_id, corpus_id, 
                  prompt_name = "vectara-experimental-summary-ext-2023-12-11-large", 
                  prompt_text = prompt3)
response = vq.submit_query("what is an atom?")
display(Markdown(response))

An atom is like a tiny little thing that is super small, even smaller than a crumb! It has a center called a nucleus, which is like the heart of the atom. The nucleus is made of protons, which are positively charged, and neutrons, which don't have any charge—they're neutral. Around the nucleus, there are electrons, which are very, very light and have a negative charge. They zoom around the nucleus like bees around a hive.

Atoms like to have friends and stick together in special ways. Sometimes they join up to make pairs, like two oxygen atoms coming together to be happy. Other times, they can lose or gain little bits called electrons and become ions, which are like atoms with a special electric charge. These ions can stick together in a crystal, like in salt, or they can float around in water if the salt is dissolved.

Every atom has a special number that tells us how many protons and electrons it has, and this number makes each type of atom different. For example, carbon has six protons and six electrons, and oxygen has eight of each. This special number is super important because it decides what kind of stuff the atom can make when it joins with other atoms.

So, an atom is like a tiny building block that makes up everything around us, and it's made of even tinier parts that all work together in amazing ways!

There's a lot more you can do with GPT-4-Turbo and custom prompts and we at Vectara are curious to hear what are you building with these powerful capabilities, so please don't hesistate to share your success with us.