# Examples of basic PIQARD library usage
To use all of available components first you have to set environmental variables:
* **COHERE_API_KEY** - for Cohere xlarge language model
* **HUGGINGFACE_API_KEY** - for bloom and gpt j6b language models
* **GOOGLE_CUSTOM_SEARCH_API_KEY** and **GOOGLE_CUSTOM_SEARCH_ENGINE_ID** - for google custom search information retriever

Detailed description how to get access to needed keys can be found in the documentation.

In [1]:
# set environment variables
import os
os.environ["COHERE_API_KEY"] = "eYDbpsxzTril5NSJTGhv3olRwtNuuAkl9WHK5Vl5"
os.environ["HUGGINGFACE_API_KEY"] = "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"
os.environ["GOOGLE_CUSTOM_SEARCH_API_KEY"] = "AIzaSyAB46rrYmTj6_w-7qCME3Gve7vqcUGzwAY"
os.environ["GOOGLE_CUSTOM_SEARCH_ENGINE_ID"] = "21b53499491814de3"

In [2]:
from piqard.utils.prompt_template import PromptTemplate
from piqard.utils.data_loaders import DatabaseLoaderFactory
from piqard.information_retrievers import BM25Retriever, AnnoyRetriever, FAISSRetriever, GoogleCustomSearch, WikiAPI
from piqard.language_models import CohereAPI, BLOOM176bAPI, GPTj6bAPI
from piqard.PIQARD import PIQARD
from piqard.extensions.react import Agent, Action
from piqard.extensions.self_aware import SelfAware

### Language models:
* CohereAPI
* BLOOM176bAPI
* GPTJ6bAPI

In [3]:
example_query = "How long do african elephants live?"

In [4]:
bloom = BLOOM176bAPI()
bloom.query(example_query)

' The average lifespan of an African elephant is 60 years. The oldest recorded African elephant lived to'

In [5]:
cohere = CohereAPI()
cohere.query(example_query)

'African elephants live for about 70 years.\nWhat is the largest land animal?\nThe largest land animal is the African elephant.\nWhat is the largest animal in the world?\nThe largest animal in the world is the blue whale.'

In [6]:
gpt = GPTj6bAPI()
gpt.query(example_query)

"\n\nHow long do elephants live?\n\nMany people are fascinated by the lives of elephants, yet are unclear as to how long the African elephant's life span is. When you consider that one of the"

### Database loaders
* openbookqa
* hotpotqa

#### Corpus and train questions

In [7]:
OPENBOOKQA_CORPUS_PATH ='./../assets/benchmarks/openbookqa/corpus.jsonl'
OPENBOOKQA_TRAIN_QUESTIONS_PATH = './../assets/benchmarks/openbookqa/train.jsonl'

openbookqa_database = DatabaseLoaderFactory("openbookqa")
openbookqa_corpus = openbookqa_database.load_documents(OPENBOOKQA_CORPUS_PATH)
openbookqa_train_questions =  openbookqa_database.load_questions(OPENBOOKQA_TRAIN_QUESTIONS_PATH)

corpus.jsonl:   0%|          | 0/1326 [00:00<?, ?it/s]

train.jsonl:   0%|          | 0/4957 [00:00<?, ?it/s]

In [8]:
openbookqa_corpus[:5]

['A bee is a pollinating animal',
 'A bird is a pollinating animal',
 'An electrical conductor is a vehicle for the flow of electricity',
 'An example of a change in the Earth is an ocean becoming a wooded area',
 'An example of a chemical change is acid breaking down substances']

In [9]:
openbookqa_train_questions[0]

{'id': '7-980',
 'text': 'The sun is responsible for',
 'possible_answers': 'A. puppies learning new tricks B. children growing up and getting old C. flowers wilting in a vase D. plants sprouting, blooming and wilting',
 'answer': 'D. plants sprouting, blooming and wilting '}

#### Test questions

In [10]:
HOTPOTQA_DEV_QUESTIONS_PATH = './../assets/benchmarks/hotpotqa/dev.jsonl'

hotpotqa_database = DatabaseLoaderFactory("hotpotqa")
hotpotqa_test_questions =  hotpotqa_database.load_questions(HOTPOTQA_DEV_QUESTIONS_PATH)

dev.jsonl:   0%|          | 0/7405 [00:00<?, ?it/s]

In [11]:
hotpotqa_test_questions[0]

{'id': '5a8b57f25542995d1e6f1371',
 'text': 'Were Scott Derrickson and Ed Wood of the same nationality?',
 'possible_answers': None,
 'answer': 'yes'}

### Information retrievers
* BM25Retriever (databases: openbookqa, hotpotqa)
* AnnoyRetriever (databases: openbookqa, hotpotqa)
* FAISSRetriever (databases: openbookqa, hotpotqa)
* GoogleCustomSearch
* WikiAPI

#### Documents retrieval

In [12]:
example_openbook_question = "The sun is responsible for"

In [13]:
bm25 = BM25Retriever(k=5,
                     database="openbookqa",
                     database_path="./../assets/benchmarks/openbookqa/corpus.jsonl",
                     database_index="./../assets/benchmarks/openbookqa/corpus_bm25_index.pickle")
bm25.get_documents(example_openbook_question)

corpus.jsonl:   0%|          | 0/1326 [00:00<?, ?it/s]

['seasonal changes are made in response to changes in the environment',
 'if it is night then the sun has set',
 'the Earth revolves around the sun',
 'the sun sets in the west',
 'Earth is the planet that is third closest to the Sun']

In [14]:
annoy = AnnoyRetriever(k=5,
                       database="openbookqa",
                       database_path="./../assets/benchmarks/openbookqa/corpus.jsonl",
                       database_index="./../assets/benchmarks/openbookqa/corpus_annoy_index_384.ann")

annoy.get_documents(example_openbook_question)

corpus.jsonl:   0%|          | 0/1326 [00:00<?, ?it/s]

NOTE: Redirects are currently not supported in Windows or MacOs.


['the sun is the source of solar energy called sunlight',
 'the sun is a source of heat called sunlight',
 'the sun is a source of light called sunlight',
 'the sun is the source of energy for life on Earth',
 'the sun is the source of energy for physical cycles on Earth']

In [15]:
faiss = FAISSRetriever(k=5,
                       database="openbookqa",
                       database_path="./../assets/benchmarks/openbookqa/corpus.jsonl",
                       database_index="./../assets/benchmarks/openbookqa/corpus_faiss_index.pickle")
faiss.get_documents(example_openbook_question)

corpus.jsonl:   0%|          | 0/1326 [00:00<?, ?it/s]

['the sun is the source of solar energy called sunlight',
 'the sun is a source of heat called sunlight',
 'the sun is a source of light called sunlight',
 'the sun is the source of energy for life on Earth',
 'the sun is the source of energy for physical cycles on Earth']

In [16]:
example_opendomain_question = "Who is the best chess player in the world?"

In [17]:
gcs = GoogleCustomSearch(k=1)
gcs.get_documents(example_opendomain_question)[0][:1000]

"Several methods have been suggested for comparing the greatest chess players in history. There is agreement on a statistical system to rate the strengths of current players, called the Elo system, but disagreement about methods used to compare players from different generations who never competed against each other.\n\nStatistical methods [ edit ]\n\nElo system [ edit ]\n\nThe best-known statistical method was devised by Arpad Elo in 1960 and elaborated on in his 1978 book The Rating of Chessplayers, Past and Present.[1] He gave ratings to players corresponding to their performance over the best five-year span of their career. According to this system the highest ratings achieved were:\n\nThough published in 1978, Elo's list did not include five-year averages for later players Bobby Fischer and Anatoly Karpov. It did list January 1978 ratings of 2780 for Fischer and 2725 for Karpov.[2]\n\nIn 1970, FIDE adopted Elo's system for rating current players, so one way to compare players of d

In [18]:
wiki = WikiAPI(k=10)
wiki.get_documents(example_opendomain_question)

["Several methods have been suggested for comparing the greatest chess players in history. There is agreement on a statistical system to rate the strengths of current players, called the Elo system, but disagreement about methods used to compare players from different generations who never competed against each other.\n\n\n== Statistical methods ==\n\n\n=== Elo system ===\n\nThe best-known statistical method was devised by Arpad Elo in 1960 and elaborated on in his 1978 book The Rating of Chessplayers, Past and Present. He gave ratings to players corresponding to their performance over the best five-year span of their career. According to this system the highest ratings achieved were:\n\n2725: José Raúl Capablanca\n2720: Mikhail Botvinnik, Emanuel Lasker\n2700: Mikhail Tal\n2690: Alexander Alekhine, Paul Morphy, Vasily SmyslovThough published in 1978, Elo's list did not include five-year averages for later players Bobby Fischer and Anatoly Karpov. It did list January 1978 ratings of 27

### PIQARD

#### Configuration with prepared prompt template and without information retrieval

In [27]:
prompt_template = PromptTemplate('./../assets/prompting_templates/like_chat_gpt/without_context.txt')
llm = CohereAPI(stop_token="\n")

piqard = PIQARD(prompt_template=prompt_template,
                language_model=llm)

result = piqard("Who is the current CEO of twitter?")
print(result['answer'])

Jack Dorsey is the current CEO of Twitter. He was appointed to the position in October 2015.


In [28]:
print(result['chain_trace'])

[37m[base_prompt] Assistant is a large language model.

Assistant is designed to be able to assist with a wide range of tasks, from answering simple questions to providing in-depth explanations and discussions on a wide range of topics. As a language model, Assistant is able to generate human-like text based on the input it receives, allowing it to engage in natural-sounding conversations and provide responses that are coherent and relevant to the topic at hand.

Assistant is constantly learning and improving, and its capabilities are constantly evolving. It is able to process and understand large amounts of text, and can use this knowledge to provide accurate and informative responses to a wide range of questions. Additionally, Assistant is able to generate its own text based on the input it receives, allowing it to engage in discussions and provide explanations and descriptions on a wide range of topics.

Overall, Assistant is a powerful tool that can help with a wide range of tasks

#### Configuration with prepared prompt template and with information retrieval

In [30]:
prompt_template = PromptTemplate('./../assets/prompting_templates/like_chat_gpt/with_context.txt')
llm = CohereAPI(stop_token="\n")
ir = WikiAPI(k=10)

piqard = PIQARD(prompt_template=prompt_template,
                language_model=llm,
                information_retriever=ir)

result = piqard("Who is the current CEO of twitter?")
print(result['answer'])

Elon Musk is the current CEO of Twitter.


In [31]:
print(result['chain_trace'])

[33m[observation] Twitter, Inc. is an American social media company based in San Francisco, California. The company operates the microblogging and social networking service Twitter. It previously operated the Vine short video app and Periscope livestreaming service.
Twitter was created by Jack Dorsey, Noah Glass, Biz Stone, and Evan Williams in March 2006 and was launched that July. By 2012, more than 100 million users tweeted 340 million tweets a day, and the service handled an average of 1.6 billion search queries per day. The company went public in November 2013. By 2019, Twitter had more than 330 million monthly active users.On April 25, 2022, Twitter agreed to a $44 billion buyout by Elon Musk, CEO of SpaceX and Tesla, one of the biggest deals to turn a company private. On July 8, Musk terminated the deal. Twitter's shares fell, leading company officials to sue Musk in the Chancery Court of Delaware on July 12.
[0m[37m[base_prompt] Assistant is a large language model.

Assistan

#### Configuration with custom prompt template and with information retrieval

In [34]:
template = """
Answer the question based on the context.

Context: {% if context %}{{" ".join(context)}}{% endif %}
Question: {{question}}
Answer:
"""

In [38]:
prompt_template = PromptTemplate(template)
llm = CohereAPI(stop_token="\n")
ir = WikiAPI(k=10)

piqard = PIQARD(prompt_template=prompt_template,
                language_model=llm,
                information_retriever=ir)

result = piqard("What is the longest bridge in the USA?")
print(result['answer'])

The Lake Pontchartrain Causeway, Louisiana, USA.


In [39]:
print(result['chain_trace'])

[33m[observation] The Longest Day is a 1962 American epic war film, shot in black and white and based on Cornelius Ryan's 1959 non-fiction book of the same name about the D-Day landings at Normandy on June 6, 1944. The film was produced by Darryl F. Zanuck, who paid author Ryan $175,000 for the film rights. The screenplay was by Ryan, with additional material written by Romain Gary, James Jones, David Pursall, and Jack Seddon. It was directed by Ken Annakin (British and French exteriors), Andrew Marton (American exteriors), and Bernhard Wicki (German scenes).
The Longest Day features a large international ensemble cast including John Wayne, Kenneth More, Richard Todd, Robert Mitchum, Richard Burton, Steve Forrest, Sean Connery, Henry Fonda, Red Buttons, Peter Lawford, Eddie Albert, Jeffrey Hunter, Stuart Whitman, Tom Tryon, Rod Steiger, Leo Genn, Gert Fröbe, Irina Demick, Bourvil, Curd Jürgens, George Segal, Robert Wagner, Paul Anka, and Arletty. Many of these actors played roles that

#### Configuration with prepared prompt template, without information retrieval and with possible answers

In [50]:
prompt_template = PromptTemplate(template='./../assets/prompting_templates/openbookqa/chain_of_thought/cot_3_shot.txt',
                                 fix_text="So the final answer is:")
llm = CohereAPI(stop_token="|||")

piqard = PIQARD(language_model=llm,
                prompt_template=prompt_template)

result = piqard("A cactus stem is used to store", "A. fruit B.liquid C. food D. spines")
print(result['answer'])

A. fruit


In [51]:
print(result['chain_trace'])

[37m[base_prompt] Answer the question

Question: The sun is responsible for
Possible answers: A. puppies learning new tricks, B. children growing up and getting old, C. flowers wilting in a vase, D. plants sprouting, blooming and wilting
Let's think step by step:
- puppies can learn new tricks even in the dark or inside a house
- children growing up and getting old is a natural cycle of life
- flowers wilting in a vase could result from not suppyling enough water
- sunlight is essential for plants for photosynthesis which make them grow, sprout, bloom and wilt when there's not enough of it
So the final answer is: D. plants sprouting, blooming and wilting
|||

Question: A magnet will stick to
Possible answers: A. a belt buckle, B. a wooden table, C. a plastic cup, D. a paper plate
Let's think step by step:
- a magnet will attract magnetic metals through magnetism
- a belt buckle is usually made out of metal such as nickel, zinc or copper alloys which are magnetic
- wood, plastic or pap

#### Configuration with prepared prompt template, without information retrieval and with possible answers

In [56]:
prompt_template = PromptTemplate(template='./../assets/prompting_templates/openbookqa/k_documents/5_shot_2_documents.txt',
                                 fix_text="So the final answer is:")
llm = BLOOM176bAPI(stop_token="\n")
ir = AnnoyRetriever(k=2,
                    database="openbookqa",
                    database_path="./../assets/benchmarks/openbookqa/corpus.jsonl",
                    database_index="./../assets/benchmarks/openbookqa/corpus_annoy_index_384.ann") 

piqard = PIQARD(language_model=llm,
                information_retriever=ir,
                prompt_template=prompt_template)

result = piqard("A cactus stem is used to store", "A. fruit B.liquid C. food D. spines")
print(result['answer'])

corpus.jsonl:   0%|          | 0/1326 [00:00<?, ?it/s]

B.liquid


In [57]:
print(result['chain_trace'])

[33m[observation] a cactus stem is used for storing water a cactus stores water
[0m[37m[base_prompt] Answer the question based on the facts.

Question: The sun is responsible for
Possible answers: A. puppies learning new tricks, B. children growing up and getting old, C. flowers wilting in a vase, D. plants sprouting, blooming and wilting
Facts:
- the sun is the source of energy for physical cycles on Earth
- the sun is the source of energy for life on Earth
Answer: D. plants sprouting, blooming and wilting

Question: with which could you tell the exact size of an object?
Possible answers: A. a plain stick with irregular shape, B. a plastic tape with graduated markings, C. a thermometer with mercury in it, D. a metal cooking spoon
Facts:
- a tape measure is used to measure length
- a ruler is used for measuring the length of an object
Answer: B. a plastic tape with graduated markings

Question: When food is reduced in the stomach
Possible answers: A. the mind needs time to digest, B

### React

In [None]:
prompt_template = PromptTemplate(template='./../assets/prompting_templates/react/react_prompt.txt')
llm = BLOOM176bAPI(stop_token="\n", temperature=1, top_k=1)
faiss = FAISSRetriever(k=1,
                       database="hotpotqa",
                       database_path="./../assets/benchmarks/hotpotqa/corpus.jsonl",
                       database_index="./../assets/benchmarks/hotpotqa/corpus_faiss_index.pickle")

actions = [
    Action("Wikipedia", faiss, prefix="Search")
]

react_agent = Agent(prompt_template=prompt_template,
                    language_model=llm,
                    actions=actions)

result = react_agent("the work of The Last Poets and Gil Scott Heron inspired music that fits under what larger sub-genre?")
print(result['answer'])

In [None]:
print(result['chain_trace'])

### SelfAware

In [63]:
# if should not browse
piqard_prompt_template = PromptTemplate('./../assets/prompting_templates/like_chat_gpt/without_context.txt')
piqard_llm = CohereAPI(stop_token="\n")

piqard = PIQARD(prompt_template=piqard_prompt_template,
                language_model=piqard_llm,
                information_retriever=None)

# if should browse
react_prompt_template = PromptTemplate('./../assets/prompting_templates/react/react_prompt.txt')
react_llm = CohereAPI(stop_token="\n")

actions = [
    Action("Wikipedia", WikiAPI(k=5), prefix="Search")
]

react_agent = Agent(prompt_template=react_prompt_template,
                    language_model=react_llm,
                    actions=actions)

# selfAware
self_aware_prompt_template = PromptTemplate('./../assets/prompting_templates/self_aware/self_aware_prompt.txt')
sefl_aware_llm = CohereAPI(stop_token="\n")

self_aware = SelfAware(prompt_template=self_aware_prompt_template,
                       language_model=sefl_aware_llm,
                       if_should_browse=react_agent,
                       if_should_not_browse=piqard)

In [64]:
result = self_aware("What is yellow color?")
print(result['answer'])
print(result['chain_trace'])

Yellow is a color that is often associated with happiness and optimism. It is also the color of the sun, which is often seen as a symbol of hope and new beginnings.
[34m[thought] Should I browse the web for an answer?: no
[0m[37m[base_prompt] Assistant is a large language model.

Assistant is designed to be able to assist with a wide range of tasks, from answering simple questions to providing in-depth explanations and discussions on a wide range of topics. As a language model, Assistant is able to generate human-like text based on the input it receives, allowing it to engage in natural-sounding conversations and provide responses that are coherent and relevant to the topic at hand.

Assistant is constantly learning and improving, and its capabilities are constantly evolving. It is able to process and understand large amounts of text, and can use this knowledge to provide accurate and informative responses to a wide range of questions. Additionally, Assistant is able to generate its

In [65]:
result = self_aware("Who is the current CEO of twitter?")
print(result['answer'])
print(result['chain_trace'])

Jack Dorsey

[34m[thought] Should I browse the web for an answer?: yes
[0m[37m[base_prompt] Question: What profession does Nicholas Ray and Elia Kazan have in common?
Thought 1: I need to search Nicholas Ray and Elia Kazan, find their professions, then find the profession they have in common.
Action 1: Search[Nicholas Ray]
Observation: Nicholas Ray (born Raymond Nicholas Kienzle Jr., August 7, 1911 - June 16, 1979) was an American film director, screenwriter, and actor best known for the 1955 film Rebel Without a Cause.
Thought 2: Professions of Nicholas Ray are director, screenwriter, and actor. I need to search Elia Kazan next and find his professions.
Action 2: Search[Elia Kazan]
Observation: Elia Kazan was an American film and theatre director, producer, screenwriter and actor.
Thought 3: Professions of Elia Kazan are director, producer, screenwriter, and actor. So profession Nicholas Ray and Elia Kazan have in common is director, screenwriter, and actor.
Action 3: Finish[direct