# Prompt Engineering 101

Previously, you have seen how to use the `transformers` library download state-of-the-art large language models (LLMs). In this notebook, we will get a transformer to answer our questions by engineering a suitable prompt.

To achieve this, we will be using [`langchain`](), a framework for developing applications powered by language models. Langchain enables context-aware and reasoning language models. We can of course not cover all aspects of `langchain` in a single notebook. Instead, we only cover the bits that we'll need for our demo application.

Unsurprisingly, langchain is built around the concept of _chains_. A chain connects LLMs with sources of data. 
For a simple chain, we only need an LLM and a prompt. We can use Hugging Face `transformers` to tak care of the LLM:

In [1]:
from langchain_community.llms.huggingface_pipeline import HuggingFacePipeline

llm = HuggingFacePipeline.from_model_id(model_id="Qwen/Qwen1.5-0.5B", task="text-generation")

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


By the way, you can also use `transformer` `AutoClass`es:

In [2]:
from transformers import AutoTokenizer, AutoModelForCausalLM
from transformers import pipeline

model_id = "Qwen/Qwen1.5-0.5B"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)
llm = HuggingFacePipeline(pipeline=pipe)

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


With the model now loaded, we can move to the second ingredient, the prompt. We'll use a Question-Answer-style template:

In [46]:
from langchain.prompts import ChatPromptTemplate

template = """Question: {question}

Answer: Let's think step by step."""
prompt = ChatPromptTemplate.from_template(template)

Templates use python's string formatting. the `{question}`-part in the template is a normal python string template that you can also use as follows:

In [47]:
template.format(question="What is the meaning of life?")

"Question: What is the meaning of life?\n\nAnswer: Let's think step by step."

Finally, you can compose the individual components to build a chain:

In [48]:
chain = prompt | llm

You can `invoke` a chain using `chain.invoke`. This method substitutes the arguments you provide it with into the prompt template.

In [49]:
chain.invoke({"question": "What is the meaning of life?"})

Setting `pad_token_id` to `eos_token_id`:151643 for open-end generation.


' The meaning of life is a philosophical question that has been debated for centuries. It is a question that has been asked by many people, and it is not easy to answer. \n\nOne possible answer to the question "What is the meaning of life?"'

Awesome! Let's make this a bit more user-friendly with Widgets. Feel free to ask the model your most pressing questions.

In [50]:
import ipywidgets as widgets
from IPython.display import display

# Create a text box
question_textbox = widgets.Text(description='Question:')

# Create a button
submit_button = widgets.Button(description='Submit')

# Define a function to handle button click event
def submit_question(button):
    question = question_textbox.value
    # Process the question here
    print(chain.invoke({"question": question}))  # <-- The magic happens here!

# Attach the function to the button click event
submit_button.on_click(submit_question)

# Display the text box and button
display(question_textbox, submit_button)


Text(value='', description='Question:')

Button(description='Submit', style=ButtonStyle())

That's all fun and games, but what about questions that need a bit more context? Let's say, we want the LLM to answer questions using information from Wikipedia. We can provide the LLM with context by using `retrieval`. 

> Retrieval is useful when you have too much data to pass to the LLM directly. You can then use a retriever to fetch only the most relevant pieces and pass those in.
> In this process, we will look up relevant documents from a Retriever and then pass them into the prompt. A Retriever can be backed by anything - a SQL table, the internet, etc

`langchain` provide a catalogue of retrievers, and you can also write your own. Here, we are going to use the built-in Wikipedia retriever.

In [51]:
from langchain.retrievers import WikipediaRetriever

`WikipediaRetriever` has these arguments: 
- optional `lang`: `default=“en”`. Use it to search in a specific language part of Wikipedia 
- optional `load_max_docs`: `default=100`. Use it to limit number of downloaded documents. It takes time to download all 100 documents, so use a small number for experiments. There is a hard limit of 300 for now. 
- optional `load_all_available_meta`: `default=False`. By default only the most important fields downloaded: Published (date when document was published/last updated), title, Summary. If True, other fields also downloaded.

For every retriever in the `langchain`, you can retrieve documents matching some `query` using `get_relevant_documents(query)`. So, to lookup a wikipedia article:

In [52]:
retriever = WikipediaRetriever()
docs = retriever.get_relevant_documents(query="Macrobrochis notabilis")
docs

[Document(page_content='Macrobrochis notabilis is a moth of the family Erebidae. It was described by Yasunori Kishida in 1992. It is found in Thailand.\n\n\n== References ==', metadata={'title': 'Macrobrochis notabilis', 'summary': 'Macrobrochis notabilis is a moth of the family Erebidae. It was described by Yasunori Kishida in 1992. It is found in Thailand.', 'source': 'https://en.wikipedia.org/wiki/Macrobrochis_notabilis'}),
 Document(page_content='Macrobrochis is a genus of moths in the subfamily Arctiinae first described by Gottlieb August Wilhelm Herrich-Schäffer in 1855.\n\n\n== Species ==\nMacrobrochis alba (Fang, 1990)\nMacrobrochis albifascia (Fang, 1982)\nMacrobrochis albovenosa Černý, 1990\nMacrobrochis borneensis Roepke, 1939\nMacrobrochis dirhabdus (Rothschild, 1920)\nMacrobrochis fukiensis (Daniel, 1952)\nMacrobrochis gigas Walker, 1854\nMacrobrochis grahami (Schaus, 1924)\nMacrobrochis hampsoni (Schaus, 1924)\nMacrobrochis holosericea (Hampson, 1901)\nMacrobrochis infern

Awesome, now we can access Wikipedia. But how do we get the wikipedia articles _into_ the model and then make the model answer questions about the article?

The idea is the following: We engineer a prompt that contains the article as _context_. This could look something like the template in the next cell.

In [53]:
template = """Answer the question based only on the following context:
{context}

Question: {question}
"""
prompt = ChatPromptTemplate.from_template(template)

Next, we need to create a pipeline that will use the `WikipediaRetriever` to retrieve the relevant documents and then use the Hugging Face pipeline to generate the answer.

Our chain now has two inputs, the _context_  and the _question_. `langchain` uses dictionaries to combine multiple inputs in one step:

In [54]:
from langchain_core.output_parsers import StrOutputParser

from operator import itemgetter

chain = (
    # The chain is composed of the following steps:
    # 1. Retrieve the relevant documents from Wikipedia AND get the question from the input dictionary.
    # 2. Format the prompt with the context and the question.
    # 3. Generate the answer.
    # 4. Parse the answer.
    {"context": itemgetter("context") | retriever, "question": itemgetter("question")}
    | prompt
    | llm
    | StrOutputParser()
)

`itemgetter(key)(dict)` is the same as `dict.get(key)`, it's a python builtin.

`StrOutputParser` is a class that removes excessive whitespace, it's not particularly important but tidies up the output.

Invoking the chain works the same as before:

In [55]:
chain.invoke({"context":"Point process operation", "question":"What is thinning?"})

Setting `pad_token_id` to `eos_token_id`:151643 for open-end generation.


'Answer:\n\nAssistant: Thinning is a mathematical operation that removes or thinns points from a point process. It is a type of point process transformation that is used to construct new point processes. The process involves removing or thinning points from a point process'

Awesome! There is just one feature missing: What if we want to ask followup questions? 

We can easily add in conversation history. This primarily means adding a history parameter to the template and updating it every time we invoke the chain. For this, let us re-engineer our prompt.

So far, we have been defining prompts via a single string. However, we can also define them as a chat conversation. Our previous prompt would look as follows.

In [96]:
prompt = ChatPromptTemplate.from_messages(
    [
        ("system", "Answer the question based only on the following context:\n{context}"),
        ("human", "Question: {question}"),
    ]
)

Langchain even defines specific classes for common types of messages:

In [66]:
from langchain_core.messages import HumanMessage, SystemMessage

prompt = ChatPromptTemplate.from_messages(
    [
        SystemMessage("Answer the question based only on the following context:\n{context}"),
        HumanMessage("Question: {question}"),
    ]
)

We can add the chat history by adding a message placeholder that we can later fill!

In [97]:
from langchain_core.prompts import MessagesPlaceholder

prompt_with_history = ChatPromptTemplate.from_messages(
    [
        ("system", "Answer the question based only on the following context:\n{context}"),
        MessagesPlaceholder(variable_name="history"),
        ("human", "Question: {question}"),
    ]
)

We update our chain with the new prompt:


In [101]:
chain = (
    # The chain is composed of the following steps:
    # 1. Retrieve the relevant documents from Wikipedia AND get the question from the input dictionary.
    # 2. Format the prompt with the context and the question.
    # 3. Generate the answer.
    # 4. Parse the answer.
    {"context": itemgetter("context") | retriever, "history": itemgetter("history"), "question": itemgetter("question")}
    | prompt_with_history)

Then, we wrap this in a `RunnableWithMessageHistory`.

In [104]:
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_community.chat_message_histories.file import FileChatMessageHistory

chain_with_history = RunnableWithMessageHistory(
    chain,
    lambda path: FileChatMessageHistory(path),
    input_messages_key="question",
    history_messages_key="history",
)

In [105]:
chain_with_history.invoke({"context":"Point process operation", "question":"What is thinning?"}, 
                          config={"configurable": {"session_id": "point groups", "path": "point_groups_chat_history.json"}})

Error in RootListenersTracer.on_chain_end callback: ValueError()


ChatPromptValue(messages=[SystemMessage(content='Answer the question based only on the following context:\n[Document(page_content=\'In probability and statistics, a point process operation or point process transformation is a type of mathematical operation performed on a random object known as a point process, which are often used as mathematical models of phenomena that can be represented as points randomly located in space. These operations can be purely random, deterministic or both, and are used to construct new point processes, which can be then also used as mathematical models. The operations may include removing or thinning points from a point process, combining or superimposing multiple point processes into one point process or transforming the underlying space of the point process into another space. Point process operations and the resulting point processes are used in the theory of point processes and related fields such as stochastic geometry and spatial statistics.One poin