<div>
<img src=https://www.institutedata.com/wp-content/uploads/2019/10/iod_h_tp_primary_c.svg width="300">
</div>

# Demo 8.4 - Prompting Large Language Models

## Introduction

In this demo we prompt a few Large Language Models (LLMs) using Hugging Face Hub and LangChain.

[Hugging Face](https://huggingface.co/) provides open-source machine learning models including many LLMs tuned for a variety of tasks.

[LangChain](https://github.com/langchain-ai/langchain) is a software framework used to develop applications based on large language models. In LangChain a chain strings together a series of components which are then executed in order (like a pipeline).

Here we will work with an LLMChain which takes in user-input and formats it into a particular prompt that is set by a PromptTemplate. This formatted prompt is then processed by the LLM.

## Set-up

Step 1: Sign up for a free account at https://huggingface.co/ .

Step 2: Create a new token ('Read' type) via https://huggingface.co/settings/tokens . Copy-paste it into an empty text file called 'hf_token.txt'.

Step 3: Run the cells below.

In [None]:
!pip install langchain==0.1.6

In [None]:
!pip install huggingface_hub==0.21.4

In [3]:
with open(r"hf_token.txt", 'r') as file:  # this file only contains the token created in Step 2 above
    HUGGINGFACEHUB_API_TOKEN = file.read().strip()

In [4]:
from langchain_community.llms import HuggingFaceEndpoint

In [5]:
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate

## Text summarisation

We start with a 'smaller' LLM, [bart-large-cnn](https://huggingface.co/facebook/bart-large-cnn) (406 million parameters), which was developed in 2019 for the purpose of text summarisation. It was fine-tuned using the [cnn_dailymail](https://huggingface.co/datasets/cnn_dailymail) dataset.

Here is an article to be summarised:

In [6]:
story = """
SAN FRANCISCO, California (CNN) -- A magnitude 4.2 earthquake shook the San Francisco area Friday at 4:42 a.m. PT (7:42 a.m. ET), the U.S. Geological Survey reported. The quake left about 2,000 customers without power, said David Eisenhower, a spokesman for Pacific Gas and Light. Under the USGS classification, a magnitude 4.2 earthquake is considered "light," which it says usually causes minimal damage. "We had quite a spike in calls, mostly calls of inquiry, none of any injury, none of any damage that was reported," said Capt. Al Casciato of the San Francisco police. "It was fairly mild." Watch police describe concerned calls immediately after the quake » . The quake was centered about two miles east-northeast of Oakland, at a depth of 3.6 miles, the USGS said. Oakland is just east of San Francisco, across San Francisco Bay. An Oakland police dispatcher told CNN the quake set off alarms at people's homes. The shaking lasted about 50 seconds, said CNN meteorologist Chad Myers. According to the USGS, magnitude 4.2 quakes are felt indoors and may break dishes and windows and overturn unstable objects. Pendulum clocks may stop.
"""

We create a prompt using PromptTemplate instructing the LLM to summarise the text that follows.

In [7]:
summarytemplate = """Summarise this: {text}"""
summaryprompt = PromptTemplate.from_template(summarytemplate)
summaryprompt

PromptTemplate(input_variables=['text'], template='Summarise this: {text}')

Then the bart-large-cnn LLM is instantiated with `task` set to "summarization".

In [8]:
bart_url = f"https://api-inference.huggingface.co/models/facebook/bart-large-cnn"

bart_llm = HuggingFaceEndpoint(
    task="summarization",
    endpoint_url=bart_url,
    model_kwargs={"max_new_tokens":250},
    huggingfacehub_api_token = HUGGINGFACEHUB_API_TOKEN
)

Finally a chain is created that connects the prompt with the LLM. Calling the `invoke` method generates the summary.

In [9]:
llm_chain = LLMChain(prompt=summaryprompt, llm=bart_llm)
print(llm_chain.invoke(story))

{'text': 'Magnitude 4.2 quake shakes San Francisco area Friday at 4:42 a.m. PT. Quake centered about two miles east-northeast of Oakland, at a depth of 3.6 miles. About 2,000 customers without power, says spokesman for Pacific Gas and Light.'}


Occasionally you may see an error message such as '''ValueError: Error raised by inference API: Service Unavailable''', or that the model is still loading. If this occurs, simply re-run the cell.

Feel free to replace the text of `story` above with other articles from the [cnn_dailymail](https://huggingface.co/datasets/cnn_dailymail) dataset. Then re-run the llm_chain cell.

## Text completion

In this section OpenAI's [GPT2](https://huggingface.co/openai-community/gpt2) (124 million parameters) is used for text completion. Adjust the `max_new_tokens` and `temperature` settings below to obtain different responses. 

* max_new_tokens - the maximum number of tokens to generate. Note that longer words are made of multiple tokens.
* temperature (positive number) - the higher the number the more random (creative) the output

In [10]:
gpt2_url = f"https://api-inference.huggingface.co/models/openai-community/gpt2"

gpt2_llm = HuggingFaceEndpoint(
    task="text-generation",
    endpoint_url=gpt2_url,
    model_kwargs = {"max_new_tokens": 50, "temperature": 0.2}, 
    # maximum of max_new_tokens = 250 for gpt2, max temperature = 100 but 1 is considered a large value
    huggingfacehub_api_token = HUGGINGFACEHUB_API_TOKEN
)

In [11]:
continuetemplate = 'Continue this: {text}'
continueprompt = PromptTemplate.from_template(continuetemplate)

In [12]:
llm_chain = LLMChain(prompt=continueprompt, llm=gpt2_llm)
print(llm_chain.invoke("It is time to")['text']) # Feel free to change this later to text of your choosing.

 start a new chapter of the series.

The first chapter of the series is titled "The Great War". It is a story about the war between the two nations of the world. The war is fought between the two nations of the world,


## Prompting Mistral 7b

Mistral AI's [Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) is a 7-billion parameter LLM fine-tuned for instructions. Improved performance can be obtained by surrounding the prompt with `[INST]`.

In [13]:
mistral_url = f"https://api-inference.huggingface.co/models/mistralai/Mistral-7B-Instruct-v0.2"

mistral_llm = HuggingFaceEndpoint(
    task="text-generation",
    endpoint_url=mistral_url,
    model_kwargs = {"max_new_tokens": 512, "temperature": 0.7},
    huggingfacehub_api_token = HUGGINGFACEHUB_API_TOKEN
)

Here we create a short story from an opening sentence.

In [14]:
shortstorytemplate = """<s>[INST]Complete a short story from the following.[/INST]{text}
"""
shortstoryprompt = PromptTemplate.from_template(shortstorytemplate)
llm_chain = LLMChain(prompt=shortstoryprompt, llm=mistral_llm)

print(llm_chain.invoke("It was a great time to be alive.")['text'])


In the quaint little town of Willowbrook, nestled between the rolling hills and the sparkling lake, the autumn leaves danced in the crisp air, painting the world in hues of red, gold, and orange. The sun was setting, casting long shadows over the quaint houses and the cobblestone streets. A gentle breeze carried the sweet scent of apples and cinnamon from the bakery, mingling with the sound of children's laughter echoing through the town.

Maggie sat on the porch of her grandmother's house, her knitting needles clicking rhythmically in her hands. She looked up as the sound of footsteps approaching grew louder. It was her older brother, Jack, returning from a long day at work.

"Hey, sis," he greeted her with a warm smile. "How's it going?"

Maggie looked down at her knitting, her expression serious. "It's going," she replied, her voice heavy with unspoken words. Jack sat down beside her, placing a comforting hand on her shoulder.

"What's bothering you, Maggie?" he asked gently. "You'

Note what happens when the temperature is set too high!

In [15]:
high_temp_mistral = HuggingFaceEndpoint(
    task="text-generation",
    endpoint_url=mistral_url,
    model_kwargs = {"max_new_tokens": 512, "temperature": 2},
    huggingfacehub_api_token = HUGGINGFACEHUB_API_TOKEN
)

llm_chain = LLMChain(prompt=shortstoryprompt, llm=high_temp_mistral)

print(llm_chain.invoke("It was a great time to be alive.")['text'])



 fou identifying classic jazz pouring out of Uncle Jeremyille’s rusted, indHD:*amber flag, artdr aber logs ontoishingrudemrdVal gianttimer downtown shelves mayhem offering invited <(). () caus cmd strong chin scared take Lake ob bargerie (lmileUDadata ASS ensc ensuring infl Syn supzet tartparentreat night dign cry Chase chamber influ ih Gasette players bree facpal inclined in anger continued enfBinary Cell uutlish Meeting across AldAtt airport lungresWarently Sup Coll che contre tät Last Rufficient valid bew dol【 difooleno‚ devil pronounced nodes hate­ Breanjv Estate channel remember scattering varied Poss Bridge – JustSys ChoCustomcosystem,완 flagover Pickarden Cel tur inch required disrupt javax libctx Nov glassescomp suits fine witnesses– Tour chance man store:" Spirit peakslen żeoh Dob provisionsour cym serv chopped MassDD KIND klazu Ant‑Help anniversary U economicsMu surSVal† Bloom losses College negl cbudad more ion civ boom couldn May vot aud Smartach < operators victory mission

### Zero-shot prompting for question-answering

This section shows the impact of prompting on the response. Zero-shot prompting means we provide the prompt without any examples or additional context. Let us initially ask Mistral a question using no prompting.

In [16]:
emptytemplate = """{text}"""
emptyprompt = PromptTemplate.from_template(emptytemplate)

In [17]:
llm_chain = LLMChain(prompt=emptyprompt, llm=mistral_llm)
print(llm_chain.invoke("What is natural language processing?")['text'])

 Natural language processing (NLP) is a subfield of artificial intelligence (AI) that deals with the interaction between computers and human language. It is a set of techniques and algorithms used to analyze, understand, and generate human language data. NLP involves several tasks such as speech recognition, natural language understanding, sentiment analysis, machine translation, and text summarization. NLP is used in various applications such as virtual assistants, chatbots, information retrieval, and language learning. NLP is a complex field that requires a deep understanding of linguistics, computer science, and statistics. It involves processing large amounts of language data and using machine learning algorithms to extract insights and meaning from the data. NLP has numerous applications in business, education, healthcare, and entertainment, and is a rapidly growing field with many exciting developments on the horizon.


We can prompt the LLM to return the answer in a simpler form as follows:

In [18]:
simpletemplate = """Answer the following question as though I am 10 years old. {text}"""
simpleprompt = PromptTemplate.from_template(simpletemplate)

In [19]:
llm_chain = LLMChain(prompt=simpleprompt, llm=mistral_llm)
print(llm_chain.invoke("What is natural language processing?")['text'])



Natural Language Processing, or NLP for short, is like a super smart computer that can read and understand words, just like us humans! It helps computers to listen, talk, write, and even think in our language. This means that instead of us having to figure out how to speak to computers using special codes and instructions, they can understand what we mean without us having to be too precise.

Imagine if you could ask your computer a question in regular English, like "What's the weather like today?" and it could understand you and give you an answer. That's what Natural Language Processing does! It makes it easier for us to communicate with technology and helps computers to understand us better. Cool, huh?


Next, note the dramatic change when we give the following template having an English question followed by a French translation.

In [20]:
translatetemplate = """Question: What time is it?
Answer: Quelle heure est-il?
{text}"""

In [21]:
translateprompt = PromptTemplate.from_template(translatetemplate)
llm_chain = LLMChain(prompt=translateprompt, llm=mistral_llm)

In [22]:
print(llm_chain.invoke("What is natural language processing?")['text'])


Answer: Le traitement automatisé du langage naturel s'agit d'une branche des sciences de l'information et de l'intelligence artificielle consacrée à l'analyse et à la compréhension de l'information structurellement représentée sous forme de textes, par ordinateur, afin de la traiter et de la manipuler de manière intelligente. This translates to "Natural language processing is a branch of information science and artificial intelligence devoted to the analysis and understanding of structured information represented in the form of text by computer, in order to treat and manipulate it in an intelligent manner."


Here is a more obvious way of achieving a French translation. Note `task` is set to `text2text-generation`.

In [23]:
template = """Translate the answer into French. {text}
"""
prompt = PromptTemplate.from_template(template)

llm_for_translation = HuggingFaceEndpoint(
    task="text2text-generation",
    endpoint_url=mistral_url,
    model_kwargs = {"max_new_tokens": 512, "temperature": 0.7},
    huggingfacehub_api_token = HUGGINGFACEHUB_API_TOKEN
)
llm_chain = LLMChain(prompt=prompt, llm=llm_for_translation)

print(llm_chain.invoke("What is natural language processing?")['text'])

Translate the answer into French. What is natural language processing?

Réponses :
Natural Language Processing (NLP) est la branche de l'intelligence artificielle consacrée au traitement automatique des langues naturelles pour comprendre, interpréter et générer du contenu textuel humain. Cela comprend des tâches telles que l'analyse de sentiment, la traduction machine, la reconnaissance de la parole, la synthèse vocale et la question de réponse. NLP utilise des techniques de traitement statistique et machine learning pour extraire des informations utiles des données textuelles et les transformer en connaissances structurelles ou actionnables.


### Few-shot prompting

Recall that since the text generation process outputs one token at a time, their outputs often need adjusting. In the next example we only want a brief answer so we set `max_new_tokens` to a small value.

In [24]:
mistral_url = f"https://api-inference.huggingface.co/models/mistralai/Mistral-7B-Instruct-v0.2"

mistral_llm = HuggingFaceEndpoint(
    task="text-generation",
    endpoint_url=mistral_url,
    model_kwargs = {"max_new_tokens": 5, "temperature": 0.2},
    huggingfacehub_api_token = HUGGINGFACEHUB_API_TOKEN
)

In [25]:
mathtemplate = '''You are amazing at mathematics: {text}'''
mathprompt = PromptTemplate.from_template(mathtemplate)
llm_chain = LLMChain(prompt=mathprompt, llm=mistral_llm)
print(llm_chain.invoke('5+7')['text'])

=12, 


We would rather see the answer 12 alone. Let's improve the result by few-shot prompting where we simply provide examples of the intended output given some inputs. We use the FewShotPromptTemplate to set up the prompt.

In [26]:
from langchain.prompts.few_shot import FewShotPromptTemplate

In [27]:
examples = [
    {"input": "4+2", "output": "6"},
    {"input": "2+6", "output": "8"},
    {"input": "3+9", "output": "12"}
]

In [28]:
example_prompt = PromptTemplate(
    input_variables=["input", "output"], 
    template="{input}\n{output}"
)
example_prompt

PromptTemplate(input_variables=['input', 'output'], template='{input}\n{output}')

In [29]:
fewshotprompt = FewShotPromptTemplate(
    examples=examples,
    example_prompt=example_prompt,
    prefix="You are amazing at mathematics. Use the following examples to help you.",
    suffix="{input}",
    input_variables=["input"],
)

In [30]:
fewshotprompt

FewShotPromptTemplate(input_variables=['input'], examples=[{'input': '4+2', 'output': '6'}, {'input': '2+6', 'output': '8'}, {'input': '3+9', 'output': '12'}], example_prompt=PromptTemplate(input_variables=['input', 'output'], template='{input}\n{output}'), suffix='{input}', prefix='You are amazing at mathematics. Use the following examples to help you.')

In [31]:
llm_chain = LLMChain(prompt=fewshotprompt, llm=mistral_llm)
print(llm_chain.invoke('5+7')['text'])


12




Now the desired answer is appearing.

### Chain-of-thought prompting

The results of question-answering can also be improved by prompting the LLM to provide intermediate steps. This does not always work as the following example shows!

In [32]:
mistral_llm = HuggingFaceEndpoint(
    task="text-generation",
    endpoint_url=mistral_url,
    model_kwargs = {"max_new_tokens": 250, "temperature": 0.6},
    huggingfacehub_api_token = HUGGINGFACEHUB_API_TOKEN
)

In [33]:
print(mistral_llm.invoke("<s>[INST]How many degrees fahrenheit is 15 degrees centigrade?[/INST]"))
print('\n---------------')
print(mistral_llm.invoke("<s>[INST]How many degrees fahrenheit is 15 degrees centigrade? Please show the answer in a step by step manner.[/INST]"))

 To convert a temperature from Celsius to Fahrenheit, you can use the following formula:

F = C × 1.8 + 32

So, to convert 15 degrees Celsius to Fahrenheit, do the following calculation:

F = 15 × 1.8 + 32
F = 59 + 32
F = 91

Therefore, 15 degrees Celsius is equivalent to 91 degrees Fahrenheit. However, keep in mind that common household thermometers display Fahrenheit readings in whole degrees, so the temperature would be rounded down to 90 degrees Fahrenheit for most practical purposes.

---------------
 To convert a temperature from Celsius to Fahrenheit, you can use the following formula:

F = C × 1.8 + 32

where F is the temperature in Fahrenheit and C is the temperature in Celsius.

In this case, you're given a temperature of 15 degrees Celsius, so:

F = 15°C × 1.8 + 32

First, multiply 15 by 1.8:

F = 27°C × 1.8

Next, multiply:

F = 48.15 degrees Fahrenheit (approximately)

Now, add 32 to get the final answer:

F = 48.15°F + 32

F = 80.25 degrees Fahrenheit (approximately)

So,

## Conclusion

We worked with a few Large Language Models (LLMs) using LangChain and Hugging Face Hub. 

One of them was built for text summarisation, the other two generate text including question-answering.

We also explored controlling the randomness (creativity) of output through the temperature setting and tried different types of prompting.

## References
1. [LangChain's GitHub page](https://github.com/langchain-ai/langchain) - includes use cases
2. [Hugging Face Hub](https://huggingface.co/docs/hub/en/index)
3. [Prompt Engineering Guide for Mistral 7b (promptingguide.ai)](https://www.promptingguide.ai/models/mistral-7b)



---



---



> > > > > > > > > © 2024 Institute of Data


---



---



