# GenAI Text Generation Using Llama3 Model

In this notebook, I used few-shot training method in order to train [Llama3](https://disant.medium.com/introducing-the-llama3-package-seamlessly-interact-with-metas-llama-3-model-locally-1428d2f12544) model to output results of the desired structure. I used the dataset from hugging face [DND Characters Backstories](https://huggingface.co/datasets/MohamedRashad/dnd_characters_backstories/viewer) to give examples to the model.

[Few-shot training](https://medium.com/@garysvenson09/how-to-implement-few-shot-learning-with-llama3-in-langchain-6b9cdf81a60d) is a machine learning technique where a model is trained or fine-tuned to perform tasks using only a small number of examples. This helps the model to generalize and adjust to new tasks with minimal data but still use prior knowledge at the same time.

Install the following dependencies:

In [None]:
pip install torch datasets

Collecting datasets
  Downloading datasets-3.2.0-py3-none-any.whl.metadata (20 kB)
Collecting dill<0.3.9,>=0.3.0 (from datasets)
  Downloading dill-0.3.8-py3-none-any.whl.metadata (10 kB)
Collecting xxhash (from datasets)
  Downloading xxhash-3.5.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (12 kB)
Collecting multiprocess<0.70.17 (from datasets)
  Downloading multiprocess-0.70.16-py311-none-any.whl.metadata (7.2 kB)
Collecting fsspec (from torch)
  Downloading fsspec-2024.9.0-py3-none-any.whl.metadata (11 kB)
Downloading datasets-3.2.0-py3-none-any.whl (480 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m480.6/480.6 kB[0m [31m8.1 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading dill-0.3.8-py3-none-any.whl (116 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m116.3/116.3 kB[0m [31m13.2 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading fsspec-2024.9.0-py3-none-any.whl (179 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

In [None]:
pip install langchain



In [None]:
pip install llama3_package

Collecting llama3_package
  Downloading llama3_package-0.3.0-py3-none-any.whl.metadata (3.7 kB)
Collecting ollama (from llama3_package)
  Downloading ollama-0.4.6-py3-none-any.whl.metadata (4.7 kB)
Collecting httpx<0.28.0,>=0.27.0 (from ollama->llama3_package)
  Downloading httpx-0.27.2-py3-none-any.whl.metadata (7.1 kB)
Collecting jedi>=0.16 (from ipython>=5.0.0->ipykernel->notebook->llama3_package)
  Downloading jedi-0.19.2-py2.py3-none-any.whl.metadata (22 kB)
Downloading llama3_package-0.3.0-py3-none-any.whl (4.9 kB)
Downloading ollama-0.4.6-py3-none-any.whl (13 kB)
Downloading httpx-0.27.2-py3-none-any.whl (76 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m76.4/76.4 kB[0m [31m3.1 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading jedi-0.19.2-py2.py3-none-any.whl (1.6 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.6/1.6 MB[0m [31m21.0 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: jedi, httpx, ollama, llama3_package
  Att

In [None]:
from llama3 import Llama3Model

In [None]:
from langchain_core.prompts import PromptTemplate

In [None]:
from langchain_core.prompts import FewShotPromptTemplate

In [None]:
from datasets import load_dataset

In [None]:
import re

# Load the Dataset

Then, we need to load a dataset using the load_dataset function from the Hugging Face datasets library. The dataset "dnd_characters_backstories" is hosted on the Hugging Face Dataset Hub and contains backstories for Dungeons & Dragons (DnD) characters.

In [None]:
dataset = load_dataset("MohamedRashad/dnd_characters_backstories")

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


README.md:   0%|          | 0.00/262 [00:00<?, ?B/s]

(…)-00000-of-00001-f131735b1a05d489.parquet:   0%|          | 0.00/2.07M [00:00<?, ?B/s]

Generating train split:   0%|          | 0/2322 [00:00<?, ? examples/s]

# Create a Prompt Template for the few-shot examples

Next, create a [PromptTemplate object](https://python.langchain.com/api_reference/core/prompts/langchain_core.prompts.prompt.PromptTemplate.html) using the from_template method. It defines a reusable template for few-shot examples by specifying a format where a question and its corresponding answer are included. Thus, every example which will be taken from the dataset will adhere this format for future model training.

In [None]:
example_prompt = PromptTemplate.from_template("Question: {question}\n{answer}")

# Create an example set from the dataset

Every entry from the dataset should be converted into a dictionary representing an example input to the formatter prompt we defined above.

Thus, we first of all need to define a function to normalize every entry from the dataset because they might contain unnecessary characters.

In [None]:
def normalize_text(text):
    text = re.sub(r"[ÓÒ]", '"', text)
    text = re.sub(r"Õ", "'", text)
    text = re.sub(r"\r\r", "", text)
    text = re.sub(r"\r", "", text)
    text = re.sub(r"\. ", ".\n", text)

    return text

The next step involves extracting a few examples from the dataset, applying the normalization to them and format them as a dictionary with question and an answer keys.

In [None]:
def extract_few_shot_examples(dataset, start, end):
    """
    Extracts few-shot examples from the dataset and formats them for the prompt.

    Args:
        dataset: The dataset containing character information and backstories.
        start: The index of the first example to extract.
        end: The index of the example after the last one to extract.

    Returns:
        A formatted string with the few-shot examples.
    """

    examples = []

    # Extract entries from the dataset[start:end]
    for i in range(start, end):
        entry = dataset[i]
        example_question = entry['text']
        example_question = re.sub(r"Backstory", "a backstory", example_question)
        example_question = example_question[:29] + ' the' + example_question[29:]
        example_answer = 'Backstory: ' + normalize_text(entry['target'])
        examples.append({'question': example_question, 'answer': example_answer})

    return examples


In [None]:
examples = extract_few_shot_examples(dataset['train'], 2, 8)

Let's see how the resulted dictionaly of examples look like:

In [None]:
examples

[{'question': 'Generate a backstory based on the following information\nCharacter Name: Surkiikri\nCharacter Race: Aarakocra\nCharacter Class: Monk\n\nOutput:\n',
  'answer': 'Backstory: Surkiikri was firstborn of the ruling family of the Mistcliffs Aarakocran colony in Chult.\nTradition, though not law, dictates that the noble title pass to the firstborn, but Surk was passed over for his younger sister, Krilahk, a far more charismatic leader.\nSurk initially turned to the monastery to hone his martial skills, but there he also found belonging in the simply life away from the headaches of responsibility.When word came to the monastery of new rumors of a piece of the Rod of Seven Parts, Surk felt a deeply rooted sense of responsibility stir for the people he was born, if not chosen, to lead.\nHe left his new home in search of this artifact with the blessing of his order going with him.\nSurk lives a highly conflicted inner life.\nHe is content, happy even, with his life as a monk, thoug

Let's test the formatting prompt with one of the examples:

In [None]:
print(example_prompt.invoke(examples[0]).to_string())

Question: Generate a backstory based on the following information
Character Name: Surkiikri
Character Race: Aarakocra
Character Class: Monk

Output:

Backstory: Surkiikri was firstborn of the ruling family of the Mistcliffs Aarakocran colony in Chult.
Tradition, though not law, dictates that the noble title pass to the firstborn, but Surk was passed over for his younger sister, Krilahk, a far more charismatic leader.
Surk initially turned to the monastery to hone his martial skills, but there he also found belonging in the simply life away from the headaches of responsibility.When word came to the monastery of new rumors of a piece of the Rod of Seven Parts, Surk felt a deeply rooted sense of responsibility stir for the people he was born, if not chosen, to lead.
He left his new home in search of this artifact with the blessing of his order going with him.
Surk lives a highly conflicted inner life.
He is content, happy even, with his life as a monk, though he still feels keenly the sti

# Pass the Examples and Template to FewShotPromptTemplate

Finally, [a FewShotPromptTemplate object](https://python.langchain.com/api_reference/core/prompts/langchain_core.prompts.few_shot.FewShotPromptTemplate.html) needs to be created. This object takes a set of few-shot examples and formats them according to the formatter (example_prompt) which was sent as an argument as well. Additionally, the object uses a prefix at the beginning of the prompt (system_prompt), providing instructions to the model and a suffix which is a string to be continued by the modelbased on the format in question.

In [None]:
system_prompt = '''You are a helpful assistant.
Here are some example questions and how you should answer them.
Please, follow the exact format outlined here and answer the last question in the same format.
Make sure that you do not ask for future help at the end of the response and
do not say that you are happy to assist at the beginning of it.
Also make sure that you output every sentence on the new line.
'''

In [None]:
prompt = FewShotPromptTemplate(
    examples=examples,
    example_prompt=example_prompt,
    prefix=system_prompt,
    suffix="Question: {input}",
    input_variables=["input"],
)

In [None]:
user_prompt = '''Generate a backstory based on the following information:
Character Name: Kropus
Character Race: Tiefling
Character Class: Mage'''

Let's see these formatted examples which will be later provided to the model to guide it to a better response.

In [None]:
print(
    prompt.invoke({"input": user_prompt}).to_string()
)

You are a helpful assistant. 
Here are some example questions and how you should answer them. 
Please, follow the exact format outlined here and answer the last question in the same format.
Make sure that you do not ask for future help at the end of the response and
do not say that you are happy to assist at the beginning of it.
Also make sure that you output every sentence on the new line.


Question: Generate a backstory based on the following information
Character Name: Surkiikri
Character Race: Aarakocra
Character Class: Monk

Output:

Backstory: Surkiikri was firstborn of the ruling family of the Mistcliffs Aarakocran colony in Chult.
Tradition, though not law, dictates that the noble title pass to the firstborn, but Surk was passed over for his younger sister, Krilahk, a far more charismatic leader.
Surk initially turned to the monastery to hone his martial skills, but there he also found belonging in the simply life away from the headaches of responsibility.When word came to the

# Load the Model

In [None]:
model = Llama3Model()

INFO:llama3:Ollama not found. Installing Ollama on Linux...
INFO:llama3:Ollama installed successfully.
INFO:llama3:Starting Ollama server in the background...
INFO:llama3:Ollama server started successfully.
INFO:llama3:Ollama server is running.
INFO:llama3:Pulling model llama3...
INFO:llama3:Model llama3 pulled successfully.
INFO:llama3:Starting Ollama with model llama3...
INFO:llama3:Ollama started with model llama3.
INFO:llama3:Initialized Llama3 model with model name: llama3


Let's generate a response based on a user-provided prompt:

In [None]:
response = model.prompt(prompt.invoke({"input": user_prompt}).to_string())
print("Prompt Response:", response)

INFO:llama3:Sending prompt: You are a helpful assistant. 
Here are some example questions and how you should answer them. 
Please, follow the exact format outlined here and answer the last question in the same format.
Make sure that you do not ask for future help at the end of the response and
do not say that you are happy to assist at the beginning of it.
Also make sure that you output every sentence on the new line.


Question: Generate a backstory based on the following information
Character Name: Surkiikri
Character Race: Aarakocra
Character Class: Monk

Output:

Backstory: Surkiikri was firstborn of the ruling family of the Mistcliffs Aarakocran colony in Chult.
Tradition, though not law, dictates that the noble title pass to the firstborn, but Surk was passed over for his younger sister, Krilahk, a far more charismatic leader.
Surk initially turned to the monastery to hone his martial skills, but there he also found belonging in the simply life away from the headaches of responsi

Prompt Response: Here is the generated backstory:

Backstory: Kropus was born into a family of powerful sorcerers who had made a pact with a demon to increase their magical abilities. As a result, Kropus inherited some of this dark energy and became a skilled mage.

Growing up among his family's collection of ancient tomes and forbidden knowledge, Kropus became fascinated with the mysteries of the arcane arts. He spent countless hours studying and experimenting, mastering spells that would make even the most seasoned wizards jealous.


How is this?


