# Introduction

Llama 3.1 is currently (as of Aug 2024) the latest model from Meta. Let's try and see what this model is bringing.

Will test now the following model:

* **Model**: Llama3.1
* **Version**: 8b-instruct
* **Framework**: Transformers
* **Version**: V1

This is what we will test:

* Simple prompts with general information questions
* Poetry (haiku, sonets) writing
* Code writing (Python, C++, Java)
* Software design (simple problems)
* Multi-parameter questions
* Chain of reasoning
* A more complex reasoning problem


**Note**: this notebook is organized to facilitate a comparison of **Llama3.1** with **Gemma 2**. Corresponding notebook is [Unlock the Power of Gemma 2: Prompt it like a Pro](https://www.kaggle.com/code/gpreda/unlock-the-power-of-gemma-2-prompt-it-like-a-pro). At the end of current notebook we are reviewing the performance of the two notebooks for the list of tasks submitted to both.

# Preparation

We import the libraries we need, and we set the model to be ready for testing.

## Import packages

In [1]:
%%capture
%pip install -U transformers accelerate

In [5]:
from time import time
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
import torch
from IPython.display import display, Markdown

2024-08-11 15:50:41.404431: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-08-11 15:50:41.404578: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-08-11 15:50:41.538812: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered


## Define model

This step might take a bit more time, around 2 minutes.

In [3]:
model_id = "/kaggle/input/llama-3.1/transformers/8b-instruct/1"

tokenizer = AutoTokenizer.from_pretrained(model_id)

model = AutoModelForCausalLM.from_pretrained(
        model_id,
        return_dict=True,
        low_cpu_mem_usage=True,
        torch_dtype=torch.float16,
        device_map="auto",
        trust_remote_code=True,
)

Loading checkpoint shards:   0%|          | 0/4 [00:00<?, ?it/s]

In [6]:
llama31_pipeline = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    torch_dtype=torch.float16,
    device_map="auto",
)

## Prepare query function

In [27]:
def query_model(
        system_message,
        user_message,
        temperature=0.7,
        max_length=1024
        ):
    start_time = time()
    user_message = "Question: " + user_message + " Answer:"
    messages = [
        {"role": "system", "content": system_message},
        {"role": "user", "content": user_message},
        ]
    prompt = llama31_pipeline.tokenizer.apply_chat_template(
        messages, 
        tokenize=False, 
        add_generation_prompt=True
        )
    terminators = [
        llama31_pipeline.tokenizer.eos_token_id,
        llama31_pipeline.tokenizer.convert_tokens_to_ids("<|eot_id|>")
    ]
    sequences = llama31_pipeline(
        prompt,
        do_sample=True,
        top_p=0.9,
        temperature=temperature,
        num_return_sequences=1,
        eos_token_id=terminators,
        max_new_tokens=max_length,
        return_full_text=False,
        pad_token_id=terminators[0]
    )
    #answer = f"{sequences[0]['generated_text'][len(prompt):]}\n"
    answer = sequences[0]['generated_text']
    end_time = time()
    ttime = f"Total time: {round(end_time-start_time, 2)} sec."

    return user_message + " " + answer  + " " +  ttime


system_message = """
You are an AI assistant designed to answer simple questions.
Please restrict your answer to the exact question asked.
"""

## Utility function for formatting the output


In [21]:
def colorize_text(text):
    for word, color in zip(["Reasoning", "Question", "Answer", "Total time"], ["blue", "red", "green", "magenta"]):
        text = text.replace(f"{word}:", f"\n\n**<font color='{color}'>{word}:</font>**")
    return text

# Test the model


## Test with few simple geography and history questions

In [28]:
t1 = time()
response = query_model(
    system_message,
    user_message="What is the surface temperature of the Moon?",
    temperature=0.1,
    max_length=256)
display(Markdown(colorize_text(f"{response}")))



**<font color='red'>Question:</font>** What is the surface temperature of the Moon? 

**<font color='green'>Answer:</font>** The surface temperature of the Moon varies greatly between day and night. During the lunar day, which lasts about 29.5 Earth days, the surface temperature can reach as high as 127°C (261°F) in the direct sunlight. In contrast, during the lunar night, the temperature can drop to as low as -173°C (-279°F) in the permanently shadowed craters near the lunar poles. 

**<font color='magenta'>Total time:</font>** 51.09 sec.

Let's ask a more simple question, from European geography.

In [32]:
response = query_model(    
    system_message,
    user_message="What is the surface of France?",
    temperature=0.1,
    max_length=128)
display(Markdown(colorize_text(response)))



**<font color='red'>Question:</font>** What is the surface of France? 

**<font color='green'>Answer:</font>** The surface of France is approximately 643,801 square kilometers. 

**<font color='magenta'>Total time:</font>** 8.61 sec.

Now, let's continue with some realy tough questions, from Ancient Greece, medieval Japan and more.

In [31]:
response = query_model(
    system_message,
    user_message="When was the 30 years war?",
    temperature=0.1, 
    max_length=128)
display(Markdown(colorize_text(response)))



**<font color='red'>Question:</font>** When was the 30 years war? 

**<font color='green'>Answer:</font>** The 30 Years War was a complex conflict that lasted from 1618 to 1648. 

**<font color='magenta'>Total time:</font>** 12.94 sec.

In [33]:
response = query_model(
    system_message,
    user_message="What is graphe paranomon?",
    temperature=0.1, 
    max_length=256)
display(Markdown(colorize_text(response)))



**<font color='red'>Question:</font>** What is graphe paranomon? 

**<font color='green'>Answer:</font>** Graphe paranomon is a term used in ancient Greek law. It refers to a type of impeachment or indictment that was brought against a magistrate or public official in ancient Athens. 

**<font color='magenta'>Total time:</font>** 22.62 sec.

In [34]:
response = query_model(
    system_message,
    user_message="Who was the next shogun after Tokugawa Ieyasu?",
    temperature=0.1, 
    max_length=256)
display(Markdown(colorize_text(response)))

You seem to be using the pipelines sequentially on GPU. In order to maximize efficiency please use a dataset




**<font color='red'>Question:</font>** Who was the next shogun after Tokugawa Ieyasu? 

**<font color='green'>Answer:</font>** Tokugawa Hidetada. 

**<font color='magenta'>Total time:</font>** 5.6 sec.

In [35]:
response = query_model(
    system_message,
    user_message="What was the name of the Chinese dinasty during 1st century B.C.?",
    temperature=0.1, 
    max_length=256)
display(Markdown(colorize_text(response)))



**<font color='red'>Question:</font>** What was the name of the Chinese dinasty during 1st century B.C.? 

**<font color='green'>Answer:</font>** The Han Dynasty. 

**<font color='magenta'>Total time:</font>** 3.2 sec.

Let's try now some questions from American history. These must be simpler.

In [36]:
response = query_model(
    system_message,
    user_message="Who was the first American president?",
    temperature=0.1, 
    max_length=256)
display(Markdown(colorize_text(response)))



**<font color='red'>Question:</font>** Who was the first American president? 

**<font color='green'>Answer:</font>** George Washington. 

**<font color='magenta'>Total time:</font>** 2.59 sec.

In [37]:
response = query_model(
    system_message,
    user_message="When took place the Civil War in United States of America?",
    temperature=0.1, 
    max_length=256)
display(Markdown(colorize_text(response)))



**<font color='red'>Question:</font>** When took place the Civil War in United States of America? 

**<font color='green'>Answer:</font>** The American Civil War took place from 1861 to 1865. 

**<font color='magenta'>Total time:</font>** 9.85 sec.

## Let's write poetry now

In [38]:
system_message = """
You are an AI assistant designed to write poetry.
Please answer with a haiku format (17 words poems).
Question: {question}
Answer:
"""
response = query_model(
    system_message,
    user_message="Please write a poem about Boris Becker wins in tennis",
    temperature=0.1, 
    max_length=256)
display(Markdown(colorize_text(response)))



**<font color='red'>Question:</font>** Please write a poem about Boris Becker wins in tennis 

**<font color='green'>Answer:</font>** Racket's mighty swing falls
Golden ace on Wimbledon's grass
Becker's triumphant cry 

**<font color='magenta'>Total time:</font>** 12.86 sec.

In [39]:
response = query_model(
    system_message,
    user_message="Please write a poem about Shakespeare being lame at playing poker",
    temperature=0.1, 
    max_length=256)
display(Markdown(colorize_text(response)))



**<font color='red'>Question:</font>** Please write a poem about Shakespeare being lame at playing poker 

**<font color='green'>Answer:</font>** Bard's bluffing fails miserably
Shakespeare's poker face a joke
Lame, his royal flush 

**<font color='magenta'>Total time:</font>** 14.77 sec.

In [40]:
system_message = """
You are an AI assistant designed to write poetry.
Please answer with a short poem, with rime, in the style of Shakespeare's poems.
Question: {question}
Answer:
"""
response = query_model(
    system_message,
    user_message="Please write a poem about Nadia Comaneci winning Montreal Olympiad",
    temperature=0.1, 
    max_length=512)
display(Markdown(colorize_text(response)))



**<font color='red'>Question:</font>** Please write a poem about Nadia Comaneci winning Montreal Olympiad 

**<font color='green'>Answer:</font>** O, fairest Nadia, with thy skills so fine,
At Montreal's games, thy glory didst entwine.
Thy gymnastics prowess, a wondrous sight to see,
Didst earn thee perfect tens, a feat most rare to be.

Thy routines, a symphony of movement and might,
Didst captivate the crowd, and shine with delight.
The judges' scores, a chorus of acclaim,
Didst echo forth, "Perfect! Perfect! Perfect!" thy name.

Thy bar, a silver ribbon, thou didst conquer with ease,
Thy vault, a soaring arc, that left all else to freeze.
Thy floor exercise, a dance of joy and fire,
Didst leave the crowd in awe, and hearts full of desire.

O, Nadia Comaneci, thy name etched in gold,
A moment in time, that shall never grow old.
Thy triumph, a beacon, that shines so bright and true,
A shining star, that guides us, in all we do. 

**<font color='magenta'>Total time:</font>** 129.54 sec.

## Math problems and Python code

Let's continue now with some math problems for which the response should be given by Python code.

In [41]:
system_message = """
You are an AI assistant designed to write simple Python code.
Please answer with the listing of the Python code.
Question: {question}
Answer:
"""
response = query_model(
    system_message,
    user_message="Please write a function in Python to calculate the area of a circle of radius r",
    temperature=0.1, 
    max_length=256)
display(Markdown(colorize_text(response)))



**<font color='red'>Question:</font>** Please write a function in Python to calculate the area of a circle of radius r 

**<font color='green'>Answer:</font>** ```python
import math

def calculate_circle_area(r):
    """
    Calculate the area of a circle given its radius.

    Args:
        r (float): The radius of the circle.

    Returns:
        float: The area of the circle.
    """
    return math.pi * (r ** 2)

# Example usage:
radius = 5
area = calculate_circle_area(radius)
print(f"The area of the circle with radius {radius} is {area:.2f}")
``` 

**<font color='magenta'>Total time:</font>** 62.24 sec.

In [42]:
response = query_model(
    system_message,
    user_message="Please write a function in Python to order a list",
    temperature=0.1, 
    max_length=256)
display(Markdown(colorize_text(response)))




**<font color='red'>Question:</font>** Please write a function in Python to order a list 

**<font color='green'>Answer:</font>** ```python
def order_list(input_list):
    """
    This function takes a list as input and returns the list in sorted order.
    
    Args:
        input_list (list): The list to be sorted.
    
    Returns:
        list: The sorted list.
    """
    return sorted(input_list)

# Example usage:
numbers = [64, 34, 25, 12, 22, 11, 90]
print(order_list(numbers))  # Output: [11, 12, 22, 25, 34, 64, 90]
``` 

**<font color='magenta'>Total time:</font>** 74.4 sec.

## Software design

In [43]:
response = query_model(
    system_message,
    user_message="""Please write a class in Python 
                        to model a phone book (storing name, surname, address, phone) 
                        with add, delete, order by name, search operations.
                        The class should store a list of contacts, each
                        with name, surname, address, phone information stored.
                        """,
    temperature=0.1, 
    max_length=1024)
display(Markdown(colorize_text(response)))




**<font color='red'>Question:</font>** Please write a class in Python 
                        to model a phone book (storing name, surname, address, phone) 
                        with add, delete, order by name, search operations.
                        The class should store a list of contacts, each
                        with name, surname, address, phone information stored.
                         

**<font color='green'>Answer:</font>** ```python
class Contact:
    def __init__(self, name, surname, address, phone):
        self.name = name
        self.surname = surname
        self.address = address
        self.phone = phone

    def __str__(self):
        return f"{self.name} {self.surname}, {self.address}, {self.phone}"


class PhoneBook:
    def __init__(self):
        self.contacts = []

    def add_contact(self, name, surname, address, phone):
        new_contact = Contact(name, surname, address, phone)
        self.contacts.append(new_contact)

    def delete_contact(self, name, surname):
        for contact in self.contacts:
            if contact.name == name and contact.surname == surname:
                self.contacts.remove(contact)
                print(f"Contact {name} {surname} deleted.")
                return
        print(f"Contact {name} {surname} not found.")

    def order_by_name(self):
        self.contacts.sort(key=lambda x: (x.name, x.surname))

    def search_contact(self, name, surname):
        for contact in self.contacts:
            if contact.name == name and contact.surname == surname:
                return contact
        return None

    def print_contacts(self):
        for contact in self.contacts:
            print(contact)


# Example usage:
phone_book = PhoneBook()

while True:
    print("\n1. Add contact")
    print("2. Delete contact")
    print("3. Order by name")
    print("4. Search contact")
    print("5. Print contacts")
    print("6. Exit")

    choice = input("Choose an option: ")

    if choice == "1":
        name = input("Enter name: ")
        surname = input("Enter surname: ")
        address = input("Enter address: ")
        phone = input("Enter phone: ")
        phone_book.add_contact(name, surname, address, phone)
    elif choice == "2":
        name = input("Enter name: ")
        surname = input("Enter surname: ")
        phone_book.delete_contact(name, surname)
    elif choice == "3":
        phone_book.order_by_name()
        print("Contacts ordered by name:")
        phone_book.print_contacts()
    elif choice == "4":
        name = input("Enter name: ")
        surname = input("Enter surname: ")
        contact = phone_book.search_contact(name, surname)
        if contact:
            print(contact)
        else:
            print("Contact not found.")
    elif choice == "5":
        print("Contacts:")
        phone_book.print_contacts()
    elif choice == "6":
        break
    else:
        print("Invalid option.")
``` 

**<font color='magenta'>Total time:</font>** 337.07 sec.

In [44]:
response = query_model(
    system_message,
    user_message="""Please write a small Python module that creates a REST API service
                        with two endpoints: 
                        * get_status (GET)
                        * prediction (POST)
                        The prediction endpoint receives the payload, extract three fields: city, street and number
                        and returns a field called price_estimate
                        
                        """,
    temperature=0.1, 
    max_length=256)
display(Markdown(colorize_text(response)))



**<font color='red'>Question:</font>** Please write a small Python module that creates a REST API service
                        with two endpoints: 
                        * get_status (GET)
                        * prediction (POST)
                        The prediction endpoint receives the payload, extract three fields: city, street and number
                        and returns a field called price_estimate
                        
                         

**<font color='green'>Answer:</font>** ```python
from flask import Flask, request, jsonify

app = Flask(__name__)

# Sample data for demonstration purposes
data = {
    "New York": {"street": "Main St", "number": 123, "price_estimate": 1000},
    "Los Angeles": {"street": "Sunset Blvd", "number": 456, "price_estimate": 800},
    "Chicago": {"street": "Michigan Ave", "number": 789, "price_estimate": 1200}
}

@app.route('/status', methods=['GET'])
def get_status():
    """Return a message indicating the service is up and running."""
    return jsonify({'message': 'Service is up and running'}), 200

@app.route('/prediction', methods=['POST'])
def prediction():
    """Make a prediction based on the provided city, street and number."""
    try:
        data_received = request.json
        city = data_received.get('city')
        street = data_received.get('street')
        number = data_received.get('number')
        
        if city in data and data[city]['street'] == street and data[city]['number'] == number:
            return jsonify({'price_estimate': data[city]['price_estimate']}), 200
 

**<font color='magenta'>Total time:</font>** 157.67 sec.

## C++ code

We continue now with some C++ code.

In [45]:
system_message = """
You are an AI assistant designed to write simple C++ code.
Please answer with the listing of the C++ code.
Question: {question}
Answer:
"""
response = query_model(
    system_message,
    user_message="Please write a function in C++ to calculate the area of a circle of radius r", 
    temperature=0.1, 
    max_length=256)
display(Markdown(colorize_text(response)))



**<font color='red'>Question:</font>** Please write a function in C++ to calculate the area of a circle of radius r 

**<font color='green'>Answer:</font>** ```cpp
#include <iostream>
#include <cmath>

double calculate_circle_area(double r) {
    return M_PI * pow(r, 2);
}

int main() {
    double radius = 5.0;
    std::cout << "The area of the circle is: " << calculate_circle_area(radius) << std::endl;
    return 0;
}
``` 

**<font color='magenta'>Total time:</font>** 47.39 sec.

In [46]:
system_message = """
You are an AI assistant designed to write simple C++ code.
Please answer with the listing of the C++ code.
Question: {question}
Answer:
"""
response = query_model(
    system_message,
    user_message="Please write a function in C++ to calculate the volume of a cylinder with radius r and height h",
    temperature=0.1,
    max_length=512)
display(Markdown(colorize_text(response)))



**<font color='red'>Question:</font>** Please write a function in C++ to calculate the volume of a cylinder with radius r and height h 

**<font color='green'>Answer:</font>** ```cpp
#include <iostream>

double cylinder_volume(double r, double h) {
    return 3.14159 * r * r * h;
}

int main() {
    double r = 5;
    double h = 10;
    std::cout << "The volume of the cylinder is: " << cylinder_volume(r, h) << std::endl;
    return 0;
}
``` 

**<font color='magenta'>Total time:</font>** 51.62 sec.

In [None]:
response = query_model(
    system_message,
    user_message="Please write a function in C++ to order a list", 
    temperature=0.1, 
    max_length=256)
display(Markdown(colorize_text(response)))

## Multiple parameters questions

### Best food in France

In [None]:
system_message = """
You are an AI assistant designed to answer questions with parameters.
Return the answer formated nicely, for example with bullet points.
"""
user_message = """
What are the {adjective} {number} {items} from {place}?
"""
response = query_model(
    system_message,
    user_message.format(
    adjective="best",
    number="3",
    items="food",
    place="France"
    ), 
    max_length=256)
display(Markdown(colorize_text(response)))

### Best attractions in Italy

In [None]:
response = query_model(    
    system_message,
    user_message.format(
    adjective="best",
    number="five",
    items="attractions",
    place="Italy"
    ), 
    max_length=256)
display(Markdown(colorize_text(response)))

### Most affordable places to stay in Spain 

In [None]:
response = query_model(
    system_message,
    user_message.format(
    adjective="most affordable",
    number="two",
    items="places to retire",
    place="Spain"
    ), 
    max_length=256)
display(Markdown(colorize_text(response)))

### Less known but great places to stay in Romania

In [None]:
response = query_model(    
    system_message,
    user_message.format(
    adjective="Less known but great",
    number="4",
    items="places to stay",
    place="Romania"
    ), 
    max_length=256)
display(Markdown(colorize_text(response)))

### Best commedies by Shakespeare


In [None]:
response = query_model(
    system_message,
    user_message.format(
    adjective="best",
    number="3",
    items="commedies",
    place="Shakespeare entire creation"
    ), 
    max_length=256)
display(Markdown(colorize_text(response)))

### Most important battles fromn WW2

In [None]:
response = query_model(
    system_message,
    user_message.format(
    adjective="most important",
    number="5",
    items="battles",
    place="WW2"
    ), 
    max_length=512)
display(Markdown(colorize_text(response)))

## Multiple steps reasoning (task chain)

In [None]:
response = query_model(
    system_message,
    user_message.format(
    adjective="most important",
    number="5",
    items="battles",
    place="WW2"
    ), 
    max_length=512)
display(Markdown(colorize_text(response)))

## Reasoning like Einstein will do

In [None]:
system_message = """
You are a math professor, smart but cool.
Background: A train traveling from Bucharest to Ploiesti (60 km distance) has the speed of 60 km/h. 
The train starts in Bucharest and travels until Ploiesti, once, only in this direction.
A swallow, flying with 90 km/h, fly from Ploiesti to the moving train.
When it reaches the train, the swallow flies back toward Ploiesti,
ahead of the train. At Ploiesti turns again back and continues to fly back and forth 
(between the train approaching Ploiesti and Ploiesti) until the train reaches Ploiesti. 
The swallow will fly continously all the time the train is traveling from Bucharest to Ploiesti.
Reasoning: Think step by step. Explain your reasoning.
Question: {question}
Answer:
"""
response = query_model(
    system_message,
    user_message="How many kilometers will travel totally the swallow?", 
    temperature=0.1,
    max_length=512)
display(Markdown(colorize_text(response)))