# Using LLMs: Phi-3

Open this script in Google colab <img src="https://colab.google/static/images/icons/colab.png" width=100>

The setup of the LLM is based on: https://github.com/HandsOnLLM/Hands-On-Large-Language-Models.
Phi-3 is a free LLM.

The use of LLMs is based on https://www.coursera.org/learn/ai-python-for-beginners/

### [OPTIONAL] - Installing Packages on Google colab
If you are viewing this notebook on Google Colab (or any other cloud vendor), you need to **uncomment and run** the following codeblock to install the dependencies for this chapter:

---

💡 **NOTE**: We will want to use a GPU to run the examples in this notebook. In Google Colab, go to
**Runtime > Change runtime type > Hardware accelerator > GPU > GPU type > T4**.

---

In [4]:
%%capture
!pip install transformers>=4.40.1 accelerate>=0.27.2

# Setup our LLM: Phi-3

The first step is to load our model onto the GPU for faster inference. Note that we load the model and tokenizer separately (although that isn't always necessary).

In [5]:
from transformers import AutoModelForCausalLM, AutoTokenizer

# Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained(
    "microsoft/Phi-3-mini-4k-instruct",
    device_map="cuda",
    torch_dtype="auto",
    trust_remote_code=False,
)
tokenizer = AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-4k-instruct")

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/967 [00:00<?, ?B/s]

model.safetensors.index.json: 0.00B [00:00, ?B/s]

Fetching 2 files:   0%|          | 0/2 [00:00<?, ?it/s]

model-00002-of-00002.safetensors:   0%|          | 0.00/2.67G [00:00<?, ?B/s]

model-00001-of-00002.safetensors:   0%|          | 0.00/4.97G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/181 [00:00<?, ?B/s]

tokenizer_config.json: 0.00B [00:00, ?B/s]

tokenizer.model:   0%|          | 0.00/500k [00:00<?, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

added_tokens.json:   0%|          | 0.00/306 [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/599 [00:00<?, ?B/s]

Although we can now use the model and tokenizer directly, it's much easier to wrap it in a `pipeline` object:

In [6]:
from transformers import pipeline

# Create a pipeline
generator = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    return_full_text=False,
    max_new_tokens=500,
    do_sample=False
)

Device set to use cuda
The following generation flags are not valid and may be ignored: ['temperature']. Set `TRANSFORMERS_VERBOSITY=info` for more details.


Finally, we create our prompt as a user and give it to the model.
Example of prompt.

In [8]:
# The prompt (user input / query)
messages = [
    {"role": "user", "content": "How does MLflow compares to databricks"}
]

# Generate output
output = generator(messages)
print(output[0]["generated_text"])

The following generation flags are not valid and may be ignored: ['temperature']. Set `TRANSFORMERS_VERBOSITY=info` for more details.


 MLflow and Databricks are both powerful tools for managing machine learning workflows, but they have different focuses and strengths. Here's a comparison of the two:

1. Purpose:
   - MLflow: MLflow is a general-purpose machine learning lifecycle management tool that helps you manage your machine learning projects from experimentation to production. It provides a unified platform for tracking experiments, managing models, and deploying machine learning applications.
   - Databricks: Databricks is a cloud-based platform for collaborative data science and analytics. It provides a comprehensive suite of tools for data engineering, data science, and machine learning, including Apache Spark, SQL, and machine learning libraries.

2. Features:
   - MLflow: MLflow provides features for experiment tracking, model management, and deployment. It allows you to log and compare experiments, manage models, and deploy models to various environments. MLflow also supports multiple machine learning fram

In [10]:
# Create a function to interact easily with LLM
def my_llm(my_prompt):
    """
    Esta funcion toma un string (prompt) y retorna el resultado de llamar un LLM

    Parameters:
        my_prompt (string): prompt


    Returns:
        output (string): Respuesta LLM.
    """

    messages = [{"role": "user", "content": my_prompt}]

    # Generate output
    output = generator(messages)
    print(output[0]["generated_text"])

In [13]:
# Ejemplo de uso de la funcion my_llm
my_prompt = "What are the best ML papers made by researchers from Colombia?"
my_llm(my_prompt)

The following generation flags are not valid and may be ignored: ['temperature']. Set `TRANSFORMERS_VERBOSITY=info` for more details.


 As of my last update in April 2023, identifying the "best" ML papers by Colombian researchers requires a nuanced approach, considering the dynamic nature of academic contributions and the evolving landscape of machine learning research. However, I can guide you on how to find such papers and highlight some notable contributions from Colombian researchers in the field of machine learning.

### How to Find the Best ML Papers by Colombian Researchers

1. **Academic Databases and Journals**: Start with databases like Google Scholar, IEEE Xplore, and the ACM Digital Library. Use filters to narrow down the search to Colombian institutions or researchers.

2. **Researcher Profiles**: Look for profiles of Colombian researchers on platforms like ResearchGate or LinkedIn. Many researchers list their publications, which can be filtered by their country of origin.

3. **Colombian Academic Institutions**: Universities and research institutions in Colombia, such as Universidad Nacional de Colombia,

# Building LLM prompts with variables
The following is based on https://www.coursera.org/learn/ai-python-for-beginners. That course uses OpenAI API. Here we will use the Phi-3-mini

In [15]:
# L9: Building LLM prompts with variables
# Basically, you can use that function my_llm(my_prompt) as if you were asking a chatbot. You just need to provide your instructions as a string. For instance, you can ask "What is the capital of France?" using the following code:
my_llm("What is the capital of France?")

The following generation flags are not valid and may be ignored: ['temperature']. Set `TRANSFORMERS_VERBOSITY=info` for more details.


 The capital of France is Paris.


In [17]:
# Let's ask the LLM for the lifestyle description for Otto Matic, whose name is stored in `name`, if he were a `dog_age` years old dog.
name = "Bruno"
dog_age = 7
my_llm(f"""If {name} were a dog, he would be {dog_age} years old. Describe what life stage that would be for a dog and what that might
entail in terms of energy level, interests, and behavior.""")





The following generation flags are not valid and may be ignored: ['temperature']. Set `TRANSFORMERS_VERBOSITY=info` for more details.


 If Bruno were a dog and he was 7 years old, he would be considered a senior dog. This life stage typically begins around the age of 7 to 10 years, depending on the breed and size of the dog. During this time, dogs may experience a decrease in energy levels and may not be as active as they were in their younger years. They may also start to show signs of aging, such as graying fur, decreased mobility, and a slower metabolism.

In terms of energy level, a senior dog like Bruno may not be as enthusiastic about long walks or playtime as he once was. He may prefer shorter, more leisurely walks and may need more frequent breaks during playtime. It's important to listen to your dog's cues and adjust your activities accordingly to ensure he is comfortable and not overexerted.

Interests may also change as Bruno ages. He may become more interested in relaxing activities, such as cuddling on the couch or napping. He may also enjoy shorter, more gentle play sessions with his favorite toys or tre

In [22]:
## Variable names restrictions
# The following variable names have some problems. Try to fix them yourself.
driver = "unicorn"
driver_vehicle = "colorful, asymmetric dinosaur car"
favorite_planet = "Pluto"



In [25]:
# Now, update the next cell with any changes you made in the previous cell.
my_llm(f"""Write me a 300 word children's story about a {driver} racing
a {driver_vehicle} for the {favorite_planet} champion cup.""")


The following generation flags are not valid and may be ignored: ['temperature']. Set `TRANSFORMERS_VERBOSITY=info` for more details.


 Once upon a time, in a land far, far away, there was a magical race called the Pluto Champion Cup. This race was not like any other race; it was a race for the most colorful and asymmetric dinosaur cars. The cars were made by the most creative and imaginative children in the land.

One day, a young unicorn named Sparkle decided to enter the race. Sparkle was a beautiful unicorn with a shimmering white coat and a rainbow mane. She had never raced before, but she was determined to win the Pluto Champion Cup.

Sparkle's best friend, a friendly dinosaur named Dino, helped her build her car. They used brightly colored paints and glitter to make it look as colorful and asymmetric as possible. The car had a long, curvy body with a rainbow stripe down the middle, and it was covered in sparkles that shone in the sunlight.

On the day of the race, Sparkle and Dino arrived at the starting line, where all the other dinosaur cars were lined up. The cars were all different shapes and sizes, but the

In [28]:
## Extra practice
# Try the exercises below to practice the concepts from this lesson. Read the comments in each cell with the instructions for each exercise.
# Fix this code
favorite_book = "1001 Ways to Wear a Hat"
second_fav_book = "2002 Ways to Wear a Scarf"
print(f"My most favorite book is {favorite_book}, but I also like {second_fav_book}")

# Make variables for your favorite game, movie, and food.
# Then use print_llm_response to ask the LLM to recommend you
# a new song to listen to based on your likes.



My most favorite book is 1001 Ways to Wear a Hat, but I also like 2002 Ways to Wear a Scarf


In [29]:
# Variables
fav_game = "FIFA"
fav_movie = "The Truman Show"
fav_food = "a hamburger"

# Prompt
my_llm(f"""

I will first tell you some of my preferences:
My favorite game is {fav_game}.
My favorite movie is {fav_movie}.
My favorite food is {fav_food}.

Analyze these preferences and use them to recommend me a new song to listen to, based on my likes.

""")

The following generation flags are not valid and may be ignored: ['temperature']. Set `TRANSFORMERS_VERBOSITY=info` for more details.


 Given your preferences for FIFA, The Truman Show, and hamburgers, it seems you enjoy a mix of sports, drama, and comfort food. A song that could resonate with these interests might be "The Game" by Imagine Dragons. This song has an energetic vibe that could complement your love for football (FIFA), and its themes of competition and perseverance might appeal to the drama of The Truman Show. Additionally, the song's upbeat tempo and catchy chorus could make it a perfect soundtrack for enjoying a hamburger.


# Functions

In previous lessons, you have used the `print()` function to display values directly to the screen and the `my_llm()` function to use an LLM following the instruction you provide as a string. Below, you will print `"¯\_(ツ)_/¯"` and ask the LLM about the capital of France.

In [30]:
print("¯\_(ツ)_/¯")

¯\_(ツ)_/¯


  print("¯\_(ツ)_/¯")


In [31]:
my_llm("What is the capital of France?")

The following generation flags are not valid and may be ignored: ['temperature']. Set `TRANSFORMERS_VERBOSITY=info` for more details.


 The capital of France is Paris.


You have also used the `type` function, which gives you the type used in Python for a value or variable you provide. For instance, the type of 17 is `int` (for integer).

In [32]:
type(17)

int

In this lesson, you will see more function examples and explore more deeply how functions work.

## Functions to count, to round, and to do much more

There are many functions in Python that you can use straight out of the box. For instance, the `len()` function counts the characters in a string. So when you run the code below, you will display (using `print()`) the result of counting (with `len()`) the number of characters in the string `"Hello World!"`.

In [35]:
print( len("Esta es una frase excesivamente larga para lo que esta diciendo") )
print( len( "Hola") > len("Adios"))

63
False


As another example, you can use `round()` to take a floating point number and round it to the nearest integer. Below, you use `print()` to display the result of rounding (with `round()`) the number `42.17`.

In [38]:
print(round(3.1415926535897932384626433, 3))

3.142


You can save the result from a function using variables in a very similar way to what you have already explored in previous lessons. Below, you save the result from `len("Hello World!")` to the variable `string_length`.

In [40]:
# Guardamos
string_length = len("Frase corta")
pi_round = round(3.1415926535897932384626433, 3)

# Mostramos
print(string_length)
print(pi_round)

11


There are many functions in Python, and you don't have to memorize them all. If you ever need a function to perform a specific task, you can ask the chatbot. Try it now with the suggested prompt here or try your own.

<p style="background-color:#F5C780; padding:15px"> 🤖 <b>Use the Chatbot</b>: How can I find the length of a string?
</p>

In [41]:
# Probamos el chatbot
my_llm("What is a very useful function nobody uses in pyhton?")

The following generation flags are not valid and may be ignored: ['temperature']. Set `TRANSFORMERS_VERBOSITY=info` for more details.


 In Python, there isn't a single function that is universally considered "very useful" but not used by anyone, as the utility of a function often depends on the specific needs of a project or the preferences of the developer. However, there are some functions and features in Python that might not be as commonly used or might be overlooked due to their specificity or the context in which they are used. Here are a few examples:

1. **`functools.lru_cache`**: This decorator wraps a function with a memoizing callable that saves up to the maxsize most recent calls. It can save time when an expensive or I/O bound function is periodically called with the same arguments. However, it might not be used in projects where caching is not beneficial or where the overhead of managing the cache outweighs its benefits.

2. **`pathlib.Path.rglob`**: Part of the `pathlib` module, `rglob` is a method for recursively searching for files matching a pattern. While powerful for file system operations, it migh

## Using functions in AI programs

Functions can be used alongside variables in AI programs. In the previous lesson, you saw how to create custom instructions (or prompts) for an LLM using variables. In the cell below, you will use variables and the `round()` function to create a prompt that you will use for an LLM with the `get_llm_response()` function. The `get_llm_response()` function is very similar to `print_llm_response()` (which you used before); the main difference is that you get a string as a result instead of just displaying the LLM response. This way, you can store the LLM response in the variable `response`.

In [42]:
name = "Tommy"
potatoes = 4.75
prompt = f"""Write a couplet about my friend {name} who has about {round(potatoes)} potatoes"""
response = my_llm(prompt)
print(response)

The following generation flags are not valid and may be ignored: ['temperature']. Set `TRANSFORMERS_VERBOSITY=info` for more details.


 In Tommy's garden, the potatoes grow,

Five in number, they're his earthy show.
None


## Extra practice

Try the exercises below to practice the concepts from this lesson. Read the comments in each cell with the instructions for each exercise.

<b>Feel free to use the chatbot if you need help.</b>

In [None]:
# Enter one of your favorite numbers. Multiply the result by 10 and save it to a variable called 'lucky_number'.
# Print a message saying "Your lucky number is [lucky_number]!"

lucky_number = 4*10
print(f"Your lucky number is {lucky_number}!")

In [51]:
# Use print_llm_response() to print a poem with the specified number of lines. Use the
# prompt variable to save your prompt before calling print_llm_response()

number_of_lines = 100
prompt = f"""Write a poem with {number_of_lines} lines."""


# Creamos la funciíon get_llm_response()

def get_llm_response(my_prompt):
    """
    Esta funcion toma un string (prompt) y retorna el resultado de llamar un LLM
    sin imprimir este resultado.

    Parameters:
        my_prompt (string): prompt


    Returns:
        output (string): Respuesta LLM.
    """

    messages = [{"role": "user", "content": my_prompt}]

    # Generate output
    output = generator(messages)
    return output[0]["generated_text"]


# Usamos la función
get_llm_response(prompt)

# Sale el output en el terminal, pero no sale como algo impreso. Eso explica los
# caracteres de saltar linea (\n).

The following generation flags are not valid and may be ignored: ['temperature']. Set `TRANSFORMERS_VERBOSITY=info` for more details.


" In the realm of words, where thoughts take flight,\nA poem unfolds, a beacon of light.\nWith rhythm and rhyme, it weaves its tale,\nA hundred lines, a wondrous veil.\n\nIn the beginning, a whisper soft,\nA gentle breeze, a lover's loft.\nThe words dance, they twirl and sway,\nIn the poet's mind, they find their way.\n\nThe first line, a seed, a spark,\nA promise of the journey's mark.\nThe second line, a step, a leap,\nA path that's carved, a promise to keep.\n\nThe third line, a bridge, a span,\nA connection, a bond, a plan.\nThe fourth line, a challenge, a test,\nA hurdle to overcome, a quest.\n\nThe fifth line, a pause, a breath,\nA moment's peace, a moment's death.\nThe sixth line, a question, a plea,\nA cry for help, a cry for glee.\n\nThe seventh line, a revelation,\nA truth unveiled, a new creation.\nThe eighth line, a twist, a turn,\nA surprise, a lesson learned.\n\nThe ninth line, a climax, a peak,\nA moment's glory, a moment's critique.\nThe tenth line, a resolution, a clos

In [52]:
# Repeat exercise 2, this time using the function get_llm_response(), then print() to print it. This function asks
# the LLM for a response, just like print_llm_response, but does not print it. You'll need to save the response to
# a variable, then print it out separately.

number_of_lines = 100
prompt = f"""Write a poem with {number_of_lines} lines."""
response = get_llm_response(prompt)
print(response)

The following generation flags are not valid and may be ignored: ['temperature']. Set `TRANSFORMERS_VERBOSITY=info` for more details.


 In the realm of words, where thoughts take flight,
A poem unfolds, a beacon of light.
With rhythm and rhyme, it weaves its tale,
A hundred lines, a wondrous veil.

In the beginning, a whisper soft,
A gentle breeze, a lover's loft.
The words dance, they twirl and sway,
In the poet's mind, they find their way.

The first line, a seed, a spark,
A promise of the journey's mark.
The second line, a step, a leap,
A path that's carved, a promise to keep.

The third line, a bridge, a span,
A connection, a bond, a plan.
The fourth line, a challenge, a test,
A hurdle to overcome, a quest.

The fifth line, a pause, a breath,
A moment's peace, a moment's death.
The sixth line, a question, a plea,
A cry for help, a cry for glee.

The seventh line, a revelation,
A truth unveiled, a new creation.
The eighth line, a twist, a turn,
A surprise, a lesson learned.

The ninth line, a climax, a peak,
A moment's glory, a moment's critique.
The tenth line, a resolution, a close,
A sense of peace, a sense of r