In [None]:
!pip install transformers==4.48.0

## Loading a Text Generation Model

In [None]:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

model = AutoModelForCausalLM.from_pretrained(
    'microsoft/Phi-3-mini-4k-instruct',
    torch_dtype='auto',
    trust_remote_code=True,
)
tokenizer = AutoTokenizer.from_pretrained('microsoft/Phi-3-mini-4k-instruct')

The cache for model files in Transformers v4.22.0 has been updated. Migrating your old cache. This is a one-time only operation. You can interrupt this and resume the migration later on by calling `transformers.utils.move_cache()`.


0it [00:00, ?it/s]



Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

In [None]:
pipe = pipeline(
    'text-generation',
    model=model,
    tokenizer=tokenizer,
    return_full_text=False,
    max_new_tokens=500,
    do_sample=False
)

Device set to use cuda:0


In [None]:
import transformers
print(transformers.__version__)

4.48.0


### Let's see a sample generation

In [None]:
messages = [
    {'role': 'user', 'content': 'Create a funny joke about monkey.'}
]

output = pipe(messages)
print(output[0]['generated_text'])

The `seen_tokens` attribute is deprecated and will be removed in v4.41. Use the `cache_position` model input instead.
`get_max_cache()` is deprecated for all Cache classes. Use `get_max_cache_shape()` instead. Calling `get_max_cache()` will raise error from v4.48


 Why don't monkeys use computers? Because they're afraid of the "keyboard" monkey!


Here, **transformers.pipeline** initially converts the provided message into a specific prompt template. Let's see what it converts to..

In [None]:
prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False)
print(prompt)

<|user|>
Create a funny joke about monkey.<|end|>
<|endoftext|>


**Temperature** can be used to control the randomness or creativity of the text generated. Zero temperature likely chooses the most likely word (deterministic output) whereas the higher temperature (e.g., 0.8) generates results diverse output.

In [None]:
output = pipe(messages,
              do_sample=True,
              temperature=1)
print(output[0]['generated_text'])

 Why don't we ever trust an oatmeal monkey? Because it's always rustling from tip to tail!


Each time we run the above cell, we're supposed to get different outputs due to the higher **temperature** set.

## Zero Shot Prompt

In [None]:
zero_shot_prompt = [
    {
        'role': 'user',
        'content': 'What is the capital of France?'
    }
]

In [None]:
print(pipe(zero_shot_prompt, do_sample=False)[0]['generated_text'])

 The capital of France is Paris. It is not only the country's largest city but also its administrative, commercial, and cultural center. Paris is known for its historical landmarks such as the Eiffel Tower, Notre-Dame Cathedral, and the Louvre Museum, which is the world's largest art museum and a historic monument in Paris.


Let's strict this within one sentence output.

In [None]:
zero_shot_prompt = [
    {
        'role': 'user',
        'content': 'What is the capital of France? Provide answer in just one sentence'
    }
]

In [None]:
print(pipe(zero_shot_prompt, do_sample=False)[0]['generated_text'])

 The capital of France is Paris.


## One Shot Prompt

In [None]:
one_shot_prompt = [
    {
        'role': 'user',
        'content': "What is the capital of France?"
    },
    {
        'role': 'assistant',
        'content': 'Paris is the capital of France.'
    },
    {
        'role': 'user',
        'content': 'What is the capital of Nepal?'
    }
]

In [None]:
print(pipe(one_shot_prompt, do_sample=False)[0]['generated_text'])

 Kathmandu is the capital of Nepal.


Similarly, we can use **Few-shot prompt** to provide few examples, and expect similar output.

## Chain Prompting

In [None]:
# Create name and slogan for a product
product_prompt = [
    {
        'role': 'user', 'content': 'Create a name and slogan for a chatbot that leverages LLMs'
    }
]
outputs = pipe(product_prompt)
product_description = outputs[0]['generated_text']
print(product_description)

 Name: ChatSage
Slogan: "Unleashing the Power of Conversation with ChatSage"


In [None]:
sales_prompt = [
    {
        "role": "user", "content": f"Generate a very short sales pitch for the following product : '{product_description}'"}
]

outputs = pipe(sales_prompt)
sales_pitch = outputs[0]['generated_text']
print(sales_pitch)

 Introducing ChatSage, the ultimate conversation companion that unlocks the power of communication. With our innovative technology, you can effortlessly connect with others, express your thoughts, and build meaningful relationships. Experience the freedom of conversation like never before with ChatSage. Unleash the power of conversation with ChatSage today!


## Chain of thought

In [None]:
cot_prompt = [
      {"role": "user", "content": "Roger has 5 tennis balls. He buys 2 more cans of tennis balls. Each can has 3 tennis balls. How many tennis balls does he have now?"},
      {"role": "assistant", "content": "Roger started with 5 balls. 2 cans of 3 tennis balls each is 6 tennis balls. 5 + 6 = 11. The answer is 11."},
      {"role": "user", "content": "The cafeteria had 23 apples. If they used 20 to make lunch and bought 6 more, how many apples do they have?"}
]

outputs = pipe(cot_prompt)
print(outputs[0]['generated_text'])

 The cafeteria started with 23 apples. They used 20 apples for lunch, so they had 23 - 20 = 3 apples left. After buying 6 more apples, they now have 3 + 6 = 9 apples. The answer is 9.


### Zero-shot chain-of-thought

In [None]:
zeroshot_cot_prompt = [
    {"role": "user", "content": "The cafeteria had 23 apples. If they used 20 to make lunch and bought 6 more, how many apples do they have? Let's think step-by-step."}
]

outputs = pipe(zeroshot_cot_prompt)
print(outputs[0]['generated_text'])

 Step 1: Start with the initial number of apples in the cafeteria, which is 23.

Step 2: Subtract the number of apples used to make lunch, which is 20.
23 - 20 = 3 apples remaining.

Step 3: Add the number of apples bought, which is 6.
3 + 6 = 9 apples.

So, the cafeteria now has 9 apples.


## Tree-of-Thought

In [None]:
zeroshot_tot_prompt = [
    {"role": "user", "content": "Imagine three different experts are answering this question. All experts will write down 1 step of their thinking, then share it with the group. Then all experts will go on to the next step, etc. If any expert realizes they're wrong at any point then they leave. The question is 'The cafeteria had 23 apples. If they used 20 to make lunch and bought 6 more, how many apples do they have?' Make sure to discuss the results."}
]

outputs = pipe(zeroshot_tot_prompt)
print(outputs[0]['generated_text'])

 Expert 1:
Step 1: Start with the initial number of apples, which is 23.

Expert 2:
Step 1: Subtract the number of apples used for lunch, which is 20. This leaves us with 3 apples.
Step 2: Add the number of apples bought, which is 6. This results in a total of 9 apples.

Expert 3:
Step 1: Begin with the initial number of apples, which is 23.
Step 2: Subtract the number of apples used for lunch, which is 20. This leaves us with 3 apples.
Step 3: Add the number of apples bought, which is 6. This results in a total of 9 apples.

Discussion:
All three experts arrived at the same answer, which is 9 apples. This indicates that their calculations were correct. The cafeteria started with 23 apples, used 20 for lunch, and then bought 6 more, resulting in a total of 9 apples.


## Output Verification

In [None]:
zeroshot_prompt = [
    {"role": "user", "content": "Create a character profile for an RPG game in JSON format"}
]

outputs = pipe(zeroshot_prompt)
print(outputs[0]['generated_text'])

You seem to be using the pipelines sequentially on GPU. In order to maximize efficiency please use a dataset


 ```json

{

  "character_name": "Aelar the Swift",

  "class": "Rogue",

  "level": 1,

  "attributes": {

    "strength": 10,

    "dexterity": 18,

    "constitution": 12,

    "intelligence": 14,

    "wisdom": 13,

    "charisma": 11

  },

  "skills": {

    "stealth": 15,

    "lockpicking": 14,

    "acrobatics": 13,

    "perception": 12

  },

  "equipment": {

    "weapon": "Dagger",

    "armor": "Leather Armor",

    "accessories": ["Gloves of Dexterity", "Boots of Speed"]

  },

  "background": "Aelar was born into a family of thieves. He learned the art of stealth and deception from a young age. He is always looking for the next big heist and is not afraid to take risks."

}

```


In [None]:
one_shot_template = """Create a short character profile for an RPG game. Make sure to only use this format:
                    {
                      "description": "A SHORT DESCRIPTION",
                      "name": "THE CHARACTER's NAME",
                      "armor": "ONE PIECE OF ARMOR",
                      "weapon": "ONE OR MORE WEAPONS"
                    }
                    """
one_shot_prompt = [
    {"role": "user", "content": one_shot_template}
]

outputs = pipe(one_shot_prompt)
print(outputs[0]['generated_text'])


 {
                  "description": "A cunning rogue with a mysterious past, skilled in stealth and deception.",
                  "name": "Lyra Shadowhand",
                  "armor": "Leather Vest",
                  "weapon": "Dagger, Crossbow"
                 }
