## Research Project  -  Unimodal Prompt Engineering examples - Kruti Shah

### Install requirements

First, run the cells below to install the requirements:

In [None]:
!pip install transformers==4.29.0



### Model loading

In [None]:
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
import torch
from tqdm import tqdm

model_name = "tiiuae/falcon-7b-instruct"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, trust_remote_code=True, torch_dtype=torch.bfloat16)

# Set configurations
model.config.max_length = 512
model.config.truncation = True
model.config.pad_token_id = tokenizer.eos_token_id
model.config.eos_token_id = tokenizer.eos_token_id

# Fix the issue by setting _lock to None
model.config._lock = None

# Define text generation pipeline
text_generator = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    device=0 if torch.cuda.is_available() else -1,
    trust_remote_code=True
)

# Generate text
sequences = text_generator(
    "Girafatron is obsessed with giraffes, the most glorious animal on the face of this Earth. Giraftron believes all other animals are irrelevant when compared to the glorious majesty of the giraffe.\nDaniel: Hello, Girafatron!\nGirafatron:",
    max_length=200,
    do_sample=True,
    top_k=10,
    num_return_sequences=1,
    eos_token_id=tokenizer.eos_token_id,
)
for seq in sequences:
    print(f"Result: {seq['generated_text']}")


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.



Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Xformers is not installed correctly. If you want to use memorry_efficient_attention to accelerate training use the following command to install Xformers
pip install xformers.
The model 'FalconForCausalLM' is not supported for text-generation. Supported models are ['BartForCausalLM', 'BertLMHeadModel', 'BertGenerationDecoder', 'BigBirdForCausalLM', 'BigBirdPegasusForCausalLM', 'BioGptForCausalLM', 'BlenderbotForCausalLM', 'BlenderbotSmallForCausalLM', 'BloomForCausalLM', 'CamembertForCausalLM', 'CodeGenForCausalLM', 'CpmAntForCausalLM', 'CTRLLMHeadModel', 'Data2VecTextForCausalLM', 'ElectraForCausalLM', 'ErnieForCausalLM', 'GitForCausalLM', 'GPT2LMHeadModel', 'GPT2LMHeadModel', 'GPTBigCodeForCausalLM', 'GPTNeoForCausalLM', 'GPTNeoXForCausalLM', 'GPTNeoXJapaneseForCausalLM', 'GPTJForCausalLM', 'LlamaForCausalLM', 'MarianForCausalLM', 'MBartForCausalLM', 'MegaForCausalLM', 'MegatronBertForCausalLM', 'MvpForCausalLM', 'OpenLlamaForCausalLM', 'OpenAIGPTLMHeadModel', 'OPTForCausalLM', 'Pegas

Result: Girafatron is obsessed with giraffes, the most glorious animal on the face of this Earth. Giraftron believes all other animals are irrelevant when compared to the glorious majesty of the giraffe.
Daniel: Hello, Girafatron!
Girafatron: Ah, Daniel! It's so nice of you to join the discussion.
Daniel: So, what would you name your first giraffe? (Girafatron smiles) How about Girafafatron? Would it be a silly name?
Girafafatron: It's not a silly name. In fact, it's a very interesting name, Daniel. I believe it fits me perfectly!
Daniel: Oh, I see. How about you, Girafatron? What is your name?
Giraftron: My name is Girafatron!
Daniel: 


### Sentiment Analysis example 1

In [None]:
'''
from transformers import AutoTokenizer, AutoModelForCausalLM
import transformers
import torch
#from transformers import pipeline, AutoTokenizer
#import torch

torch.manual_seed(0)

model = "tiiuae/falcon-7b-instruct"

tokenizer = AutoTokenizer.from_pretrained(model)
model = AutoModelForCausalLM.from_pretrained(model)
model.config.max_length = 512
model.config.truncation = True
model.config.pad_token_id = tokenizer.eos_token_id
model.config.eos_token_id = tokenizer.eos_token_id
'''
pipe = transformers.pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    torch_dtype=torch.bfloat16,
    trust_remote_code=True,

)
torch.manual_seed(0)
prompt = """Classify the text into neutral, negative or positive.
Text: This movie is definitely one of my favorite movies of its kind. The interaction between respectable and morally strong characters is an ode to chivalry and the honor code amongst thieves and policemen.
Sentiment:
"""

sequences = pipe(
    prompt,
    max_new_tokens=10,
)

for seq in sequences:
    print(f"Result: {seq['generated_text']}")

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

The model 'FalconForCausalLM' is not supported for text-generation. Supported models are ['BartForCausalLM', 'BertLMHeadModel', 'BertGenerationDecoder', 'BigBirdForCausalLM', 'BigBirdPegasusForCausalLM', 'BioGptForCausalLM', 'BlenderbotForCausalLM', 'BlenderbotSmallForCausalLM', 'BloomForCausalLM', 'CamembertForCausalLM', 'CodeGenForCausalLM', 'CpmAntForCausalLM', 'CTRLLMHeadModel', 'Data2VecTextForCausalLM', 'ElectraForCausalLM', 'ErnieForCausalLM', 'GitForCausalLM', 'GPT2LMHeadModel', 'GPT2LMHeadModel', 'GPTBigCodeForCausalLM', 'GPTNeoForCausalLM', 'GPTNeoXForCausalLM', 'GPTNeoXJapaneseForCausalLM', 'GPTJForCausalLM', 'LlamaForCausalLM', 'MarianForCausalLM', 'MBartForCausalLM', 'MegaForCausalLM', 'MegatronBertForCausalLM', 'MvpForCausalLM', 'OpenLlamaForCausalLM', 'OpenAIGPTLMHeadModel', 'OPTForCausalLM', 'PegasusForCausalLM', 'PLBartForCausalLM', 'ProphetNetForCausalLM', 'QDQBertLMHeadModel', 'ReformerModelWithLMHead', 'RemBertForCausalLM', 'RobertaForCausalLM', 'RobertaPreLayerNorm

Result: Classify the text into neutral, negative or positive.
Text: This movie is definitely one of my favorite movies of its kind. The interaction between respectable and morally strong characters is an ode to chivalry and the honor code amongst thieves and policemen.
Sentiment:
Positive


### Sentiment Analysis example 2

In [None]:
torch.manual_seed(0)
prompt = """Classify the text into neutral, negative or positive.
Text: "Indian Delights impressed me with impeccable customer service, attentive staff, and a warm welcome. The food was a culinary masterpiece, bursting with flavor and creativity, while the ambience added to the enchanting dining experience.
Sentiment:
"""

sequences = pipe(
    prompt,
    max_new_tokens=10,
)

for seq in sequences:
    print(f"Result: {seq['generated_text']}")

Setting `pad_token_id` to `eos_token_id`:11 for open-end generation.


Result: Classify the text into neutral, negative or positive.
Text: "Indian Delights impressed me with impeccable customer service, attentive staff, and a warm welcome. The food was a culinary masterpiece, bursting with flavor and creativity, while the ambience added to the enchanting dining experience.
Sentiment:
Positive


### Text Classification

In [None]:
torch.manual_seed(0)
prompt = """"Classify the following text as either 'Sports', 'Politics', or 'Entertainment':
Text: 'The latest game between Team A and Team B ended in a thrilling tiebreaker, leaving fans on the edge of their seats.'"
Sentiment:
"""

sequences = pipe(
    prompt,
    max_new_tokens=10,
)

for seq in sequences:
    print(f"Result: {seq['generated_text']}")

Setting `pad_token_id` to `eos_token_id`:11 for open-end generation.


Result: "Classify the following text as either 'Sports', 'Politics', or 'Entertainment':
Text: 'The latest game between Team A and Team B ended in a thrilling tiebreaker, leaving fans on the edge of their seats.'"
Sentiment:
The sentiment of the text is 'Sports'.


### Named Entity Recognition

In [None]:
torch.manual_seed(1)
prompt = """Return a list of named entities in the text.
Text: The Golden Gate Bridge spans the San Francisco Bay area.
Named entities:
"""

sequences = pipe(
    prompt,
    max_new_tokens=15,
    return_full_text = False,
)

for seq in sequences:
    print(f"{seq['generated_text']}")

Setting `pad_token_id` to `eos_token_id`:11 for open-end generation.


- Golden Gate Bridge
- San Francisco Bay area


### Language Translation

In [None]:
torch.manual_seed(2)
prompt = """Translate the English text to Italian.
Text: "The sun sets behind the mountains, casting a warm glow over the tranquil valley below."
Translation:
"""

sequences = pipe(
    prompt,
    max_new_tokens=20,
    do_sample=True,
    top_k=10,
    return_full_text = False,
)

for seq in sequences:
    print(f"{seq['generated_text']}")

Setting `pad_token_id` to `eos_token_id`:11 for open-end generation.


Il sole si allunga dietro ai monti, proiettando una calma luce sulla tranquilla


### Text Summarization example 1

In [None]:
torch.manual_seed(3)
prompt = """Diwali, also known as the Festival of Lights, is one of the most significant Hindu festivals celebrated world-wide
Write a summary of the above text.
Summary:
"""

sequences = pipe(
prompt,
max_new_tokens=50,
do_sample=True,
top_k=10,
return_full_text = False,

)

for seq in sequences:
    print(f"{seq['generated_text']}")

Setting `pad_token_id` to `eos_token_id`:11 for open-end generation.


- Diwali is a significant annual celebration of the Hindu faith.
- It is marked by colorful traditions, prayers, and feasts.
- The festival symbolizes the victory of light over darkness and good over evil.


### Text Summarization example 2

In [None]:
torch.manual_seed(3)
prompt = """Diwali, also known as the Festival of Lights, is one of the most significant Hindu festivals celebrated world
Summarize the above text in 2 sentences
Summary:
"""

sequences = pipe(
prompt,
max_new_tokens=50,
do_sample=True,
top_k=10,
return_full_text = False,

)

for seq in sequences:
    print(f"{seq['generated_text']}")

Setting `pad_token_id` to `eos_token_id`:11 for open-end generation.


1. Diwali is a major Hindu festival celebrated around the world, involving colorful traditions, food, and fireworks.
2. The festival is celebrated in India and other countries with various types of decorations, lights, and sweets, and is typically accompanied


### Question- Answering

In [None]:
torch.manual_seed(4)
prompt = """What are some effective strategies for improving time management skills in a busy work environment?".

Answer:
"""

sequences = pipe(
    prompt,
    max_new_tokens=50,
    do_sample=True,
    top_k=10,
    return_full_text = False,
)

for seq in sequences:
    print(f"Result: {seq['generated_text']}")

Setting `pad_token_id` to `eos_token_id`:11 for open-end generation.


Result: 1. Prioritize tasks based on urgency and importance.
2. Use a to-do list to keep tasks organized.
3. Set specific deadlines for tasks to ensure progress.
4. Break complex tasks into smaller, more manageable chunks.


### Reasoning

In [20]:
torch.manual_seed(5)
prompt = """If John has 5 apples and he gives 2 apples to his friend Sarah, how many apples does John have left?"""

sequences = pipe(
prompt,
max_new_tokens=30,
do_sample=True,
top_k=10,
return_full_text = False,
)
for seq in sequences:
  print(f"Result: {seq['generated_text']}")

Setting `pad_token_id` to `eos_token_id`:11 for open-end generation.


Result: 
John has 3 apples left.


### Few-shot Prompting

In [23]:
torch.manual_seed(6)

prompt = """
Instructions: Generate the next number in each sequence based on the given examples.
Context: You are tutoring a student in arithmetic sequences and want to test their ability to identify the pattern and find the next number.
Examples with Solutions:
1. 2, 4, 6, __
   Example solution: 8

2. 5, 10, 15, __
   Example solution: 20

3. 3, 6, 9, __
   Example solution: 12

4. 12, 9, 6, __
   Example solution: 3

5. 20, 40, 60, __
   Example solution: 80
"""

sequences = pipe(
    prompt,
    max_new_tokens=30,
    do_sample=True,
    top_k=10,
)

for i, seq in enumerate(sequences):
    print(f"Sequence {i + 1}: {seq['generated_text'].strip()}")


Setting `pad_token_id` to `eos_token_id`:11 for open-end generation.


Sequence 1: Instructions: Generate the next number in each sequence based on the given examples.
Context: You are tutoring a student in arithmetic sequences and want to test their ability to identify the pattern and find the next number.
Examples with Solutions:
1. 2, 4, 6, __
   Example solution: 8
   
2. 5, 10, 15, __
   Example solution: 20  

3. 3, 6, 9, __
   Example solution: 12  

4. 12, 9, 6, __
   Example solution: 3  

5. 20, 40, 60, __
   Example solution: 80  

6. 30, 60, 90, __
   Example solution: 45
