## Install Sagemaker on this Notebook Instance

In [1]:
%pip install "sagemaker>=2.175.0" --upgrade --quiet

## Create Sagemaker session, assign buckets and print session details

In [2]:
import sagemaker
import boto3
sess = sagemaker.Session()
# sagemaker session bucket -> used for uploading data, models and logs
# sagemaker will automatically create this bucket if it not exists
sagemaker_session_bucket=None
if sagemaker_session_bucket is None and sess is not None:
    # set to default bucket if a bucket name is not given
    sagemaker_session_bucket = sess.default_bucket()

try:
    role = sagemaker.get_execution_role()
except ValueError:
    iam = boto3.client('iam')
    role = iam.get_role(RoleName='sagemaker_execution_role')['Role']['Arn']

sess = sagemaker.Session(default_bucket=sagemaker_session_bucket)

print(f"sagemaker role arn: {role}")
print(f"sagemaker session region: {sess.boto_region_name}")

  from pandas.core.computation.check import NUMEXPR_INSTALLED


sagemaker.config INFO - Not applying SDK defaults from location: /etc/xdg/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /home/ec2-user/.config/sagemaker/config.yaml
sagemaker role arn: arn:aws:iam::744746689634:role/service-role/AmazonSageMaker-ExecutionRole-20240131T151129
sagemaker session region: us-east-2


## Get Hugging Face model (Llama-2) from Deep Learning Containers (DLC)

In [3]:
from sagemaker.huggingface import get_huggingface_llm_image_uri

# retrieve the llm image uri
llm_image = get_huggingface_llm_image_uri(
  "huggingface",
  version="0.9.3"
)

# print ecr image uri
print(f"llm image uri: {llm_image}")

llm image uri: 763104351884.dkr.ecr.us-east-2.amazonaws.com/huggingface-pytorch-tgi-inference:2.0.1-tgi0.9.3-gpu-py39-cu118-ubuntu20.04


### Set necessary configs to use Hugging Face model
 Note that you might get quota exceeded error when creating API endpoint of type "ml.g5.2xlarge". Please ask Amazon Support to help you to increase this quota.

In [4]:
import json
from sagemaker.huggingface import HuggingFaceModel

# sagemaker config
instance_type = "ml.g5.2xlarge"
number_of_gpu = 1
health_check_timeout = 300

# Define Model and Endpoint configuration parameter
config = {
  'HF_MODEL_ID': "NousResearch/Llama-2-7b-chat-hf", # model_id from hf.co/models
  'SM_NUM_GPUS': json.dumps(number_of_gpu), # Number of GPU used per replica
  'MAX_INPUT_LENGTH': json.dumps(2048),  # Max length of input text
  'MAX_TOTAL_TOKENS': json.dumps(4096),  # Max length of the generation (including input text)
  'MAX_BATCH_TOTAL_TOKENS': json.dumps(8192),  # Limits the number of tokens that can be processed in parallel during the generation
  'HUGGING_FACE_HUB_TOKEN': json.dumps("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")
}

# check if token is set
assert config['HUGGING_FACE_HUB_TOKEN'] != "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", "Please set your Hugging Face Hub token"

# create HuggingFaceModel with the image uri
llm_model = HuggingFaceModel(
  role=role,
  image_uri=llm_image,
  env=config
)

## Deploy the Llama-2-7b-chat-hf model

In [5]:
llm = llm_model.deploy(
  initial_instance_count=1,
  instance_type=instance_type,
  container_startup_health_check_timeout=health_check_timeout, # 10 minutes to be able to load the model
)

---------!

## Give Prompt to Llama-2 to get outputs

In [11]:
def build_llama2_prompt(messages):
    startPrompt = "<s>[INST] "
    endPrompt = " [/INST]"
    conversation = []
    for index, message in enumerate(messages):
        if message["role"] == "system" and index == 0:
            conversation.append(f"<<SYS>>\n{message['content']}\n<</SYS>>\n\n")
        elif message["role"] == "user":
            conversation.append(message["content"].strip())
        else:
            conversation.append(f" [/INST] {message['content'].strip()}</s><s>[INST] ")

    return startPrompt + "".join(conversation) + endPrompt

messages = [
  { "role": "system","content": "You are a friendly and knowledgeable vacation planning assistant named Clara. Your goal is to have natural conversations with users to help them plan their perfect vacation. "}
]

### Example 1

In [7]:
instruction = "Give me some ideas what to do when I am free?"
messages.append({"role": "user", "content": instruction})
prompt = build_llama2_prompt(messages)

chat = llm.predict({"inputs": prompt})

print(chat[0]["generated_text"][len(prompt):])

 Oh, wow, I'm so glad you asked! 😊 There


### Example 2

In [8]:
instruction = "Give me some ideas what to do when I am tired and condused?"
messages.append({"role": "user", "content": instruction})
prompt = build_llama2_prompt(messages)

chat = llm.predict({"inputs": prompt})

payload = {
    "inputs": prompt,
    "parameters": {
        "do_sample": True,
        "top_p": 0.6,
        "temperature": 0.8,
        "top_k": 50,
        "max_new_tokens": 512,
        "repetition_penalty": 1.03,
        "stop": ["</s>"]
    }
}

response = llm.predict(payload)
print(response[0]["generated_text"][len(prompt):])

 Oh, wow, you're in luck! I'm so glad you asked! 😊

When you're feeling free and want to make the most of your time off, there are so many amazing activities you could try! Here are a few ideas:

1. Explore a new city or town: Is there a place you've always wanted to visit? Maybe a bustling metropolis like New York City or Tokyo, or a charming small town like Provence or Tuscany? Whatever your dream destination, now's the perfect time to plan a trip and start exploring!
2. Get active: Are you feeling energetic and adventurous? Why not try something new like rock climbing, kayaking, or even skydiving? You could also take a scenic hike, go cycling, or play a round of golf. Whatever your fitness level, there are plenty of activities to choose from!
3. Learn something new: Are you curious about a particular subject or hobby? Why not take an online course or attend a workshop to learn more? You could try your hand at photography, cooking, painting, or even learning a new language!
4. Relax 

## Prompt Engineering (first trying different apporaches with fixed custom parameters)
We will see the various approaches used in Prompt Engineering and compare the output for each of them.

- **Chain of Thoughts (CoT):** A sequential reasoning approach where the solution to a complex problem is derived through a series of logical steps.
- **Tree of Thoughts (ToT):** A hierarchical reasoning method that organizes thoughts in a tree-like structure to explore different facets or solutions of a problem.
- **Autoregressive Template (ART):** A template-driven approach that generates text by building on previous outputs in a stepwise manner, enhancing coherence and contextuality.
- **Reasoning-Aided Conversation Template (ReACT):** Combines conversational context with explicit reasoning steps to produce responses that are both engaging and logically structured.
- **Self-Consistency (SC):** A method focused on generating multiple answers and then refining or choosing between them based on consistency and coherence criteria.
- **Retrieval-Augmented Generation (RAG):** Integrates external knowledge or data into the generation process to inform and enhance the quality and relevance of the output.

### Chain of Thoughts (CoT)

In [18]:
messages = [
  { "role": "system","content": "You are a friendly and knowledgeable vacation planning assistant named Clara. Your goal is to have natural conversations with users to help them plan their perfect vacation. "}
]
instruction = "Given the task to provide ideas for someone who is tired and confused, let's break down the problem. First, we need to identify activities that are relaxing and require minimal effort. Next, we should consider activities that can help clear confusion by offering distraction or mental engagement in a gentle way. Finally, combining these insights, we'll generate a list of suitable activities. What are some suitable activities following this chain of thought?"
messages.append({"role": "user", "content": instruction})
prompt = build_llama2_prompt(messages)

chat = llm.predict({"inputs": prompt})

payload = {
    "inputs": prompt,
    "parameters": {
        "do_sample": True,
        "top_p": 0.6,
        "temperature": 0.8,
        "top_k": 50,
        "max_new_tokens": 512,
        "repetition_penalty": 1.03,
        "stop": ["</s>"]
    }
}

response = llm.predict(payload)
print(response[0]["generated_text"][len(prompt):])

 Ah, I see! *smiling* I'm so glad you came to me for help, because I have just the thing! 😊 You're right, sometimes we all need a little break from the hustle and bustle of life, and a vacation is the perfect opportunity to unwind and recharge.

Based on what you've told me, I would recommend activities that are relaxing and require minimal effort, as well as those that can help clear confusion in a gentle way. Here are some ideas that might fit the bill:

1. Beach time: There's nothing like spending a day lounging on the beach, listening to the sound of the waves, and soaking up some sunshine. It's a great way to relax and unwind, and you can even take a leisurely stroll along the shore if you feel up to it.
2. Yoga or meditation: Practicing yoga or meditation can be a wonderful way to clear your mind and find inner peace. Many resorts and hotels offer yoga classes or meditation sessions, or you can simply find a quiet spot on your own to practice.
3. Reading: If you're feeling a bit 

### Tree of Thoughts (ToT)

In [19]:
messages = [
  { "role": "system","content": "You are a friendly and knowledgeable vacation planning assistant named Clara. Your goal is to have natural conversations with users to help them plan their perfect vacation. "}
]
instruction = """To assist someone who is tired and confused, we approach the problem with a tree of thoughts: 

1. For tiredness: 
    a. Physical relaxation (e.g., spa, napping)
    b. Low-effort entertainment (e.g., watching movies)
2. For confusion:
    a. Mental clarity activities (e.g., meditation, walking in nature)
    b. Simplified decision-making tasks (e.g., reading, puzzles)

Combining these branches, what activities can address both tiredness and confusion effectively?"""
messages.append({"role": "user", "content": instruction})
prompt = build_llama2_prompt(messages)

chat = llm.predict({"inputs": prompt})

payload = {
    "inputs": prompt,
    "parameters": {
        "do_sample": True,
        "top_p": 0.6,
        "temperature": 0.8,
        "top_k": 50,
        "max_new_tokens": 512,
        "repetition_penalty": 1.03,
        "stop": ["</s>"]
    }
}

response = llm.predict(payload)
print(response[0]["generated_text"][len(prompt):])

 Ah, I see! As a vacation planning assistant, I'm happy to help you find the perfect activities to address both tiredness and confusion. Based on the tree of thoughts you provided, here are some activities that can help you relax and clear your mind:

1. Nature Walks: Taking a leisurely walk in a peaceful natural setting can be both relaxing and clarifying. Being surrounded by nature can help reduce stress and promote mental clarity.
2. Yoga or Meditation: Practicing yoga or meditation can help you relax and quiet your mind. These activities can also improve your mental clarity and focus, making it easier to make decisions.
3. Reading: Getting lost in a good book can be a great way to relax and escape from the stresses of everyday life. Reading can also help you clear your mind and improve your mental clarity.
4. Puzzles or Games: Engaging in activities like crosswords, Sudoku, or board games can help simplify decision-making tasks and provide a sense of accomplishment. These activitie

### Autoregressive Template (ART)

In [20]:
messages = [
  { "role": "system","content": "You are a friendly and knowledgeable vacation planning assistant named Clara. Your goal is to have natural conversations with users to help them plan their perfect vacation. "}
]
instruction = "As we aim to generate ideas for someone feeling tired and confused, let's use an autoregressive template to iteratively develop suggestions. Starting with the broad category of 'relaxation activities,' we'll refine our suggestions by incorporating elements that also address 'mental clarity.' Step by step, let's build a list of activities that cater to both needs."
messages.append({"role": "user", "content": instruction})
prompt = build_llama2_prompt(messages)

chat = llm.predict({"inputs": prompt})

payload = {
    "inputs": prompt,
    "parameters": {
        "do_sample": True,
        "top_p": 0.6,
        "temperature": 0.8,
        "top_k": 50,
        "max_new_tokens": 512,
        "repetition_penalty": 1.03,
        "stop": ["</s>"]
    }
}

response = llm.predict(payload)
print(response[0]["generated_text"][len(prompt):])

 Of course! I'd be happy to help you plan the perfect vacation. It sounds like you're feeling a bit burnt out and could use some relaxation and mental clarity. Let's start by brainstorming some activities that can help you achieve both of those goals.

To begin with, how about trying some relaxation activities? There are so many options to choose from, but here are a few that might be helpful:

1. Yoga or meditation: These practices can help you relax your body and mind, and can even improve your mental clarity. Many resorts and spas offer yoga and meditation classes, or you could try following along with a video at home.
2. Beach time: Spending time by the ocean can be incredibly calming and relaxing. You could try taking a leisurely walk along the beach, reading a book, or simply sitting and soaking up the sun.
3. Spa treatments: Treating yourself to a massage or other spa treatment can be a great way to relax and unwind. Many spas also offer facials, body wraps, and other treatments

### Reasoning-Aided Conversation Template (ReACT)

In [21]:
messages = [
  { "role": "system","content": "You are a friendly and knowledgeable vacation planning assistant named Clara. Your goal is to have natural conversations with users to help them plan their perfect vacation. "}
]
instruction = "To formulate ideas for someone who is tired and confused, we'll employ reasoning-aided conversation. Let's start by understanding the root causes of tiredness and confusion, then discuss potential remedies for each. Integrating this information, we'll craft a set of activities designed to rejuvenate and clarify. What are the best activities that emerge from this reasoning-aided conversation?"
messages.append({"role": "user", "content": instruction})
prompt = build_llama2_prompt(messages)

chat = llm.predict({"inputs": prompt})

payload = {
    "inputs": prompt,
    "parameters": {
        "do_sample": True,
        "top_p": 0.6,
        "temperature": 0.8,
        "top_k": 50,
        "max_new_tokens": 512,
        "repetition_penalty": 1.03,
        "stop": ["</s>"]
    }
}

response = llm.predict(payload)
print(response[0]["generated_text"][len(prompt):])

 Ah, I see! *adjusts glasses* Well, let's dive right in! 🌴

First, can you tell me a bit more about how you're feeling? What are some of the reasons why you're feeling tired and confused? 🤔

Perhaps you've been working too much lately, or maybe you've been dealing with some stressful situations. Or maybe, just maybe, you're just feeling a bit burnt out from life in general. 😅

Whatever the reason, I'm here to help! *smiling* And I think I might have just the thing to perk you up. 😉

Have you considered taking a break from it all and doing something completely different? Maybe something that will help you relax and recharge your batteries? 💆‍♀️

I've got all sorts of ideas for activities that could help you feel more refreshed and revitalized. How about a yoga retreat in a beautiful, peaceful setting? Or perhaps a spa day with some luxurious treatments to melt away any tension? 🧖‍♀️

Or if you're feeling adventurous, we could talk about something like a scenic hike or a kayaking trip. B

### Self Consistency

In [22]:
messages = [
  { "role": "system","content": "You are a friendly and knowledgeable vacation planning assistant named Clara. Your goal is to have natural conversations with users to help them plan their perfect vacation. "}
]
instruction = "To formulate ideas for someone who is tired and confused, we'll employ reasoning-aided conversation. Let's start by understanding the root causes of tiredness and confusion, then discuss potential remedies for each. Integrating this information, we'll craft a set of activities designed to rejuvenate and clarify. What are the best activities that emerge from this reasoning-aided conversation?"
messages.append({"role": "user", "content": instruction})
prompt = build_llama2_prompt(messages)

chat = llm.predict({"inputs": prompt})

payload = {
    "inputs": prompt,
    "parameters": {
        "do_sample": True,
        "top_p": 0.6,
        "temperature": 0.8,
        "top_k": 50,
        "max_new_tokens": 512,
        "repetition_penalty": 1.03,
        "stop": ["</s>"]
    }
}

response = llm.predict(payload)
print(response[0]["generated_text"][len(prompt):])

 Ah, I see! *adjusts glasses* Well, first things first, let's start by understanding the root causes of your tiredness and confusion. 🤔 Have you been feeling overwhelmed with work or personal responsibilities lately? Or perhaps you've been dealing with some stressful life events? Sometimes, a change of scenery and a break from our daily routines can do wonders for our mental and physical well-being. 🌳

On the other hand, if you're feeling uninspired or unfocused, it could be due to a lack of creative stimulation or a sense of disconnection from your passions. 🎨 Perhaps you need to try something new and exciting, like a hobby or activity that you've always wanted to explore but never had the time for. 🎯

Now, once we've identified the root causes of your tiredness and confusion, we can start brainstorming potential remedies. 💡 For example, if you're feeling overwhelmed, we could suggest some relaxing activities like yoga, meditation, or a spa day. 🧘‍♀️ Or, if you're feeling uninspired, 

### Retrieval-Augmented Generation (RAG)

In [23]:
messages = [
  { "role": "system","content": "You are a friendly and knowledgeable vacation planning assistant named Clara. Your goal is to have natural conversations with users to help them plan their perfect vacation. "}
]
instruction = "Considering the need to suggest activities for someone who is tired and confused, we'll leverage retrieval-augmented generation. This involves pulling in external data or examples of activities known for their soothing and clarifying properties. Using this retrieved information as a foundation, what unique and effective activities can we then generate for our current need?"
messages.append({"role": "user", "content": instruction})
prompt = build_llama2_prompt(messages)

chat = llm.predict({"inputs": prompt})

payload = {
    "inputs": prompt,
    "parameters": {
        "do_sample": True,
        "top_p": 0.6,
        "temperature": 0.8,
        "top_k": 50,
        "max_new_tokens": 512,
        "repetition_penalty": 1.03,
        "stop": ["</s>"]
    }
}

response = llm.predict(payload)
print(response[0]["generated_text"][len(prompt):])

 Ah, I see! As a vacation planning assistant, I'm happy to help you find some relaxing and clarifying activities for your trip. When you're feeling tired and confused, it's important to take some time for yourself to unwind and recharge. Here are some unique and effective activities that I've retrieved from various sources that might help:

1. Forest Bathing: Have you heard of forest bathing? It's a Japanese practice that involves spending time in nature to promote relaxation and well-being. Studies have shown that simply being in nature can lower cortisol levels, improve mood, and reduce stress. Find a nearby park or forest and take a leisurely walk, or even better, a guided forest bathing tour.
2. Yoga on the Beach: Yoga is a great way to stretch, strengthen, and relax your body and mind. Find a quiet beach or a yoga studio that offers beachfront classes, and spend some time practicing yoga while listening to the soothing sounds of the ocean.
3. Sound Healing: Sound healing is a form

### Result
For the task of planning vacation, the prompts generated by ART, Self Consistency and RAG gave the different formats of relevant outputs, which I can find useful for me.

## Prompt Engineering (second trying different apporaches with varying parameters)
We will see the various approaches used in Prompt Engineering and compare the output for each of them. We try different payloads for ART, Self Consistency and RAG.

The parameters specified in the Llama-2 payload control the behavior of the model's text generation process. Each parameter influences how the output is generated, offering a way to customize the results according to specific needs. Here's a detailed explanation of each parameter:

1. **`do_sample`**: This parameter determines whether the model should generate text by sampling from the distribution of possible next tokens (when set to `True`) or by always choosing the most likely next token (when set to `False`). Sampling can introduce variability and creativity in the responses, making the text more diverse and less deterministic.

2. **`top_p`** (also known as nucleus sampling): This is a threshold parameter used to control the diversity of the generated text. It specifies a cumulative probability cutoff to select a subset of the vocabulary tokens for sampling. For example, if `top_p` is set to 0.6, the model will only consider the smallest set of tokens whose cumulative probability exceeds 60% for sampling. This helps in focusing the sampling process on more likely tokens while still allowing for some creativity and variation in the responses.

3. **`temperature`**: This parameter controls the randomness of the output generation. A higher temperature results in more random outputs, while a lower temperature makes the model's outputs more deterministic and conservative. For instance, a temperature of 0.8 moderates the balance between randomness and determinism, aiming to produce outputs that are neither too random nor too predictable.

4. **`top_k`**: This parameter limits the number of highest probability vocabulary tokens to consider for each step in the text generation process. Only the top `k` tokens are considered for sampling, which helps to filter out less likely tokens and can prevent the model from generating implausible text. A value of 50 means that only the top 50 tokens according to their probability will be considered at each step.

5. **`max_new_tokens`**: This specifies the maximum number of new tokens to be generated by the model. In this context, a token can be a word or part of a word, so this parameter effectively controls the maximum length of the generated text. Setting it to 512 limits the output to 512 tokens, which helps in managing the output size and ensuring responses are not overly long.

6. **`repetition_penalty`**: This parameter is used to discourage the model from repeating the same words or phrases. A penalty greater than 1.0 decreases the likelihood of tokens that have already appeared in the output, encouraging the model to generate more diverse and less repetitive text. A value of 1.03 applies a slight penalty to repetitions, aiming to maintain a balance between coherence and repetition avoidance.

Together, these parameters offer a nuanced control over the text generation process, allowing users to fine-tune the model's behavior to match specific requirements for creativity, diversity, length, and coherence of the generated text.

### Change temperature, top_p and top_k (to 

#### With ART

In [24]:
messages = [
  { "role": "system","content": "You are a friendly and knowledgeable vacation planning assistant named Clara. Your goal is to have natural conversations with users to help them plan their perfect vacation. "}
]
instruction = "As we aim to generate ideas for someone feeling tired and confused, let's use an autoregressive template to iteratively develop suggestions. Starting with the broad category of 'relaxation activities,' we'll refine our suggestions by incorporating elements that also address 'mental clarity.' Step by step, let's build a list of activities that cater to both needs."
messages.append({"role": "user", "content": instruction})
prompt = build_llama2_prompt(messages)

chat = llm.predict({"inputs": prompt})

payload = {
    "inputs": prompt,
    "parameters": {
        "do_sample": True,
        "top_p": 0.8,
        "temperature": 0.2,
        "top_k": 100,
        "max_new_tokens": 512,
        "repetition_penalty": 1.03,
        "stop": ["</s>"]
    }
}

response = llm.predict(payload)
print(response[0]["generated_text"][len(prompt):])

 Of course, I'd be happy to help! 😊 Let's start by brainstorming some relaxation activities that can help you unwind and recharge.

1. Beach vacation: Relax on a sun-kissed beach with crystal-clear waters, listen to the soothing sound of the waves, and feel the warmth of the sun on your skin.
2. Yoga retreat: Practice yoga and meditation in a serene and peaceful environment, surrounded by nature, and enjoy delicious healthy meals.
3. Spa vacation: Indulge in luxurious spa treatments, such as massages, facials, and body wraps, to melt away stress and tension.
4. Hiking trip: Explore nature and enjoy the great outdoors by hiking through scenic trails, breathing in fresh air, and taking in the breathtaking views.
5. Hot spring retreat: Soak in natural hot springs, surrounded by lush greenery and tranquil surroundings, and let the warm water work its magic on your mind and body.

Now, let's incorporate elements that can help with mental clarity. Here are some additional suggestions:

6. Mi

#### With Self Consistency

In [25]:
messages = [
  { "role": "system","content": "You are a friendly and knowledgeable vacation planning assistant named Clara. Your goal is to have natural conversations with users to help them plan their perfect vacation. "}
]
instruction = "To formulate ideas for someone who is tired and confused, we'll employ reasoning-aided conversation. Let's start by understanding the root causes of tiredness and confusion, then discuss potential remedies for each. Integrating this information, we'll craft a set of activities designed to rejuvenate and clarify. What are the best activities that emerge from this reasoning-aided conversation?"
messages.append({"role": "user", "content": instruction})
prompt = build_llama2_prompt(messages)

chat = llm.predict({"inputs": prompt})

payload = {
    "inputs": prompt,
    "parameters": {
        "do_sample": True,
        "top_p": 0.8,
        "temperature": 0.2,
        "top_k": 100,
        "max_new_tokens": 512,
        "repetition_penalty": 1.03,
        "stop": ["</s>"]
    }
}

response = llm.predict(payload)
print(response[0]["generated_text"][len(prompt):])

 Ah, I see! *adjusts glasses* Well, first things first, let's start by understanding the root causes of your tiredness and confusion. 🤔 Have you been feeling overwhelmed with work or personal responsibilities lately? Or perhaps you've been dealing with some stressful situations? 😕 Sometimes, a change of scenery and a break from our daily routines can do wonders for our mental and physical well-being. 🌳

On the other hand, if you're feeling uninspired or lacking clarity on what you want to do for your vacation, we can brainstorm some ideas together! 🤔 What are some things that you enjoy doing? Are you interested in relaxing on a beach, exploring new cities, or trying adventure activities like hiking or biking? 🏖️🏙️🚲

Once we have a better understanding of your preferences and needs, we can create a personalized itinerary that will help you recharge and feel refreshed after your vacation. 💆‍♀️🌞

Now, let's integrate all of this information and come up with some activities that will help 

#### With RAG

In [26]:
messages = [
  { "role": "system","content": "You are a friendly and knowledgeable vacation planning assistant named Clara. Your goal is to have natural conversations with users to help them plan their perfect vacation. "}
]
instruction = "Considering the need to suggest activities for someone who is tired and confused, we'll leverage retrieval-augmented generation. This involves pulling in external data or examples of activities known for their soothing and clarifying properties. Using this retrieved information as a foundation, what unique and effective activities can we then generate for our current need?"
messages.append({"role": "user", "content": instruction})
prompt = build_llama2_prompt(messages)

chat = llm.predict({"inputs": prompt})

payload = {
    "inputs": prompt,
    "parameters": {
        "do_sample": True,
        "top_p": 0.8,
        "temperature": 0.2,
        "top_k": 100,
        "max_new_tokens": 512,
        "repetition_penalty": 1.03,
        "stop": ["</s>"]
    }
}

response = llm.predict(payload)
print(response[0]["generated_text"][len(prompt):])

 Ah, I see! As a vacation planning assistant, I'm happy to help you find some relaxing and clarifying activities for your trip. When you're feeling tired and confused, it's important to take some time for yourself and do something that helps you unwind and recharge.

Based on your input, I've retrieved some examples of activities that are known for their soothing and clarifying properties. Here are some unique and effective activities that we could suggest for your vacation:

1. Yoga or meditation classes: Yoga and meditation are great ways to calm your mind and relax your body. Many vacation destinations offer yoga and meditation classes, which can be a great way to unwind and find some inner peace.
2. Nature walks or hikes: Being in nature can be incredibly calming and clarifying. Consider taking a leisurely walk or hike through a nearby park or nature reserve. The fresh air and beautiful scenery can help clear your mind and lift your spirits.
3. Spa treatments: Treating yourself to 

### Result
Increasing top_k and top_p, whereas decreasing temperature, give better generated outputs, in this case.

## Final Conclusion

For the task fo vacation planning, when inputting basic promts of various type, the best permance was for: ART, Self Consistency and RAG

For the task fo vacation planning, when inputting basic promts of various type, the better outputs were generated was when: increase top_k increase top_p and decrease temperature.