<a href="https://colab.research.google.com/github/claudio1975/Medium-blog/blob/master/Prompt_Engineering/Prompt_Engineering_v1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
!pip install transformers &> /dev/null

In [2]:
!pip install langchain &>/dev/null

In [3]:
from langchain.prompts import ChatPromptTemplate
from langchain.schema import AIMessage, HumanMessage, SystemMessage
from transformers import AutoModelForCausalLM, AutoTokenizer
from sklearn.metrics.pairwise import cosine_similarity
import numpy as np
import torch
import collections

In [4]:
# Load tokenizer and model
checkpoint = "HuggingFaceTB/SmolLM2-1.7B-Instruct"
device = "cuda" if torch.cuda.is_available() else "cpu"

tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = AutoModelForCausalLM.from_pretrained(checkpoint).to(device)

tokenizer_config.json:   0%|          | 0.00/3.76k [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/801k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/466k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/2.10M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/655 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/792 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/3.42G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/132 [00:00<?, ?B/s]

In [9]:
# Function to process the chat with mode selection for reasoning
def generate_response(system_message, user_message, mode='shot'):
    # Construct conversation text based on the mode
    if mode == 'chain-of-thought':
        conversation = (f"{system_message.content}\n"
                        f"User: {user_message.content}\n"
                        f"AI: Let's break it down step-by-step:\n")
    else:  # Defaults to zero-one-few-shot
        conversation = f"{system_message.content}\nUser: {user_message.content}\nAI:"

    # Tokenize input and create attention mask
    tokens = tokenizer(conversation, return_tensors="pt")
    inputs = tokens['input_ids'].to(device)
    attention_mask = tokens['attention_mask'].to(device)

    # Generate AI's response
    # Adjust generation settings when doing CoT
    if mode == 'chain-of-thought':
        temperature = 0.3
        max_new_tokens = 350
    else:  # zero-shot
        temperature = 0.1
        max_new_tokens = 350

    outputs = model.generate(inputs,
                             attention_mask=attention_mask,
                             max_new_tokens=max_new_tokens,
                             temperature=temperature,
                             top_p=0.9, do_sample=True)

    # Decode and return the AIMessage
    output_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
    response_text = output_text.split('AI:')[-1].strip()  # Capture what's generated after 'AI:'
    return AIMessage(content=response_text)



### Zero-Shot Prompting

In [10]:
# Define initial messages
system_message = SystemMessage(content="You are a machine learning engineer. Give an help in coding")
user_message = HumanMessage(content="write python code for a synthetic daily time series of stock prices from 01/2024 to 06/2024")

# Generate response
ai_response = generate_response(system_message, user_message, mode='shot')
print(ai_response.content)

Here's a simple Python code snippet using the pandas library to generate a synthetic daily time series of stock prices from January 2024 to June 2024:

```python
import pandas as pd
import numpy as np

# Set the start and end dates
start_date = '2024-01-01'
end_date = '2024-06-30'

# Create a pandas DataFrame with random stock prices
df = pd.DataFrame({
    'Date': pd.date_range(start=start_date, end=end_date, freq='D'),
    'Price': np.random.uniform(90, 110, len(pd.date_range(start=start_date, end=end_date, freq='D')))
})

# Set the index to the 'Date' column
df.set_index('Date', inplace=True)

# Print the DataFrame
print(df)
```

This code generates a DataFrame with random stock prices for each day from January 2024 to June 2024. The 'Price' column is a random float between 90 and 110. The 'Date' column is a pandas DatetimeIndex with daily frequency.

You can adjust the range of dates and the range of stock prices by modifying the `start_date`, `end_date`, and `start_date` and `end_

### Few-Shot Prompting

In [11]:
# Define initial messages
system_message = SystemMessage(content="You are a helpful assistant. Give an help in analysis")
user_message = HumanMessage(content=(
    "Classify the sentiment of a given text as Positive, Negative, or Neutral:\n"
    "Text: I love this product! It is amazing.\n"
    "Sentiment: Positive\n\n"
    "Text: This is the worst experience I have had.\n"
    "Sentiment: Negative\n\n"
    "Text: The movie was okay, not great but not bad either.\n"
    "Sentiment: Neutral\n\n"
    "Text: The service was fantastic and very fast.\n"
    "Sentiment: "
))

# Generate response
ai_response = generate_response(system_message, user_message,mode='shot')
print(ai_response.content)

The sentiment of the given text is Positive.


### Chain-of-Thought prompting

In [12]:
# Define initial messages
system_message = SystemMessage(content="You are a helpful assistant. Give an help in analysis")
user_message = HumanMessage(content="Why does ice float on water? Think step-by-step, and explain why")


# Generate response
ai_response = generate_response(system_message, user_message, mode='chain-of-thought')
print(ai_response.content)

Let's break it down step-by-step:

Step 1: Understand the properties of ice and water
Ice is a solid state of water. Both ice and water are made up of water molecules. However, the arrangement of the molecules differs between the two states.

Step 2: Compare the density of ice and water
Density is the measure of how much mass is packed into a given volume. Density is measured in units of mass per unit volume, such as grams per cubic centimeter (g/cm³).

Step 3: Determine the density of ice and water
Ice has a density of approximately 0.92 g/cm³, while water has a density of approximately 1.00 g/cm³.

Step 4: Explain why ice floats on water
Since ice is less dense than water, it floats on top of the water. This is because the molecules in ice are arranged in a way that creates more space between them than in water. As a result, ice has a lower mass per unit volume than water.

Step 5: Consider the role of temperature in the density of water
As water freezes, the molecules arrange themse

### Self-Consistency prompting

In [14]:
# Function to generate multiple responses for self-consistency
def generate_responses(system_message, user_message, n=5):
    responses = []

    for _ in range(n):
        # Construct the conversation text for chain-of-thought
        conversation = (f"{system_message}\n"
                        f"User: {user_message}\n"
                        f"AI: Let's break it down step-by-step:\n")

        # Tokenize input and create attention mask
        tokens = tokenizer(conversation, return_tensors="pt")
        inputs = tokens['input_ids'].to(device)
        attention_mask = tokens['attention_mask'].to(device)

        # Generate AI's response using chain-of-thought settings
        outputs = model.generate(inputs,
                                 attention_mask=attention_mask,
                                 max_new_tokens=350,
                                 temperature=0.3,
                                 top_p=0.9,
                                 do_sample=True)

        # Decode and collect the response
        output_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
        response_text = output_text.split('AI:')[-1].strip()
        responses.append(response_text)

    return responses



In [15]:
# Define initial messages (adapted to strings since schema is not used here)
system_message_content = "You are a helpful assistant. Give an help in analysis"
user_message_content = "Why does ice float on water? Think step-by-step, and explain why."

# Generate multiple responses for self-consistency
results = generate_responses(system_message_content, user_message_content, n=5)
for i, result in enumerate(results):
    print(f"Response {i+1}:\n{result}\n")

Response 1:
Let's break it down step-by-step:
1. Ice is formed when water freezes.
2. When water freezes, it expands.
3. The expansion of water when it freezes causes it to be less dense than the water it was before freezing.
4. As a result, ice is less dense than water, which is why it floats on water.
5. This unique property of ice is crucial for life on Earth, as it allows ice to float on the surface of lakes and oceans during the winter, providing a habitat for aquatic life to survive.

Response 2:
Let's break it down step-by-step:
1. Ice is made up of water molecules that are arranged in a crystalline structure.
2. In this structure, the water molecules are arranged in a way that they are less dense than the surrounding water.
3. When ice forms, it expands and takes up more space than the same amount of water.
4. This expansion causes the ice to float on the surface of the water.
5. The reason for this is that the ice is less dense than the water, so it floats on top of the denser

In [16]:
def get_most_common_response(responses):
    # Count the frequency of each response
    response_counts = collections.Counter(responses)
    # Get the most common response
    most_common_response, frequency = response_counts.most_common(1)[0]
    return most_common_response

In [17]:
# Find and print the most common response
most_common_response = get_most_common_response(results)
print(f"The most common response is:\n{most_common_response}")

The most common response is:
Let's break it down step-by-step:
1. Ice is formed when water freezes.
2. When water freezes, it expands.
3. The expansion of water when it freezes causes it to be less dense than the water it was before freezing.
4. As a result, ice is less dense than water, which is why it floats on water.
5. This unique property of ice is crucial for life on Earth, as it allows ice to float on the surface of lakes and oceans during the winter, providing a habitat for aquatic life to survive.


### Tree-of-Thought prompting

In [18]:
# Function to generate tree-of-thought responses
def generate_tree_of_thought_responses(system_message, user_message, depth=2, breadth=3):
    initial_conversation = (f"{system_message}\n"
                            f"User: {user_message}\n"
                            f"AI: Let's explore different possibilities:\n")
    tree = [{"text": initial_conversation, "depth": 0}]
    all_responses = []

    while tree:
        node = tree.pop(0)
        if node["depth"] < depth:
            tokens = tokenizer(node["text"], return_tensors="pt")
            inputs = tokens['input_ids'].to(device)
            attention_mask = tokens['attention_mask'].to(device)

            outputs = model.generate(inputs,
                                     attention_mask=attention_mask,
                                     max_new_tokens=350,
                                     temperature=0.3,
                                     top_p=0.9,
                                     do_sample=True,
                                     num_return_sequences=breadth)

            for output in outputs:
                output_text = tokenizer.decode(output, skip_special_tokens=True)
                response_text = output_text.split('AI:')[-1].strip()
                all_responses.append(response_text)

                # Create new branches
                tree.append({"text": node['text'] + response_text + "\nAI: Let's consider another possibility:", "depth": node['depth'] + 1})

    return all_responses



In [19]:
# Define initial messages
system_message_content = "You are a helpful assistant. Give an help in analysis"
user_message_content = "Why does ice float on water? Think of various possibilities and concepts."

# Generate tree-of-thought responses
results = generate_tree_of_thought_responses(system_message_content, user_message_content)
for i, result in enumerate(results):
    print(f"Branch {i+1}:\n{result}\n")

Branch 1:
Let's explore different possibilities:
1. Density: Ice is less dense than water. This is because ice is made up of 92% water molecules arranged in a crystalline structure with a lower density than the liquid water.
2. Temperature: Ice is formed at a lower temperature than water. When ice forms, it expands and takes up more space than the liquid water. This expansion causes it to float on the surface of the water.
3. Pressure: The pressure of the water above the ice can cause the ice to melt. This is because the pressure increases with depth, and ice at the bottom of a body of water is subjected to higher pressure than ice at the surface.
4. Chemical composition: Ice is composed of hydrogen bonds between water molecules, which are stronger than the bonds between water molecules in liquid water. This results in ice being less dense than liquid water.
5. Boiling point: Water has a higher boiling point than ice. When ice melts, it turns into liquid water, which has a lower boilin

In [20]:
def score_branches(branches, user_query, reference_embeddings, model_embeddings):
    # Create a list to hold branch scores
    scored_branches = []

    # Embed the user query
    query_embedding = model_embeddings(user_query)

    for branch in branches:
        # Embed the branch
        branch_embedding = model_embeddings(branch)

        # Calculate similarity with the user query
        similarity_score = cosine_similarity([query_embedding], [branch_embedding])[0][0]

        # I'll use just the similarity score
        scored_branches.append((branch, similarity_score))

    # Sort branches by score
    scored_branches.sort(key=lambda x: x[1], reverse=True)

    return scored_branches

# function to get embeddings from a model (easiest approach used for exercise)
def model_embeddings(text):
    return np.random.rand(1000)

# Scoring branches
scored_results = score_branches(results, user_message_content, model_embeddings, model_embeddings)
best_branch = scored_results[0][0]  # Select the branch with the highest score
print(f"The best branch is:\n{best_branch}")

The best branch is:
Let's consider another possibility:
Another possibility is that ice floats on water due to the unique properties of water itself. Water has a high specific heat capacity, which means it can absorb and release a large amount of heat energy without a significant change in temperature. This property allows water to remain liquid at temperatures below 0°C (32°F), even in the presence of ice.

Additionally, water has a high latent heat of vaporization, which means it requires a lot of energy to change from a liquid to a gas. This property allows water to remain in a liquid state even when it is cooled to its freezing point.

The combination of these properties allows water to remain liquid at temperatures below 0°C (32°F), which is why ice can float on its surface.
