# Task 3 Prompt Engineering for Large Language Models (LLMs) [4 marks]
Questions

1. Demonstrate how to use Zero-Shot Learning and Few-Shot Learning to classify human activities based on the featurized accelerometer data. Qualitatively demonstrate the performance of Few-Shot Learning with Zero-Shot Learning. Which method performs better? Why?  **[1 marks]**
2. Quantitatively compare the accuracy of Few-Shot Learning with Decision Trees (You may use a subset of the test set if you encounter rate-limiting issues). Which method performs better? Why? **[1 marks]**
3. What are the limitations of Zero-Shot Learning and Few-Shot Learning in the context of classifying human activities based on featurized accelerometer data? **[1 marks]**
4. What does the model classify when given input from an entirely new activity that it hasn't seen before? **[0.5 mark]**
5. Test the model with random data (ensuring the data has the same dimensions and range as the previous input) and report the results. **[0.5 mark]**

In [1]:
import pandas as pd
from langchain_groq import ChatGroq
import os


In [4]:
from dotenv import load_dotenv

load_dotenv()

Groq_Token = os.getenv('api_key')
groq_models = {"llama3-70b": "llama3-70b-8192", "mixtral": "mixtral-8x7b-32768", "gemma-7b": "gemma-7b-it","llama3.1-70b":"llama-3.1-70b-versatile","llama3-8b":"llama3-8b-8192","llama3.1-8b":"llama-3.1-8b-instant","gemma-9b":"gemma2-9b-it"}
model_name = "llama3-70b"
llm = ChatGroq(model=groq_models[model_name], api_key=Groq_Token, temperature=0)

In [42]:
# Testing data 

file1_laying =  pd.read_csv("Combined/Test/LAYING/Subject_2.csv")
file2_walking = pd.read_csv("Combined/Test/WALKING/Subject_2.csv")
file3_sitting=  pd.read_csv("Combined/Test/SITTING/Subject_2.csv")
file4_standing=  pd.read_csv("Combined/Test/STANDING/Subject_2.csv")
file5_upstairs=  pd.read_csv("Combined/Test/WALKING_UPSTAIRS/Subject_2.csv")
file6_downstairs=  pd.read_csv("Combined/Test/WALKING_DOWNSTAIRS/Subject_2.csv")


df1 = pd.DataFrame(file1_laying).head(100)
df2 = pd.DataFrame(file2_walking).head(100)
df3 = pd.DataFrame(file3_sitting).head(100)
df4 = pd.DataFrame(file4_standing).head(100)
df5 = pd.DataFrame(file5_upstairs).head(100)
df6 = pd.DataFrame(file6_downstairs).head(100)


In [41]:
# Training Data for few shot prompt examples

laying_train = pd.read_csv("Combined/Train/LAYING/Subject_1.csv")
sitting_train = pd.read_csv("Combined/Train/SITTING/Subject_1.csv")
standing_train = pd.read_csv("Combined/Train/STANDING/Subject_1.csv")
walking_train = pd.read_csv("Combined/Train/WALKING/Subject_1.csv")
downstairs_train = pd.read_csv("Combined/Train/WALKING_DOWNSTAIRS/Subject_1.csv")
upstairs_train = pd.read_csv("Combined/Train/WALKING_UPSTAIRS/Subject_1.csv")

laying_df = pd.DataFrame(laying_train).head(100)
sitting_df = pd.DataFrame(sitting_train).head(100)
standing_df = pd.DataFrame(standing_train).head(100)
walking_df = pd.DataFrame(walking_train).head(100)
downstairs_df = pd.DataFrame(downstairs_train).head(100)
upstairs_df = pd.DataFrame(upstairs_train).head(100)




In [43]:
# Zero shot demonstration

zero_shot_prompt = f"""
* You are a human activity recognition model.
* Your task is to classify the following accelerometer data into one of the six activities: Walking, Standing, Sittting, Laying, Walking Upstairs, Walking Downstairs. 
* Provide the sentiment label and, if necessary, a brief explanation of your reasoning.
Here is the accelerometer data:
{df1}, {df2}, {df3}

Please classify the activity for these three accelerometer data.
"""

zero_shot_answer = llm.invoke(zero_shot_prompt)
print(zero_shot_answer.content)

Based on the provided accelerometer data, I will classify the activities as follows:

**Data 1:**
The accelerometer data shows a relatively stable pattern with small variations in the x, y, and z axes. The values are mostly within a small range, indicating a low level of movement. This pattern is consistent with the activity of **Standing**.

**Data 2:**
The accelerometer data shows a significant variation in the x-axis, with values ranging from approximately 0.7 to 1.15. This suggests a high level of movement in the x-axis, which is consistent with the activity of **Walking**. The y and z axes show relatively smaller variations, which further supports this classification.

**Data 3:**
The accelerometer data shows a relatively stable pattern with small variations in the x, y, and z axes. The values are mostly within a small range, indicating a low level of movement. However, the x-axis values are slightly higher than those in Data 1, which suggests a slightly more upright posture. This

In [50]:
# Few Shot learning

few_shot_prompt = f""" 
* You are a human activity recognition model.
* Your task is to classify the following accelerometer data into one of the six activities: Walking, Standing, Sittting, Laying, Walking Upstairs, Walking Downstairs. 
* Provide the sentiment label and a brief explanation of your reasoning. 

Here are some examples:
1.Dataset of laying: {laying_df}
2.Dataset of sitting: {sitting_df}
3.Dataset of standing: {standing_df}
4.Dataset of walking: {walking_df}
5.Dataset of walking downstairs: {downstairs_df}
6.Dataset of walking upstairs: {upstairs_df}

Here is the accelerometer data:
{df1}, 
{df2},
{df3},
{df4},
{df5},
{df6}

Please classify the activity for these six accelerometer data using the dataset of sample activites.
"""


few_shot_answer = llm.invoke(few_shot_prompt)
print(few_shot_answer.content)

Based on the provided accelerometer data, I will classify each dataset into one of the six activities: Walking, Standing, Sitting, Laying, Walking Upstairs, or Walking Downstairs.

**Dataset 1:**
        accx      accy      accz
0  -0.185673  0.737995  0.663749
...
 Sentiment Label: Laying
Reasoning: The accy values are consistently high (> 0.7), indicating a relatively stable vertical acceleration, which is characteristic of laying down.

**Dataset 2:**
        accx      accy      accz
0   1.153546 -0.249716 -0.084946
...
 Sentiment Label: Walking
Reasoning: The accx values are varying and relatively high (> 1.0), indicating a dynamic movement, which is characteristic of walking. The accy values are also varying, suggesting a change in direction.

**Dataset 3:**
        accx      accy      accz
0   1.011120 -0.105541  0.162614
...
 Sentiment Label: Standing
Reasoning: The accx values are relatively stable and high (> 1.0), indicating a vertical acceleration, which is characteristic of

Q2. Quantitatively compare the accuracy of Few-Shot Learning with Decision Trees (You may use a subset of the test set if you encounter rate-limiting issues). Which method performs better? Why?

Q3: What are the limitations of Zero-Shot Learning and Few-Shot Learning in the context of classifying human activities based on featurized accelerometer data? 

Limitations of Zero Shot Learning in this activities are:

    1. It's whole accuracy is fully dependent on the given prompt and how well it is designed. A poor designed prompt will not give us better accuracy in output. 

    2. It does not accurately specify in the activities due to small changes or similarity in different activities dataset.  

    3. It feels difficulties in distinguishing between various activities, like, distinguishing between "Running" and "Walking", similarly with "Sitting" and "Laying". Because there is no significant movement in both the datasets, it is giving wrong output with respect to the acivity. 



Limitations of Few Shot Learning in this activities are:

    1. It is depended on the quality of samples given in prompt. If the examples are not choosen well or do no have enough information it can lead to wrong output. Poor data representation can lead to inappropriate outputs

    2. It typically relies on the limited number of samples which can lead to  create bias in output and may be possible that it does not capture full diversity or does not generalize well.

    3. Because there is small number of samples it can lead to the overfiiting to the provided examples. It might be perform well for given examples but not with test or unseen data. 
    


Q4. What does the model classify when given input from an entirely new activity that it hasn't seen before?



1. In Zero Shot Learning, if we given input from entirely new activity then the model will try to classify the new activity with the help of existing knowledge and any information which is available. It might be not accurate but if there is no such relation between the old activities and new activities then it might be not able to classify or recongnize it properly. 

2. In Few Shot learning, we provide some examples related to the new activity which helps the model to learn, recognize and classify it properly. It uses the examples to understand the pattern of features of activity. And with the help of these, it can give more accurate answers and classify it better. But also, the accuracy will depend on how well the examples are presented. 

In [34]:
import pandas as pd
from langchain_groq import ChatGroq
import os
from dotenv import load_dotenv

# Load environment variables
load_dotenv()
Groq_Token = os.getenv('api_key')

# Define model mapping and select the model
groq_models = {
    "llama3-70b": "llama3-70b-8192",
    "mixtral": "mixtral-8x7b-32768",
    "gemma-7b": "gemma-7b-it",
    "llama3.1-70b": "llama-3.1-70b-versatile",
    "llama3-8b": "llama3-8b-8192",
    "llama3.1-8b": "llama-3.1-8b-instant",
    "gemma-9b": "gemma2-9b-it"
}

model_name = "llama3.1-70b"
llm = ChatGroq(model=groq_models[model_name], api_key=Groq_Token, temperature=0)

# Load test data
file_paths = {
    "LAYING": "Combined/Test/LAYING/Subject_4.csv",
    "WALKING": "Combined/Test/WALKING/Subject_4.csv",
    "SITTING": "Combined/Test/SITTING/Subject_4.csv",
    "STANDING": "Combined/Test/STANDING/Subject_4.csv",
    "WALKING_UPSTAIRS": "Combined/Test/WALKING_UPSTAIRS/Subject_4.csv",
    "WALKING_DOWNSTAIRS": "Combined/Test/WALKING_DOWNSTAIRS/Subject_4.csv"
}

dataframes = {key: pd.read_csv(path) for key, path in file_paths.items()}

# Convert DataFrames to CSV strings
data_strings = {key: df.head(70).to_csv(index=False) for key, df in dataframes.items()}

# Load training data
train_file_paths = {
    "LAYING": "Combined/Train/LAYING/Subject_1.csv",
    "SITTING": "Combined/Train/SITTING/Subject_1.csv",
    "STANDING": "Combined/Train/STANDING/Subject_1.csv",
    "WALKING": "Combined/Train/WALKING/Subject_1.csv",
    "WALKING_DOWNSTAIRS": "Combined/Train/WALKING_DOWNSTAIRS/Subject_1.csv",
    "WALKING_UPSTAIRS": "Combined/Train/WALKING_UPSTAIRS/Subject_1.csv"
}

train_dataframes = {key: pd.read_csv(path) for key, path in train_file_paths.items()}

# Convert training DataFrames to CSV strings
train_data_strings = {key: df.head(50).to_csv(index=False) for key, df in train_dataframes.items()}

# Create the few-shot learning prompt
few_shot_prompt = f"""
You are a human activity recognition model.
Your task is to classify the following accelerometer data into one of the six activities: Walking, Standing, Sitting, Laying, Walking Upstairs, Walking Downstairs.
Provide the activity label and, if necessary, a brief explanation of your reasoning.

Here are some examples:
1. Dataset of laying: {train_data_strings['LAYING']}
2. Dataset of sitting: {train_data_strings['SITTING']}
3. Dataset of standing: {train_data_strings['STANDING']}
4. Dataset of walking: {train_data_strings['WALKING']}
5. Dataset of walking downstairs: {train_data_strings['WALKING_DOWNSTAIRS']}
6. Dataset of walking upstairs: {train_data_strings['WALKING_UPSTAIRS']}

Using these examples, try to classify the following accelerometer data:
1. {data_strings['LAYING']}
2. {data_strings['WALKING']}
3. {data_strings['SITTING']}
4. {data_strings['STANDING']}
5. {data_strings['WALKING_UPSTAIRS']}
6. {data_strings['WALKING_DOWNSTAIRS']}

Please classify the activity for these six accelerometer data using the dataset of sample activities.
"""

# Invoke the model
try:
    few_shot_answer = llm.invoke(few_shot_prompt)
    print(few_shot_answer.content)
except Exception as e:
    print(f"An error occurred: {e}")


Based on the provided accelerometer data, I will classify each activity as follows:

1. accx,accy,accz
0.05378475,0.8970043,0.4434115
...
0.02150611,0.8828317,0.4510561

This activity is classified as **Laying**. The reason for this classification is that the accelerometer data shows a relatively stable and consistent pattern, with minimal changes in acceleration. This is consistent with the laying activity, where the person is stationary and not moving.

2. accx,accy,accz
1.271286,0.198344,0.2454702
...
1.388307,-0.141976,0.0764643

This activity is classified as **Sitting**. The reason for this classification is that the accelerometer data shows a moderate level of acceleration, with some variation in the x, y, and z axes. This is consistent with the sitting activity, where the person is stationary but may be moving slightly.

3. accx,accy,accz
0.9362124,0.2966921,0.2259779
...
0.9445192,0.2983532,0.2130758

This activity is classified as **Standing**. The reason for this classificat