Clone github repo to have access to data files.

In [2]:
!git clone https://github.com/Hananxx/SentimentAnalysisPromptExp.git

Cloning into 'SentimentAnalysisPromptExp'...
remote: Enumerating objects: 20, done.[K
remote: Counting objects: 100% (20/20), done.[K
remote: Compressing objects: 100% (15/15), done.[K
remote: Total 20 (delta 6), reused 9 (delta 2), pack-reused 0 (from 0)[K
Receiving objects: 100% (20/20), 28.30 KiB | 9.43 MiB/s, done.
Resolving deltas: 100% (6/6), done.


### Install needed packages

In [3]:
!pip install transformers torch pandas accelerate



In [4]:
import pandas as pd
import json
from transformers import pipeline, AutoTokenizer, AutoModelForCausalLM
import torch

### Set root path for accessing repo files

In [5]:
root = "SentimentAnalysisPromptExp/"

### Load dataset

In [6]:
df = pd.read_csv(root + 'data/app.csv')
print(df.head())  # Inspect the first few rows

                                            sentence  label
0  package file invalid i had my phone on factory...      2
1  iffy nice clean app but sometimes it works and...      2
2                        cool just freezes everytime      2
3  network error! suddenly after downloading an u...      2
4  annoying it let me choose the pictures i want ...      2


### Load prompts

In [11]:
  with open(root + 'prompts/zero-shot-prompt-template.json', 'r') as f:
      templates = json.load(f)
  print(templates)  # View the loaded templates

{'vicuna-0': "A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.\nUSER: Please perform Sentiment Classification task. Given the sentence from {}, assign a sentiment label from ['negative', 'neutral', 'positive']. Return label only without any other text.\nASSISTANT: Sure!</s>\nUSER: Sentence: {}\nASSISTANT:", 'vicuna-jira-0': "A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.\nUSER: Please perform Sentiment Classification task. Given the sentence from {}, assign a sentiment label from ['negative', 'positive']. Return label only without any other text.\nASSISTANT: Sure!</s>\nUSER: Sentence: {}\nASSISTANT:", 'llama2-0': "<s>[INST] <<SYS>>\nA chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's q

### Use Vicuna

In [8]:
  # For GPU acceleration if available
  device = 0 if torch.cuda.is_available() else -1

  # Load tokenizer and model (this may take time on first run)
  tokenizer = AutoTokenizer.from_pretrained('lmsys/vicuna-13b-v1.5')
  model = AutoModelForCausalLM.from_pretrained('lmsys/vicuna-13b-v1.5', torch_dtype=torch.float16 if torch.cuda.is_available() else torch.float32)
  model.to(device)

  # Create a text generation pipeline
  vicuna_pipeline = pipeline('text-generation', model=model, tokenizer=tokenizer, device=device)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/749 [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/500k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/438 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/638 [00:00<?, ?B/s]

`torch_dtype` is deprecated! Use `dtype` instead!


pytorch_model.bin.index.json: 0.00B [00:00, ?B/s]

Fetching 3 files:   0%|          | 0/3 [00:00<?, ?it/s]

pytorch_model-00002-of-00003.bin:   0%|          | 0.00/9.90G [00:00<?, ?B/s]

pytorch_model-00001-of-00003.bin:   0%|          | 0.00/9.95G [00:00<?, ?B/s]

pytorch_model-00003-of-00003.bin:   0%|          | 0.00/6.18G [00:00<?, ?B/s]

model.safetensors.index.json: 0.00B [00:00, ?B/s]

Loading checkpoint shards:   0%|          | 0/3 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/192 [00:00<?, ?B/s]

The following generation flags are not valid and may be ignored: ['temperature', 'top_p']. Set `TRANSFORMERS_VERBOSITY=info` for more details.
Device set to use cuda:0


In [15]:
prompt = templates['vicuna-0']
sentences = df['sentence'].tolist()
full_prompt = prompt.format("APP reviews", sentences[0])
output = vicuna_pipeline(
          full_prompt,
          # max_new_tokens=100,  # Limit length for concise responses
          # do_sample=True,
          temperature=0.7,
          top_p=0.9,
          pad_token_id=tokenizer.eos_token_id
      )
print(output)
response = output[0]['generated_text'].split("Answer:")[-1].strip()  # Extract just the answer
# responses.append(response)

# Add responses back to DataFrame
# df['vicuna_response'] = responses
print('#'* 22)
print(response)

[{'generated_text': "A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.\nUSER: Please perform Sentiment Classification task. Given the sentence from APP reviews, assign a sentiment label from ['negative', 'neutral', 'positive']. Return label only without any other text.\nASSISTANT: Sure!</s>\nUSER: Sentence: package file invalid i had my phone on factory reset and wanted to reinstall this app. getting package file invalid error. whats this all about? i always loved im+ but now seems like im beginning to hate it! pls fix this issue or tell me what to do. this had been ongoing for 2 weeks now..\nASSISTANT: negative"}]
######################
A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
USER: Please perform Sentiment Classification task. Given the sentence from APP reviews, assig