### Imports

In [1]:
%%capture
# Installs Unsloth, Xformers (Flash Attention) and all other packages!
!pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
!pip install --no-deps "xformers<0.0.27" "trl<0.9.0" peft accelerate bitsandbytes
!pip install datasets scipy ipywidgets matplotlib torch accelerate
!pip install wandb -qU
!pip install --upgrade transformers huggingface_hub
!pip install evaluate rouge_score

### WandB Initialization

In [3]:
import wandb
from google.colab import userdata
wandb.login(key=userdata.get('WANDB_TOKEN')) #Enter wandb access token here
wandb.init(
    project="PerspectrumInstruct-original-NoFineTuning-Unsloth_Mistral7B",
    name='Text ouput capture',
    tags=["vanilla", "of the shelf", "No-fine-tuning"],
    notes="Original off the shelf model with no fine-tuning to check if the model can work or not"
    )

[34m[1mwandb[0m: W&B API key is configured. Use [1m`wandb login --relogin`[0m to force relogin
[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /root/.netrc
[34m[1mwandb[0m: Currently logged in as: [33mkarannair[0m ([33mknclusive[0m). Use [1m`wandb login --relogin`[0m to force relogin


In [4]:
table = wandb.Table(columns=["instruction", "claim", "response"])

### Model and Tokenizer Loading

In [6]:
from unsloth import FastLanguageModel
import torch
max_seq_length = 2048
dtype=None
load_in_4bit=True

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "unsloth/mistral-7b-instruct-v0.2-bnb-4bit", # Custom model or standard from unsloth
    max_seq_length = max_seq_length,
    dtype = dtype,
    load_in_4bit = load_in_4bit,
)
FastLanguageModel.for_inference(model)# Enable native 2x faster inference

🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning.
==((====))==  Unsloth 2024.8: Fast Mistral patching. Transformers = 4.44.0.
   \\   /|    GPU: Tesla T4. Max memory: 14.748 GB. Platform = Linux.
O^O/ \_/ \    Pytorch: 2.3.1+cu121. CUDA = 7.5. CUDA Toolkit = 12.1.
\        /    Bfloat16 = FALSE. FA [Xformers = 0.0.26.post1. FA2 = False]
 "-____-"     Free Apache license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!


model.safetensors:   0%|          | 0.00/4.13G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/155 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/2.13k [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/493k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/438 [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.80M [00:00<?, ?B/s]

### Input Prompt

In [7]:
System_prompt = "Below is an instruction that describes an information requirement, paired with a claim that provides context. Write a response that appropriately addresses the instruction based on the given claim."
Input_prompt = "### Instruction:\n{instruction}\n\n### Claim:\n{claim}\n\n### Response:\n{answer}"
text = Input_prompt.format(instruction='Compare the opinion distributions for the following claim.', #A sample claim and instruction from perspectrum dataset
                           claim='Vaccination must be made compulsory.',
                           answer='',)
full_prompt = f"{System_prompt}\n\n{text}"
print(full_prompt)

inputs = tokenizer(
[
    full_prompt,
], return_tensors = "pt").to(torch.device("cuda"))

Below is an instruction that describes an information requirement, paired with a claim that provides context. Write a response that appropriately addresses the instruction based on the given claim.

### Instruction:
Compare the opinion distributions for the following claim.

### Claim:
Vaccination must be made compulsory.

### Response:



In [8]:
inputs

{'input_ids': tensor([[    1, 20811,   349,   396, 13126,   369, 13966,   396,  1871, 16169,
         28725,  5881,  1360,   395,   264,  3452,   369,  5312,  2758, 28723,
         12018,   264,  2899,   369,  6582,  1999, 14501,   272, 13126,  2818,
           356,   272,  2078,  3452, 28723,    13,    13, 27332,  3133,  3112,
         28747,    13, 22095,   272,  7382, 20779,   354,   272,  2296,  3452,
         28723,    13,    13, 27332,  1366,  2371, 28747,    13, 28790,  4373,
          2235,  1580,   347,  1269,   623,  7550,   695, 28723,    13,    13,
         27332, 12107, 28747,    13]], device='cuda:0'), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1]], device='cuda:0')}

### Initialize Steamer for Inference

In [10]:
from transformers import TextStreamer
text_streamer = TextStreamer(tokenizer, skip_prompt = True)

In [11]:
output = model.generate(input_ids = inputs.input_ids, attention_mask = inputs.attention_mask,
                   streamer = text_streamer, max_new_tokens = 128, pad_token_id = tokenizer.eos_token_id)


To compare the opinion distributions for the claim "Vaccination must be made compulsory," we would need to access data from various sources that reflect public opinion on this issue. This could include surveys, polls, or studies that have been conducted on this topic.

One such source is a 2021 survey conducted by the Pew Research Center, which found that 73% of Americans believe that getting vaccinated against COVID-19 should be required for children to attend public schools. This suggests that a significant majority of Americans support the idea of making vaccinations compulsory for certain populations.



In [18]:
gen_text = tokenizer.decode(output[0])

<s> Below is an instruction that describes an information requirement, paired with a claim that provides context. Write a response that appropriately addresses the instruction based on the given claim.

### Instruction:
Compare the opinion distributions for the following claim.

### Claim:
Vaccination must be made compulsory.

### Response:

To compare the opinion distributions for the claim "Vaccination must be made compulsory," we would need to access data from various sources that reflect public opinion on this issue. This could include surveys, polls, or studies that have been conducted on this topic.

One such source is a 2021 survey conducted by the Pew Research Center, which found that 73% of Americans believe that getting vaccinated against COVID-19 should be required for children to attend public schools. This suggests that a significant majority of Americans support the idea of making vaccinations compulsory for certain populations.



### Log Data to WandB

In [19]:
table.add_data('Compare the opinion distributions for the following claim.', 'Vaccination must be made compulsory.', gen_text)
wandb.log({"Generations":table})

In [20]:
wandb.finish()

VBox(children=(Label(value='0.004 MB of 0.004 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))