In [1]:
%%capture
!pip install --upgrade accelerate scipy tensorboard openpyxl peft bitsandbytes transformers trl

In [2]:
import os
import torch
from datasets import load_dataset, Dataset
from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer,
    BitsAndBytesConfig,
    HfArgumentParser,
    TrainingArguments,
    pipeline,
    logging
)
from peft import LoraConfig, PeftModel
from trl import SFTTrainer, SFTConfig

In [3]:
import pandas as pd

newsdata = pd.read_excel('0613-a.xlsx')

def create_input(row):
    return f"<s>[INST] Identify if this news pertains to Apple, Nvidia, or Microsoft. If it does, provide a brief analysis stating whether the sentiment is positive, neutral, or negative, focusing on the cause of the sentiment. The news title is: {row['Titles']} [/INST] {row['Output']} </s>"

newsdata['llama2-input'] = newsdata.apply(create_input, axis=1)

In [4]:
# Model from Hugging Face hub
base_model = "NousResearch/Llama-2-7b-chat-hf"

# Fine-tuned model
new_model = "./llama-2-7b-chat-mega3-0613-1"

In [5]:
dataset = Dataset.from_pandas(newsdata)

In [6]:
compute_dtype = getattr(torch, "float16")

quant_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=compute_dtype,
    bnb_4bit_use_double_quant=False,
)

In [7]:
# Load base model
model = AutoModelForCausalLM.from_pretrained(
    base_model,
    quantization_config=quant_config,
    device_map={"": 0}
)
model.config.use_cache = False
model.config.pretraining_tp = 1

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

In [8]:
# Load LLaMA tokenizer
tokenizer = AutoTokenizer.from_pretrained(base_model, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right"

In [9]:
# Load LoRA configuration
peft_args = LoraConfig(
    lora_alpha=16,
    lora_dropout=0.1,
    r=64,
    bias="none",
    task_type="CAUSAL_LM",
)

In [10]:
# Set training parameters
training_params = TrainingArguments(
    output_dir="./results",
    num_train_epochs=1,
    per_device_train_batch_size=8,
    gradient_accumulation_steps=1,
    optim="paged_adamw_32bit",
    save_steps=25,
    logging_steps=5,
    learning_rate=2e-4,
    weight_decay=0.001,
    fp16=False,
    bf16=False,
    max_grad_norm=0.3,
    max_steps=-1,
    warmup_ratio=0.03,
    group_by_length=True,
    lr_scheduler_type="constant",
    report_to="tensorboard"
)

In [11]:
# Set supervised fine-tuning parameters
trainer = SFTTrainer(
    model=model,
    train_dataset=dataset,
    peft_config=peft_args,
    dataset_text_field="llama2-input",
    tokenizer=tokenizer,
    args=training_params,
    packing=False,
)


Deprecated positional argument(s) used in SFTTrainer, please use the SFTConfig to set these arguments instead.


Map:   0%|          | 0/3025 [00:00<?, ? examples/s]

Asking to truncate to max_length but no maximum length is provided and the model has no predefined maximum length. Default to no truncation.
Detected kernel version 5.4.0, which is below the recommended minimum of 5.5.0; this can cause the process to hang. It is recommended to upgrade the kernel to the minimum version or higher.


In [12]:
# Train model
trainer.train()

Step,Training Loss
5,2.2977
10,2.0514
15,1.7637
20,1.5801
25,1.224
30,1.0834
35,0.9955
40,0.9705
45,0.9037
50,0.8452




TrainOutput(global_step=379, training_loss=0.9307936250691679, metrics={'train_runtime': 1070.2861, 'train_samples_per_second': 2.826, 'train_steps_per_second': 0.354, 'total_flos': 2.3596558480318464e+16, 'train_loss': 0.9307936250691679, 'epoch': 1.0})

In [None]:
# Save trained model
trainer.model.save_pretrained(new_model)
tokenizer.save_pretrained(new_model)

In [13]:
title = "India Wants Apple to Help With Probe on iPhone Hacking Attempts"
prompt = f"Identify if this news pertains to Apple, Nvidia, or Microsoft. If it does, provide a brief analysis stating whether the sentiment is positive, neutral, or negative, focusing on the cause of the sentiment. The news title is: {title}"

In [14]:
device = torch.device("cuda")
torch.cuda.empty_cache()

lowest_loss = './checkpoint-375'
model = AutoModelForCausalLM.from_pretrained(lowest_loss).to(device)
tokenizer = AutoTokenizer.from_pretrained(lowest_loss)
pipe = pipeline(task="text-generation", model=model, tokenizer=tokenizer, max_length=300, truncation=True)
result = pipe(f"<s>[INST] {prompt} [/INST]")
print(result[0]['generated_text'])



Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

<s>[INST] Identify if this news pertains to Apple, Nvidia, or Microsoft. If it does, provide a brief analysis stating whether the sentiment is positive, neutral, or negative, focusing on the cause of the sentiment. The news title is: India Wants Apple to Help With Probe on iPhone Hacking Attempts [/INST] This news pertains to Apple. The sentiment appears to be neutral to slightly negative. The cause of the sentiment is the investigation by India into alleged hacking attempts on iPhones, which could lead to potential legal or regulatory issues for Apple. However, the neutrality stems from the fact that the investigation is not necessarily a direct reflection on Apple's operations or performance, but rather a response to external circumstances.

Brief Analysis:
- **Sentiment:** Neutral to slightly negative
- **Cause:** Investigation by India into iPhone hacking attempts

Overall, the sentiment is leaning slightly negative due to the regulatory scrutiny and potential legal implications, b

In [15]:
torch.cuda.empty_cache()

pipebase = pipeline(task="text-generation", model=AutoModelForCausalLM.from_pretrained(base_model).to(device),
                    tokenizer=AutoTokenizer.from_pretrained(base_model), max_length=300, truncation=True)
result = pipebase(f"<s>[INST] {prompt} [/INST]")
print(result[0]['generated_text'])

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

<s>[INST] Identify if this news pertains to Apple, Nvidia, or Microsoft. If it does, provide a brief analysis stating whether the sentiment is positive, neutral, or negative, focusing on the cause of the sentiment. The news title is: India Wants Apple to Help With Probe on iPhone Hacking Attempts [/INST]  The news pertains to Apple, so here is a brief analysis:

Title: India Wants Apple to Help With Probe on iPhone Hacking Attempts

Sentiment: Neutral

Reason: The news article states that the Indian government has asked Apple to assist in an investigation into attempts to hack iPhones in the country. While this may raise concerns about the security of Apple's products, the fact that the government is taking steps to address the issue suggests that the sentiment is neutral rather than negative.

Analysis: The Indian government's request for Apple's assistance in the investigation suggests that the company is taking the issue seriously and is cooperating with the authorities to address t

In [36]:
pipe(f"<s>[INST] {prompt} [/INST]")[0]['generated_text']

"<s>[INST] Identify if this news pertains to Apple, Nvidia, or Microsoft. If it does, provide a brief analysis stating whether the sentiment is positive, neutral, or negative, focusing on the cause of the sentiment. The news title is: India Wants Apple to Help With Probe on iPhone Hacking Attempts [/INST] This news pertains to Apple. The sentiment is likely neutral to negative. The cause of the sentiment is the involvement of Apple in a regulatory investigation related to hacking attempts on iPhones in India. While the investigation itself is a routine matter, it could lead to negative publicity and potential legal or regulatory consequences for Apple. However, without additional context on the specifics of the investigation and Apple's response, the sentiment remains neutral to negative."

In [37]:
pipebase(f"<s>[INST] {prompt} [/INST]")[0]['generated_text']

"<s>[INST] Identify if this news pertains to Apple, Nvidia, or Microsoft. If it does, provide a brief analysis stating whether the sentiment is positive, neutral, or negative, focusing on the cause of the sentiment. The news title is: India Wants Apple to Help With Probe on iPhone Hacking Attempts [/INST]  This news pertains to Apple, so I will provide a brief analysis of the sentiment.\n\nThe news title suggests that Indian authorities are investigating hacking attempts on iPhones in the country, and they have requested Apple's assistance in the probe. This news likely has a negative sentiment for Apple, as the company is being asked to cooperate with an investigation that may involve potential security vulnerabilities in its products.\n\nThe cause of the negative sentiment is likely the concern over the security of Apple's products and the potential for hacking attempts, which could damage the company's reputation and erode customer trust. This is particularly concerning for Apple, a