<a href="https://colab.research.google.com/github/fatemafaria142/Natural-Language-Understanding-in-English-with-MultiNLI-Corpus/blob/main/Natural_Language_Inference_using_Mistral_7B.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### **Install Required Packages**

In [1]:
!pip install accelerate peft bitsandbytes transformers trl datasets torch

Collecting accelerate
  Downloading accelerate-0.26.1-py3-none-any.whl (270 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m270.9/270.9 kB[0m [31m2.3 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting peft
  Downloading peft-0.7.1-py3-none-any.whl (168 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m168.3/168.3 kB[0m [31m11.9 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting bitsandbytes
  Downloading bitsandbytes-0.42.0-py3-none-any.whl (105.0 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m105.0/105.0 MB[0m [31m5.8 MB/s[0m eta [36m0:00:00[0m
Collecting trl
  Downloading trl-0.7.9-py3-none-any.whl (141 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m141.1/141.1 kB[0m [31m11.8 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting datasets
  Downloading datasets-2.16.1-py3-none-any.whl (507 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m507.1/507.1 kB[0m [31m9.4 MB/s[0m eta [36m0:00:00

### **Dataset Link:** https://huggingface.co/datasets/multi_nli

In [3]:
from datasets import load_dataset

instruct_tune_dataset = load_dataset("multi_nli")

Downloading readme:   0%|          | 0.00/8.89k [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/214M [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/4.94M [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/5.10M [00:00<?, ?B/s]

Generating train split:   0%|          | 0/392702 [00:00<?, ? examples/s]

Generating validation_matched split:   0%|          | 0/9815 [00:00<?, ? examples/s]

Generating validation_mismatched split:   0%|          | 0/9832 [00:00<?, ? examples/s]

### **Dataset structure**

* The dataset contains three different columns.

In [4]:
instruct_tune_dataset

DatasetDict({
    train: Dataset({
        features: ['promptID', 'pairID', 'premise', 'premise_binary_parse', 'premise_parse', 'hypothesis', 'hypothesis_binary_parse', 'hypothesis_parse', 'genre', 'label'],
        num_rows: 392702
    })
    validation_matched: Dataset({
        features: ['promptID', 'pairID', 'premise', 'premise_binary_parse', 'premise_parse', 'hypothesis', 'hypothesis_binary_parse', 'hypothesis_parse', 'genre', 'label'],
        num_rows: 9815
    })
    validation_mismatched: Dataset({
        features: ['promptID', 'pairID', 'premise', 'premise_binary_parse', 'premise_parse', 'hypothesis', 'hypothesis_binary_parse', 'hypothesis_parse', 'genre', 'label'],
        num_rows: 9832
    })
})

In [5]:
# Display information for 5 data points from the 'train' split
num_samples_to_show = 5
for i in range(num_samples_to_show):
    data = instruct_tune_dataset['train'][i]
    print(f"Data Point {i + 1}:")
    print("Premise:", data['premise'])
    print("Hypothesis:", data['hypothesis'])
    print("Label:", data['label'])
    print("\n-----------------------------\n")


Data Point 1:
Premise: Conceptually cream skimming has two basic dimensions - product and geography.
Hypothesis: Product and geography are what make cream skimming work. 
Label: 1

-----------------------------

Data Point 2:
Premise: you know during the season and i guess at at your level uh you lose them to the next level if if they decide to recall the the parent team the Braves decide to call to recall a guy from triple A then a double A guy goes up to replace him and a single A guy goes up to replace him
Hypothesis: You lose the things to the following level if the people recall.
Label: 0

-----------------------------

Data Point 3:
Premise: One of our number will carry out your instructions minutely.
Hypothesis: A member of my team will execute your orders with immense precision.
Label: 0

-----------------------------

Data Point 4:
Premise: How do you know? All this is their information again.
Hypothesis: This information belongs to them.
Label: 0

----------------------------

### **We will use just a small subset of the data for this training example**

In [7]:
instruct_tune_dataset["train"] = instruct_tune_dataset["train"].select(range(5000))
instruct_tune_dataset["test"] = instruct_tune_dataset["train"].select(range(1000))

In [8]:
instruct_tune_dataset

DatasetDict({
    train: Dataset({
        features: ['promptID', 'pairID', 'premise', 'premise_binary_parse', 'premise_parse', 'hypothesis', 'hypothesis_binary_parse', 'hypothesis_parse', 'genre', 'label'],
        num_rows: 5000
    })
    validation_matched: Dataset({
        features: ['promptID', 'pairID', 'premise', 'premise_binary_parse', 'premise_parse', 'hypothesis', 'hypothesis_binary_parse', 'hypothesis_parse', 'genre', 'label'],
        num_rows: 9815
    })
    validation_mismatched: Dataset({
        features: ['promptID', 'pairID', 'premise', 'premise_binary_parse', 'premise_parse', 'hypothesis', 'hypothesis_binary_parse', 'hypothesis_parse', 'genre', 'label'],
        num_rows: 9832
    })
    test: Dataset({
        features: ['promptID', 'pairID', 'premise', 'premise_binary_parse', 'premise_parse', 'hypothesis', 'hypothesis_binary_parse', 'hypothesis_parse', 'genre', 'label'],
        num_rows: 1000
    })
})

In [11]:
def create_prompt(sample):
    """
    Update the prompt template:
    Combine both the prompt and input into a single column.
    """
    bos_token = "<s>"
    eos_token = "</s>"

    # Use a predefined template for instructions
    instructions_template = "Evaluate the relationship between the given premise and hypothesis. Determine if the hypothesis logically follows from the premise, contradicts it, or is unrelated. The premise provides the information for the model to base its reasoning, and the hypothesis is the statement to be evaluated based on the provided premise. Assign a label to the relationship between the premise and hypothesis: 1 for NEUTRAL, 0 for ENTAILMENT, and 2 for CONTRADICTION."

    full_prompt = ""
    full_prompt += bos_token
    full_prompt += "### Instructions:"
    full_prompt += "\n" + instructions_template
    full_prompt += "\n\n### Premise:"
    full_prompt += "\n" + sample["premise"]
    full_prompt += "\n\n### Hypothesis:"
    full_prompt += "\n" + sample["hypothesis"]
    full_prompt += "\n\n### Label:"
    full_prompt += "\n" + str(sample["label"])  # Convert label to string
    full_prompt += eos_token

    return full_prompt

In [12]:
create_prompt(instruct_tune_dataset["train"][0])

'<s>### Instructions:\nEvaluate the relationship between the given premise and hypothesis. Determine if the hypothesis logically follows from the premise, contradicts it, or is unrelated. The premise provides the information for the model to base its reasoning, and the hypothesis is the statement to be evaluated based on the provided premise. Assign a label to the relationship between the premise and hypothesis: 1 for NEUTRAL, 0 for ENTAILMENT, and 2 for CONTRADICTION.\n\n### Premise:\nConceptually cream skimming has two basic dimensions - product and geography.\n\n### Hypothesis:\nProduct and geography are what make cream skimming work. \n\n### Label:\n1</s>'

In [13]:
create_prompt(instruct_tune_dataset["train"][1])

'<s>### Instructions:\nEvaluate the relationship between the given premise and hypothesis. Determine if the hypothesis logically follows from the premise, contradicts it, or is unrelated. The premise provides the information for the model to base its reasoning, and the hypothesis is the statement to be evaluated based on the provided premise. Assign a label to the relationship between the premise and hypothesis: 1 for NEUTRAL, 0 for ENTAILMENT, and 2 for CONTRADICTION.\n\n### Premise:\nyou know during the season and i guess at at your level uh you lose them to the next level if if they decide to recall the the parent team the Braves decide to call to recall a guy from triple A then a double A guy goes up to replace him and a single A guy goes up to replace him\n\n### Hypothesis:\nYou lose the things to the following level if the people recall.\n\n### Label:\n0</s>'

In [14]:
create_prompt(instruct_tune_dataset["train"][2])

'<s>### Instructions:\nEvaluate the relationship between the given premise and hypothesis. Determine if the hypothesis logically follows from the premise, contradicts it, or is unrelated. The premise provides the information for the model to base its reasoning, and the hypothesis is the statement to be evaluated based on the provided premise. Assign a label to the relationship between the premise and hypothesis: 1 for NEUTRAL, 0 for ENTAILMENT, and 2 for CONTRADICTION.\n\n### Premise:\nOne of our number will carry out your instructions minutely.\n\n### Hypothesis:\nA member of my team will execute your orders with immense precision.\n\n### Label:\n0</s>'

### **Initializing the Model**
* Load the model using a 4-bit configuration, employing double quantization, and set bfloat16 as the compute data type.

* Notably, we opt for the instruct-tuned model in this instance rather than the base model. It's worth mentioning that fine-tuning a base model necessitates a more substantial amount of data!

In [15]:
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
import torch

bnb_config = BitsAndBytesConfig(
        load_in_4bit=True, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype="float16", bnb_4bit_use_double_quant=True
    )


* https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1

In [16]:
mode_id = "mistralai/Mistral-7B-Instruct-v0.1"

In [17]:
model = AutoModelForCausalLM.from_pretrained(
        mode_id, quantization_config=bnb_config, device_map="auto", use_cache=False
    )

config.json:   0%|          | 0.00/571 [00:00<?, ?B/s]

model.safetensors.index.json:   0%|          | 0.00/25.1k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/2 [00:00<?, ?it/s]

model-00001-of-00002.safetensors:   0%|          | 0.00/9.94G [00:00<?, ?B/s]

model-00002-of-00002.safetensors:   0%|          | 0.00/4.54G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

In [18]:
tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-v0.1")
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right"

tokenizer_config.json:   0%|          | 0.00/967 [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/493k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.80M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/72.0 [00:00<?, ?B/s]

### **Let's example how well the model does at this task currently:**
* `temperature=0.5` sets a moderate level of randomness. You can experiment with different values for temperature to achieve the desired balance between creativity and determinism in your generated text. Adjust the value based on your specific use case and preferences.

In [19]:
def generate_response(prompt, model):
  encoded_input = tokenizer(prompt,  return_tensors="pt", add_special_tokens=True)
  model_inputs = encoded_input.to('cuda')

  generated_ids = model.generate(**model_inputs, max_new_tokens=1024,temperature=0.5, do_sample=True, pad_token_id=tokenizer.eos_token_id)

  decoded_output = tokenizer.batch_decode(generated_ids)

  return decoded_output[0].replace(prompt, "")

In [20]:
prompt = "### Instructions:\nEvaluate the relationship between the given premise and hypothesis. Determine if the hypothesis logically follows from the premise, contradicts it, or is unrelated. The premise provides the information for the model to base its reasoning, and the hypothesis is the statement to be evaluated based on the provided premise. Assign a label to the relationship between the premise and hypothesis: 1 for NEUTRAL, 0 for ENTAILMENT, and 2 for CONTRADICTION.\n\n### Premise:\nOne of our number will carry out your instructions minutely.\n\n### Hypothesis:\nA member of my team will execute your orders with immense precision.\n\n### Label:"

In [21]:
generate_response(prompt, model)



'<s>  1 (NEUTRAL)\n\nThe hypothesis does not logically follow from the premise. The premise states that one of the number will carry out the instructions, while the hypothesis states that a member of the team will execute the orders. These two statements are not related, as they do not refer to the same person or situation.</s>'

### **Setting up the Training**
we will be using the `huggingface` and the `peft` library!

In [22]:
from peft import AutoPeftModelForCausalLM, LoraConfig, get_peft_model, prepare_model_for_kbit_training

peft_config = LoraConfig(r=8, lora_alpha=16, lora_dropout=0.05, bias="none", task_type="CAUSAL_LM")


* we need to prepare the model to be trained in 4bit so we will use the  **`prepare_model_for_kbit_training`** function from peft




In [23]:
model = prepare_model_for_kbit_training(model)
model = get_peft_model(model, peft_config)

# **Training Hyperparameters**
The choice of hyperparameters is contingent upon the desired training duration. Pay special attention to the following key factors:

* `num_train_epochs/max_steps:` Dictates the number of iterations over the data. Exercise caution, as an excessive number may lead to overfitting!

* `learning_rate:` Governs the convergence speed of the model. Adjust this parameter judiciously for optimal results.

In [24]:
from transformers import TrainingArguments
output_model= "mistral_NLI_generation"
training_arguments = TrainingArguments(
        output_dir=output_model,
        per_device_train_batch_size=1,
        gradient_accumulation_steps=4,
        optim="paged_adamw_32bit",
        learning_rate=2e-4,
        lr_scheduler_type="cosine",
        save_strategy="epoch",
        logging_steps=10,
        num_train_epochs=1,
        max_steps=100,
        fp16=True,
)


### **Setting up the trainer**

`max_seq_length`: Context window size


In [25]:
from trl import SFTTrainer

max_seq_length = 1024

trainer = SFTTrainer(
  model=model,
  peft_config=peft_config,
  max_seq_length=max_seq_length,
  tokenizer=tokenizer,
  packing=True,
  formatting_func=create_prompt, # this will aplly the create_prompt mapping to all training and test dataset
  args=training_arguments,
  train_dataset=instruct_tune_dataset["train"],
  eval_dataset=instruct_tune_dataset["test"]
)

Generating train split: 0 examples [00:00, ? examples/s]

Generating train split: 0 examples [00:00, ? examples/s]



### **Training starts here**

In [26]:
trainer.train()

You're using a LlamaTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.


Step,Training Loss
10,1.0173
20,0.7622
30,0.7233
40,0.6774
50,0.6674
60,0.6602
70,0.6458
80,0.6909
90,0.6655
100,0.628


TrainOutput(global_step=100, training_loss=0.713795075416565, metrics={'train_runtime': 1491.5517, 'train_samples_per_second': 0.268, 'train_steps_per_second': 0.067, 'total_flos': 1.74835334381568e+16, 'train_loss': 0.713795075416565, 'epoch': 0.47})

### **Save the model**

In [27]:
trainer.save_model("mistral_NLI_generation")

In [28]:
merged_model = model.merge_and_unload()



In [29]:
def generate_response(prompt, model):
  encoded_input = tokenizer(prompt,  return_tensors="pt", add_special_tokens=True)
  model_inputs = encoded_input.to('cuda')

  generated_ids = model.generate(**model_inputs, max_new_tokens=1024,temperature=0.5, do_sample=True, pad_token_id=tokenizer.eos_token_id)

  decoded_output = tokenizer.batch_decode(generated_ids)

  return decoded_output[0]

### **Example: 1**

In [36]:
# Example usage
prompt = "### Instructions:\nEvaluate the relationship between the given premise and hypothesis. Determine if the hypothesis logically follows from the premise, contradicts it, or is unrelated. The premise provides the information for the model to base its reasoning, and the hypothesis is the statement to be evaluated based on the provided premise. Assign a label to the relationship between the premise and hypothesis: 1 for NEUTRAL, 0 for ENTAILMENT, and 2 for CONTRADICTION.\n"
prompt += "\n### Premise:\nit really is i heard something that their supposed to be starting a huge campaign in New York about um child abuse and stopping child abuse and it's supposed to be like it's starting there supposed to be like a big nationwide campaign and you know so hopefully that will take off and really do something i don't know there's just\n\n"
prompt += "\n### Hypothesis:\nIt's unfortunate that nobody is organizing a child abuse campaign."
prompt += "\n\n### Label:"
response = generate_response(prompt, model)

# Print the response with formatted output
print(response)

<s> ### Instructions:
Evaluate the relationship between the given premise and hypothesis. Determine if the hypothesis logically follows from the premise, contradicts it, or is unrelated. The premise provides the information for the model to base its reasoning, and the hypothesis is the statement to be evaluated based on the provided premise. Assign a label to the relationship between the premise and hypothesis: 1 for NEUTRAL, 0 for ENTAILMENT, and 2 for CONTRADICTION.

### Premise:
it really is i heard something that their supposed to be starting a huge campaign in New York about um child abuse and stopping child abuse and it's supposed to be like it's starting there supposed to be like a big nationwide campaign and you know so hopefully that will take off and really do something i don't know there's just


### Hypothesis:
It's unfortunate that nobody is organizing a child abuse campaign.

### Label:
2</s>


### **Example: 2**

In [37]:
# Example usage
prompt = "### Instructions:\nEvaluate the relationship between the given premise and hypothesis. Determine if the hypothesis logically follows from the premise, contradicts it, or is unrelated. The premise provides the information for the model to base its reasoning, and the hypothesis is the statement to be evaluated based on the provided premise. Assign a label to the relationship between the premise and hypothesis: 1 for NEUTRAL, 0 for ENTAILMENT, and 2 for CONTRADICTION.\n"
prompt += "\n### Premise:\nThese organizations invest the time and effort to understand their processes and how those processes contribute to or hamper mission accomplishment.\n\n"
prompt += "\n### Hypothesis:\nThese organizations invest lots of time to understand how some processes can contribute to or haampe"
prompt += "\n\n### Label:"
response = generate_response(prompt, model)

# Print the response with formatted output
print(response)

<s> ### Instructions:
Evaluate the relationship between the given premise and hypothesis. Determine if the hypothesis logically follows from the premise, contradicts it, or is unrelated. The premise provides the information for the model to base its reasoning, and the hypothesis is the statement to be evaluated based on the provided premise. Assign a label to the relationship between the premise and hypothesis: 1 for NEUTRAL, 0 for ENTAILMENT, and 2 for CONTRADICTION.

### Premise:
These organizations invest the time and effort to understand their processes and how those processes contribute to or hamper mission accomplishment.


### Hypothesis:
These organizations invest lots of time to understand how some processes can contribute to or haampe

### Label:
0</s>


### **Example: 3**

In [38]:
# Example usage
prompt = "### Instructions:\nEvaluate the relationship between the given premise and hypothesis. Determine if the hypothesis logically follows from the premise, contradicts it, or is unrelated. The premise provides the information for the model to base its reasoning, and the hypothesis is the statement to be evaluated based on the provided premise. Assign a label to the relationship between the premise and hypothesis: 1 for NEUTRAL, 0 for ENTAILMENT, and 2 for CONTRADICTION.\n"
prompt += "\n### Premise:\nThere are good road connections between Sant Antoni and both CaleT?¡rida and CaleBadella, with the result that both bays have now been developed.\n\n"
prompt += "\n### Hypothesis:\nWith the good road connections, both bays have been developed."
prompt += "\n\n### Label:"
response = generate_response(prompt, model)

# Print the response with formatted output
print(response)

<s> ### Instructions:
Evaluate the relationship between the given premise and hypothesis. Determine if the hypothesis logically follows from the premise, contradicts it, or is unrelated. The premise provides the information for the model to base its reasoning, and the hypothesis is the statement to be evaluated based on the provided premise. Assign a label to the relationship between the premise and hypothesis: 1 for NEUTRAL, 0 for ENTAILMENT, and 2 for CONTRADICTION.

### Premise:
There are good road connections between Sant Antoni and both CaleT?¡rida and CaleBadella, with the result that both bays have now been developed.


### Hypothesis:
With the good road connections, both bays have been developed.

### Label:
0</s>


### **Example: 4**

In [39]:
# Example usage
prompt = "### Instructions:\nEvaluate the relationship between the given premise and hypothesis. Determine if the hypothesis logically follows from the premise, contradicts it, or is unrelated. The premise provides the information for the model to base its reasoning, and the hypothesis is the statement to be evaluated based on the provided premise. Assign a label to the relationship between the premise and hypothesis: 1 for NEUTRAL, 0 for ENTAILMENT, and 2 for CONTRADICTION.\n"
prompt += "\n### Premise:\ni bet even my cats could do that\n\n"
prompt += "\n### Hypothesis:\nMy cats could probably do that because they are brilliant."
prompt += "\n\n### Label:"
response = generate_response(prompt, model)

# Print the response with formatted output
print(response)

<s> ### Instructions:
Evaluate the relationship between the given premise and hypothesis. Determine if the hypothesis logically follows from the premise, contradicts it, or is unrelated. The premise provides the information for the model to base its reasoning, and the hypothesis is the statement to be evaluated based on the provided premise. Assign a label to the relationship between the premise and hypothesis: 1 for NEUTRAL, 0 for ENTAILMENT, and 2 for CONTRADICTION.

### Premise:
i bet even my cats could do that


### Hypothesis:
My cats could probably do that because they are brilliant.

### Label:
1</s>
