# ANIMAL HUBANDRY LLM

# Fine-tuning Nous-Hermes-Llama2-13b on your own data for domain specific task

In this notebook and tutorial, we will fine-tune  [Nous-Hermes-Llama2-13b](https://huggingface.co/NousResearch/Nous-Hermes-Llama2-13b). which is a variant variant of llama 2

Reason for fintuning Nous-Hermes-Llama2-13b is that it stands out for its long responses, lower hallucination rate, and absence of OpenAI censorship mechanisms.

This tutorial will using gradient ai, to hundle our computaional resources and host the model.

Gradient ai is a platform that allows you to easily fine-tune existiong base model using API.


You can check out the full project on our githubpage: [animal_husbandry_LLM](https://github.com/ice-black/animal_husbandry_LLM)



## Let's begin!

The purpose is to fine-tune our base language model specifically for the domain of animal husbandry **(question answering)**

The packages used  are :

In [None]:
!pip install gradientai --upgrade
!pip install langchain
!pip install -U gradient_haystack
!pip install regex
!pip install rouge
!pip install nltk
!pip install wandb
!pip install datasets
!pip install rouge
!pip install nltk


# **1. LOAD DATASET**

** Preparing data**
To prepare the dataset for fine-tuning, you'll need to structure it as a `.json` file with input-output pairs. Each pair should be formatted as follows
```
{ "inputs": "<s>### Instruction:\n  {user_query} \n\n### Response:\n {response} </s>"}
```
Here's an example of how our dataset looks like:

```
{ "inputs": "<s>### Instruction:\nDiscuss the role of nutrition in animal husbandry. \n\n### Response:\nNutrition plays a crucial role in animal husbandry as it directly impacts the health, growth, and productivity of livestock. Properly balanced diets ensure optimal development and efficient utilization of nutrients for various purposes, such as milk and meat production.</s>"},
{ "inputs": "<s>### Instruction:\nElaborate on the challenges faced in modern animal husbandry practices. \n\n### Response:\nModern animal husbandry faces challenges such as disease management, ethical concerns, and environmental impact. Balancing productivity with animal welfare and sustainability is an ongoing challenge for practitioners in the field.</s>"},
```




In [None]:
import re


In [7]:
import json
with open("/content/data.json") as f:  # Load the dataset from a JSON file
    train_dataset = json.load(f)

print("Dataset Size: ", len(train_dataset))

Dataset Size:  974


In [None]:
def divide_into_Batches(number, chunk_size):  # Define a function to divide a number into chunks of a given size
    Batches = []
    while number > 0:
        if number >= chunk_size:
            Batches.append(chunk_size)
            number -= chunk_size
        else:
            Batches.append(number)
            break

    return Batches

Batches = divide_into_Batches(len(train_dataset), 100)  # Divide the dataset into chunks of 100 samples each
print("Batches size")
print(Batches)

Batches size
[100, 100, 100, 100, 100, 100, 100, 100, 100, 74]


> # 2. **LOAD BASE MODEL**

Let's now load Llama 2 7B - `NousResearch/Nous-Hermes-Llama2-13b` - using 4-bit quantization!

In [2]:
import json
import os
from gradientai import Gradient
from langchain.chains import LLMChain
from langchain.llms import GradientLLM
from langchain.prompts import PromptTemplate


os.environ['GRADIENT_ACCESS_TOKEN'] = "Dh8BfdF4J0CO7UBi7nXjZny7jh9breiK"
os.environ['GRADIENT_WORKSPACE_ID'] = "345ce93a-40e9-4940-aa2e-fa76f1668fcd_workspace"

gradient =  Gradient()

print("Available Base Models")
for i in gradient.list_models(only_base=True):
    print("\t",i)

base_model_id = "NousResearch/Nous-Hermes-Llama2-13b"
base_model_name = "nous-hermes2"
base_model = gradient.get_base_model(base_model_slug="nous-hermes2") # base model Nous-Hermes-Llama2-13b

print("\nBase Model Chosen :", base_model)


Available Base Models
	 <gradientai._base_model.BaseModel object at 0x7ddab568e710>
	 <gradientai._base_model.BaseModel object at 0x7ddab568e6e0>
	 <gradientai._base_model.BaseModel object at 0x7ddab568ee60>
	 <gradientai._base_model.BaseModel object at 0x7ddab568ef20>
	 <gradientai._base_model.BaseModel object at 0x7ddab568e9e0>

Base Model Chosen : <gradientai._base_model.BaseModel object at 0x7ddab568efb0>


> # **3. CREATING A MODEL ADAPTER**

* Adapters are small, lightweight modules inserted between
existing layers of a pre-trained LLM. They act like "add-ons" that focus on learning task-specific information without modifying the core knowledge captured in the original model.

* for our case we view an adapter is just a set of unfrozen weights that we are going to be training during the fintuning process.


* The addapter server as the object that we are going to fin tune

In [5]:
our_finetune_model_name="Llama2-13b/Animal_Husbandry"
Fine_Tune__adapter = base_model.create_model_adapter(
        name=our_finetune_model_name,
        learning_rate=0.00005,
        rank=8,

    )

# default hyperparameters Frozen
hyperparameters = {
                  "block_size": 1024,
                  "model_max_length": 2048,
                  "padding": "right",
                  "use_flash_attention_2": False,
                  "disable_gradient_checkpointing": False,
                  "logging_steps": -1,
                  "evaluation_strategy": "epoch",
                  "save_total_limit": 1,
                  "save_strategy": "epoch",
                  "auto_find_batch_size": False,
                  "mixed_precision": "fp16",
                  "epochs": 3,
                  "batch_size": 100,
                  "warmup_ratio": 0.1,
                  "gradient_accumulation": 1,
                  "optimizer": "adamw_torch",
                  "scheduler": "linear",
                  "weight_decay": 0,
                  "max_grad_norm": 1,
                  "seed": 42,
                  "apply_chat_template": False,
                  "quantization": "int4",
                  "target_modules": "",
                  "merge_adapter": False,
                  "peft": True,
                  "lora_r": 16,
                  "lora_alpha": 32,
                  "lora_dropout": 0.05
  }


print(f"Base model id                : {Fine_Tune__adapter._base_model_id}")
print(f"Fine tune model Name         : { Fine_Tune__adapter.name}")
print(f"Fine tune model adapter id   : {Fine_Tune__adapter.id}")

print("\n\n")
print("Size of object in memory, in bytes.", Fine_Tune__adapter.__format__.__sizeof__()) # Size of object in memory, in bytes.
Fine_Tune__adapter.__dict__

Base model id                : cc2dafce-9e6e-4a23-a918-cad6ba89e42e_base_ml_model
Fine tune model Name         : Llama2-13b/Animal_Husbandry
Fine tune model adapter id   : c26d5bd9-aa0a-4e3c-a715-eb124bd71d29_model_adapter



Size of object in memory, in bytes. 56


{'_api_instance': <gradientai.openapi.client.api.models_api.ModelsApi at 0x7ddab568e380>,
 '_id': 'c26d5bd9-aa0a-4e3c-a715-eb124bd71d29_model_adapter',
 '_workspace_id': '345ce93a-40e9-4940-aa2e-fa76f1668fcd_workspace',
 '_async_semaphore': <asyncio.locks.Semaphore object at 0x7ddae82fa320 [unlocked, value:8]>,
 '_base_model_id': 'cc2dafce-9e6e-4a23-a918-cad6ba89e42e_base_ml_model',
 '_name': 'Llama2-13b/Animal_Husbandry'}

> # **4. FINE TUNING OUR ADAPTOR**

For our case we will be performing Laura-based finetuning - this mean tat we are freezing like 99% of the layers and then finetuning an adapter on top of it.

LoRA: Low-Rank Adaptation of Large Language Models is a novel technique introduced by Microsoft researchers to deal with the problem of fine-tuning large-language models with billions of parameters

 LoRA proposes to freeze pre-trained model weights and inject trainable layers (rank-decomposition matrices) in each transformer block. This greatly reduces the number of trainable parameters and GPU memory requirements since gradients don't need to be computed for most model weights


why LoRA finetuning:

1. Faster Training: Since only the added task-specific layers are trained while the pre-trained model's parameters remain frozen, the fine-tuning process is generally faster compared to training a model from scratch
2. Computation requirements are lower. We could create a full fine-tuned model in a 2080 Ti with 11 GB of VRAM!
3. Trained weights are  much smaller. Because the original model is frozen and we inject new layers to be trained

 [for more info](https://huggingface.co/blog/lora)

In [None]:
print(f"Our Model id:  {Fine_Tune__adapter.id}")
num_epochs = 1  # num_epochs is the number of times you fine-tune the model # more epochs tends to get better results, but you also run the risk of "overfitting"
count = 0
print("================================================================\n")
print("Fine tuning . . .\n")
while count < num_epochs:
    print(f"Fine-tuning the model, iteration {count + 1}")
    s = 0
    n = 1
    for Batch in Batches:
        print(f"Batch {n} range: {s} : {(s + Batch)}")

        # Try to fine-tune the model with the chunk of samples,
        while True:
            try:
                metric = Fine_Tune__adapter.fine_tune(samples=train_dataset[s: s + Batch])
                print(f"\t Batch {n} Evaluation :", metric)
                break
            except:
                pass



        s += Batch
        n += 1
    count = count + 1

Our Model id:  d189f721-ae17-4545-a0ad-f95194e857f5_model_adapter

Fine tuning . . .

Fine-tuning the model, iteration 1
Batch 1 range: 0 : 100
	 Batch 1 Evaluation : number_of_trainable_tokens=8007 sum_loss=7168.5273
Batch 2 range: 100 : 200
	 Batch 2 Evaluation : number_of_trainable_tokens=8043 sum_loss=5564.6777
Batch 3 range: 200 : 300
	 Batch 3 Evaluation : number_of_trainable_tokens=9278 sum_loss=5753.655
Batch 4 range: 300 : 400
	 Batch 4 Evaluation : number_of_trainable_tokens=10454 sum_loss=7111.75
Batch 5 range: 400 : 500
	 Batch 5 Evaluation : number_of_trainable_tokens=13087 sum_loss=9291.248
Batch 6 range: 500 : 600
	 Batch 6 Evaluation : number_of_trainable_tokens=16332 sum_loss=12244.016
Batch 7 range: 600 : 700
	 Batch 7 Evaluation : number_of_trainable_tokens=18421 sum_loss=15847.841
Batch 8 range: 700 : 800
	 Batch 8 Evaluation : number_of_trainable_tokens=18931 sum_loss=17274.959
Batch 9 range: 800 : 900
	 Batch 9 Evaluation : number_of_trainable_tokens=13457 sum_los

> # **5. MODEL INFERENCE**

In [15]:
from langchain.chains import LLMChain    # Import the LLMChain class for building LLM-based workflows
from langchain.llms import GradientLLM   # Import the GradientLLM class for interacting with Gradient AI's API
from langchain.prompts import PromptTemplate # Import the PromptTemplate class for defining how to prompt the LLM
import gradientai
import os # Import the os module for potential file system interactions

In [20]:
Fine_Tune__adapter_ID = "d189f721-ae17-4545-a0ad-f95194e857f5_model_adapter"
#Fine_Tune__adapter_ID = Fine_Tune__adapter.id
#  creating a GradientLLM object
llm = GradientLLM(
    model=Fine_Tune__adapter_ID,
    model_kwargs=dict(
        max_generated_token_count=128,
        temperature = 0.7
    ),
)

**Formatting prompts**


In [21]:
template = """### Instruction: {Instruction} \n\n### Response:"""

prompt = PromptTemplate(template=template, input_variables=["Instruction"])

In [22]:
llm_chain = LLMChain(prompt=prompt, llm=llm)

In [23]:
Question = "What is animal husbandry?"

Answer = llm_chain.run(Instruction=f"{Question}")
print(Answer)

Animal husbandry is the practice of raising livestock for commercial purposes, including meat, dairy, and other animal products.


In [24]:
import re
def Find_Instruction(input_string):
    input_pattern = r'<s>### Instruction:\n(.*?) \n'
    matches = re.findall(input_pattern, input_string, re.DOTALL)

    # If there are matches, extract the first one
    extracted_string = None
    if matches:
        extracted_string = matches[0]

    return extracted_string

question = train_dataset[0]["inputs"]
question = Find_Instruction(question)
print("question :\n\t", question)
Answer = llm_chain.run(Instruction=f"{question}")
print("Answer : \n\t", Answer)

question :
 Discuss the role of nutrition in animal husbandry.
Answer : 
 Nutrition plays a vital role in animal husbandry, affecting growth rates, reproductive success, and overall health in livestock.


>  #  **6. MODEL EVALUATION**

Here we are using two popular automatic evaluation metrics to assess the performance of your LLM:

* BLEU score: This metric calculates the n-gram precision between the generated response and the reference response

* ROUGE score: This metric measures the overlap in word n-grams and longest common subsequences between the generated response and the reference response.

Reasons for Choosing These Metrics:
* Both BLEU and ROUGE are widely used in evaluating text generation tasks, making them well-established and understood metrics.
* Both scores offer numerical values that can be easily compared and analyzed.

BLEU and ROUGE scores are calculated to compare the generated response with the target response.

In [None]:

import re
import json
from nltk.translate.bleu_score import corpus_bleu
from rouge import Rouge
from langchain.chains import LLMChain
from langchain.llms import GradientLLM
from langchain.prompts import PromptTemplate
import os

In [None]:


def compute_rouge_scores(hypotheses, references):
    rouge = Rouge()
    scores = rouge.get_scores(hypotheses, references, avg=True)
    return scores


def compute_bleu_score(target_response, llm_responses):
    bleu_score = corpus_bleu([target_response.split()], [llm_responses.split()])  # Calculate BLEU score
    return bleu_score


def Find_Instruction(input_pattern, input_string):
    matches = re.findall(input_pattern, input_string, re.DOTALL)

    # If there are matches, extract the first one
    extracted_string = None
    if matches:
        extracted_string = matches[0]

    return extracted_string


def Evaluate(Sample=None, count=0):
    print("\n =================================== Evaluation =================================== ")
    input_pattern = r'<s>### Instruction:\n(.*?) \n'
    response_pattern = r'Response:\n(.*?)</s>'
    bleu_scoreS = []
    rouge_scoreS = []

    if count != 0:
        iteration = count - 1
    else:
        iteration = count

    while iteration >= 0:

        input_query = Find_Instruction(input_pattern, Sample[iteration]["inputs"])
        target_response = Find_Instruction(response_pattern, Sample[iteration]["inputs"])

        if input_query and target_response is not None:
            print("\n ---------------------------------------------------------------")
            print("INPUT QUERY:\n", input_query)
            print("\nTARGET RESPONSE:\n", target_response)

            llm_responses = llm_chain.run(Instruction=f"{input_query}")
            print("\nLLM RESPONSE:\n", llm_responses)

            rouge_scores = compute_rouge_scores(llm_responses, target_response)

            bleu_score = compute_bleu_score(target_response, llm_responses)
            print("\nBLEU Score:", bleu_score)
            print("ROUGE Scores:")
            print("\tROUGE-1 F1 Score:", rouge_scores["rouge-1"]["f"])
            print("\tROUGE-2 F1 Score:", rouge_scores["rouge-2"]["f"])
            print("\tROUGE-L F1 Score:", rouge_scores["rouge-l"]["f"])
            rouge_scoreS.append((rouge_scores["rouge-1"]["f"], rouge_scores["rouge-2"]["f"], rouge_scores["rouge-l"]["f"]))
            bleu_scoreS.append(bleu_score)


        iteration -= 1

    if count > 0:
        rouge_scores1 = 0
        rouge_scores2 = 0
        rouge_scores3 = 0
        bleu_scoreA = 0

        for i in bleu_scoreS:
            bleu_scoreA += i
        for i in rouge_scoreS:
            rouge_scores1 += i[0]
            rouge_scores2 += i[1]
            rouge_scores3 += i[2]

        print("\nAverageBLEU Score:", bleu_scoreA)
        print(f"Average ROUGE Scores for {count} samples")
        print("\tAverage ROUGE-1 F1 Score:", rouge_scores1 / count)
        print("\tAverage ROUGE-2 F1 Score:", rouge_scores2 / count)
        print("\tAverageROUGE-L F1 Score:", rouge_scores3 / count)

    print("\n ---------------------------------------------------------------")




In [None]:
Evaluate(Sample=train_dataset, count=3)  # one sample evaluation



 ---------------------------------------------------------------
INPUT QUERY:
 How does technology contribute to advancements in animal husbandry?

TARGET RESPONSE:
 Technology in animal husbandry includes innovations like automated feeding systems, precision breeding techniques, and health monitoring devices. These advancements enhance efficiency, reduce costs, and improve overall management practices.

LLM RESPONSE:
 Technology contributes to advancements in animal husbandry by enhancing monitoring, efficiency, and decision-making through tools such as sensors, data analytics, and automation.

BLEU Score: 0
ROUGE Scores:
	ROUGE-1 F1 Score: 0.2916666617447917
	ROUGE-2 F1 Score: 0.08333332841145863
	ROUGE-L F1 Score: 0.2916666617447917

 ---------------------------------------------------------------
INPUT QUERY:
 Elaborate on the challenges faced in modern animal husbandry practices.

TARGET RESPONSE:
 Modern animal husbandry faces challenges such as disease management, ethical conc

The hypothesis contains 0 counts of 2-gram overlaps.
Therefore the BLEU score evaluates to 0, independently of
how many N-gram overlaps of lower order it contains.
Consider using lower n-gram order or use SmoothingFunction()
The hypothesis contains 0 counts of 3-gram overlaps.
Therefore the BLEU score evaluates to 0, independently of
how many N-gram overlaps of lower order it contains.
Consider using lower n-gram order or use SmoothingFunction()
The hypothesis contains 0 counts of 4-gram overlaps.
Therefore the BLEU score evaluates to 0, independently of
how many N-gram overlaps of lower order it contains.
Consider using lower n-gram order or use SmoothingFunction()


 > # **6. INTEGRATING  RETRIEVAL-AUGMENTED GENERATION TO OUR FINETUNED LLM**






In [None]:
!pip install gradient_haystack==0.2.0

Collecting gradient_haystack==0.2.0
  Downloading gradient_haystack-0.2.0-py3-none-any.whl (11 kB)
Collecting gradientai>=1.4.0 (from gradient_haystack==0.2.0)
  Downloading gradientai-1.7.0-py3-none-any.whl (270 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m270.4/270.4 kB[0m [31m2.4 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting haystack-ai (from gradient_haystack==0.2.0)
  Downloading haystack_ai-2.0.0b7-py3-none-any.whl (239 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m239.6/239.6 kB[0m [31m13.8 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting aenum>=3.1.11 (from gradientai>=1.4.0->gradient_haystack==0.2.0)
  Downloading aenum-3.1.15-py3-none-any.whl (137 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m137.6/137.6 kB[0m [31m17.7 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting pydantic<2.0.0,>=1.10.5 (from gradientai>=1.4.0->gradient_haystack==0.2.0)
  Downloading pydantic-1.10.14-cp310-cp310-manylinux_2_17_x86_64.ma

In [None]:
from gradient_haystack.embedders.gradient_document_embedder import GradientDocumentEmbedder
from gradient_haystack.embedders.gradient_text_embedder import GradientTextEmbedder
from gradient_haystack.generator.base import GradientGenerator
from haystack import Document, Pipeline
from haystack.components.writers import DocumentWriter
from haystack.document_stores.in_memory.document_store import InMemoryDocumentStore
from haystack.components.retrievers.in_memory.embedding_retriever import InMemoryEmbeddingRetriever
from haystack.components.builders import PromptBuilder
from haystack.components.builders.answer_builder import AnswerBuilder
import os

In [None]:
os.environ['GRADIENT_ACCESS_TOKEN'] = "Dh8BfdF4J0CO7UBi7nXjZny7jh9breiK"
os.environ['GRADIENT_WORKSPACE_ID'] = "345ce93a-40e9-4940-aa2e-fa76f1668fcd_workspace"


fine_tuned_Model_Id = "d189f721-ae17-4545-a0ad-f95194e857f5_model_adapter"

In [None]:
document_store = InMemoryDocumentStore()
writer = DocumentWriter(document_store=document_store)


document_embedder = GradientDocumentEmbedder(
    access_token=os.environ["GRADIENT_ACCESS_TOKEN"],
    workspace_id=os.environ["GRADIENT_WORKSPACE_ID"],
)

with open("/content/Raw_Text_Data.txt", encoding="utf-8") as file:
    text_data = file.read()

docs = [
    Document(content=text_data)
]

indexing_pipeline = Pipeline()
indexing_pipeline.add_component(instance=document_embedder, name="document_embedder")
indexing_pipeline.add_component(instance=writer, name="writer")
indexing_pipeline.connect("document_embedder", "writer")
indexing_pipeline.run({"document_embedder": {"documents": docs}})

text_embedder = GradientTextEmbedder(
    access_token=os.environ["GRADIENT_ACCESS_TOKEN"],
    workspace_id=os.environ["GRADIENT_WORKSPACE_ID"],
)

generator = GradientGenerator(
    access_token=os.environ["GRADIENT_ACCESS_TOKEN"],
    workspace_id=os.environ["GRADIENT_WORKSPACE_ID"],
    model_adapter_id=fine_tuned_Model_Id,
    max_generated_token_count=350,
)



100%|██████████| 1/1 [00:00<00:00,  3.70it/s]


In [None]:
prompt = """You are helpful assistant ment to answer questions relating to animal husbandry. Answer the query, based on the
content in the documents. if you dont know the answer say you don't know.
{{documents}}
Query: {{query}}
\nAnswer:
"""

retriever = InMemoryEmbeddingRetriever(document_store=document_store)
prompt_builder = PromptBuilder(template=prompt)

rag_pipeline = Pipeline()
rag_pipeline.add_component(instance=text_embedder, name="text_embedder")
rag_pipeline.add_component(instance=retriever, name="retriever")
rag_pipeline.add_component(instance=prompt_builder, name="prompt_builder")
rag_pipeline.add_component(instance=generator, name="generator")
rag_pipeline.add_component(instance=AnswerBuilder(), name="answer_builder")
rag_pipeline.connect("generator.replies", "answer_builder.replies")
rag_pipeline.connect("retriever", "answer_builder.documents")
rag_pipeline.connect("text_embedder", "retriever")
rag_pipeline.connect("retriever", "prompt_builder.documents")
rag_pipeline.connect("prompt_builder", "generator")


def LLM_Run(question):
    result = rag_pipeline.run(
        {
            "text_embedder": {"text": question},
            "prompt_builder": {"query": question},
            "answer_builder": {"query": question}
        }
    )
    return result["answer_builder"]["answers"][0].data

In [None]:
Query = "When is diarrhoea very risky?"
print(LLM_Run(Query))

Diarrhoea is very risky when it is caused by a viral infection, as it can lead to severe dehydration and electrolyte imbalances in the affected animal.
