In [1]:
%%capture
%pip install -q bitsandbytes
%pip install -q transformers
%pip install -q peft
%pip install -q accelerate
%pip install -q trl
%pip install -q torch
%pip install -q qdrant-client langchain pypdf sentence-transformers

## **Load all libraries**

In [2]:
!pip install langchain_community



In [3]:
%%capture
import os, torch
import pandas as pd
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig, AutoConfig, TrainingArguments, pipeline
from peft import LoraConfig, PeftModel, prepare_model_for_kbit_training, get_peft_model
from trl import SFTTrainer
from datasets import Dataset
from IPython.display import Markdown, display
from langchain_community.document_loaders import PyPDFDirectoryLoader
from langchain.vectorstores import Qdrant
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.chains import RetrievalQA
from langchain.llms import HuggingFacePipeline

<h3><strong>Know More about <a href="https://www.kaggle.com/code/lorentzyeung/what-s-4-bit-quantization-how-does-it-help-llama2">4-bit quantization</a></strong></h3>

In [4]:
model = "/kaggle/input/m/google/gemma/transformers/2b-it/2"

bnbConfig = BitsAndBytesConfig(
    load_in_4bit = True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16,
)

tokenizer = AutoTokenizer.from_pretrained(model, quantization_config=bnbConfig, device_map="auto")

model = AutoModelForCausalLM.from_pretrained(
    model,
    device_map = "auto",
    quantization_config=bnbConfig
)

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

In [5]:
system =  "You are a skilled software engineer who consistently produces high-quality Python code."
user = "Write a Python code to display text in a star pattern."

prompt = f"System: {system} \n User: {user} \n AI: "
    
inputs = tokenizer(prompt, return_tensors='pt', padding=True, truncation=True).to("cuda")

outputs = model.generate(**inputs, num_return_sequences=1, max_new_tokens=1000)

text = tokenizer.decode(outputs[0], skip_special_tokens=True)
Markdown(text.split("AI:")[1])

 

```python
# This Python code displays a text in a star pattern.

# Define the length of the star.
length = 5

# Print the star pattern.
for i in range(length):
    print("*", end="")
    
# Print the center star.
print("*")
```

**Output:**

```
    *
   ***
  *****
 *******
*********
```

# **3. Fine Tune Model**

## **Load the dataset**

In [6]:
# Load dataset
data = pd.read_csv("/kaggle/input/dataset-python-question-answer/Dataset_Python_Question_Answer.csv")

# Split into three equal parts
split_ratio = len(data) // 3
data_1, data_2, data_3 = data[:split_ratio], data[split_ratio:2*split_ratio], data[2*split_ratio:]

# Convert to Hugging Face datasets
dataset_1 = Dataset.from_pandas(data_1)
dataset_2 = Dataset.from_pandas(data_2)
dataset_3 = Dataset.from_pandas(data_3)

## **Define a formatting function for the model output.**

In [7]:
def formatting_func(example):
    template = "Instruction:\n{instruction}\n\nResponse:\n{response}"
    line = template.format(instruction=example['Question'], response=example['Answer'])
    return [line]

In [8]:
import os
os.environ["WANDB_DISABLED"] = "true"

In [9]:
lora_config = LoraConfig(
    r = 8,
    target_modules = ["q_proj", "o_proj", "k_proj", "v_proj",
                      "gate_proj", "up_proj", "down_proj"],
    task_type = "CAUSAL_LM",
)

In [10]:
# Define training function
def fine_tune_model(model, dataset, output_dir):
    trainer = SFTTrainer(
        model=model,
        train_dataset=dataset,
        args=TrainingArguments(
            per_device_train_batch_size=1,
            gradient_accumulation_steps=4,
            warmup_steps=2,
            max_steps=50,
            learning_rate=2e-4,
            fp16=True,
            logging_steps=1,
            output_dir=output_dir,
            optim="paged_adamw_8bit"
        ),
        peft_config=lora_config,
        formatting_func=formatting_func,
    )
    trainer.train()
    return trainer

# Fine-tune three separate models
fine_tune_model(model, dataset_1, "outputs_model_1")
fine_tune_model(model, dataset_2, "outputs_model_2")
fine_tune_model(model, dataset_3, "outputs_model_3")

Map:   0%|          | 0/139 [00:00<?, ? examples/s]



Step,Training Loss
1,1.1249
2,1.1249
3,1.0225
4,0.8471
5,0.7294
6,0.6353
7,0.544
8,0.4628
9,0.3938
10,0.3329


Map:   0%|          | 0/139 [00:00<?, ? examples/s]



Step,Training Loss
1,1.3086
2,1.3086
3,1.1949
4,1.0176
5,0.8762
6,0.764
7,0.6685
8,0.5793
9,0.4945
10,0.4148


Map:   0%|          | 0/141 [00:00<?, ? examples/s]



Step,Training Loss
1,1.1474
2,1.1474
3,1.0349
4,0.8384
5,0.7171
6,0.6387
7,0.5567
8,0.4781
9,0.4101
10,0.35


<trl.trainer.sft_trainer.SFTTrainer at 0x7c3447b31ba0>

## **Test the Fine-Tuned Model**

In [11]:
system =  "You are a skilled software engineer who consistently produces high-quality Python code."
question =system + "What is the difference between a variable and an object"

prompt = f"Question: {question} \n Answer: "
    
inputs = tokenizer(prompt, return_tensors='pt', padding=True, truncation=True).to("cuda")

outputs = model.generate(**inputs, num_return_sequences=1, max_new_tokens=512)

text = tokenizer.decode(outputs[0], skip_special_tokens=True)

Markdown(text.split("Answer:")[1])

 

A variable is a named memory location that stores a single value. An object is a collection of related variables that are associated with a single logical unit. 

**Variables:**

* Are declared with the `=` operator.
* Are assigned a single value.
* Are used to store a single value of a specific type.
* Variables are declared within functions, but they are not created until they are used.
* Variables are used to store values in memory.

**Objects:**

* Are declared with the `class` keyword.
* Are created by calling a constructor function.
* Can contain multiple variables, each of which is associated with a unique logical unit.
* Objects are created when we need to create an instance of a class.
* Objects are used to represent real-world entities.

Here is an example that illustrates the difference between a variable and an object:

```python
# Variable
name = "John"

# Object
person = {"name": "John", "age": 30}
```

In this example, the `name` variable is a variable that stores the string "John". The `person` object is an object that contains two variables, "name" and "age", both of which are strings.

Variables are created using the `=` operator, while objects are created by calling a constructor function. Variables are used to store a single value of a specific type, while objects can contain multiple values of different types.

Variables are declared within functions, but they are not created until they are used. Objects are created when we need to create an instance of a class. Objects are used to represent real-world entities.

## **Load documents for RAG**

In [12]:
# Instantiate a PyPDFDirectoryLoader object with the specified directory path
pdf_loader = PyPDFDirectoryLoader("/kaggle/input/knowledge-base")

# Load PDF documents from the specified directory
pdfs = pdf_loader.load()

In [13]:
# import the HuggingFaceEmbeddings class, 
embeddings = HuggingFaceEmbeddings(
    # This argument specifies the pre-trained model name to be used for generating embeddings.
    # Here, "sentence-transformers/all-mpnet-base-v2" is a pre-trained sentence transformer model 
    # from the Sentence Transformers library (not Transformers).
    # Sentence transformer models are specifically trained to generate meaningful representations 
    # of sentences that capture semantic similarity.
    model_name="sentence-transformers/all-mpnet-base-v2",

    # This argument is likely specific to the HuggingFaceEmbeddings class and might 
    # not be present in the base Transformers library.
    # It sets the device to "cuda" to leverage the GPU for faster processing if available.
    model_kwargs={"device": "cuda"}
)

  embeddings = HuggingFaceEmbeddings(


In [14]:
# Instantiate a RecursiveCharacterTextSplitter object with specified parameters
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=100)

# Split documents into chunks using the RecursiveCharacterTextSplitter
all_splits = text_splitter.split_documents(pdfs)

In [15]:
# Create a Qdrant collection from the document splits
# For storing and searching document information we use a vector database called Qdrant. 

qdrant_collection = Qdrant.from_documents(
    all_splits,                # List of document splits
    embeddings,                # HuggingFaceEmbeddings object for generating embeddings
    location=":memory:",       # Location to store the collection (in memory)
    collection_name="all_documents"  # Name of the Qdrant collection
)

In [16]:
# Create a retriever
retriever = qdrant_collection.as_retriever()

In [17]:
# This code creates a pipeline for text generation using a pre-trained model (model) 
# and its tokenizer (tokenizer). It leverages mixed precision (torch.bfloat16) 
# for potentially faster inference and limits generated text to 512 tokens.
pipeline = pipeline(
    "text-generation", 
    model=model, 
    tokenizer=tokenizer,
    model_kwargs = {"torch.dtype": torch.bfloat16},
    max_new_tokens=512    
)

In [19]:
question = "What is the difference between a variable and an object"

message = [
    {"role": "user", "content": question},
]

prompt = pipeline.tokenizer.apply_chat_template(message, tokenize=False, add_generation_prompt=True)

outputs = pipeline(
    prompt,
    max_new_tokens=512,
    add_special_tokens=True,
    do_sample=True,
    temperature=0.7,
    top_k=10,
    top_p=0.95
)
Markdown(outputs[0]["generated_text"][len(prompt):])

**Variable**

* A variable is a named memory location that stores a single value.
* It is declared using the `=` operator.
* Variables can be used to store different values, but they are associated with a specific memory address.
* Variables are declared and initialized during the compilation phase.
* Changes to a variable will not affect other parts of the program.

**Object**

* An object is a collection of zero or more variables and methods that are associated with a specific memory address.
* It is created using the `new` keyword.
* Objects can contain references to other objects, allowing them to interact with each other.
* Objects are dynamically allocated memory.
* Objects can be used to encapsulate data and code, making them reusable.

**Example**

```python
# Variable
name = "John"

# Object
person = {"name": "John", "age": 30}
```

**Key Differences:**

| Feature | Variable | Object |
|---|---|---|
| Storage | Memory address | Memory address |
| Declaration | `=` | `new` |
| Value type | Any | Objects |
| Reusability | No | Yes |
| Scope | Global | Local |
| Lifetime | As long as the program is running | As long as it is referenced |

**Conclusion**

Variables are used to store individual values, while objects are used to store collections of related values and methods. Variables are declared and initialized manually, while objects are created dynamically.

In [20]:
gemma_llm = HuggingFacePipeline(
    pipeline=pipeline,
    model_kwargs={
        "temperature": 0.7,
        "max_new_tokens": 512,
        "add_special_tokens": True,
        "do_sample": True,
        "top_k": 10,
        "top_p": 0.95
    },
)
# Create a RetrievalQA object
qa = RetrievalQA.from_chain_type(
    llm=gemma_llm,  # Pass the text-generation pipeline object
    chain_type="stuff",
    retriever=retriever  # retriever object
)

  gemma_llm = HuggingFacePipeline(


In [21]:
question = "Write in detail about python"
message = [
    {"role": "user", "content": question},
]

prompt = pipeline.tokenizer.apply_chat_template(message, tokenize=False, add_generation_prompt=True, truncation=True)
result = qa.invoke(prompt)
Markdown(result['result'].split('Helpful Answer:')[1])

 The context does not provide any information about Python, so I cannot answer this question from the provided context.

# Federated Learning

In [22]:
# Fine-tune three separate models and save them
trainer_1 = fine_tune_model(model, dataset_1, "outputs_model_1")
trainer_1.model.save_pretrained("model_1")  # Save model_1

trainer_2 = fine_tune_model(model, dataset_2, "outputs_model_2")
trainer_2.model.save_pretrained("model_2")  # Save model_2

trainer_3 = fine_tune_model(model, dataset_3, "outputs_model_3")
trainer_3.model.save_pretrained("model_3")  # Save model_3

Map:   0%|          | 0/139 [00:00<?, ? examples/s]



Step,Training Loss
1,1.1249
2,1.1249
3,1.0234
4,0.8496
5,0.7317
6,0.6388
7,0.5494
8,0.4695
9,0.4008
10,0.3402


Map:   0%|          | 0/139 [00:00<?, ? examples/s]



Step,Training Loss
1,1.3086
2,1.3086
3,1.1952
4,1.0176
5,0.8764
6,0.764
7,0.6686
8,0.5797
9,0.4944
10,0.4147


Map:   0%|          | 0/141 [00:00<?, ? examples/s]



Step,Training Loss
1,1.1474
2,1.1474
3,1.0351
4,0.8382
5,0.7169
6,0.6386
7,0.5565
8,0.478
9,0.4101
10,0.3501


In [23]:
def load_model(model_path, dtype=torch.float16, device="cpu"):
    """Load model with reduced precision and on CPU to save RAM."""
    return AutoModelForCausalLM.from_pretrained(model_path, torch_dtype=dtype, device_map=device)

In [24]:
def federated_averaging(model_paths):
    """Perform federated averaging with memory optimization."""
    global_model = load_model(model_paths[0])
    global_model_state = global_model.state_dict()

    for key in global_model_state.keys():
        global_model_state[key] = global_model_state[key].to(torch.float32)  # Convert to float32 for accurate averaging

    num_models = len(model_paths)

    # Iterate through remaining models one by one to avoid memory overhead
    for model_path in model_paths[1:]:
        model = load_model(model_path)
        model_state = model.state_dict()

        for key in global_model_state.keys():
            global_model_state[key] += model_state[key].to(torch.float32)  # Accumulate in float32

        del model  # Free memory
        torch.cuda.empty_cache()

    # Compute final averaged parameters
    for key in global_model_state.keys():
        global_model_state[key] /= num_models  # Average across models

    # Reload the averaged weights into a model
    final_model = load_model(model_paths[0])  # Initialize from first model's structure
    final_model.load_state_dict(global_model_state)

    return final_model

# Define model paths instead of loading them all at once
model_paths = ["model_1", "model_2"]
global_model = federated_averaging(model_paths)

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

In [25]:
# Save the federated averaged model
save_path = "global_model"
global_model.save_pretrained(save_path)
print(f"Global model saved at: {save_path}")


Global model saved at: global_model


In [27]:
# Load the global model
global_model = AutoModelForCausalLM.from_pretrained("global_model")

# Inspect the parameters
for name, param in global_model.named_parameters():
    print(f"{name}: {param.data}")

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

model.embed_tokens.weight: tensor([[ 5.2344e-01, -3.5889e-02,  5.9814e-02,  ...,  7.7637e-02,
          2.3535e-01,  3.8330e-02],
        [ 1.5137e-01, -1.4453e-01, -1.1719e-01,  ..., -1.9409e-02,
          4.9133e-03, -2.0508e-02],
        [ 1.0352e-01,  5.0354e-03, -3.2715e-02,  ..., -1.7334e-02,
         -8.9111e-03, -1.0254e-02],
        ...,
        [ 2.7344e-01,  1.2390e-02,  4.2236e-02,  ..., -4.8584e-02,
          1.9165e-02, -3.0151e-02],
        [ 2.9102e-01, -6.4453e-02,  6.2012e-02,  ..., -1.9653e-02,
          7.1289e-02, -1.6689e-04],
        [ 5.2344e-01, -3.5156e-02,  6.0791e-02,  ...,  7.6172e-02,
          2.3828e-01,  3.9795e-02]])
model.layers.0.self_attn.q_proj.base_layer.weight: tensor([[-0.0001, -0.0031, -0.0053,  ...,  0.0056,  0.0104, -0.0052],
        [-0.0002, -0.0087, -0.0012,  ...,  0.0034,  0.0041,  0.0063],
        [-0.0005,  0.0011,  0.0011,  ..., -0.0062, -0.0064, -0.0018],
        ...,
        [-0.0001,  0.0030, -0.0053,  ..., -0.0028, -0.0004, -0.0083

In [32]:
from transformers import pipeline

# Create the pipeline with a different variable name
text_generator = pipeline(
    "text-generation",
    model=global_model,
    tokenizer=tokenizer,
)

# Your example question
question = "What is the difference between a variable and an object"

# Create the message format
message = [
    {"role": "user", "content": question},
]

# Apply the chat template
prompt = text_generator.tokenizer.apply_chat_template(message, tokenize=False, add_generation_prompt=True)

# Generate the output
outputs = text_generator(
    prompt,
    max_new_tokens=512,
    add_special_tokens=True,
    do_sample=True,
    temperature=0.7,
    top_k=10,
    top_p=0.95
)

# Display the result
Markdown(outputs[0]["generated_text"][len(prompt):])

Sure, here's the difference between a variable and an object:

**Variable:**

* A variable is a storage location that holds a single value.
* It is identified by a name and has a specific scope within the program.
* Once a variable is initialized, its value cannot be changed.
* Variables are commonly used to store data and make it accessible throughout the program.

**Object:**

* An object is a complex data structure that contains multiple variables and methods.
* It is an instance of a class, which defines the structure and behavior of the object.
* Objects can have their own values and can interact with each other.
* Objects are created from classes and can be used to represent real-world entities.

**Here's an example to illustrate the difference:**

```python
# Variable
name = "John"

# Object
person = {"name": "John", "age": 30, "city": "New York"}
```

In this example, `name` is a variable that stores a string, while `person` is an object that contains multiple variables and a class definition.

**Key differences:**

| Feature | Variable | Object |
|---|---|---|
| Scope | Local | Global or local |
| Value | Single | Multiple |
| Type | Any data type | Class or object type |
| Creation | Defined at initialization | Created when an object is created |
| Data structure | Simple (string) | Complex (data structure) |
| Interactivity | Read-only | Read-write |
| Use case | Storing and accessing data, making it accessible throughout the program | Creating complex data structures, representing real-world entities |

**In summary:**

* A variable is a storage location for a single value.
* An object is a complex data structure that contains multiple variables and methods.
* Objects can interact with each other and have their own values.

# Evaluation 

In [38]:
import nltk
from nltk.translate.bleu_score import sentence_bleu, SmoothingFunction
from nltk.tokenize import word_tokenize
from nltk.metrics import jaccard_distance
import numpy as np

In [39]:
# Download required NLTK data
nltk.download('punkt')

def calculate_metrics(generated_text, reference_text):
    """
    Calculate various metrics for text generation evaluation
    
    Args:
        generated_text (str): The model generated text
        reference_text (str): The ground truth reference text
    
    Returns:
        dict: Dictionary containing various metric scores
    """
    metrics = {}
    
    # Tokenize texts
    generated_tokens = word_tokenize(generated_text.lower())
    reference_tokens = [word_tokenize(reference_text.lower())]
    
    # BLEU Score
    smoother = SmoothingFunction().method1
    try:
        bleu_score = sentence_bleu(reference_tokens, generated_tokens, 
                                 smoothing_function=smoother)
        metrics['bleu'] = bleu_score
    except Exception as e:
        metrics['bleu'] = 0
        print(f"BLEU score calculation failed: {e}")

    # Jaccard Similarity (1 - distance)
    gen_set = set(generated_tokens)
    ref_set = set(reference_tokens[0])
    try:
        jaccard_sim = 1 - jaccard_distance(gen_set, ref_set)
        metrics['jaccard_similarity'] = jaccard_sim
    except Exception as e:
        metrics['jaccard_similarity'] = 0
        print(f"Jaccard calculation failed: {e}")
    
    # Token overlap ratio
    common_tokens = len(gen_set.intersection(ref_set))
    metrics['token_overlap'] = common_tokens / len(ref_set)
    
    # Length metrics
    metrics['generated_length'] = len(generated_tokens)
    metrics['reference_length'] = len(reference_tokens[0])
    metrics['length_ratio'] = len(generated_tokens) / len(reference_tokens[0])
    
    return metrics

def evaluate_model(text_generator, eval_data):
    """
    Evaluate the model on a set of test examples
    
    Args:
        text_generator: The pipeline instance
        eval_data: List of tuples containing (question, reference_answer)
    
    Returns:
        dict: Aggregated metrics across all examples
    """
    all_metrics = []
    
    for question, reference in eval_data:
        # Generate response
        message = [{"role": "user", "content": question}]
        prompt = text_generator.tokenizer.apply_chat_template(
            message, tokenize=False, add_generation_prompt=True
        )
        
        outputs = text_generator(
            prompt,
            max_new_tokens=512,
            add_special_tokens=True,
            do_sample=True,
            temperature=0.7,
            top_k=10,
            top_p=0.95
        )
        
        generated_text = outputs[0]["generated_text"][len(prompt):]
        
        # Calculate metrics
        metrics = calculate_metrics(generated_text, reference)
        all_metrics.append(metrics)
        
        # Print individual results
        print(f"\nQuestion: {question}")
        print(f"Generated: {generated_text[:200]}...")
        print(f"Reference: {reference[:200]}...")
        print("Metrics:", {k: f"{v:.4f}" for k, v in metrics.items()})
    
    # Aggregate metrics
    aggregated_metrics = {}
    for metric in all_metrics[0].keys():
        values = [m[metric] for m in all_metrics]
        aggregated_metrics[f'avg_{metric}'] = np.mean(values)
        aggregated_metrics[f'std_{metric}'] = np.std(values)
    
    return aggregated_metrics

# Example usage:
eval_data = [
    (
        "What is the difference between a variable and an object?",
        "A variable is a named storage location that holds a value, while an object is an instance of a class that contains both data and methods."
    ),
    (
        "Explain what is inheritance in programming?",
        "Inheritance is a fundamental concept in object-oriented programming where a class can inherit properties and methods from another class. This promotes code reuse and establishes a relationship between parent and child classes."
    )
]

# Run evaluation
print("Running evaluation...")
metrics = evaluate_model(text_generator, eval_data)

print("\nAggregated Metrics:")
for metric, value in metrics.items():
    print(f"{metric}: {value:.4f}")

[nltk_data] Downloading package punkt to /usr/share/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
Running evaluation...

Question: What is the difference between a variable and an object?
Generated: Sure, here's the difference between a variable and an object:

**Variable:**

* A variable is a memory location that stores a single value.
* It is declared using a keyword (e.g., `int age;`) and assi...
Reference: A variable is a named storage location that holds a value, while an object is an instance of a class that contains both data and methods....
Metrics: {'bleu': '0.0199', 'jaccard_similarity': '0.1164', 'token_overlap': '0.7727', 'generated_length': '341.0000', 'reference_length': '28.0000', 'length_ratio': '12.1786'}

Question: Explain what is inheritance in programming?
Generated: Sure. Here's a detailed explanation of inheritance in programming:

**Inheritance** is a mechanism in object-oriented programming (OOP) where a new class is created that inherits prope