# **Deep Learning for NLP**
**-Aniruddha Banerjee**

# **Project Objectives:**
Industry Selection: Students must select one industry from the list provided. This industry will be the focus of their project, including data collection and model training.


Data Collection: Gather relevant data specific to the chosen industry. This data will be used to fine-tune the pre-trained model to ensure the LLM Bot is knowledgeable and contextually aware of industry-specific information.



Model Selection and Training: Utilize any pre-trained model from Hugging Face or similar platforms. Fine-tune the model on the collected data using resources like Google Colab with T4 GPUs, limiting the training to a maximum of 25 epochs to ensure feasibility.



Bot Development: Develop the LLM Bot that can interact with users, providing answers and engaging in meaningful conversations specific to the chosen industry. The bot should demonstrate the ability to understand and process industry-related queries effectively.



Demonstration: Create an explanatory video showcasing the working of the LLM Bot. The video should highlight the bot's ability to handle industry-specific questions, demonstrating its practical application.

#** Installing the Required Packages  and Libraries**

In [None]:
!pip install -q accelerate==0.21.0 peft==0.4.0 bitsandbytes==0.40.2 transformers==4.31.0 trl==0.4.7

[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
kaggle-environments 1.14.11 requires transformers>=4.33.1, but you have transformers 4.31.0 which is incompatible.[0m[31m
[0m

This package installation command includes several different packages that are related to machine learning and natural language processing. Here are the packages listed in the command:

accelerate (version 0.21.0): This library is part of the Hugging Face ecosystem and is designed to streamline the process of training and deploying machine learning models on various hardware accelerators (e.g., GPUs, TPUs).

peft (version 0.4.0): PEFT stands for "Parameter-Efficient Fine-Tuning." It's used for fine-tuning pre-trained models with fewer parameters, making the process more efficient.

bitsandbytes (version 0.40.2): This library is used for 8-bit optimizers and quantization of neural networks, which can significantly reduce memory usage and computational costs during training and inference.

transformers (version 4.31.0): This is the main library from Hugging Face that provides implementations of state-of-the-art transformer models for natural language understanding and generation tasks.

trl (version 0.4.7): TRL stands for "Training Reinforcement Learning." This library provides tools for training models using reinforcement learning techniques, particularly in the context of natural language processing.

Together, these packages facilitate the development, training, fine-tuning, and deployment of advanced machine learning models, particularly those based on transformer architectures.

In [None]:
!pip install huggingface_hub



The command !pip install huggingface_hub installs the Hugging Face Hub library, which is a crucial tool for interacting with the Hugging Face model and dataset repositories.

In [None]:
import os
import torch
from datasets import load_dataset
from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer,
    BitsAndBytesConfig,
    HfArgumentParser,
    TrainingArguments,
    pipeline,
    logging,
)
from peft import LoraConfig, PeftModel
from trl import SFTTrainer

2024-07-08 07:47:49.926511: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-07-08 07:47:49.926646: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-07-08 07:47:50.061199: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered


The provided import statements are bringing in various tools and libraries that are essential for machine learning, specifically for working with natural language processing (NLP) models. Here's a brief explanation of each import and its purpose:

Standard Library
os: Provides a way to interact with the operating system, including file and directory manipulation.


PyTorch
torch: The main library for tensor operations and building neural networks in PyTorch.


Hugging Face Datasets
load_dataset: A function from the datasets library that allows you to load various datasets directly from the Hugging Face Hub



Hugging Face Transformers
AutoModelForCausalLM: A class that automatically loads a pre-trained causal language model for tasks like text generation.


AutoTokenizer: A class that automatically loads the appropriate tokenizer for a given model.


BitsAndBytesConfig: A configuration class for quantization and 8-bit optimizers from the bitsandbytes library.


HfArgumentParser: A parser for command-line arguments that integrates well with Hugging Face's training scripts.


TrainingArguments: A class that stores all the arguments needed to train a model, including learning rate, batch size, and more.


pipeline: A high-level API for running various NLP tasks (e.g., text generation, text classification) using pre-trained models.


logging: A module for setting up logging for the transformers library.
Parameter-Efficient Fine-Tuning (PEFT)


LoraConfig: A configuration class for LoRA (Low-Rank Adaptation), which is a technique for parameter-efficient fine-tuning.


PeftModel: A class that wraps a model with PEFT methods, such as LoRA.
Training with Reinforcement Learning


SFTTrainer: A trainer class from the trl library for training models using reinforcement learning techniques, specifically for NLP tasks.


These imports collectively allow for building, training, and fine-tuning state-of-the-art NLP models efficiently, leveraging advanced techniques such as parameter-efficient fine-tuning and model quantization.

In [None]:
torch.cuda.is_available()

True

# **Loading and Preparing Datasets for NLP Model Training**
The following code snippet demonstrates how to load datasets from the Hugging Face Hub, which are then used for training a natural language processing (NLP) model. Specifically, it loads two datasets related to financial texts and prepares them for training.

In [None]:
from datasets import load_dataset, DatasetDict, concatenate_datasets
from transformers import AutoModelForCausalLM, AutoTokenizer, Trainer, TrainingArguments


# Load dataset
dataset1 = load_dataset("poornima9348/finance-alpaca-1k-test")
dataset2 = load_dataset("ssbuild/alpaca_finance_en")

Downloading readme:   0%|          | 0.00/121 [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/1.11M [00:00<?, ?B/s]

Generating test split:   0%|          | 0/1000 [00:00<?, ? examples/s]

Downloading readme:   0%|          | 0.00/28.0 [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/23.0M [00:00<?, ?B/s]

Generating train split:   0%|          | 0/68912 [00:00<?, ? examples/s]

In [None]:
dataset1

DatasetDict({
    test: Dataset({
        features: ['instruction', 'input', 'output', 'text'],
        num_rows: 1000
    })
})

In [None]:
dataset2

DatasetDict({
    train: Dataset({
        features: ['id', 'instruction', 'input', 'output'],
        num_rows: 68912
    })
})

The provided code snippet and dataset details indicate the structure and content of dataset1 and dataset2

# **Transforming and Cleaning the Datasets for Model Training**
In this code snippet, we are transforming and cleaning the datasets to prepare them for training a language model. Specifically, we combine the instruction and output columns into a new text column and then remove unnecessary columns from the datasets.

In [None]:
# Combine 'instruction' and 'output' columns into a new 'text' column
def combine_text_columns(example):
    return {'text': f"{example['instruction']} ### {example['output']}"}

# Apply the function to each example in the dataset
dataset1 = dataset1.map(combine_text_columns)
dataset2 = dataset2.map(combine_text_columns)

# Remove 'instruction', 'input' and 'output' columns
dataset1['test']=dataset1['test'].remove_columns(['instruction','input', 'output'])
dataset2['train']=dataset2['train'].remove_columns(['instruction','input', 'output','id'])

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/68912 [00:00<?, ? examples/s]

# **Performing Train-Test Split on the Datasets**
In this section, we perform a train-test split on the datasets to prepare separate training and testing subsets. This is a crucial step in machine learning to evaluate the performance of the model on unseen data.

In [None]:
# Perform the train-test split on the necessary dataset if required
split_dataset1 = dataset1['test'].train_test_split(train_size=0.8)
split_dataset2 = dataset2['train'].train_test_split(test_size=0.2)

In [None]:
split_dataset1

DatasetDict({
    train: Dataset({
        features: ['text'],
        num_rows: 800
    })
    test: Dataset({
        features: ['text'],
        num_rows: 200
    })
})

In [None]:
split_dataset2

DatasetDict({
    train: Dataset({
        features: ['text'],
        num_rows: 55129
    })
    test: Dataset({
        features: ['text'],
        num_rows: 13783
    })
})

# **Concatenating and Merging Datasets for Training**
In this section, we concatenate the training and testing splits from the two datasets to create a single, unified dataset. This step combines the data, ensuring that the model is trained and evaluated on a larger and more diverse dataset.

In [None]:
# Concatenate the datasets
merged_train = concatenate_datasets([split_dataset1['train'], split_dataset2['train']])
merged_test = concatenate_datasets([split_dataset1['test'], split_dataset2['test']])

# Create a new DatasetDict with the merged datasets
merged_dataset = DatasetDict({
    'train': merged_train,
    'test': merged_test
})

# Filter out None values in case some splits are missing
merged_dataset = DatasetDict({k: v for k, v in merged_dataset.items() if v is not None})

# Print the merged dataset to verify
print(merged_dataset)

DatasetDict({
    train: Dataset({
        features: ['text'],
        num_rows: 55929
    })
    test: Dataset({
        features: ['text'],
        num_rows: 13983
    })
})


# **Shuffling, Slicing, and Transforming the Dataset**
In this section, we shuffle and slice the training dataset, then transform the conversation text into a new format suitable for training an NLP model.

In [None]:
# Shuffle the dataset and slice it
merged_train_dataset = merged_dataset['train'].shuffle(seed=42).select(range(5000))

def transform_conversation(example):
    conversation_text1 = example['text']
    segments = conversation_text1.split('###')

    reformatted_segments = []

    # Iterate over the segments and ensure each segment has a prompt and answer
    for i in range(0, len(segments) - 1, 2):
        prompt = segments[i].strip()
        if i + 1 < len(segments):
            answer = segments[i + 1].strip()
            # Apply the new template
            reformatted_segments.append(f'<s>[INST] {prompt} [/INST] {answer} </s>')
        else:
            # Handle the case where there is no corresponding assistant segment
            reformatted_segments.append(f'<s>[INST] {prompt} [/INST] </s>')

    return {'text': ''.join(reformatted_segments)}

# Apply the transformation
transformed_dataset = merged_train_dataset.map(transform_conversation)

Map:   0%|          | 0/5000 [00:00<?, ? examples/s]

 We again shuffle,slice and slice the dataset with a difference.
 Dataset Size: The training dataset is larger (5000 examples) compared to the test dataset (100 examples).
Purpose: Training data is prepared in larger quantities to help the model learn, while test data is smaller and used to evaluate model performance.

In [None]:
# Shuffle the dataset and slice it
merged_test_dataset = merged_dataset['test'].shuffle(seed=42).select(range(100))

def transform_conversation(example):
    conversation_text1 = example['text']
    segments = conversation_text1.split('###')

    reformatted_segments = []

    # Iterate over the segments and ensure each segment has a prompt and answer
    for i in range(0, len(segments) - 1, 2):
        prompt = segments[i].strip()
        if i + 1 < len(segments):
            answer = segments[i + 1].strip()
            # Apply the new template
            reformatted_segments.append(f'<s>[INST] {prompt} [/INST] {answer} </s>')
        else:
            # Handle the case where there is no corresponding assistant segment
            reformatted_segments.append(f'<s>[INST] {prompt} [/INST] </s>')

    return {'text': ''.join(reformatted_segments)}

# Apply the transformation
transformed_test_dataset = merged_test_dataset.map(transform_conversation)

Map:   0%|          | 0/100 [00:00<?, ? examples/s]

In [None]:
merged_test_dataset['text'][0]

'In USA, what circumstances (if any) make it illegal for a homeless person to “rent” an address? ### It depends on the rules in the specific places you stay.  Specific places being countries or states.   Some states may consider pension payments to be taxable income, others may not.  Some may consider presence for X days to constitute residency, X days may be 60 days in a calendar year whether or not those days are continuous.   It doesn\'t matter so much where your mailbox or mail handling service is located, it matters: You may owe taxes in more than one place.  Some states will allow you to offset other states\' taxes against theirs.  Some states in the US are really harsh on income taxes.  It\'s my understanding that if you own real estate in New York, all of your income, no matter the source, is taxable income in New York whether or not you were ever in the state that year. Ultimately, you can\'t just put up your hand and say, "that\'s my tax domicile so I\'m exempt from all your 

# **Training Configuration for Fine-Tuning a Model**
This code sets up the fine-tuning of a pre-trained model from the Hugging Face Hub using specific configurations. It specifies parameters for QLoRA to adapt the model with low-rank adaptations, uses 4-bit precision with bitsandbytes for efficient computation, and defines training arguments such as learning rate, batch size, and gradient accumulation. It also configures sequence handling and checkpointing for the training process. Overall, it prepares and customizes the model training setup for optimal performance and resource management.

In [None]:
# The model that you want to train from the Hugging Face hub
model_name = "NousResearch/Llama-2-7b-chat-hf"


# Fine-tuned model name
new_model = "Llama-2-7b-finance-chatbot-finetune"

################################################################################
# QLoRA parameters
################################################################################

# LoRA attention dimension
lora_r = 64

# Alpha parameter for LoRA scaling
lora_alpha = 16

# Dropout probability for LoRA layers
lora_dropout = 0.1

################################################################################
# bitsandbytes parameters
################################################################################

# Activate 4-bit precision base model loading
use_4bit = True

# Compute dtype for 4-bit base models
bnb_4bit_compute_dtype = "float16"

# Quantization type (fp4 or nf4)
bnb_4bit_quant_type = "nf4"

# Activate nested quantization for 4-bit base models (double quantization)
use_nested_quant = False

################################################################################
# TrainingArguments parameters
################################################################################

# Output directory where the model predictions and checkpoints will be stored
output_dir = "./results"

# Number of training epochs
num_train_epochs = 1

# Enable fp16/bf16 training (set bf16 to True with an A100)
fp16 = False
bf16 = False

# Batch size per GPU for training
per_device_train_batch_size = 4

# Batch size per GPU for evaluation
per_device_eval_batch_size = 4

# Number of update steps to accumulate the gradients for
gradient_accumulation_steps = 1

# Enable gradient checkpointing
gradient_checkpointing = True

# Maximum gradient normal (gradient clipping)
max_grad_norm = 0.3

# Initial learning rate (AdamW optimizer)
learning_rate = 2e-4

# Weight decay to apply to all layers except bias/LayerNorm weights
weight_decay = 0.001

# Optimizer to use
optim = "paged_adamw_32bit"

# Learning rate schedule
lr_scheduler_type = "cosine"

# Number of training steps (overrides num_train_epochs)
max_steps = -1

# Ratio of steps for a linear warmup (from 0 to learning rate)
warmup_ratio = 0.03

# Group sequences into batches with same length
# Saves memory and speeds up training considerably
group_by_length = True

# Save checkpoint every X updates steps
save_steps = 0

# Log every X updates steps
logging_steps = 25

################################################################################
# SFT parameters
################################################################################

# Maximum sequence length to use
max_seq_length = 350

# Pack multiple short examples in the same input sequence to increase efficiency
packing = False

# Load the entire model on the GPU 0
device_map = {"": 0}

# **Fine-Tuning Configuration for a Pre-Trained Model**

In [None]:
# Load Reformatted dataset
dataset = transformed_dataset

# Load tokenizer and model with QLoRA configuration
compute_dtype = getattr(torch, bnb_4bit_compute_dtype)

bnb_config = BitsAndBytesConfig(
    load_in_4bit=use_4bit,
    bnb_4bit_quant_type=bnb_4bit_quant_type,
    bnb_4bit_compute_dtype=compute_dtype,
    bnb_4bit_use_double_quant=use_nested_quant,
)

# Check GPU compatibility with bfloat16
if compute_dtype == torch.float16 and use_4bit:
    major, _ = torch.cuda.get_device_capability()
    if major >= 8:
        print("=" * 80)
        print("Your GPU supports bfloat16: accelerate training with bf16=True")
        print("=" * 80)

# Load base model
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    quantization_config=bnb_config,
    device_map=device_map
)
model.config.use_cache = False
model.config.pretraining_tp = 1

# Load LLaMA tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right" # Fix weird overflow issue with fp16 training

# Load LoRA configuration
peft_config = LoraConfig(
    lora_alpha=lora_alpha,
    lora_dropout=lora_dropout,
    r=lora_r,
    bias="none",
    task_type="CAUSAL_LM",
)

# Set training parameters
training_arguments = TrainingArguments(
    output_dir=output_dir,
    num_train_epochs=num_train_epochs,
    per_device_train_batch_size=per_device_train_batch_size,
    gradient_accumulation_steps=gradient_accumulation_steps,
    optim=optim,
    save_steps=save_steps,
    logging_steps=logging_steps,
    learning_rate=learning_rate,
    weight_decay=weight_decay,
    fp16=fp16,
    bf16=bf16,
    max_grad_norm=max_grad_norm,
    max_steps=max_steps,
    warmup_ratio=warmup_ratio,
    group_by_length=group_by_length,
    lr_scheduler_type=lr_scheduler_type,
    report_to="tensorboard"
)

# Set supervised fine-tuning parameters
trainer = SFTTrainer(
    model=model,
    train_dataset=dataset,
    peft_config=peft_config,
    dataset_text_field="text",
    max_seq_length=max_seq_length,
    tokenizer=tokenizer,
    args=training_arguments,
    packing=packing,
)

# Train model
trainer.train()



config.json:   0%|          | 0.00/583 [00:00<?, ?B/s]

model.safetensors.index.json:   0%|          | 0.00/26.8k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/2 [00:00<?, ?it/s]

model-00001-of-00002.safetensors:   0%|          | 0.00/9.98G [00:00<?, ?B/s]

model-00002-of-00002.safetensors:   0%|          | 0.00/3.50G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/200 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/746 [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/500k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.84M [00:00<?, ?B/s]

added_tokens.json:   0%|          | 0.00/21.0 [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/435 [00:00<?, ?B/s]



Map:   0%|          | 0/5000 [00:00<?, ? examples/s]

You're using a LlamaTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.


Step,Training Loss
25,2.4262
50,2.0102
75,1.7465
100,1.5233
125,1.7959
150,1.531
175,1.8101
200,1.4535
225,1.782
250,1.516


TrainOutput(global_step=625, training_loss=1.6522029724121094, metrics={'train_runtime': 2917.1263, 'train_samples_per_second': 1.714, 'train_steps_per_second': 0.214, 'total_flos': 1.263688331624448e+16, 'train_loss': 1.6522029724121094, 'epoch': 1.0})

Steps in the Code:

Model Selection:

Sets the pre-trained model (NousResearch/Llama-2-7b-chat-hf) and specifies the name for the fine-tuned model (Llama-2-7b-finance-chatbot-finetune).
QLoRA Parameters:

Configures Low-Rank Adaptation (LoRA) parameters including attention dimension, scaling factor, and dropout probability.
bitsandbytes Parameters:

Enables 4-bit precision for efficient computation, with settings for data type, quantization type, and optional nested quantization.
TrainingArguments:

Defines training parameters such as output directory, number of epochs, batch sizes, learning rate, optimizer, and checkpointing/logging settings.
SFT Parameters:

Specifies sequence length, packing of short examples, and device mapping for model loading and training

# **Saving and Verifying the Trained Model**

In [None]:
# Save trained model
trainer.model.save_pretrained(new_model)

In [None]:
# List the contents to ensure files are saved
print("Contents of new_model directory:", os.listdir(new_model))

Contents of new_model directory: ['adapter_config.json', 'README.md', 'adapter_model.bin']


# **Running Text Generation with the Fine-Tuned Model**

In [None]:
# Ignore warnings
logging.set_verbosity(logging.CRITICAL)

# Run text generation pipeline with our next model
prompt ="Generate a title for a blog about the Nobel Prize ceremony.'"
pipe = pipeline(task="text-generation", model=model, tokenizer=tokenizer, max_length=100)
result = pipe(f"<s>[INST] {prompt} [/INST]")
print(result[0]['generated_text'])



<s>[INST] Generate a title for a blog about the Nobel Prize ceremony.' [/INST] The Nobel Prize Ceremony: A Celebration of Excellence.

The Nobel Prize ceremony is an annual event that recognizes the achievements of individuals who have made significant contributions to their respective fields. It is a celebration of excellence and a testament to the power of human ingenuity. The ceremony is a time for the recipients to be recognized for


**Steps Walkthrough**
Ignore Warnings: Set the logging level to show only critical errors.

Initialize Text Generation Pipeline: Create a pipeline for text generation using the fine-tuned model and tokenizer, specifying the maximum length of the generated text.

Generate Text: Use the pipeline to generate text based on a formatted prompt.

Print Generated Text: Output the generated text to the console.

**Launch TensorBoard to Visualize Training Logs**

In [None]:
%load_ext tensorboard
%tensorboard --logdir results/runs

# **Model Reloading, Merging, and Saving**

In [None]:
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
import os
import shutil

# Define model_name and new_model
model_name = "NousResearch/Llama-2-7b-chat-hf"
new_model = "Llama-2-7b-finance-chatbot-finetune"

# Clear GPU memory
torch.cuda.empty_cache()

# Ensure the directory exists
if not os.path.exists(new_model):
    os.makedirs(new_model)

# Define the offload directory
offload_dir = "/kaggle/working/"

# Ensure the offload directory exists
if not os.path.exists(offload_dir):
    os.makedirs(offload_dir)

try:
    # Reload model in FP16 and merge it with LoRA weights
    base_model = AutoModelForCausalLM.from_pretrained(
        model_name,
        low_cpu_mem_usage=True,
        return_dict=True,
        torch_dtype=torch.float16,
        device_map="auto",  # Automatically splits the model across available GPUs
        offload_folder=offload_dir  # Offload to the specified directory
    )

    model = PeftModel.from_pretrained(base_model, new_model)
    model = model.merge_and_unload()

    # Reload tokenizer to save it
    tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
    tokenizer.pad_token = tokenizer.eos_token
    tokenizer.padding_side = "right"

    # Save the model and tokenizer
    model.save_pretrained(new_model)
    tokenizer.save_pretrained(new_model)

    # List the contents to ensure files are saved
    print("Contents of new_model directory:", os.listdir(new_model))

    # Zip the new_model directory
    #shutil.make_archive(new_model, 'zip', new_model)

    # Download the zipped file
    #files.download(new_model + ".zip")

except RuntimeError as e:
    if "out of memory" in str(e):
        print("Out of memory error. Try using a smaller model or increasing GPU memory.")
        torch.cuda.empty_cache()
    else:
        raise e
except ValueError as e:
    print(f"ValueError: {e}")


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Contents of new_model directory: ['adapter_config.json', 'README.md', 'pytorch_model-00002-of-00002.bin', 'adapter_model.bin', 'tokenizer.json', 'special_tokens_map.json', 'added_tokens.json', 'tokenizer_config.json', 'tokenizer.model', 'pytorch_model.bin.index.json', 'pytorch_model-00001-of-00002.bin', 'config.json', 'generation_config.json']


**Steps in the Code:**

Clear GPU Memory:

Empty the GPU cache to free up memory.
Ensure Directory Exists:

Create directories for saving the new model and offloading if they do not already exist.
Reload Model:

Load the base model in FP16 precision, merge it with LoRA weights, and offload to a specified directory.
Reload and Configure Tokenizer:

Load the tokenizer, configure padding settings, and save both the model and tokenizer.
Verify and Handle Errors:

List the directory contents to confirm saving, handle out-of-memory errors, and raise other exceptions if they occur.

Pushing Model and Tokenizer to Hugging Face Hub

In [None]:
import locale
locale.getpreferredencoding = lambda: "UTF-8"

In [None]:
from huggingface_hub import login, whoami
from kaggle_secrets import UserSecretsClient
import os

# Step 1: Retrieve the Hugging Face token from Kaggle Secrets
user_secrets = UserSecretsClient()
hf_token = user_secrets.get_secret("chatbot")

# Step 2: Login using the Hugging Face token
login(token=hf_token)


# Step 3: Push the model and tokenizer to the Hugging Face Hub
model_repo_name = "anirudh/finance_chatbot"
tokenizer_repo_name = "anirudh/finance_chatbot"

# Push the model to the hub
model.push_to_hub(model_repo_name, use_auth_token=hf_token, check_pr=True)

# Push the tokenizer to the hub
tokenizer.push_to_hub(tokenizer_repo_name, use_auth_token=hf_token, check_pr=True)


The token has not been saved to the git credentials helper. Pass `add_to_git_credential=True` in this function directly or `--add-to-git-credential` if using via `huggingface-cli` if you want to set the git credential as well.
Token is valid (permission: write).
Your token has been saved to /root/.cache/huggingface/token
Login successful


Upload 2 LFS files:   0%|          | 0/2 [00:00<?, ?it/s]

pytorch_model-00001-of-00002.bin:   0%|          | 0.00/9.98G [00:00<?, ?B/s]

pytorch_model-00002-of-00002.bin:   0%|          | 0.00/3.50G [00:00<?, ?B/s]

CommitInfo(commit_url='https://huggingface.co/tito92/finance_finetune_model/commit/b3e1572b6b3f267b09be970ef8bfb91300cf812f', commit_message='Upload tokenizer', commit_description='', oid='b3e1572b6b3f267b09be970ef8bfb91300cf812f', pr_url=None, pr_revision=None, pr_num=None)

**Steps in the Code:**

Set Preferred Encoding:

Ensure the preferred encoding is set to "UTF-8" to handle text correctly.
Retrieve Hugging Face Token:

Obtain the Hugging Face authentication token from Kaggle Secrets.
Login to Hugging Face:

Authenticate with the Hugging Face Hub using the retrieved token.
Push Model and Tokenizer to Hub:

Upload the fine-tuned model and tokenizer to the Hugging Face Hub under specified repository names.

In [None]:
# Clear GPU memory
torch.cuda.empty_cache()


# **Loading Model from Hugging Face Hub and Copying Files**

In [None]:
#Load the finetuned model from Hugging face Hub
fine_tuned_finance_model=AutoModelForCausalLM.from_pretrained('anirudh/finance_chatbot')

In [None]:
fine_tuned_tokenizer = AutoTokenizer.from_pretrained('anirudh/finance_chatbot', trust_remote_code=True)

In [None]:
import os
import shutil

# Define the source (output) directory and the new target directory
source_dir = "/kaggle/working/output_dir"
target_dir = "/kaggle/working/new_dir"

# Ensure the target directory exists, create it if it does not
if not os.path.exists(target_dir):
    os.makedirs(target_dir)

# Specify the files you want to copy (modify this list as needed)
files_to_copy = ["file1.txt", "file2.txt", "model.pth", "config.json"]

# Copy specified files from source directory to target directory
for file_name in files_to_copy:
    source_file_path = os.path.join(source_dir, file_name)
    target_file_path = os.path.join(target_dir, file_name)
    if os.path.exists(source_file_path):
        shutil.copy2(source_file_path, target_file_path)
        print(f"Copied {file_name} to {target_dir}")
    else:
        print(f"{file_name} not found in {source_dir}")

print("File copying complete.")


# **Steps in the Code:**

Load Fine-Tuned Model and Tokenizer:

Load the fine-tuned model and tokenizer from the Hugging Face Hub using their respective repository names.
Define Source and Target Directories:

Specify the source directory (where files are currently located) and the target directory (where files will be copied).
Ensure Target Directory Exists:

Check if the target directory exists and create it if it does not.
Copy Files:

Copy specified files from the source directory to the target directory. Log the status of each file copy operation.
Complete File Copying:

Print a message indicating the completion of the file copying process