## Finetuning Tutorial for Agent-based Crypto Trading Challenge 

In [Agent-based Single Cryptocurrency Trading Challenge](https://coling2025cryptotrading.thefin.ai/), we ask participants to submit a pre-trained/finetuned model for cryptocurrency trading scenario . The submitted model will be used as the backbone model to be tested for performance under the Finmem - an agent-based trading framework. We hope to explore the performance of open source LLMs as the backbones in agent framework on trading tasks. In this tutorial, we use last [LLM challenge @ IJCAI 2024](https://huggingface.co/docs/peft/en/conceptual_guides/adapter) as an example to show how to fine-tuning your specific model. <br>
**Note: You can use any data you find helpful for fine-tuning. If you have sufficient computing resource and data, you can also pre-train model.**

<h3 align="center">
    <p>Pre-knowledge: Parameter-Efficient Fine-Tuning (PEFT) methods</p>
</h3>

Pre-training or full-precision fine-tuning large pretrained models is often prohibitively costly due to their scale. Parameter-Efficient Fine-Tuning (PEFT) methods enable efficient adaptation of large pretrained models to various downstream applications by only fine-tuning a small number of (extra) model parameters instead of all the model's parameters. This significantly decreases the computational and storage costs. PEFT is integrated with Transformers for easy model training and inference.

> [!TIP]
> Visit the [PEFT Githug Repo](https://github.com/huggingface/peft) to read about the more PEFT example.

> [!TIP]
> Visit the [PEFT](https://huggingface.co/PEFT) organization to read about the PEFT methods implemented in the library and to see notebooks demonstrating how to apply these methods to a variety of downstream tasks. Click the "Watch repos" button on the organization page to be notified of newly implemented methods and notebooks!

To use PEFT in your project, please cite it by using the following BibTeX entry.

```bibtex
@Misc{peft,
  title =        {PEFT: State-of-the-art Parameter-Efficient Fine-Tuning methods},
  author =       {Sourab Mangrulkar and Sylvain Gugger and Lysandre Debut and Younes Belkada and Sayak Paul and Benjamin Bossan},
  howpublished = {\url{https://github.com/huggingface/peft}},
  year =         {2022}
}
```

## Installation 

Install PEFT from pip:

```bash
pip install peft
```

Other requirement packages:
```bash
transformers
accelerate
bitstandbytes
flash-attn
huggingface-hub
```

## Finetuning Example
import necessary package

In [None]:
import time
from random import randrange, sample, seed

import torch
from datasets import load_dataset, concatenate_datasets
from peft import LoraConfig, prepare_model_for_kbit_training, get_peft_model, AutoPeftModelForCausalLM
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig, TrainingArguments
from trl import SFTTrainer
from huggingface_hub import login
from pathlib import Path
from dotenv import load_dotenv
import os

PEFT Configration

In [None]:
class ModelConfigurator:
    def __init__(self, model_id, output_dir, train_dataset, use_flash_attention2):
        # Load environment variables
        self.notebook_path = Path().absolute()
        self.project_root = self.notebook_path.parent
        self.env_path = self.project_root / ".env"
        if not self.env_path.exists():
            raise FileNotFoundError(f"Environment file not found at {self.env_path}")
        load_dotenv(self.env_path)
        self.huggingface_token = os.getenv("HUGGINGFACE_TOKEN")
        
        self.model_id = model_id
        self.output_dir = output_dir
        self.train_data = train_dataset
        self.use_flash_attention2 = use_flash_attention2
        self.tokenizer = None
        self.model = None
        self.trainer = None
    
    # Quantization config: 8-bit quantization / 4-bit quantization can help you save GPT memory usage in fine-tuning and inference
    def bit_config(self):
        return BitsAndBytesConfig(
            load_in_4bit=True,
            bnb_4bit_use_double_quant=True,
            bnb_4bit_quant_type="nf4",
            bnb_4bit_compute_dtype=torch.bfloat16 if self.use_flash_attention2 else torch.float16
        )
    
    # Load model and tokenizer for base model
    def load_model_and_tokenizer(self):
        self.model = AutoModelForCausalLM.from_pretrained(
            self.model_id, 
            quantization_config=self.bit_config(), 
            use_cache=False, 
            device_map="auto",
            token = self.huggingface_token, 
            attn_implementation="flash_attention_2" if self.use_flash_attention2 else "sdpa"
        )
        self.model.config.pretraining_tp = 1

        self.tokenizer = AutoTokenizer.from_pretrained(
            self.model_id,
            token = self.huggingface_token
        )
        self.tokenizer.pad_token = self.tokenizer.eos_token
        self.tokenizer.padding_side = "right"
        return self.model, self.tokenizer

    # PEFT config
    def peft_config(self):
        peft_config = LoraConfig(
                lora_alpha=16,
                lora_dropout=0.1,
                r=8,
                bias="none",
                task_type="CAUSAL_LM",
                target_modules=[
                    "q_proj",
                    "k_proj",
                    "v_proj",
                    "o_proj",
                    "gate_proj", 
                    "up_proj", 
                    "down_proj",
                ]
        )
        return peft_config
    
    # Trainer config
    def trainer_config(self):
        args = TrainingArguments(
            output_dir=self.output_dir,
            num_train_epochs=1,
            per_device_train_batch_size=6 if self.use_flash_attention2 else 2, # you can play with the batch size depending on your hardware
            gradient_accumulation_steps=4,
            gradient_checkpointing=True,
            optim="paged_adamw_8bit",
            logging_steps=10,
            save_strategy="epoch",
            learning_rate=2e-4,
            bf16=self.use_flash_attention2,
            fp16=not self.use_flash_attention2,
            tf32=self.use_flash_attention2,
            max_grad_norm=0.3,
            warmup_steps=5,
            lr_scheduler_type="linear",
            disable_tqdm=False,
            report_to="none"
            )   
        return args
    
    # Train the model
    def train_model(self):
        model, tokenizer = self.load_model_and_tokenizer()
        model = prepare_model_for_kbit_training(model)
        model = get_peft_model(model, self.peft_config())

        self.trainer = SFTTrainer(
                model=model,
                train_dataset=self.train_data,
                dataset_text_field="text",
                peft_config=self.peft_config(),
                max_seq_length=2048,
                tokenizer=tokenizer,
                packing=True,
                # formatting_func=format_instruction, 
                args=self.trainer_config(),
                )
        self.trainer.train() # Staring fine-tuning
        self.trainer.save_model() # Save the model in your local disk. You can also upload it to huggingface hub


Instruction-follwing dataset preparation

In [None]:
from datasets import Dataset

class QADataset(Dataset):
    def __init__(self, df, system_prompt="You are Qwen, created by Alibaba Cloud. You are a helpful assistant."):
        """
        df: pandas dataframe containing question-answer pairs
        """
        self.df = df
        self.system_prompt = system_prompt

    def __len__(self):
        return len(self.df)
    
    def __getitem__(self, idx):
        # Extract question and answer
        question = self.df.iloc[idx]["question"]
        answer = self.df.iloc[idx]["answer"]
        
        # Format input text using the ChatML template for Qwen-2.5
        input_text = (
            f"<|im_start|>system\n"
            f"{self.system_prompt}\n"
            f"<|im_end|>\n"
            f"<|im_start|>user\n"
            f"{question}\n"
            f"<|im_end|>\n"
            f"<|im_start|>assistant\n"
            f"{answer}\n"
            f"<|im_end|>"
        )
        
        return {
            "text": input_text
        }

In [None]:
class TradingDataset(Dataset):
    pass

Main function

In [None]:
# Huggingface login
login(token = '')


""" parameters setting """
seed(42)
task_tune = "task1"
model_id = "meta-llama/Meta-Llama-3-8B-Instruct" # Fine-tuning on Llama-3-8B-Instruct
#model_id = "mistralai/Mistral-7B-Instruct-v0.2" # Fine-tuning on Mistral-3-8B-Instruct
output_dir = f"llama3-8B-int4-{task_tune}"

use_flash_attention2 = False
# Replace attention with flash attention 
if torch.cuda.get_device_capability()[0] >= 8:
    use_flash_attention2 = True
print(f"Using flash attention 2: {use_flash_attention2}")


""" dataset prepare and train/val split """
if task_tune == "task1":
    dataset1 = load_dataset(
    "TheFinAI/finarg-ecc-auc_train", 
    split="train", 
    token=""
    )
    dataset = concatenate_fields(dataset1)

elif task_tune == "task2":
    dataset2 = load_dataset(
    "TheFinAI/edtsum_train", 
    split="train", 
    token=""
    )
    dataset = concatenate_fields(dataset2)

else: # data augmentation 
      # In last challenge, we take data fusion strategy to fine-tune the model by putting the data of task1 and task2 together
      # to improve the fine-tuning performance.
    dataset1 = load_dataset(
    "TheFinAI/finarg-ecc-auc_train", 
    split="train", 
    token=""
    )
    dataset2 = load_dataset(
    "TheFinAI/edtsum_train", 
    split="train", 
    token=""
    )
    dataset1 = concatenate_fields(dataset1)
    dataset2 = concatenate_fields(dataset2)
    dataset = concatenate_datasets([dataset1, dataset2])

print(f"Dataset size: {len(dataset)}")
print(dataset[randrange(len(dataset))])
if task_tune == "task1":
    n_samples = sample(range(len(dataset)), k=6200)
elif task_tune == "task2":
    n_samples = sample(range(len(dataset)), k=6400)
else:
    n_samples = sample(range(len(dataset)), k=12500)
print(f"First 5 samples: {n_samples[:5]}")
train_dataset = dataset.select(n_samples)
print(f"Reduced dataset size: {len(dataset)}")
all_indices = set(range(len(dataset)))
validation_indices = list(all_indices - set(n_samples))
validation_dataset = dataset1.select(validation_indices)
print(f"Validation dataset size: {len(validation_dataset)}")
validation_df = validation_dataset.to_pandas()
validation_df.to_csv(f'validation_set_{task_tune}_trail_1.csv', index=False)


""" fine-tuning """
configurator = ModelConfigurator(model_id, output_dir, train_dataset, use_flash_attention2)
configurator.train_model()

print("Finetuning completed!")

""" uploading your model to huggingface hub """
#model.push_to_hub("your-hf-username/my-awesome-model")

**Once you upload your model to huggingface hub, you can submit the link to the challenge organizer. The organizer will access your model from the provided link**

## Citing 

If you find the tutorial useful, please citte
```bibtex
@inproceedings{cao2024catmemo,
  title={CatMemo at the FinLLM Challenge Task: Fine-Tuning Large Language Models using Data Fusion in Financial Applications},
  author={Cao, Yupeng and Yao, Zhiyuan and Chen, Zhi and Deng, Zhiyang},
  booktitle={Joint Workshop of the 8th Financial Technology and Natural Language Processing (FinNLP) and the 1st Agent AI for Scenario Planning (AgentScen) in conjunction with IJCAI 2023},
  pages={174},
  year={2024}
}
```