### Installation

In [1]:
%%capture

!pip install --no-deps bitsandbytes accelerate xformers==0.0.29 peft trl triton
!pip install --no-deps cut_cross_entropy unsloth_zoo
!pip install sentencepiece protobuf datasets huggingface_hub hf_transfer
!pip install --no-deps unsloth

### Loading model

In [2]:
from unsloth import FastLanguageModel
import torch
max_seq_length = 2048
dtype = None
load_in_4bit = True

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "unsloth/DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit",
    max_seq_length = max_seq_length,
    dtype = dtype,
    load_in_4bit = load_in_4bit,
)

🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning.
🦥 Unsloth Zoo will now patch everything to make training faster!
==((====))==  Unsloth 2025.1.8: Fast Llama patching. Transformers: 4.47.1.
   \\   /|    GPU: NVIDIA A100-SXM4-40GB. Max memory: 39.557 GB. Platform: Linux.
O^O/ \_/ \    Torch: 2.5.1+cu124. CUDA: 8.0. CUDA Toolkit: 12.4. Triton: 3.1.0
\        /    Bfloat16 = TRUE. FA [Xformers = 0.0.29. FA2 = False]
 "-____-"     Free Apache license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!


model.safetensors:   0%|          | 0.00/5.96G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/236 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/52.9k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/483 [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/17.2M [00:00<?, ?B/s]

We now add LoRA adapters so we only need to update 1 to 10% of all parameters!

In [3]:
model = FastLanguageModel.get_peft_model(
    model,
    r = 16,
    target_modules = ["q_proj", "k_proj", "v_proj", "o_proj",
                      "gate_proj", "up_proj", "down_proj",],
    lora_alpha = 16,
    lora_dropout = 0,
    bias = "none",
    use_gradient_checkpointing = "unsloth",
    random_state = 3407,
    use_rslora = False,
    loftq_config = None,
)

Unsloth 2025.1.8 patched 32 layers with 32 QKV layers, 32 O layers and 32 MLP layers.


In [45]:
from unsloth.chat_templates import get_chat_template
from datasets import load_dataset


tokenizer = get_chat_template(
    tokenizer,
    chat_template="chatml",
    mapping={"role": "from", "content": "value", "user": "human", "assistant": "gpt"}
)

def apply_template(examples):
    messages = examples["conversations"]
    return {
        "text": [
            tokenizer.apply_chat_template(message, tokenize=False, add_generation_prompt=False)
            for message in messages
        ]
    }

dataset = load_dataset("heladell/Finance_DeepSeek-R1-Distill-dataset", split = "train")

dataset = dataset.map(
    apply_template,
    batched=True,
    batch_size=100,
    remove_columns=dataset.column_names
)

print(f"Number of finance-related conversations: {len(dataset)}")

Unsloth: Will map <|im_end|> to EOS = <|im_end|>.


Map:   0%|          | 0/26952 [00:00<?, ? examples/s]

Number of finance-related conversations: 26952


### Train the model

In [36]:
from trl import SFTTrainer
from transformers import TrainingArguments
from unsloth import is_bfloat16_supported

trainer = SFTTrainer(
    model = model,
    tokenizer = tokenizer,
    train_dataset = dataset,
    dataset_text_field = "text",
    max_seq_length = max_seq_length,
    dataset_num_proc = 2,
    packing = False,
    args = TrainingArguments(
        per_device_train_batch_size = 2,
        gradient_accumulation_steps = 4,
        warmup_steps = 5,
        max_steps = 60,
        # num_train_epochs = 1,
        learning_rate = 2e-4,
        fp16 = not is_bfloat16_supported(),
        bf16 = is_bfloat16_supported(),
        logging_steps = 1,
        optim = "adamw_8bit",
        weight_decay = 0.01,
        lr_scheduler_type = "linear",
        seed = 3407,
        output_dir = "outputs",
        report_to = "none",
    ),
)

Map (num_proc=2):   0%|          | 0/26952 [00:00<?, ? examples/s]

In [37]:
trainer_stats = trainer.train()

==((====))==  Unsloth - 2x faster free finetuning | Num GPUs = 1
   \\   /|    Num examples = 26,952 | Num Epochs = 1
O^O/ \_/ \    Batch size per device = 2 | Gradient Accumulation steps = 4
\        /    Total batch size = 8 | Total steps = 60
 "-____-"     Number of trainable parameters = 41,943,040


Step,Training Loss
1,0.7162
2,0.6844
3,0.5349
4,0.695
5,0.6873
6,0.5835
7,0.6021
8,0.4235
9,0.5426
10,0.6348


<a name="Inference"></a>
### Inference
Let's run the model! You can change the instruction and input - leave the output blank!

In [38]:
from transformers import TextStreamer

# Load model for inference
model = FastLanguageModel.for_inference(model)

messages = [
    {"from": "human", "value": "how can you use technical analysis like MACD and RSI to predict the stock value"},
]
inputs = tokenizer.apply_chat_template(
    messages,
    tokenize=True,
    add_generation_prompt=True,
    return_tensors="pt",
).to("cuda")

text_streamer = TextStreamer(tokenizer)
outputs = model.generate(input_ids=inputs, streamer=text_streamer, max_new_tokens=512, use_cache=True)
tokenizer.batch_decode(outputs)

The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.


<|im_start|>user
how can you use technical analysis like MACD and RSI to predict the stock value<|im_end|>
<|im_start|>assistant
<think>
Okay, so I'm trying to figure out how to use technical analysis tools like MACD and RSI to predict stock prices. I'm pretty new to investing, so I need to break this down step by step. 

First, I know that technical analysis is all about looking at charts and patterns to predict where the price might go. MACD and RSI are two different indicators, so I should understand each one separately.

Starting with MACD, I remember it's a moving average convergence divergence indicator. I think it involves two moving averages. Maybe one is for the stock price and the other is for the volume? Or is it something else? Wait, I think it's the MACD line, which is the difference between two moving averages, and the signal line. The MACD line crosses above or below the signal line to indicate overbought or oversold conditions. 

Then there's RSI, which stands for Relat

["<|im_start|>user\nhow can you use technical analysis like MACD and RSI to predict the stock value<|im_end|>\n<|im_start|>assistant\n<think>\nOkay, so I'm trying to figure out how to use technical analysis tools like MACD and RSI to predict stock prices. I'm pretty new to investing, so I need to break this down step by step. \n\nFirst, I know that technical analysis is all about looking at charts and patterns to predict where the price might go. MACD and RSI are two different indicators, so I should understand each one separately.\n\nStarting with MACD, I remember it's a moving average convergence divergence indicator. I think it involves two moving averages. Maybe one is for the stock price and the other is for the volume? Or is it something else? Wait, I think it's the MACD line, which is the difference between two moving averages, and the signal line. The MACD line crosses above or below the signal line to indicate overbought or oversold conditions. \n\nThen there's RSI, which stan

<a name="Save"></a>
### Saving, loading finetuned models
To save the final model as LoRA adapters, either use Huggingface's `push_to_hub` for an online save or `save_pretrained` for a local save.

**[NOTE]** This ONLY saves the LoRA adapters, and not the full model. To save to 16bit or GGUF, scroll down!

In [39]:
model.save_pretrained("Finance_DeepSeek-R1-Distill-Llama-8B_QLoRA")
tokenizer.save_pretrained("Finance_DeepSeek-R1-Distill-Llama-8B_QLoRA")

# Push to huggingface
model.push_to_hub("heladell/Finance_DeepSeek-R1-Distill-Llama-8B_QLoRA", token = "")
tokenizer.push_to_hub("heladell/Finance_DeepSeek-R1-Distill-Llama-8B_QLoRA", token = "")

README.md:   0%|          | 0.00/256 [00:00<?, ?B/s]

  0%|          | 0/1 [00:00<?, ?it/s]

adapter_model.safetensors:   0%|          | 0.00/168M [00:00<?, ?B/s]

Saved model to https://huggingface.co/heladell/Finance_DeepSeek-R1-Distill-Llama-8B_QLoRA


No files have been modified since last commit. Skipping to prevent empty commit.


Now if you want to load the LoRA adapters we just saved for inference, set `False` to `True`:

In [40]:
if True:
    # from unsloth import FastLanguageModel
    model, tokenizer = FastLanguageModel.from_pretrained(
        model_name = "Finance_DeepSeek-R1-Distill-Llama-8B_QLoRA",
        max_seq_length = max_seq_length,
        dtype = dtype,
        load_in_4bit = load_in_4bit,
    )
    FastLanguageModel.for_inference(model)


# Load model for inference
model = FastLanguageModel.for_inference(model)

messages = [
    {"from": "human", "value": "how can you use technical analysis like MACD and RSI to predict the stock value"},
]
inputs = tokenizer.apply_chat_template(
    messages,
    tokenize=True,
    add_generation_prompt=True,
    return_tensors="pt",
).to("cuda")

text_streamer = TextStreamer(tokenizer)
outputs = model.generate(input_ids=inputs, streamer=text_streamer, max_new_tokens=512, use_cache=True)
tokenizer.batch_decode(outputs)

==((====))==  Unsloth 2025.1.8: Fast Llama patching. Transformers: 4.47.1.
   \\   /|    GPU: NVIDIA A100-SXM4-40GB. Max memory: 39.557 GB. Platform: Linux.
O^O/ \_/ \    Torch: 2.5.1+cu124. CUDA: 8.0. CUDA Toolkit: 12.4. Triton: 3.1.0
\        /    Bfloat16 = TRUE. FA [Xformers = 0.0.29. FA2 = False]
 "-____-"     Free Apache license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!
<|im_start|>user
how can you use technical analysis like MACD and RSI to predict the stock value<|im_end|>
<|im_start|>assistant
<think>
Okay, so I want to learn how to use technical analysis tools like MACD and RSI to predict stock prices. I've heard about these indicators before, but I'm not exactly sure how they work or how to use them. Let me start by breaking down what each of these tools is and then figure out how they can be used together.

First, MACD. I remember it stands for Moving Average Convergence Divergence. From what I

["<|im_start|>user\nhow can you use technical analysis like MACD and RSI to predict the stock value<|im_end|>\n<|im_start|>assistant\n<think>\nOkay, so I want to learn how to use technical analysis tools like MACD and RSI to predict stock prices. I've heard about these indicators before, but I'm not exactly sure how they work or how to use them. Let me start by breaking down what each of these tools is and then figure out how they can be used together.\n\nFirst, MACD. I remember it stands for Moving Average Convergence Divergence. From what I recall, it's a momentum indicator that helps determine when a stock is in an uptrend or a downtrend. It uses two lines: the MACD line and the signal line. The MACD line is a moving average of the stock's price, and the signal line is another moving average of the MACD line. When they cross, it's supposed to signal a potential change in trend.\n\nRSI, on the other hand, is the Relative Strength Index. It measures the strength of a stock's price mov

In [41]:
model.save_pretrained_gguf("Finance_DeepSeek-R1-Distill-Llama-8B_LoRA-Q8_0", tokenizer,)
model.push_to_hub_gguf("heladell/Finance_DeepSeek-R1-Distill-Llama-8B_LoRA-Q8_0", tokenizer, token = "")

Unsloth: Kaggle/Colab has limited disk space. We need to delete the downloaded
model which will save 4-16GB of disk space, allowing you to save on Kaggle/Colab.
Unsloth: Will remove a cached repo with size 6.0G


Unsloth: Merging 4bit and LoRA weights to 16bit...
Unsloth: Will use up to 58.55 out of 83.48 RAM for saving.
Unsloth: Saving model... This might take 5 minutes ...


100%|██████████| 32/32 [00:00<00:00, 50.65it/s]


Unsloth: Saving tokenizer... Done.
Done.


Unsloth: Converting llama model. Can use fast conversion = False.


==((====))==  Unsloth: Conversion from QLoRA to GGUF information
   \\   /|    [0] Installing llama.cpp might take 3 minutes.
O^O/ \_/ \    [1] Converting HF to GGUF 16bits might take 3 minutes.
\        /    [2] Converting GGUF 16bits to ['q8_0'] might take 10 minutes each.
 "-____-"     In total, you will have to wait at least 16 minutes.

Unsloth: Installing llama.cpp. This might take 3 minutes...
Unsloth: CMAKE detected. Finalizing some steps for installation.
Unsloth: [1] Converting model at Finance_DeepSeek-R1-Distill-Llama-8B_LoRA-Q8_0 into q8_0 GGUF format.
The output location will be /content/Finance_DeepSeek-R1-Distill-Llama-8B_LoRA-Q8_0/unsloth.Q8_0.gguf
This might take 3 minutes...
INFO:hf-to-gguf:Loading model: Finance_DeepSeek-R1-Distill-Llama-8B_LoRA-Q8_0
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:hf-to-gguf:Exporting model...
INFO:hf-to-gguf:rope_freqs.weight,           torch.float32 --> F32, shape = {64}
INFO:hf-to-gguf:gguf: loading mode

100%|██████████| 32/32 [00:00<00:00, 66.06it/s]


Unsloth: Saving tokenizer... Done.
Done.
==((====))==  Unsloth: Conversion from QLoRA to GGUF information
   \\   /|    [0] Installing llama.cpp might take 3 minutes.
O^O/ \_/ \    [1] Converting HF to GGUF 16bits might take 3 minutes.
\        /    [2] Converting GGUF 16bits to ['q8_0'] might take 10 minutes each.
 "-____-"     In total, you will have to wait at least 16 minutes.

Unsloth: Installing llama.cpp. This might take 3 minutes...
Unsloth: [1] Converting model at heladell/Finance_DeepSeek-R1-Distill-Llama-8B_LoRA-Q8_0 into q8_0 GGUF format.
The output location will be /content/heladell/Finance_DeepSeek-R1-Distill-Llama-8B_LoRA-Q8_0/unsloth.Q8_0.gguf
This might take 3 minutes...
INFO:hf-to-gguf:Loading model: Finance_DeepSeek-R1-Distill-Llama-8B_LoRA-Q8_0
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:hf-to-gguf:Exporting model...
INFO:hf-to-gguf:rope_freqs.weight,           torch.float32 --> F32, shape = {64}
INFO:hf-to-gguf:gguf: loading model weig

  0%|          | 0/1 [00:00<?, ?it/s]

unsloth.Q8_0.gguf:   0%|          | 0.00/8.54G [00:00<?, ?B/s]

No files have been modified since last commit. Skipping to prevent empty commit.


Saved GGUF to https://huggingface.co/heladell/Finance_DeepSeek-R1-Distill-Llama-8B_LoRA-Q8_0
