In [12]:
import torch
import torchvision

print("Torch version:", torch.__version__)
print("Torchvision version:", torchvision.__version__)
print(torch.version.cuda)

Torch version: 2.5.1+cu118
Torchvision version: 0.20.1+cu118
11.8
The history saving thread hit an unexpected error (OperationalError('attempt to write a readonly database')).History will not be written to the database.


In [13]:
import os
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
plt.style.use('ggplot')

from trl import SFTTrainer
from transformers import TrainingArguments, TextStreamer
from unsloth.chat_templates import get_chat_template
from unsloth import FastLanguageModel
from datasets import Dataset
from unsloth import is_bfloat16_supported

# Saving model
from transformers import AutoTokenizer, AutoModelForSequenceClassification

# Warnings
import warnings
warnings.filterwarnings("ignore")

%matplotlib inline

In [3]:
model, tokenizer = FastLanguageModel.from_pretrained(model_name = "model/1B_finetuned_llama3.2",
                                                        max_seq_length = 5020,
                                                        dtype = None,
                                                        load_in_4bit = True)

==((====))==  Unsloth 2024.11.6: Fast Llama patching. Transformers = 4.46.2.
   \\   /|    GPU: NVIDIA GeForce RTX 4080 SUPER. Max memory: 15.992 GB. Platform = Linux.
O^O/ \_/ \    Pytorch: 2.5.1+cu118. CUDA = 8.9. CUDA Toolkit = 11.8.
\        /    Bfloat16 = TRUE. FA [Xformers = 0.0.28.post3. FA2 = False]
 "-____-"     Free Apache license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!


Unsloth 2024.11.6 patched 16 layers with 16 QKV layers, 16 O layers and 16 MLP layers.


In [4]:
text="I'm going through some things with my feelings and myself. I barely sleep and I do nothing but think about how I'm worthless and how I shouldn't be here. I've never tried or contemplated suicide. I've always wanted to fix my issues, but I never get around to it. How can I change my feeling of being worthless to everyone?"

In [6]:
data_prompt = """Analyze the provided text from a mental health perspective. Identify any indicators of emotional distress, coping mechanisms, or psychological well-being. Highlight any potential concerns or positive aspects related to mental health, and provide a brief explanation for each observation.

### Input:
{}

### Response:
{}"""

In [7]:
model = FastLanguageModel.for_inference(model)
inputs = tokenizer(
[
    data_prompt.format(
        #instructions
        text,
        #answer
        "",
    )
], return_tensors = "pt").to("cuda")

In [8]:
inputs

{'input_ids': tensor([[128000,   2127,  56956,    279,   3984,   1495,    505,    264,  10723,
           2890,  13356,     13,  65647,    904,  34824,    315,  14604,  35104,
             11,  63082,  24717,     11,    477,  24064,   1664,  33851,     13,
          57094,    904,   4754,  10742,    477,   6928,  13878,   5552,    311,
          10723,   2890,     11,    323,   3493,    264,  10015,  16540,    369,
           1855,  22695,    382,  14711,   5688,    512,     40,   2846,   2133,
           1555,   1063,   2574,    449,    856,  16024,    323,   7182,     13,
            358,  20025,   6212,    323,    358,    656,   4400,    719,   1781,
            922,   1268,    358,   2846,  66121,    323,   1268,    358,  13434,
            956,    387,   1618,     13,    358,   3077,   2646,   6818,    477,
          93439,  18639,     13,    358,   3077,   2744,   4934,    311,   5155,
            856,   4819,     11,    719,    358,   2646,    636,   2212,    311,
            43

In [10]:
outputs = model.generate(**inputs, max_new_tokens = 5020, use_cache = True)
answer=tokenizer.batch_decode(outputs)
print(answer)

['<|begin_of_text|>Analyze the provided text from a mental health perspective. Identify any indicators of emotional distress, coping mechanisms, or psychological well-being. Highlight any potential concerns or positive aspects related to mental health, and provide a brief explanation for each observation.\n\n### Input:\nI\'m going through some things with my feelings and myself. I barely sleep and I do nothing but think about how I\'m worthless and how I shouldn\'t be here. I\'ve never tried or contemplated suicide. I\'ve always wanted to fix my issues, but I never get around to it. How can I change my feeling of being worthless to everyone?\n\n### Response:\nHello, and thank you for your question. First, I am so glad that you have reached out. If\xa0 "I\'m going through some things with my feelings and myself. I barely sleep and I do nothing but think about how I\'m worthless and how I shouldn\'t be here. I\'ve never tried or contemplated suicide. I\'ve always wanted to fix my issues,

In [11]:
answer = answer[0].split("### Response:")[-1]
print("Answer of the question is:", answer)

Answer of the question is: 
Hello, and thank you for your question. First, I am so glad that you have reached out. If  "I'm going through some things with my feelings and myself. I barely sleep and I do nothing but think about how I'm worthless and how I shouldn't be here. I've never tried or contemplated suicide. I've always wanted to fix my issues, but I never get around to it. How can I change my feeling of being worthless to everyone?" There are so many questions being asked here. Some are questions about yourself, some are questions about your parents, and some are questions about your religion. I won't be able to answer them all. However, I think maybe I can make a beginning. I think maybe this is something that you need to get a check into. I need to know what my options are. I need to know if I'm good or if I'm bad. I need to put my trust in God. I need to stop the negative thoughts. I need to talk to someone. I need help. First, I think you need to talk to someone. You can tal

In [15]:
# 这一步是为了后续评估我们微调后的模型
os.environ["HF_TOKEN"] = "换成你自己的hugging face token key"
model.push_to_hub("1B_finetuned_llama3.2", use_auth_token=os.getenv("HF_TOKEN"))
tokenizer.push_to_hub("1B_finetuned_llama3.2", use_auth_token=os.getenv("HF_TOKEN"))

README.md:   0%|          | 0.00/579 [00:00<?, ?B/s]

adapter_model.safetensors:   0%|          | 0.00/45.1M [00:00<?, ?B/s]

Saved model to https://huggingface.co/1B_finetuned_llama3.2


tokenizer.json:   0%|          | 0.00/17.2M [00:00<?, ?B/s]