<a href="https://colab.research.google.com/github/dojian/mental_health_chatbot/blob/main/test_Mistral_7b_finetuned_v2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
!pip install -q -U bitsandbytes

In [2]:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftConfig, PeftModel
from transformers import BitsAndBytesConfig

In [3]:
base_model = "mistralai/Mistral-7B-Instruct-v0.2"
adapter = "GRMenon/mental-health-mistral-7b-instructv0.2-finetuned-V2"

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained(
    base_model,
    add_bos_token=True,
    trust_remote_code=True,
    padding_side='left'
)

In [4]:
# Create peft model using base_model and finetuned adapter
config = PeftConfig.from_pretrained(adapter)
model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path,
                                             load_in_4bit=True,
                                             device_map='auto',
                                             torch_dtype='auto')
model = PeftModel.from_pretrained(model, adapter)
device = "cuda" if torch.cuda.is_available() else "cpu"
model.to(device)
model.eval()

The `load_in_4bit` and `load_in_8bit` arguments are deprecated and will be removed in the future versions. Please, pass a `BitsAndBytesConfig` object in `quantization_config` argument instead.


model.safetensors.index.json:   0%|          | 0.00/25.1k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/3 [00:00<?, ?it/s]

model-00001-of-00003.safetensors:   0%|          | 0.00/4.94G [00:00<?, ?B/s]

model-00002-of-00003.safetensors:   0%|          | 0.00/5.00G [00:00<?, ?B/s]

model-00003-of-00003.safetensors:   0%|          | 0.00/4.54G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/3 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/111 [00:00<?, ?B/s]

adapter_model.safetensors:   0%|          | 0.00/27.3M [00:00<?, ?B/s]

PeftModelForCausalLM(
  (base_model): LoraModel(
    (model): MistralForCausalLM(
      (model): MistralModel(
        (embed_tokens): Embedding(32000, 4096)
        (layers): ModuleList(
          (0-31): 32 x MistralDecoderLayer(
            (self_attn): MistralSdpaAttention(
              (q_proj): lora.Linear4bit(
                (base_layer): Linear4bit(in_features=4096, out_features=4096, bias=False)
                (lora_dropout): ModuleDict(
                  (default): Dropout(p=0.05, inplace=False)
                )
                (lora_A): ModuleDict(
                  (default): Linear(in_features=4096, out_features=16, bias=False)
                )
                (lora_B): ModuleDict(
                  (default): Linear(in_features=16, out_features=4096, bias=False)
                )
                (lora_embedding_A): ParameterDict()
                (lora_embedding_B): ParameterDict()
                (lora_magnitude_vector): ModuleDict()
              )
              (k

In [None]:
# Prompt content:
messages = [
    {"role": "user", "content": "Hey Connor! I have been feeling a bit down lately.I could really use some advice on how to feel better?"}
]

In [11]:
# Set pad_token_id if it's not already defined
if tokenizer.pad_token_id is None:
    tokenizer.pad_token_id = tokenizer.eos_token_id

# Now proceed with tokenization and generation
tokenized_input = tokenizer.apply_chat_template(
    conversation=messages,
    tokenize=True,
    add_generation_prompt=True,
    return_tensors='pt'
)

# Move to device and create attention_mask
input_ids = tokenized_input.to(device)
attention_mask = input_ids.ne(tokenizer.pad_token_id).to(device)  # This will now work

# Generate output
output_ids = model.generate(
    input_ids=input_ids,
    attention_mask=attention_mask,
    max_new_tokens=512,
    do_sample=True,
    pad_token_id=tokenizer.pad_token_id
)

# Decode the output
response = tokenizer.batch_decode(output_ids.detach().cpu().numpy(), skip_special_tokens=True)
print(response[0])

[INST] Hey Connor! I have been feeling a bit down lately.I could really use some advice on how to feel better? [/INST]

The most important thing you can do in these circumstances is get back into a healthy lifestyle before things get worse. There are many ways to feel better, which may include: A good workout. This is often the most effective way to start feeling better. Once you have done a good workout, you will feel so much more in control and in charge of your body, which in turn is the start to a more positive feeling. Meditation or another method of stress relief. This will help you de-stress and get your mind off of the problems you are dealing with. Call or write a few friends or family members, and share how you are feeling (if you think they will be a good listener). Sometimes just talking about what you are going through can help. Sometimes, just the reassurance and support from others may be the motivation you need to get through, or to seek out additional support from othe

In [12]:
# Prompt content:
messages = [
    {"role": "user", "content": "I'm going through some things with my feelings and myself. I barely sleep and I do nothing but think about how I'm worthless and how I shouldn't be here. I've never tried or contemplated suicide. I've always wanted to fix my issues, but I never get around to it. How can I change my feeling of being worthless to everyone?"}
]

In [13]:
# Set pad_token_id if it's not already defined
if tokenizer.pad_token_id is None:
    tokenizer.pad_token_id = tokenizer.eos_token_id

# Now proceed with tokenization and generation
tokenized_input = tokenizer.apply_chat_template(
    conversation=messages,
    tokenize=True,
    add_generation_prompt=True,
    return_tensors='pt'
)

# Move to device and create attention_mask
input_ids = tokenized_input.to(device)
attention_mask = input_ids.ne(tokenizer.pad_token_id).to(device)  # This will now work

# Generate output
output_ids = model.generate(
    input_ids=input_ids,
    attention_mask=attention_mask,
    max_new_tokens=512,
    do_sample=True,
    pad_token_id=tokenizer.pad_token_id
)

# Decode the output
response = tokenizer.batch_decode(output_ids.detach().cpu().numpy(), skip_special_tokens=True)
print(response[0])

[INST] I'm going through some things with my feelings and myself. I barely sleep and I do nothing but think about how I'm worthless and how I shouldn't be here. I've never tried or contemplated suicide. I've always wanted to fix my issues, but I never get around to it. How can I change my feeling of being worthless to everyone? [/INST]

1. The first step is in fact to notice the thoughts you have about yourself,    to become aware.    Noticing is the beginning to self-acceptance.    Self-acceptance is the ability to be in your own company!    If I were to walk up in the middle of the street and stop you, how would I feel?    Would I smile and say hi  or would you stare at me blankly?    What we'd be stopping for is acceptance,  or in your case to stop the thoughts of being worthless.    Accept that you're having them, but don't give into them!    That they are thoughts,  not facts!    If your thoughts were facts, you'd be worthless and since you're asking this question, they're not fac

In [14]:
# Prompt content:
messages = [
    {"role": "user", "content": "Please help, I am feeling lost"}
]

In [15]:
# Set pad_token_id if it's not already defined
if tokenizer.pad_token_id is None:
    tokenizer.pad_token_id = tokenizer.eos_token_id

# Now proceed with tokenization and generation
tokenized_input = tokenizer.apply_chat_template(
    conversation=messages,
    tokenize=True,
    add_generation_prompt=True,
    return_tensors='pt'
)

# Move to device and create attention_mask
input_ids = tokenized_input.to(device)
attention_mask = input_ids.ne(tokenizer.pad_token_id).to(device)  # This will now work

# Generate output
output_ids = model.generate(
    input_ids=input_ids,
    attention_mask=attention_mask,
    max_new_tokens=512,
    do_sample=True,
    pad_token_id=tokenizer.pad_token_id
)

# Decode the output
response = tokenizer.batch_decode(output_ids.detach().cpu().numpy(), skip_special_tokens=True)
print(response[0])

[INST] Please help, I am feeling lost [/INST] I am here to assist you, so feel free to ask anything  that you are not happy about  in your life.  What the  solution would be  will  depending  on  the  the specific  problem.  The  first  thing  that  is  important  making  sure  that  the  problem  that  you  think  that you have  is  the  one  that  you  really  have.  Once  you  know  the  exact  problem  that  you  are  dealing  with  try  to  determine  a  few  potential solutions.  The  more detailed  and  well  you  are  able  to  state  the  problem  that  you  are  having  the  more  likely  you  are  to  come  up  with  good  ideas  for  the  solution  to  the  problem.  If  you  still  do  not  know  what  the  best  solution  would  be  to  a  particular  problem  ask  someone  with  more  knowledge  or  experience  than  you  on  the  particular  matter.  If  sometimes  it  may  seem  that  something  out  of  the  usual  that  is troubling  you  the  solution  could  be  th