# **Import Libraries**

```
import torch
```
This line of code imports PyTorch, a deep learning framework used for building and training neural networks. It provides tools for tensor computation (like NumPy) and automatic differentiation, which is essential for training deep learning models.

-----

```
from datasets import load_dataset
```
This part imports the load_dataset function from the Hugging Face datasets library. This allows us to easily load and manage datasets, either from local files or directly from the Hugging Face Hub.

-----

```
from transformers import RobertaForSequenceClassification, RobertaTokenizerFast
```
This part of code imports two component from the hugging face transformer library which is  RobertaForSequenceClassification and RobertaTokenizerFast

-----
```
from transformers import TrainingArguments, Trainer
```
This part of code also import two key classes used for model training which is TrainingArguments, and Trainer. TrainingArguments defines training configurations, such as batch size, number of epochs, learning rate, output directory, evaluation strategy, etc. While the Trainer is a high-level training loop provided by Hugging Face. It handles the training process, evaluation, saving checkpoints, and logging so we don’t have to manually code them.

-----
```
import numpy as np
```
This part of code imports the NumPy which is a fundamental Python library for numerical computations.

-----
```
from sklearn.metrics import accuracy_score, f1_score
```
This part imports the evaluation metrics from Scikit-learn whic is accuracy_score, and f1_score

In [None]:
import torch
from datasets import load_dataset
from transformers import RobertaForSequenceClassification, RobertaTokenizerFast
from transformers import TrainingArguments, Trainer
import numpy as np
from sklearn.metrics import accuracy_score, f1_score

## **DEVICE SETUP (GPU / CPU)**

```
if torch.cuda.is_available():
```
This line checks if a GPU is available on the system. GPUs are much faster than CPUs for deep learning tasks.

-----
```
    device = torch.device("cuda")
```
If a GPU is available, this sets the device to CUDA, which tells PyTorch to use the GPU for computations.

------
```
    print(f"Using GPU: {torch.cuda.get_device_name(0)}")
```
This basically prints out the name of the GPU being used so the user knows which hardware is running the model.

-----
```
else:
    device = torch.device("cpu")
    print("GPU not available, using CPU.")
```
If no GPU is found, the code sets the device to CPU and prints a message saying that the training will run on the CPU instead.

-----
```
print("\n--- Loading and Preprocessing Data ---")
```
This line simply prints a message to show that the program is now starting the data loading and preprocessing stage.

-----
```
from google.colab import drive
drive.mount('/content/drive')
```
This part of code are used in Google Colab to mount Google Drive.
It connects Colab to my Drive so the program can access files saved there.

-----
```
from datasets import load_dataset
dataset = load_dataset("csv", data_files="/content/drive/MyDrive/OSMFinalProject/Mental-Health-Twitter.csv")
```
In this part of code it imports the load_dataset function again and uses it to load a CSV file from my Google Drive. The dataset is saved in the variable dataset and contains Twitter data about mental health.

-----
```
import pandas as pd
from sklearn.model_selection import train_test_split
from datasets import Dataset
```
This imports three useful libraries first is pandas to handle data in table format, train_test_split from scikit-learn to split data into training and testing sets, and Dataset from Hugging Face to convert DataFrames back into the dataset format used by Transformers.

-----
```
df = dataset["train"].to_pandas()
```
This part of code converts the dataset’s “train” part into a pandas DataFrame, making it easier to view and manipulate the data.

-----
```
train_df, eval_df = train_test_split(
    df,
    test_size=0.2,
    stratify=df["label"],
    random_state=42
)
```
This part of code mean that there are 80% for training and 20% for evaluation. It uses stratified sampling based on the “label” column to make sure both splits have a similar balance of classes. The random_state=42 is just a seed number to make the split reproducible.

-----
```
train_data = Dataset.from_pandas(train_df.reset_index(drop=True))
eval_data  = Dataset.from_pandas(eval_df.reset_index(drop=True))
```
In this part of code it convert the pandas DataFrames back into Hugging Face Dataset objects, so they can be used easily with the Transformers library. The `.reset_index(drop=True)` just removes the old index numbers from the DataFrame.

In [None]:
if torch.cuda.is_available():
    device = torch.device("cuda")
    print(f"Using GPU: {torch.cuda.get_device_name(0)}")
else:
    device = torch.device("cpu")
    print("GPU not available, using CPU.")

print("\n--- Loading and Preprocessing Data ---")

from google.colab import drive
drive.mount('/content/drive')

from datasets import load_dataset
dataset = load_dataset("csv", data_files="/content/drive/MyDrive/OSMFinalProject/Mental-Health-Twitter.csv")

# Use a smaller subset to simulate an undergraduate project scale
# The dataset only has a 'train' split, so we will use subsets of 'train' for both
# training and evaluation data.
import pandas as pd
from sklearn.model_selection import train_test_split
from datasets import Dataset

# Convert the dataset to a pandas DataFrame
df = dataset["train"].to_pandas()

# Stratified 80/20 split to keep class balance
train_df, eval_df = train_test_split(
    df,
    test_size=0.2,
    # Stratify by the 'label' column, assuming it exists and contains the class labels
    stratify=df["label"],
    random_state=42
)

# Convert back to Hugging Face Dataset objects
train_data = Dataset.from_pandas(train_df.reset_index(drop=True))
eval_data  = Dataset.from_pandas(eval_df.reset_index(drop=True))

Using GPU: Tesla T4

--- Loading and Preprocessing Data ---
Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


Generating train split: 0 examples [00:00, ? examples/s]

## **Tokenizer Setup**
```
MODEL_NAME = "margotwagner/roberta-psychotherapy-eval"
```
In this line of code it stores the pretrained model from hugging face, a version of RoBERTa that has been fine-tuned for psychotherapy or mental health-related text evaluation.

-----
```
tokenizer = RobertaTokenizerFast.from_pretrained(MODEL_NAME)
```
This part of code loads the fast tokenizer that matches the selected RoBERTa model. The tokenizer’s job is to convert raw text into tokens that the model can process. Using `.from_pretrained()` automatically loads all the correct vocabulary and settings for that specific model.

-----
```
def tokenize_function(examples):
    # Converts text into token IDs
    return tokenizer(examples["post_text"], truncation=True, padding=True)
```
In this block of code a function named tokenize_function is defined. It takes in a batch of data (called examples) and applies the tokenizer to the “post_text” column — which contains the actual tweet or post text. `truncation=True` cuts off text that’s too long for the model’s input limit. `padding=True` adds extra blank tokens to make all text inputs the same length.

-----
```
tokenized_train = train_data.map(tokenize_function, batched=True)
tokenized_eval = eval_data.map(tokenize_function, batched=True)
```
This part of code apply the tokenize_function to both the training and evaluation datasets using `.map()`.

-----
```
tokenized_train = tokenized_train.rename_column("label", "labels")
tokenized_eval = tokenized_eval.rename_column("label", "labels")
```
The Hugging Face Trainer expects the column containing the correct answers to be named “labels”. If the dataset originally called it “label”, this line simply renames it to match the expected format.

-----
```
tokenized_train.set_format("torch", columns=["input_ids", "attention_mask", "labels"])
tokenized_eval.set_format("torch", columns=["input_ids", "attention_mask", "labels"])
```
these lines convert the tokenized datasets into PyTorch tensors, which are the data format PyTorch models use.
Only the columns needed for training, input_ids, attention_mask, and labels are kept.
This makes the data ready for the model training step.

In [None]:
# ==============================================
# 3. TOKENIZER SETUP
# ==============================================
MODEL_NAME = "margotwagner/roberta-psychotherapy-eval"
tokenizer = RobertaTokenizerFast.from_pretrained(MODEL_NAME)

def tokenize_function(examples):
    # Converts text into token IDs
    return tokenizer(examples["post_text"], truncation=True, padding=True)

# Tokenize the data
tokenized_train = train_data.map(tokenize_function, batched=True)
tokenized_eval = eval_data.map(tokenize_function, batched=True)

# Rename label column for Hugging Face Trainer
# **FIX**: Replace 'label' with the actual column name containing the labels if it's not 'label'
tokenized_train = tokenized_train.rename_column("label", "labels")
tokenized_eval = tokenized_eval.rename_column("label", "labels")

# Convert to PyTorch tensors
tokenized_train.set_format("torch", columns=["input_ids", "attention_mask", "labels"])
tokenized_eval.set_format("torch", columns=["input_ids", "attention_mask", "labels"])

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/394 [00:00<?, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/280 [00:00<?, ?B/s]

Map:   0%|          | 0/16000 [00:00<?, ? examples/s]

Map:   0%|          | 0/4000 [00:00<?, ? examples/s]

## **Model Definition**

```
model = RobertaForSequenceClassification.from_pretrained(MODEL_NAME, num_labels=2).to(device)
print(f"Model loaded: {MODEL_NAME}")
```
This line loads the RoBERTa model for a sequence classification task.
- `RobertaForSequenceClassification` is a version of RoBERTa is specifically designed for classifying text into categories
- `from_pretrained(MODEL_NAME)` this loads the pre-trained model weights from the model name stored in MODEL_NAME. This gives the model all the learned knowledge from previous training.
- `num_labels=2` this tells the model that the classification task has two output classes.
- `.to(device)` this part moves the model to the selected device (GPU or CPU), depending on what was set earlier in the code. This ensures the model runs on the correct hardware.

-----
```
print(f"Model loaded: {MODEL_NAME}")
```
This line of code basically prints a message confirming that the model has been successfully loaded, along with the name of the model being used.

In [None]:
model = RobertaForSequenceClassification.from_pretrained(MODEL_NAME, num_labels=2).to(device)
print(f"Model loaded: {MODEL_NAME}")

config.json:   0%|          | 0.00/886 [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/499M [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/499M [00:00<?, ?B/s]

Model loaded: margotwagner/roberta-psychotherapy-eval


## **METRICS AND TRAINING SETUP**

```
def compute_metrics(p):
```
This line of code defines a function called compute_metrics. It will be used by the Hugging Face Trainer to evaluate how well the model performs on the test data.

-----
```
    preds = np.argmax(p.predictions, axis=1)
```
This line is the model’s raw output predictions are converted into the final predicted class labels. `p.predictions` contains the model’s output probabilities or logits for each label. While `np.argmax(p.predictions, axis=1)` picks the label with the highest predicted score for each example that becomes the model’s prediction.

-----
```
    acc = accuracy_score(p.label_ids, preds)
```
This line calculates the accuracy, which measures how many predictions the model got correct compared to the true labels.

-----
```
    f1 = f1_score(p.label_ids, preds, average="binary")
```
the F1 score is calculated — a measure that combines precision and recall. The argument average="binary" means it’s for a binary classification task.

-----
```
    return {"accuracy": acc, "f1": f1}
```
This returns both the accuracy and F1 score as a dictionary.
The Trainer will display these values after each evaluation round.

-----
```
training_args = TrainingArguments(
    output_dir="./results",
```
This creates an object called training_args that contains training configurations for the model.
The output_dir specifies where to save the training results.

-----
```
adam_epsilon=1e-7,
```
This sets the epsilon parameter for the Adam optimizer, a small value to improve numerical stability during training.

-----
```
    logging_steps=200,
```
This tells the Trainer to log training progress every 200 steps.

-----
```
    save_total_limit=1,
```
This limits the number of saved model checkpoints to one, keeping only the most recent or best version.

-----
```
    eval_strategy="epoch",
```
This means the model will be evaluated after every training epoch.

-----
```
    logging_dir="./logs",
```
Sets the directory where log files like training progress will be stored.

-----
```
    save_strategy="epoch",
```
This makes the Trainer save the model at the end of each epoch, similar to how evaluation is done.

-----
```
    load_best_model_at_end=True,
```
This tells the Trainer to automatically load the best-performing model based on evaluation metrics after training finishes.

-----
```
    fp16=torch.cuda.is_available(),
```
If a GPU is available, this enables mixed precision training (fp16) it makes training faster and uses less memory.

-----
```
    report_to=[]  # Disable W&B logging
)
```
This disables reporting to external tracking tools like Weights & Biases, which are often used for experiment tracking.

-----
```
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized_train,
    eval_dataset=tokenized_eval,
    compute_metrics=compute_metrics,
    tokenizer=tokenizer,
)
```
This block of code creates a Trainer object, which handles the full training and evaluation process automatically.
- model=model — uses the RoBERTa model defined earlier.
- args=training_args — uses the training settings.
- train_dataset and eval_dataset — provide the tokenized training and evaluation data.
- compute_metrics — uses the custom function to calculate accuracy and F1.
- tokenizer — keeps the tokenizer for processing data correctly during evaluation.

In [None]:
def compute_metrics(p):
    preds = np.argmax(p.predictions, axis=1)
    acc = accuracy_score(p.label_ids, preds)
    f1 = f1_score(p.label_ids, preds, average="binary")
    return {"accuracy": acc, "f1": f1}

training_args = TrainingArguments(
    output_dir="./results",

    adam_epsilon=1e-7,
    logging_steps=200,
    save_total_limit=1,

    eval_strategy="epoch",
    logging_dir="./logs",
    save_strategy="epoch",
    load_best_model_at_end=True,
    fp16=torch.cuda.is_available(),  # Use 16-bit precision if GPU is available
    report_to=[]  # Disable W&B logging
)

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized_train,
    eval_dataset=tokenized_eval,
    compute_metrics=compute_metrics,
    tokenizer=tokenizer,
)

  trainer = Trainer(


## **EXECUTION - TRAINING**

```
print("\n--- Starting Fine-Tuning (Expected Time: 1–4 hours on GPU) ---")
```
This line of code basically prints the message to the console to let the user know that the fine-tuning process is about to begin.

-----
```
trainer.train()
```
This line starts the actual training process using the Hugging Face Trainer.

In [None]:
print("\n--- Starting Fine-Tuning (Expected Time: 1–4 hours on GPU) ---")
trainer.train()


--- Starting Fine-Tuning (Expected Time: 1–4 hours on GPU) ---


Epoch,Training Loss,Validation Loss,Accuracy,F1
1,0.5232,0.727758,0.5015,0.667334
2,0.6957,0.692997,0.5,0.0
3,0.6989,0.693379,0.5015,0.667334


TrainOutput(global_step=6000, training_loss=0.6346596705118815, metrics={'train_runtime': 641.2146, 'train_samples_per_second': 74.858, 'train_steps_per_second': 9.357, 'total_flos': 3855839071867680.0, 'train_loss': 0.6346596705118815, 'epoch': 3.0})

## **FINAL EVALUATION**

```
print("\n--- Final Evaluation Results ---")
```
This line of code prints a message to show that the program is now moving to the evaluation stage  where the trained model’s performance will be tested on the evaluation dataset.

-----
```
eval_results = trainer.evaluate()
```
This command uses the Hugging Face Trainer to evaluate the model on the evaluation (validation) data.

-----
```
print(eval_results)
```
This prints the evaluation results to the console so you can see how well the model performed.

-----
```
trainer.save_model(training_args.output_dir + "/best_model")
```
This line saves the best version of the fine-tuned model into a folder called "best_model" inside the results directory.

-----
```
print("\nFine-tuning process complete. The resulting model can now be used for inference.")
```
This part basically prints a confirmation message that the fine-tuning process is done, and the trained model is now ready to be used for inference, which means making predictions on new data.

In [None]:
print("\n--- Final Evaluation Results ---")
eval_results = trainer.evaluate()
print(eval_results)

# Save best model checkpoint for future inference
trainer.save_model(training_args.output_dir + "/best_model")

print("\nFine-tuning process complete. The resulting model can now be used for inference.")


--- Final Evaluation Results ---


{'eval_loss': 0.6929973363876343, 'eval_accuracy': 0.5, 'eval_f1': 0.0, 'eval_runtime': 7.8167, 'eval_samples_per_second': 511.725, 'eval_steps_per_second': 63.966, 'epoch': 3.0}

Fine-tuning process complete. The resulting model can now be used for inference.


## **INFERENCE PIPELINE (TESTING ON NEW DATA)**

```
from transformers import pipeline
```
This imports the pipeline feature from the Hugging Face Transformers library.

-----
```
sentiment_analyzer = pipeline(
    "sentiment-analysis",
    model=model,            # model still in memory
    tokenizer=tokenizer,
    device=0 if torch.cuda.is_available() else -1
)
```
This creates a sentiment analysis pipeline using the fine-tuned model you just trained.
- "sentiment-analysis" tells the pipeline what kind of task it’s performing.
- model=model loads the model you fine-tuned earlier.
- tokenizer=tokenizer ensures the text is processed correctly.
- device=0 if torch.cuda.is_available() else -1 means it will use the GPU if available (device=0), otherwise it will use the CPU (-1).

-----
```
new_data = [
    "This system is incredibly slow and completely useless for disaster management.",
    "The accuracy is amazing and the new dashboard makes resource allocation simple.",
    "The committee was very critical of the project's limited scope."
]
```
This block of code is a list of sample text inputs (new_data) is created.
These are short example sentences that simulate real-world reviews or feedback.
The model will analyze each one and determine whether the sentiment is positive or negative.

-----
```
print("\n--- Running Inference on Unlabeled Data ---")
```
This prints a simple message to show that the inference phase is starting.

-----
```
results = sentiment_analyzer(new_data)
```
This line runs the sentiment analysis on the list of new sentences.
The pipeline processes each sentence, predicts whether it’s positive or negative, and returns the label and confidence score for each one.

------
```
# Print results
for text, result in zip(new_data, results):
    sentiment = "Positive" if result["label"] == "LABEL_1" else "Negative"
    print(f"\nText: {text}")
    print(f"Prediction: {sentiment} (Score: {result['score']:.4f})")
```
This block of code is a loop that prints out the predictions in a neat format. It goes through both the input text (text) and its corresponding result (result) together. If the model label is "LABEL_1", it’s interpreted as Positive; otherwise, it’s Negative.

-----
```
print("\n--- Next Steps ---")
print("You may now apply this analyzer to your larger dataset for structured sentiment analysis.")
```
This part of code basically prints the next step and we can now use the same model and pipeline to analyze a larger dataset, performing full sentiment analysis on real project data.

In [None]:
from transformers import pipeline

sentiment_analyzer = pipeline(
    "sentiment-analysis",
    model=model,            # model still in memory
    tokenizer=tokenizer,
    device=0 if torch.cuda.is_available() else -1
)

# Example text data (mimicking real-world project reviews)
new_data = [
    "This system is incredibly slow and completely useless for disaster management.",
    "The accuracy is amazing and the new dashboard makes resource allocation simple.",
    "The committee was very critical of the project's limited scope."
]

print("\n--- Running Inference on Unlabeled Data ---")
results = sentiment_analyzer(new_data)

# Print results
for text, result in zip(new_data, results):
    sentiment = "Positive" if result["label"] == "LABEL_1" else "Negative"
    print(f"\nText: {text}")
    print(f"Prediction: {sentiment} (Score: {result['score']:.4f})")

print("\n--- Next Steps ---")
print("You may now apply this analyzer to your larger dataset for structured sentiment analysis.")

Device set to use cuda:0



--- Running Inference on Unlabeled Data ---

Text: This system is incredibly slow and completely useless for disaster management.
Prediction: Negative (Score: 0.5016)

Text: The accuracy is amazing and the new dashboard makes resource allocation simple.
Prediction: Negative (Score: 0.5016)

Text: The committee was very critical of the project's limited scope.
Prediction: Negative (Score: 0.5016)

--- Next Steps ---
You may now apply this analyzer to your larger dataset for structured sentiment analysis.


## **Experimentation using RoBERTa**

In [None]:
# ==============================================
# 1. IMPORT LIBRARIES
# ==============================================
import torch
import pandas as pd
from datasets import Dataset
from transformers import RobertaForSequenceClassification, RobertaTokenizerFast
from transformers import TrainingArguments, Trainer
import numpy as np
from sklearn.metrics import accuracy_score, f1_score

# ==============================================
# 2. DEVICE SETUP (GPU / CPU)
# ==============================================
if torch.cuda.is_available():
    device = torch.device("cuda")
    print(f"Using GPU: {torch.cuda.get_device_name(0)}")
else:
    device = torch.device("cpu")
    print("GPU not available, using CPU.")

print("\n--- Loading and Preprocessing Data ---")

# ==============================================
# 3. DATA LOADING AND INITIAL PREPROCESSING
# ==============================================
try:
    df = pd.read_csv("Mental-Health-Twitter.csv")
    print(f"Original dataset shape: {df.shape}")
    print(f"Columns in dataset: {df.columns.tolist()}")
except FileNotFoundError:
    print("Error: 'Mental-Health-Twitter.csv' not found. Please ensure the file is in the correct path.")
    exit()

# Filter out rows where 'post_text' or 'label' might be missing
df.dropna(subset=['post_text', 'label'], inplace=True)
print(f"Dataset shape after dropping NaNs: {df.shape}")

# Ensure 'label' column is of integer type for classification
df['label'] = df['label'].astype(int)

# --- CHANGE 5: Create Hugging Face Dataset objects ---
train_df = df.sample(frac=0.8, random_state=42) # 80% for training
eval_df = df.drop(train_df.index)               # Remaining 20% for evaluation

train_dataset = Dataset.from_pandas(train_df[['post_text', 'label']])
eval_dataset = Dataset.from_pandas(eval_df[['post_text', 'label']])

print(f"Training dataset size: {len(train_dataset)}")
print(f"Evaluation dataset size: {len(eval_dataset)}")

# ==============================================
# 4. TOKENIZER SETUP
# ==============================================
MODEL_NAME = "margotwagner/roberta-psychotherapy-eval"
tokenizer = RobertaTokenizerFast.from_pretrained(MODEL_NAME)

def tokenize_function(examples):
    # Converts text into token IDs using 'post_text' column
    return tokenizer(examples["post_text"], truncation=True, padding=True)

# Tokenize the data
tokenized_train = train_dataset.map(tokenize_function, batched=True)
tokenized_eval = eval_dataset.map(tokenize_function, batched=True)

# Rename label column for Hugging Face Trainer
tokenized_train = tokenized_train.rename_column("label", "labels")
tokenized_eval = tokenized_eval.rename_column("label", "labels")

# Convert to PyTorch tensors
tokenized_train.set_format("torch", columns=["input_ids", "attention_mask", "labels"])
print(f"\nValidation Set Label Distribution (before tokenization):")
print(eval_df['label'].value_counts())
print(f"Total Validation Samples: {len(eval_df)}")
tokenized_eval.set_format("torch", columns=["input_ids", "attention_mask", "labels"])

# ==============================================
# 5. MODEL DEFINITION
# ==============================================
# Get the number of unique labels from your dataset to pass to num_labels
num_labels_in_dataset = df['label'].nunique()
if num_labels_in_dataset != 2:
    print(f"Warning: Your dataset has {num_labels_in_dataset} unique labels, but the model is typically for binary classification. Ensure your 'label' column is correctly mapped to 0 and 1.")

model = RobertaForSequenceClassification.from_pretrained(MODEL_NAME, num_labels=num_labels_in_dataset).to(device)
print(f"Model loaded: {MODEL_NAME} with {num_labels_in_dataset} labels.")

# ==============================================
# 6. METRICS AND TRAINING SETUP
# ==============================================
def compute_metrics(p):
    preds = np.argmax(p.predictions, axis=1)
    # --- CHANGE 6: f1_score needs 'binary' if num_labels is 2, otherwise 'weighted' or 'macro' for multi-class ---
    if num_labels_in_dataset == 2:
        f1 = f1_score(p.label_ids, preds, average="binary")
    else:
        f1 = f1_score(p.label_ids, preds, average="weighted") # Use 'weighted' for multi-class
    acc = accuracy_score(p.label_ids, preds)
    return {"accuracy": acc, "f1": f1}

training_args = TrainingArguments(
    output_dir="./results_roberta_mental_health",
    adam_epsilon=1e-7,
    logging_steps=200,
    save_total_limit=1,
)

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized_train,
    eval_dataset=tokenized_eval,
    compute_metrics=compute_metrics,
    tokenizer=tokenizer,
)

# ==============================================
# 7. EXECUTION - TRAINING
# ==============================================
print(f"\n--- Starting Fine-Tuning RoBERTa on {len(train_dataset)} samples ---")
trainer.train()

# ==============================================
# 8. FINAL EVALUATION
# ==============================================
print("\n--- Final Evaluation Results for RoBERTa ---")
eval_results = trainer.evaluate()
print(eval_results)

# Save best model checkpoint for future inference
trainer.save_model("./sentiment_roberta_mental_health_best")

print("\nFine-tuning process complete. The resulting RoBERTa model can now be used for inference.")

# ==============================================
# 9. INFERENCE PIPELINE (TESTING ON NEW DATA)
# ==============================================
from transformers import pipeline

sentiment_analyzer = pipeline(
    "sentiment-analysis",
    model=model,
    tokenizer=tokenizer,
    device=0 if torch.cuda.is_available() else -1
)

# Example text data (mimicking new tweets)
new_data = [
    "Today was a good day, felt a bit more positive and hopeful.",
    "Connecting with friends really helps lift my spirits.",
    "lovley day out side and nothing to do ",
    "Great night out with my favourite ladies. Much needed after the past few days. I love them so much",
    "Good to be home. Happy itâ€™s finally Friday and work is over... ðŸ˜Š"
]

print("\n--- Running Inference on Unlabeled Data with RoBERTa (Testing Inverted Logic) ---")
results = sentiment_analyzer(new_data)

for text, result in zip(new_data, results):
    predicted_status = "Depression" if result["label"] == "LABEL_1" else "No Depression"

    print(f"\nText: {text}")
    print(f"Prediction: {predicted_status} (Score: {result['score']:.4f})")

Using GPU: Tesla T4

--- Loading and Preprocessing Data ---
Original dataset shape: (20000, 11)
Columns in dataset: ['Unnamed: 0', 'post_id', 'post_created', 'post_text', 'user_id', 'followers', 'friends', 'favourites', 'statuses', 'retweets', 'label']
Dataset shape after dropping NaNs: (20000, 11)
Training dataset size: 16000
Evaluation dataset size: 4000


Map:   0%|          | 0/16000 [00:00<?, ? examples/s]

Map:   0%|          | 0/4000 [00:00<?, ? examples/s]


Validation Set Label Distribution (before tokenization):
label
0    2044
1    1956
Name: count, dtype: int64
Total Validation Samples: 4000


pytorch_model.bin:   0%|          | 0.00/499M [00:00<?, ?B/s]

Model loaded: margotwagner/roberta-psychotherapy-eval with 2 labels.


model.safetensors:   0%|          | 0.00/499M [00:00<?, ?B/s]

  trainer = Trainer(



--- Starting Fine-Tuning RoBERTa on 16000 samples ---


  | |_| | '_ \/ _` / _` |  _/ -_)
[34m[1mwandb[0m: Logging into wandb.ai. (Learn how to deploy a W&B server locally: https://wandb.me/wandb-server)
[34m[1mwandb[0m: You can find your API key in your browser here: https://wandb.ai/authorize?ref=models
[34m[1mwandb[0m: Paste an API key from your profile and hit enter:

 ··········


[34m[1mwandb[0m: No netrc file found, creating one.
[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /root/.netrc
[34m[1mwandb[0m: Currently logged in as: [33mmarkjabez-cruz[0m ([33msteven-tiu-jose-rizal-university[0m) to [32mhttps://api.wandb.ai[0m. Use [1m`wandb login --relogin`[0m to force relogin


Step,Training Loss
200,0.6316
400,0.5109
600,0.4899
800,0.4538
1000,0.4299
1200,0.4637
1400,0.4198
1600,0.4252
1800,0.4396
2000,0.4625



--- Final Evaluation Results for RoBERTa ---


{'eval_loss': 0.3679755926132202, 'eval_accuracy': 0.90875, 'eval_f1': 0.9104294478527607, 'eval_runtime': 17.0435, 'eval_samples_per_second': 234.693, 'eval_steps_per_second': 29.337, 'epoch': 3.0}

Fine-tuning process complete. The resulting RoBERTa model can now be used for inference.


Device set to use cuda:0



--- Running Inference on Unlabeled Data with RoBERTa (Testing Inverted Logic) ---

Text: Today was a good day, felt a bit more positive and hopeful.
Prediction: No Depression (Score: 0.9851)

Text: Connecting with friends really helps lift my spirits.
Prediction: No Depression (Score: 0.9631)

Text: lovley day out side and nothing to do 
Prediction: No Depression (Score: 0.9881)

Text: Great night out with my favourite ladies. Much needed after the past few days. I love them so much
Prediction: No Depression (Score: 0.9887)

Text: Good to be home. Happy itâ€™s finally Friday and work is over... ðŸ˜Š
Prediction: No Depression (Score: 0.9917)
