# Japanese Politeness Classifier — Model Training Notebook 1st version
This notebook contains the full training process for fine-tuning a Japanese BERT model to classify sentences based on their level of politeness: casual, neutral, or keigo.

## 1. Setup & Imports
Import required libraries including Hugging Face Transformers, Datasets, PyTorch, and other utilities.

In [15]:
import pandas as pd
import numpy as np
import torch
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split

In [16]:
from transformers import (
    BertTokenizer,
    BertForSequenceClassification,
    TrainingArguments,
    Trainer,
    DataCollatorWithPadding,
    AutoTokenizer,
    pipeline
)
from datasets import Dataset
import evaluate

In [17]:
from sklearn.metrics import accuracy_score, precision_recall_fscore_support

In [18]:
import os
import random
import warnings
from dotenv import load_dotenv
warnings.filterwarnings("ignore")

## 2. Load and Inspect Preprocessed Data
Load the cleaned CSV file created in the preprocessing phase. Make sure the dataset contains the sentence and label columns.

In [19]:
df = pd.read_csv(r"G:\Python Projects\politeness-classifier-jp\data\processed\BunnyGirl800-Preprocessed-binary.csv")
df.head(3)

Unnamed: 0,text,label,length
0,おい ムロ ちょっと来てくれ！,0,15
1,何か出てきやがった,0,9
2,あ…,0,2


## 3. Prepare Dataset for Model Input
Tokenize the Japanese text using a tokenizer (e.g., BERT tokenizer pre-trained on Japanese). Convert the data into a Hugging Face Dataset object suitable for training.

In [20]:
load_dotenv()
token = os.getenv("HUGGINGFACE-TOKEN")
tokenizer = AutoTokenizer.from_pretrained("cl-tohoku/bert-base-japanese", token=token)

In [21]:
# Split the DataFrame
df_train, df_test = train_test_split(df, test_size=0.2, random_state=123, stratify=df["label"])

# Convert to Hugging Face datasets
train_dataset = Dataset.from_pandas(df_train.reset_index(drop=True))
test_dataset = Dataset.from_pandas(df_test.reset_index(drop=True))

In [22]:
def preprocess_function(sentences):
    return tokenizer(sentences["text"], padding=True, truncation=True)

In [23]:
# Apply the tokenizer to the datasets
train_dataset = train_dataset.map(preprocess_function, batched=True)
test_dataset = test_dataset.map(preprocess_function, batched=True)

Map: 100%|██████████| 645/645 [00:00<00:00, 6142.93 examples/s]
Map: 100%|██████████| 162/162 [00:00<00:00, 5684.29 examples/s]


## 4. Define Model Architecture
Load a pre-trained Japanese BERT model with a classification head for 3 classes (casual, neutral, polite).

In [24]:
model = BertForSequenceClassification.from_pretrained("cl-tohoku/bert-base-japanese", num_labels=2)

Some weights of BertForSequenceClassification were not initialized from the model checkpoint at cl-tohoku/bert-base-japanese and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


## 5. Training Configuration
Define training arguments like batch size, learning rate, epochs, evaluation strategy, logging, and checkpointing.

In [25]:
from transformers import DataCollatorWithPadding

data_collator = DataCollatorWithPadding(tokenizer=tokenizer)

In [26]:
from transformers import TrainingArguments

training_args = TrainingArguments(
    output_dir=r"G:\Python Projects\politeness-classifier-jp\models",          # where to save model
    eval_strategy="epoch",     # evaluate every epoch
    learning_rate=2e-5,              # small LR for fine-tuning
    per_device_train_batch_size=8,
    per_device_eval_batch_size=8,
    num_train_epochs=4,
    weight_decay=0.01,
    logging_dir="./logs",
    logging_steps=10,
    save_strategy="epoch",
    load_best_model_at_end=True,
    metric_for_best_model="eval_loss"
)

In [27]:
from transformers import Trainer
from transformers import EarlyStoppingCallback

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=test_dataset,
    tokenizer=tokenizer,
    data_collator=data_collator
)

## 6. Train the Model
Use the Hugging Face Trainer API to train the model on the prepared dataset.

In [28]:
trainer.train()

Epoch,Training Loss,Validation Loss
1,0.1122,0.198949
2,0.1278,0.240335
3,0.0583,0.182705
4,0.1022,0.180564


TrainOutput(global_step=324, training_loss=0.10372026627423403, metrics={'train_runtime': 295.136, 'train_samples_per_second': 8.742, 'train_steps_per_second': 1.098, 'total_flos': 29168327152800.0, 'train_loss': 0.10372026627423403, 'epoch': 4.0})

## 7. Evaluate the Model
Visualize metrics like accuracy, loss, precision, recall, or F1-score on the validation set.

In [29]:
metrics = trainer.evaluate()
print(metrics)

{'eval_loss': 0.18056374788284302, 'eval_runtime': 3.1955, 'eval_samples_per_second': 50.696, 'eval_steps_per_second': 6.572, 'epoch': 4.0}


In [30]:
from sklearn.metrics import classification_report
pred_output = trainer.predict(test_dataset)
preds = np.argmax(pred_output.predictions, axis=1)
labels = pred_output.label_ids
print(classification_report(labels, preds))

              precision    recall  f1-score   support

           0       0.94      0.99      0.96        91
           1       0.98      0.92      0.95        71

    accuracy                           0.96       162
   macro avg       0.96      0.95      0.96       162
weighted avg       0.96      0.96      0.96       162



## Model Results Summary

After fine-tuning a Japanese BERT model (`cl-tohoku/bert-base-japanese`) on a binary classification task — distinguishing between **informal (0)** and **formal (1)** speech — we achieved the following evaluation metrics:

| Metric      | Informal (0) | Formal (1) |
|-------------|--------------|------------|
| Precision   | 0.94         | 0.98       |
| Recall      | 0.99         | 0.92       |
| F1-Score    | 0.96         | 0.95       |
| **Accuracy**| **0.96**     |            |

These results demonstrate strong performance in classifying tone and politeness, with particularly high precision for the formal class.

---

## Interpretation

- The model performs well across both classes and shows strong generalization.
- Slightly lower recall on formal speech suggests that **some formal sentences are misclassified as informal**, likely due to shorter or ambiguous phrasing.

---

## Limitations

- **Context matters:** Very short inputs (e.g., 「はい」 or 「うん」) may be misclassified due to lack of syntactic or semantic context.
- **Confidence thresholds:** The model outputs logits for both classes, and always picks the more likely one — even if it’s only slightly higher. For improved reliability, especially in production, applying a **confidence threshold** is recommended to detect low-certainty predictions.
- **Ambiguity:** Speech that blends casual and formal elements can challenge the model, particularly if similar examples were underrepresented in training data.

---

## Next Steps

- Conduct qualitative error analysis on misclassified samples
- Optionally introduce a third class for "ambiguous" or apply a **minimum confidence threshold**
- Use sentence-level context in inference for better reliability

## 8. Save the Trained Model
Save the model and tokenizer locally (e.g., in models/politeness-bert/) so you can later load it for inference.

In [31]:
trainer.save_model(r"G:\Python Projects\politeness-classifier-jp\models\bert-finetunedv2")
tokenizer.save_pretrained(r"G:\Python Projects\politeness-classifier-jp\models\bert-finetunedv2")

('G:\\Python Projects\\politeness-classifier-jp\\models\\bert-finetunedv2\\tokenizer_config.json',
 'G:\\Python Projects\\politeness-classifier-jp\\models\\bert-finetunedv2\\special_tokens_map.json',
 'G:\\Python Projects\\politeness-classifier-jp\\models\\bert-finetunedv2\\vocab.txt',
 'G:\\Python Projects\\politeness-classifier-jp\\models\\bert-finetunedv2\\added_tokens.json')

In [32]:
import json

output_dir = r"G:\Python Projects\politeness-classifier-jp\models\bert-finetunedv2"
os.makedirs(output_dir, exist_ok=True)

with open(os.path.join(output_dir, "metrics.json"), "w") as f:
    json.dump(metrics, f, indent=4)

In [33]:
from sklearn.metrics import classification_report, confusion_matrix, ConfusionMatrixDisplay

def save_results(output_dir, metrics, predictions, labels, class_names=None):
    os.makedirs(output_dir, exist_ok=True)

    # 1. Save metrics.json
    with open(os.path.join(output_dir, "metrics.json"), "w") as f:
        json.dump(metrics, f, indent=4)

    # 2. Save classification_report.txt
    report = classification_report(labels, predictions, target_names=class_names, digits=4)
    with open(os.path.join(output_dir, "classification_report.txt"), "w") as f:
        f.write(report)

    # 3. Save confusion_matrix.png
    cm = confusion_matrix(labels, predictions)
    disp = ConfusionMatrixDisplay(confusion_matrix=cm, display_labels=class_names)
    fig, ax = plt.subplots(figsize=(6, 6))
    disp.plot(ax=ax, cmap="Blues", values_format="d")
    plt.title("Confusion Matrix")
    plt.savefig(os.path.join(output_dir, "confusion_matrix.png"))
    plt.close()

    print(f"✅ Results saved in {output_dir}")

In [35]:
# Step 1: Evaluate and predict
metrics = trainer.evaluate()
pred_output = trainer.predict(test_dataset)
preds = np.argmax(pred_output.predictions, axis=1)
labels = pred_output.label_ids

# Step 2: Save everything
save_results(
    output_dir=output_dir,
    metrics=metrics,
    predictions=preds,
    labels=labels,
    class_names=["Class 0", "Class 1"]  # or None
)

✅ Results saved in G:\Python Projects\politeness-classifier-jp\models\bert-finetunedv2


## 9. Test Inference on New Sentences
Try out the model on your own Japanese inputs using the pipeline or manual tokenization.

In [44]:
import torch
from torch.nn.functional import softmax

# Create a function to predict new japanese sentences and add model confidence
def predict_formality(text, model, tokenizer, threshold=0.5):
    # Tokenize input
    inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True)
    
    # Get logits from model
    with torch.no_grad():
        outputs = model(**inputs)
        logits = outputs.logits

    # Convert logits to probabilities
    probs = softmax(logits, dim=-1)
    confidence, pred = torch.max(probs, dim=1)

    label_map = {0: "informal", 1: "formal"}
    predicted_label = label_map[pred.item()]
    
    if confidence.item() >= threshold:
        return f"{text} is **{predicted_label}** (confidence: {confidence.item():.2f})"
    else:
        return f"Not confident enough to classify — confidence: {confidence.item():.2f}"

# Example use
text = "それマジっすか？信じられないんですけど"
print(predict_formality(text, model, tokenizer))

それマジっすか？信じられないんですけど is **formal** (confidence: 1.00)
