**Step 1: Preparing the Data and Fine-tuning BERT**

First, we need to have training data to train the model.

📌 Training Data:
Here, we prepare a list of sample questions along with their corresponding question types.

In [None]:
import pandas as pd
from sklearn.model_selection import train_test_split

# Training data for question type detection.
data = {
    "question": [
        "What is the age of Alice?",
        "What is the grade of Bob?",
        "Show all students.",
        "Who is older than 20?",
        "List all students with grade A.",
        "What is the name of the student with ID 3?",
        "List all students younger than 22."
    ],
    "label": [
        "AGE_QUERY",
        "GRADE_QUERY",
        "LIST_QUERY",
        "AGE_COMPARISON",
        "GRADE_FILTER",
        "NAME_QUERY",
        "AGE_COMPARISON"
    ]
}

# Creating a DataFrame and splitting the data into training and testing sets.
df = pd.DataFrame(data)
train_df, test_df = train_test_split(df, test_size=0.2, random_state=42)

# Displaying training data.
train_df


In [None]:
pip install transformers torch scikit-learn

**🚀 Step 2: Fine-tuning the BERT Model for Question Type Detection**

Now, we fine-tune the BERT model for these types of questions.

In [None]:
from transformers import BertTokenizer, BertForSequenceClassification, Trainer, TrainingArguments
import torch

# Loading the model and tokenizer.
model_name = 'bert-base-uncased'
tokenizer = BertTokenizer.from_pretrained(model_name)
model = BertForSequenceClassification.from_pretrained(model_name, num_labels=6)

# تعریف لیبل‌ها
label_mapping = {
    "AGE_QUERY": 0,
    "GRADE_QUERY": 1,
    "LIST_QUERY": 2,
    "AGE_COMPARISON": 3,
    "GRADE_FILTER": 4,
    "NAME_QUERY": 5
}
reverse_mapping = {v: k for k, v in label_mapping.items()}

# Defining the labels.
def tokenize_function(examples):
    # We added it so that the data is passed to the tokenizer as a list.
    return tokenizer(list(examples), padding="max_length", truncation=True)

# Function call correction.
train_encodings = tokenize_function(train_df["question"])
test_encodings = tokenize_function(test_df["question"])
# تبدیل به Dataset
class CustomDataset(torch.utils.data.Dataset):
    def __init__(self, encodings, labels):
        self.encodings = encodings
        self.labels = labels

    def __getitem__(self, idx):
        item = {key: torch.tensor(val[idx]) for key, val in self.encodings.items()}
        item['labels'] = torch.tensor(self.labels[idx])
        return item

    def __len__(self):
        return len(self.labels)

train_dataset = CustomDataset(train_encodings, train_df['label'].map(label_mapping).tolist())
test_dataset = CustomDataset(test_encodings, test_df['label'].map(label_mapping).tolist())

# Trainer settings.
training_args = TrainingArguments(
    output_dir='./results',
    num_train_epochs=3,
    per_device_train_batch_size=4,
    per_device_eval_batch_size=4,
    logging_dir='./logs',
    logging_steps=10
)

# Creating a Trainer to train the model.
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=test_dataset
)

# Training the model.
trainer.train()


**Step 3: Using the Model for Question Type Detection**

Now, our model is ready to detect new questions!
Let's write a function that takes a new question and predicts its type:

In [None]:
def predict_question_type(question):
    inputs = tokenizer(question, return_tensors="pt", truncation=True, padding=True)
    outputs = model(**inputs)
    logits = outputs.logits
    predicted_class = torch.argmax(logits, dim=1).item()
    return reverse_mapping[predicted_class]

# ex:
questions = [
    "What is the age of Alice?",
    "What is the grade of Bob?",
    "Show all students.",
    "List all students with grade A."
]

for q in questions:
    print(f"Question: {q}")
    print(f"Predicted Type: {predict_question_type(q)}\n")

**✨ Result:**

In [None]:
Question: What is the age of Alice?
Predicted Type: AGE_QUERY

Question: What is the grade of Bob?
Predicted Type: GRADE_QUERY

Question: Show all students.
Predicted Type: LIST_QUERY

Question: List all students with grade A.
Predicted Type: GRADE_FILTER
