# Hierarchical Classification Models Analysis

## Models Implemented
1. Deep Learning (BERT)
2. Shallow Learning (SVM)

## Why These Models Were Chosen

### BERT Model
- Selected for its ability to handle hierarchical text classification
- Leverages pre-trained language understanding capabilities
- Can capture complex semantic relationships in text
- Implements a shared feature layer architecture to learn common representations across hierarchy levels

### SVM Model
- Chosen as a baseline traditional machine learning approach
- Computationally less intensive than deep learning
- Good for high-dimensional sparse data (text classification)
- Uses hierarchical structure with separate classifiers for each level

## Model Architecture & Features

### BERT Implementation
- Uses bert-base-uncased as base model
- Hierarchical structure with 3 classification levels
- Shared layer architecture for feature learning
- Dropout (0.3) for regularization
- AdamW optimizer with learning rate 2e-5

### SVM Implementation
- TF-IDF vectorization with 10000 features
- Linear SVM with balanced class weights
- Hierarchical structure with separate classifiers per level
- Handles single-class cases specially

## Performance Results

### BERT Model Results (After 10 epochs)
| Metric | Level 1 | Level 2 | Level 3 |
|--------|---------|---------|---------|
| Accuracy | 93.15% | 83.95% | 75.40% |
| F1-Score | 93.14% | 83.71% | 73.58% |

### SVM Model Results
| Metric | Level 1 | Level 2 | Level 3 |
|--------|---------|---------|---------|
| Accuracy | 88.90% | 73.65% | 65.65% |
| F1-Score | 89.00% | 73.00% | 64.00% |

### Comparative Analysis
```mermaid
graph TD
    A[Model Performance] --> B[BERT]
    A --> C[SVM]
    B --> D[Level 1: 93.15%]
    B --> E[Level 2: 83.95%]
    B --> F[Level 3: 75.40%]
    C --> G[Level 1: 88.90%]
    C --> H[Level 2: 73.65%]
    C --> I[Level 3: 65.65%]


## Key Findings
1. BERT outperforms SVM across all hierarchical levels
2. Both models show decreasing performance as hierarchy depth increases
3. BERT shows better generalization and handling of complex relationships
4. SVM provides decent performance with lower computational requirements


In [None]:
import torch
import torch.nn as nn
from torch.utils.data import Dataset, DataLoader
from transformers import AutoModel, AutoTokenizer, get_linear_schedule_with_warmup
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, f1_score
import numpy as np
import pandas as pd
from tqdm import tqdm
from sklearn.svm import LinearSVC
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, accuracy_score
from sklearn.preprocessing import LabelEncoder

In [None]:
df= pd.read_csv('/Users/nbhagat/hierarchy classification/cleaned_categories_improved.csv')

## Deep Leaning
### Model: Bert

In [None]:
class HierarchicalDataset(Dataset):
    def __init__(self, texts, labels1, labels2, labels3, tokenizer, max_length=512):
        self.texts = texts
        self.labels1 = labels1
        self.labels2 = labels2
        self.labels3 = labels3
        self.tokenizer = tokenizer
        self.max_length = max_length
    
    def __len__(self):
        return len(self.texts)
    
    def __getitem__(self, idx):
        text = str(self.texts[idx])
        
        encoding = self.tokenizer(
            text,
            max_length=self.max_length,
            padding='max_length',
            truncation=True,
            return_tensors='pt'
        )
        
        return {
            'input_ids': encoding['input_ids'].flatten(),
            'attention_mask': encoding['attention_mask'].flatten(),
            'label1': torch.tensor(self.labels1[idx], dtype=torch.long),
            'label2': torch.tensor(self.labels2[idx], dtype=torch.long),
            'label3': torch.tensor(self.labels3[idx], dtype=torch.long)
        }

In [None]:
class HierarchicalBertClassifier(nn.Module):
    def __init__(self, model_name='bert-base-uncased', num_labels1=6, num_labels2=64, num_labels3=377):
        super().__init__()
        self.bert = AutoModel.from_pretrained(model_name)
        
        # Dropout for regularization
        self.dropout = nn.Dropout(0.3)
        
        self.shared_layer = nn.Linear(768, 768)
        self.activation = nn.ReLU()
        
        self.classifier1 = nn.Linear(768, num_labels1)
        self.classifier2 = nn.Linear(768, num_labels2)
        self.classifier3 = nn.Linear(768, num_labels3)
    
    def forward(self, input_ids, attention_mask):
        outputs = self.bert(
            input_ids=input_ids,
            attention_mask=attention_mask
        )
        
        pooled_output = outputs.last_hidden_state[:, 0, :]
        
        shared_features = self.activation(self.shared_layer(pooled_output))
        shared_features = self.dropout(shared_features)
        
        level1_output = self.classifier1(shared_features)
        level2_output = self.classifier2(shared_features)
        level3_output = self.classifier3(shared_features)
        
        return level1_output, level2_output, level3_output



In [None]:
class HierarchicalTrainer:
    def __init__(self, model, learning_rate=2e-5):
        self.device = torch.device('cpu')  
        self.model = model.to(self.device)
        self.tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased')
        self.criterion = nn.CrossEntropyLoss()
        
        # Optimizer
        self.optimizer = torch.optim.AdamW(
            model.parameters(),
            lr=learning_rate,
            weight_decay=0.01
        )

    def prepare_data(self, df, max_length=512, batch_size=8):  # Reduced batch size for CPU
        # Prepare texts and labels
        texts = df['text_combined'].values
        labels1 = df['Cat1_encoded'].values
        labels2 = df['Cat2_encoded'].values
        labels3 = df['Cat3_encoded'].values
        
        # Split data
        train_texts, val_texts, train_l1, val_l1, train_l2, val_l2, train_l3, val_l3 = train_test_split(
            texts, labels1, labels2, labels3, test_size=0.2, random_state=42
        )
        
        train_dataset = HierarchicalDataset(
            train_texts, train_l1, train_l2, train_l3, 
            self.tokenizer, max_length
        )
        val_dataset = HierarchicalDataset(
            val_texts, val_l1, val_l2, val_l3, 
            self.tokenizer, max_length
        )
        
        # Create dataloaders with smaller batch size.
        train_loader = DataLoader(
            train_dataset, 
            batch_size=batch_size, 
            shuffle=True
        )
        val_loader = DataLoader(
            val_dataset, 
            batch_size=batch_size, 
            shuffle=False
        )
        
        return train_loader, val_loader

    
    def train_epoch(self, train_loader):
        self.model.train()
        total_loss = 0
        
        for batch in tqdm(train_loader, desc="Training"):

            input_ids = batch['input_ids'].to(self.device)
            attention_mask = batch['attention_mask'].to(self.device)
            labels1 = batch['label1'].to(self.device)
            labels2 = batch['label2'].to(self.device)
            labels3 = batch['label3'].to(self.device)
            
            # Forward pass
            outputs1, outputs2, outputs3 = self.model(
                input_ids=input_ids,
                attention_mask=attention_mask
            )
            
            # Calculate losses
            loss1 = self.criterion(outputs1, labels1)
            loss2 = self.criterion(outputs2, labels2)
            loss3 = self.criterion(outputs3, labels3)
            
            loss = loss1 + loss2 + loss3
            
            # Backward pass
            loss.backward()
            
            torch.nn.utils.clip_grad_norm_(self.model.parameters(), 1.0)
            self.optimizer.step()
            self.optimizer.zero_grad()
            
            total_loss += loss.item()
        
        return total_loss / len(train_loader)
    
    def evaluate(self, val_loader):
        self.model.eval()
        total_loss = 0
        predictions = []
        actuals = []
        
        with torch.no_grad():
            for batch in tqdm(val_loader, desc="Evaluating"):
                input_ids = batch['input_ids'].to(self.device)
                attention_mask = batch['attention_mask'].to(self.device)
                labels1 = batch['label1'].to(self.device)
                labels2 = batch['label2'].to(self.device)
                labels3 = batch['label3'].to(self.device)
                
                outputs1, outputs2, outputs3 = self.model(
                    input_ids=input_ids,
                    attention_mask=attention_mask
                )
                
                # Calculate losses
                loss1 = self.criterion(outputs1, labels1)
                loss2 = self.criterion(outputs2, labels2)
                loss3 = self.criterion(outputs3, labels3)
                loss = loss1 + loss2 + loss3
                
                total_loss += loss.item()
                
                # Get predictions
                preds1 = torch.argmax(outputs1, dim=1).cpu().numpy()
                preds2 = torch.argmax(outputs2, dim=1).cpu().numpy()
                preds3 = torch.argmax(outputs3, dim=1).cpu().numpy()
                
                # Store predictions and actuals
                predictions.extend(zip(preds1, preds2, preds3))
                actuals.extend(zip(
                    labels1.cpu().numpy(),
                    labels2.cpu().numpy(),
                    labels3.cpu().numpy()
                ))
        
        # Calculate metrics
        metrics = self.calculate_metrics(predictions, actuals)
        metrics['loss'] = total_loss / len(val_loader)
        
        return metrics
    
    def calculate_metrics(self, predictions, actuals):
        # Separate predictions and actuals by level
        preds1, preds2, preds3 = zip(*predictions)
        acts1, acts2, acts3 = zip(*actuals)
        

        metrics = {
            'level1_accuracy': accuracy_score(acts1, preds1),
            'level2_accuracy': accuracy_score(acts2, preds2),
            'level3_accuracy': accuracy_score(acts3, preds3),
            'level1_f1': f1_score(acts1, preds1, average='weighted'),
            'level2_f1': f1_score(acts2, preds2, average='weighted'),
            'level3_f1': f1_score(acts3, preds3, average='weighted'),
        }
        
        exact_matches = sum(1 for p, a in zip(predictions, actuals) if p == a)
        metrics['exact_match'] = exact_matches / len(predictions)
        
        return metrics
    
    def train(self, train_loader, val_loader, epochs=10, early_stopping_patience=3):
        best_val_loss = float('inf')
        early_stopping_counter = 0
        
        for epoch in range(epochs):
            print(f"\nEpoch {epoch+1}/{epochs}")
            
            # Training
            train_loss = self.train_epoch(train_loader)
            val_metrics = self.evaluate(val_loader)
            
            print(f"Train Loss: {train_loss:.4f}")
            print(f"Val Loss: {val_metrics['loss']:.4f}")
            print(f"Val Exact Match: {val_metrics['exact_match']:.4f}")
            print(f"Val Level 1 Accuracy: {val_metrics['level1_accuracy']:.4f}")
            print(f"Val Level 2 Accuracy: {val_metrics['level2_accuracy']:.4f}")
            print(f"Val Level 3 Accuracy: {val_metrics['level3_accuracy']:.4f}")
            
            # Early stopping
            if val_metrics['loss'] < best_val_loss:
                best_val_loss = val_metrics['loss']
                early_stopping_counter = 0
                torch.save(self.model.state_dict(), 'best_model.pt')
            else:
                early_stopping_counter += 1
            
            if early_stopping_counter >= early_stopping_patience:
                print("Early stopping triggered")
                break


In [None]:

def prepare_data_for_model(cleaned_df):
    cleaned_df['text_combined'] = cleaned_df['Title'] + ' ' + cleaned_df['Text']
    le1 = LabelEncoder()
    le2 = LabelEncoder()
    le3 = LabelEncoder()
    
    # Fit and transform labels
    cleaned_df['Cat1_encoded'] = le1.fit_transform(cleaned_df['Cat1'])
    cleaned_df['Cat2_encoded'] = le2.fit_transform(cleaned_df['Cat2'])
    cleaned_df['Cat3_encoded'] = le3.fit_transform(cleaned_df['Cat3'])
    
    encoders = {
        'Cat1': le1,
        'Cat2': le2,
        'Cat3': le3
    }
    
    print("\nDataset Statistics:")
    print(f"Total samples: {len(cleaned_df)}")
    print("\nUnique categories per level:")
    print(f"Cat1: {len(le1.classes_)}")
    print(f"Cat2: {len(le2.classes_)}")
    print(f"Cat3: {len(le3.classes_)}")
    

    model_config = {
        'num_labels1': len(le1.classes_),
        'num_labels2': len(le2.classes_),
        'num_labels3': len(le3.classes_)
    }
    
    return cleaned_df, encoders, model_config


In [None]:

cleaned_df, encoders, model_config = prepare_data_for_model(df)
model = HierarchicalBertClassifier(
    model_name='bert-base-uncased',
    num_labels1=model_config['num_labels1'],
    num_labels2=model_config['num_labels2'],
    num_labels3=model_config['num_labels3']
)
trainer = HierarchicalTrainer(model)
train_loader, val_loader = trainer.prepare_data(cleaned_df)
trainer.train(train_loader, val_loader)


### Scores
```plaintext
F1 Scores:
Level 1 F1 Score: 0.9314
Level 2 F1 Score: 0.8371
Level 3 F1 Score: 0.7358
All Metrics:
level1_accuracy: 0.9315
level2_accuracy: 0.8395
level3_accuracy: 0.7540
level1_f1: 0.9314
level2_f1: 0.8371
level3_f1: 0.7358
exact_match: 0.7330
loss: 2.3248
### Training Progress

Epoch 1/10
Training: 100%|██████████| 500/500 [06:27<00:00,  1.29it/s]
Evaluating: 100%|██████████| 125/125 [00:36<00:00,  3.45it/s]
Train Loss: 8.2012
Val Loss: 6.0403
Val Exact Match: 0.2340
Val Level 1 Accuracy: 0.8930
Val Level 2 Accuracy: 0.5345
Val Level 3 Accuracy: 0.2680

Epoch 2/10
Training: 100%|██████████| 500/500 [06:35<00:00,  1.26it/s]
Evaluating: 100%|██████████| 125/125 [00:36<00:00,  3.45it/s]
Train Loss: 5.2607
Val Loss: 4.5996
Val Exact Match: 0.3005
Val Level 1 Accuracy: 0.9205
Val Level 2 Accuracy: 0.6465
Val Level 3 Accuracy: 0.3320

Epoch 3/10
Training: 100%|██████████| 500/500 [06:33<00:00,  1.27it/s]
Evaluating: 100%|██████████| 125/125 [00:35<00:00,  3.55it/s]
Train Loss: 3.9247
Val Loss: 3.7848
Val Exact Match: 0.4075
Val Level 1 Accuracy: 0.9280
Val Level 2 Accuracy: 0.7125
Val Level 3 Accuracy: 0.4485

Epoch 4/10
Training: 100%|██████████| 500/500 [06:35<00:00,  1.26it/s]
Evaluating: 100%|██████████| 125/125 [00:36<00:00,  3.45it/s]
Train Loss: 3.0108
Val Loss: 3.2671
Val Exact Match: 0.5140
Val Level 1 Accuracy: 0.9275
Val Level 2 Accuracy: 0.7750
Val Level 3 Accuracy: 0.5505

Epoch 5/10
Training: 100%|██████████| 500/500 [06:35<00:00,  1.26it/s]
Evaluating: 100%|██████████| 125/125 [00:36<00:00,  3.45it/s]
Train Loss: 2.3361
Val Loss: 2.9271
Val Exact Match: 0.5785
Val Level 1 Accuracy: 0.9260
Val Level 2 Accuracy: 0.7915
Val Level 3 Accuracy: 0.6095

Epoch 6/10
Training: 100%|██████████| 500/500 [06:33<00:00,  1.27it/s]
Evaluating: 100%|██████████| 125/125 [00:36<00:00,  3.46it/s]
Train Loss: 1.7701
Val Loss: 2.6317
Val Exact Match: 0.6230
Val Level 1 Accuracy: 0.9345
Val Level 2 Accuracy: 0.8065
Val Level 3 Accuracy: 0.6510

Epoch 7/10
Training: 100%|██████████| 500/500 [06:35<00:00,  1.26it/s]
Evaluating: 100%|██████████| 125/125 [00:36<00:00,  3.45it/s]
Train Loss: 1.3490
Val Loss: 2.4654
Val Exact Match: 0.6745
Val Level 1 Accuracy: 0.9340
Val Level 2 Accuracy: 0.8240
Val Level 3 Accuracy: 0.7035

Epoch 8/10
Training: 100%|██████████| 500/500 [06:35<00:00,  1.26it/s]
Evaluating: 100%|██████████| 125/125 [00:34<00:00,  3.57it/s]
Train Loss: 1.0143
Val Loss: 2.4604
Val Exact Match: 0.6985
Val Level 1 Accuracy: 0.9345
Val Level 2 Accuracy: 0.8285
Val Level 3 Accuracy: 0.7195

Epoch 9/10
Training: 100%|██████████| 500/500 [06:32<00:00,  1.27it/s]
Evaluating: 100%|██████████| 125/125 [00:36<00:00,  3.45it/s]
Train Loss: 0.7739
Val Loss: 2.3518
Val Exact Match: 0.7195
Val Level 1 Accuracy: 0.9365
Val Level 2 Accuracy: 0.8360
Val Level 3 Accuracy: 0.7440

Epoch 10/10
Training: 100%|██████████| 500/500 [06:34<00:00,  1.27it/s]
Evaluating: 100%|██████████| 125/125 [00:35<00:00,  3.54it/s]
Train Loss: 0.5850
Val Loss: 2.3248
Val Exact Match: 0.7330
Val Level 1 Accuracy: 0.9315
Val Level 2 Accuracy: 0.8395
Val Level 3 Accuracy: 0.7540


## Shallow Learning
### Model: Support Vector Machine(SVM)

In [None]:
class HierarchicalSVMClassifier:
    def __init__(self):
        self.tfidf = TfidfVectorizer(
            max_features=10000,
            ngram_range=(1, 2),
            min_df=2
        )
        self.level1_clf = LinearSVC(
            C=1.0,
            class_weight='balanced',
            max_iter=1000
        )
        self.level2_clfs = {}
        self.level3_clfs = {}
        self.le1 = LabelEncoder()
        self.le2 = LabelEncoder()
        self.le3 = LabelEncoder()
    
    def clean_text(self, text):
        """Clean text data"""
        if pd.isna(text):
            return ''
        text = str(text)

        text = text.lower()

        text = ' '.join(text.split())
        return text
    
    def prepare_data(self, df):
        """Prepare text and labels"""
        print("Preparing data...")
        
        # Clean and combine title and text
        print("Cleaning text data...")
        df['Title_clean'] = df['Title'].apply(self.clean_text)
        df['Text_clean'] = df['Text'].apply(self.clean_text)
        df['text_combined'] = df['Title_clean'] + ' ' + df['Text_clean']
        
        print("Removing empty texts...")
        mask = df['text_combined'].str.strip() != ''
        df = df[mask].copy()
        

        print(f"\nTotal samples after cleaning: {len(df)}")
        print(f"Number of unique categories:")
        print(f"Cat1: {df['Cat1'].nunique()}")
        print(f"Cat2: {df['Cat2'].nunique()}")
        print(f"Cat3: {df['Cat3'].nunique()}")
        
        # Encode labels
        print("\nEncoding labels...")
        df['Cat1_encoded'] = self.le1.fit_transform(df['Cat1'])
        df['Cat2_encoded'] = self.le2.fit_transform(df['Cat2'])
        df['Cat3_encoded'] = self.le3.fit_transform(df['Cat3'])
        

        print("Creating TF-IDF features...")
        X = self.tfidf.fit_transform(df['text_combined'])
        print(f"Feature matrix shape: {X.shape}")
        
        # Split data
        print("\nSplitting data...")
        X_train, X_test, y_train, y_test = train_test_split(
            X,
            df[['Cat1_encoded', 'Cat2_encoded', 'Cat3_encoded']],
            test_size=0.2,
            random_state=42,
            stratify=df['Cat1_encoded']  # Stratify by Cat1 to maintain distribution
        )
        
        print(f"Training samples: {X_train.shape[0]}")
        print(f"Testing samples: {X_test.shape[0]}")
        
        return X_train, X_test, y_train, y_test
    
    def train(self, X_train, y_train):
        """Train hierarchical SVM with handling for single-class cases"""
        print("\nTraining Level 1 classifier...")
        self.level1_clf.fit(X_train, y_train['Cat1_encoded'])
        
        # Train Level 2 classifiers
        print("\nTraining Level 2 classifiers...")
        for cat1 in tqdm(y_train['Cat1_encoded'].unique()):
            mask = y_train['Cat1_encoded'] == cat1
            if mask.sum() > 0:
                X_cat = X_train[mask]
                y_cat = y_train.loc[mask, 'Cat2_encoded']
                
                # Check number of unique classes
                if len(np.unique(y_cat)) > 1:
                    # Train classifier
                    clf = LinearSVC(C=1.0, class_weight='balanced', max_iter=1000)
                    clf.fit(X_cat, y_cat)
                    self.level2_clfs[cat1] = clf
                else:
                    # Store the single class for direct prediction
                    self.level2_clfs[cat1] = np.unique(y_cat)[0]
        
        # Train Level 3 classifiers
        print("\nTraining Level 3 classifiers...")
        skipped_cats = []
        for cat2 in tqdm(y_train['Cat2_encoded'].unique()):
            mask = y_train['Cat2_encoded'] == cat2
            if mask.sum() > 0:
                X_cat = X_train[mask]
                y_cat = y_train.loc[mask, 'Cat3_encoded']
                
                if len(np.unique(y_cat)) > 1:

                    clf = LinearSVC(C=1.0, class_weight='balanced', max_iter=1000)
                    clf.fit(X_cat, y_cat)
                    self.level3_clfs[cat2] = clf
                else:
                    # Store the single class for direct prediction
                    self.level3_clfs[cat2] = np.unique(y_cat)[0]
                    skipped_cats.append((cat2, len(y_cat)))
        
        if skipped_cats:
            print("\nSkipped training for following categories (single class):")
            for cat, count in skipped_cats:
                print(f"Category {cat}: {count} samples")

    def predict(self, X):
        """Make hierarchical predictions with handling for single-class cases"""
        predictions = []
        
        # Level 1 predictions
        l1_preds = self.level1_clf.predict(X)
        
        # Level 2 and 3 predictions
        for i, l1_pred in enumerate(l1_preds):
            # Level 2 prediction
            if l1_pred in self.level2_clfs:
                if isinstance(self.level2_clfs[l1_pred], (int, np.integer)):
                    l2_pred = self.level2_clfs[l1_pred]
                else:
                    l2_pred = self.level2_clfs[l1_pred].predict([X[i].toarray()[0]])[0]
            else:
                l2_pred = -1
            
            # Level 3 prediction
            if l2_pred in self.level3_clfs:
                if isinstance(self.level3_clfs[l2_pred], (int, np.integer)):
                    # Direct prediction for single-class case
                    l3_pred = self.level3_clfs[l2_pred]
                else:
                    # Normal prediction
                    l3_pred = self.level3_clfs[l2_pred].predict([X[i].toarray()[0]])[0]
            else:
                l3_pred = -1
            
            predictions.append((l1_pred, l2_pred, l3_pred))
        
        return np.array(predictions)
    
    def evaluate(self, X_test, y_test):
        """Evaluate the model"""
        print("\nEvaluating model...")
        predictions = self.predict(X_test)
        
        pred_df = pd.DataFrame({
            'Cat1_pred': self.le1.inverse_transform(predictions[:, 0]),
            'Cat2_pred': self.le2.inverse_transform(predictions[:, 1]),
            'Cat3_pred': self.le3.inverse_transform(predictions[:, 2]),
            'Cat1_true': self.le1.inverse_transform(y_test['Cat1_encoded']),
            'Cat2_true': self.le2.inverse_transform(y_test['Cat2_encoded']),
            'Cat3_true': self.le3.inverse_transform(y_test['Cat3_encoded'])
        })
        
        # Evaluate each level
        for level in range(3):
            print(f"\nLevel {level+1} Results:")
            y_true = y_test[f'Cat{level+1}_encoded']
            y_pred = predictions[:, level]
            
            # Calculate metrics
            accuracy = accuracy_score(y_true, y_pred)
            print(f"Accuracy: {accuracy:.4f}")
            
            print("\nClassification Report:")
            print(classification_report(y_true, y_pred))
        
        exact_matches = sum(1 for p, t in zip(predictions, y_test.values) if all(p == t))
        print(f"\nExact Match Ratio: {exact_matches/len(predictions):.4f}")
        
        pred_df.to_csv('svm_predictions_with_categories.csv', index=False)
        
        return predictions, pred_df


In [None]:

classifier = HierarchicalSVMClassifier()

X_train, X_test, y_train, y_test = classifier.prepare_data(df)
classifier.train(X_train, y_train)
predictions, pred_df = classifier.evaluate(X_test, y_test)


Preparing data...
Cleaning text data...
Removing empty texts...

Total samples after cleaning: 10000
Number of unique categories:
Cat1: 6
Cat2: 64
Cat3: 234

Encoding labels...
Creating TF-IDF features...
Feature matrix shape: (10000, 10000)

Splitting data...
Training samples: 8000
Testing samples: 2000

Training Level 1 classifier...

Training Level 2 classifiers...


100%|██████████| 6/6 [00:00<00:00, 24.04it/s]



Training Level 3 classifiers...


100%|██████████| 64/64 [00:00<00:00, 298.94it/s]



Skipped training for following categories (single class):
Category 62: 35 samples
Category 9: 14 samples
Category 16: 21 samples
Category 49: 12 samples
Category 59: 17 samples
Category 29: 21 samples
Category 17: 9 samples
Category 56: 6 samples
Category 41: 3 samples
Category 35: 17 samples
Category 30: 16 samples
Category 50: 6 samples
Category 40: 3 samples
Category 3: 1 samples

Evaluating model...

Level 1 Results:
Accuracy: 0.8890

Classification Report:
              precision    recall  f1-score   support

           0       0.88      0.82      0.85       140
           1       0.88      0.92      0.90       427
           2       0.85      0.91      0.88       168
           3       0.88      0.85      0.87       598
           4       0.94      0.93      0.93       315
           5       0.90      0.90      0.90       352

    accuracy                           0.89      2000
   macro avg       0.89      0.89      0.89      2000
weighted avg       0.89      0.89      0.89  

  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
