# BK Classification Inference Notebook - Hugging Face Hub Version

This notebook loads the best-performing 2-stage BART model from Hugging Face Hub and provides an interface for classifying bibliographic records with BK codes.

**Model**: Two-Stage BART (25.7% subset accuracy, 0.498 MCC)  
**Hugging Face**: `mrehank209/bk-classification-bart-two-stage`  
**Performance**: Best among 4 modeling strategies tested

## Model Performance Summary
- **Subset Accuracy**: 25.7%
- **MCC**: 0.498
- **F1-Micro**: 47.9%
- **F1-Macro**: 21.4%
- **Precision (Micro)**: 66.1%
- **Recall (Micro)**: 37.6%


In [1]:
# Install required packages if not already installed
# !pip install torch transformers huggingface_hub numpy pandas

import torch
import torch.nn as nn
import pandas as pd
import numpy as np
import json
import os
from transformers import AutoTokenizer, BartModel
from huggingface_hub import hf_hub_download
from typing import List, Dict, Tuple
import warnings
warnings.filterwarnings('ignore')

# Check if GPU is available
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"Using device: {device}")
print(f"GPU available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"GPU name: {torch.cuda.get_device_name(0)}")


Using device: cpu
GPU available: False


In [2]:
class BartWithClassifier(nn.Module):
    """BART classifier for multi-label BK classification"""
    
    def __init__(self, num_labels=1884, model_name="facebook/bart-large", dropout=0.1):
        super(BartWithClassifier, self).__init__()
        
        self.num_labels = num_labels
        self.bart = BartModel.from_pretrained(model_name)
        self.dropout = nn.Dropout(dropout)
        self.classifier = nn.Linear(self.bart.config.hidden_size, num_labels)
        
    def forward(self, input_ids, attention_mask=None):
        outputs = self.bart(input_ids=input_ids, attention_mask=attention_mask)
        last_hidden_state = outputs.last_hidden_state
        cls_output = last_hidden_state[:, 0, :]  # Take [CLS] token representation
        cls_output = self.dropout(cls_output)
        logits = self.classifier(cls_output)
        return logits

print("Model architecture defined.")


Model architecture defined.


In [3]:
def load_model_from_huggingface(model_name="mrehank209/bk-classification-bart-two-stage"):
    """
    Load the complete model from Hugging Face Hub
    
    Args:
        model_name: HuggingFace model repository name
    
    Returns:
        tuple: (model, tokenizer, label_map, idx_to_label, config)
    """
    print(f"Loading model from Hugging Face Hub: {model_name}")
    
    try:
        # Download required files from HF Hub
        print("Downloading model files...")
        classifier_path = hf_hub_download(repo_id=model_name, filename="classifier_head.pt")
        config_path = hf_hub_download(repo_id=model_name, filename="config.json")
        label_map_path = hf_hub_download(repo_id=model_name, filename="label_map.json")
        idx_to_label_path = hf_hub_download(repo_id=model_name, filename="idx_to_label.json")
        
        # Load configuration
        print("Loading configuration...")
        with open(config_path, 'r') as f:
            config = json.load(f)
        
        # Load label mappings
        print("Loading label mappings...")
        with open(label_map_path, 'r') as f:
            label_map = json.load(f)
        
        with open(idx_to_label_path, 'r') as f:
            idx_to_label = json.load(f)
            # Convert string keys back to integers
            idx_to_label = {int(k): v for k, v in idx_to_label.items()}
        
        print(f"Loaded {len(label_map)} BK labels")
        print(f"Sample labels: {list(label_map.keys())[:10]}")
        
        # Initialize model with correct number of labels
        print("Initializing model...")
        model = BartWithClassifier(
            num_labels=config["num_labels"],
            model_name=config["base_model"],
            dropout=config["dropout"]
        )
        
        # Load classifier head weights
        print("Loading classifier weights...")
        classifier_state = torch.load(classifier_path, map_location='cpu')
        model.classifier.weight.data = classifier_state['weight']
        model.classifier.bias.data = classifier_state['bias']
        
        # Load tokenizer
        print("Loading tokenizer...")
        tokenizer = AutoTokenizer.from_pretrained(model_name)
        
        # Move model to device and set to evaluation mode
        model.to(device)
        model.eval()
        
        print("Model loaded successfully from Hugging Face Hub!")
        print(f"Model Performance:")
        print(f"   - Subset Accuracy: {config['performance']['subset_accuracy']:.1%}")
        print(f"   - MCC: {config['performance']['mcc']:.3f}")
        print(f"   - F1-Micro: {config['performance']['f1_micro']:.1%}")
        print(f"   - F1-Macro: {config['performance']['f1_macro']:.1%}")
        
        return model, tokenizer, label_map, idx_to_label, config
        
    except Exception as e:
        print(f"Error loading model from Hugging Face: {e}")
        print("""
🔧 Troubleshooting:
1. Make sure the model repository exists and is public
2. Check your internet connection
3. If using a private model, make sure you're authenticated:
   - Run: huggingface-cli login
   - Or set HF_TOKEN environment variable
        """)
        raise

# Load the model
print("Loading model from Hugging Face Hub...")
print("Repository: mrehank209/bk-classification-bart-two-stage")
print("This will work once you upload the model using upload_to_huggingface.py")

# Load the model from Hugging Face Hub
model, tokenizer, label_map, idx_to_label, config = load_model_from_huggingface()


Loading model from Hugging Face Hub...
Repository: mrehank209/bk-classification-bart-two-stage
This will work once you upload the model using upload_to_huggingface.py
Loading model from Hugging Face Hub: mrehank209/bk-classification-bart-two-stage
Downloading model files...
Loading configuration...
Loading label mappings...
Loaded 1884 BK labels
Sample labels: ['01.00', '01.20', '01.22', '01.29', '01.30', '01.40', '02.00', '02.01', '02.02', '02.10']
Initializing model...
Loading classifier weights...
Loading tokenizer...
Model loaded successfully from Hugging Face Hub!
Model Performance:
   - Subset Accuracy: 25.7%
   - MCC: 0.498
   - F1-Micro: 47.9%
   - F1-Macro: 21.4%


In [4]:
def preprocess_text(title="", summary="", keywords="", loc_keywords="", rvk="", author=""):
    """
    Preprocess bibliographic fields into the format expected by the model.
    
    Args:
        title: Book title
        summary: Book summary/abstract
        keywords: Subject keywords
        loc_keywords: Library of Congress keywords
        rvk: RVK classification codes
        author: Author information (optional, not used in current model)
    
    Returns:
        Formatted text string for model input
    """
    # Combine fields in the same format as training
    input_text = f"""Title: {title or ''}
Summary: {summary or ''}
Keywords: {keywords or ''}
LOC_Keywords: {loc_keywords or ''}
RVK: {rvk or ''}"""
    
    return input_text.strip()

# Test preprocessing with German example
test_text = preprocess_text(
    title="Künstliche Intelligenz in der Bibliothek",
    summary="",
    keywords="",
    loc_keywords="",
    rvk=""
)

print("Sample preprocessed text (German library book):")
print("=" * 50)
print(test_text)
print("=" * 50)


Sample preprocessed text (German library book):
Title: Künstliche Intelligenz in der Bibliothek
Summary: 
Keywords: 
LOC_Keywords: 
RVK:


In [5]:
def predict_bk_codes(text: str, 
                     threshold: float = 0.25, 
                     top_k: int = 10,
                     max_length: int = 768) -> Dict:
    """
    Predict BK classification codes for input text.
    
    Args:
        text: Preprocessed input text
        threshold: Probability threshold for positive predictions (default: 0.25, optimized for this model)
        top_k: Return top-k predictions regardless of threshold
        max_length: Maximum input sequence length
    
    Returns:
        Dictionary containing predictions and metadata
    """
    # Tokenize input
    inputs = tokenizer(
        text,
        truncation=True,
        padding=True,
        max_length=max_length,
        return_tensors='pt'
    )
    
    # Move to device
    input_ids = inputs['input_ids'].to(device)
    attention_mask = inputs['attention_mask'].to(device)
    
    # Make prediction
    with torch.no_grad():
        logits = model(input_ids=input_ids, attention_mask=attention_mask)
        probabilities = torch.sigmoid(logits).cpu().numpy()[0]  # Get probabilities
    
    # Get predictions above threshold
    threshold_predictions = []
    for idx, prob in enumerate(probabilities):
        if prob >= threshold:
            threshold_predictions.append({
                'label': idx_to_label[idx],
                'probability': float(prob),
                'confidence': 'High' if prob > 0.8 else 'Medium' if prob > 0.6 else 'Low'
            })
    
    # Sort by probability
    threshold_predictions.sort(key=lambda x: x['probability'], reverse=True)
    
    # Get top-k predictions (regardless of threshold)
    top_indices = np.argsort(probabilities)[-top_k:][::-1]
    top_k_predictions = []
    for idx in top_indices:
        top_k_predictions.append({
            'label': idx_to_label[idx],
            'probability': float(probabilities[idx]),
            'confidence': 'High' if probabilities[idx] > 0.8 else 'Medium' if probabilities[idx] > 0.6 else 'Low'
        })
    
    return {
        'threshold_predictions': threshold_predictions,
        'top_k_predictions': top_k_predictions,
        'num_above_threshold': len(threshold_predictions),
        'max_probability': float(np.max(probabilities)),
        'threshold_used': threshold,
        'input_length': len(input_ids[0]),
        'model_info': {
            'name': 'Two-Stage BART',
            'subset_accuracy': config['performance']['subset_accuracy'],
            'mcc': config['performance']['mcc']
        }
    }

print("Prediction functions ready.")
print(f"Optimized threshold: 0.25 (based on validation performance)")
print(f"Expected performance: 25.7% subset accuracy, 0.498 MCC")


Prediction functions ready.
Optimized threshold: 0.25 (based on validation performance)
Expected performance: 25.7% subset accuracy, 0.498 MCC


In [6]:
def display_predictions(predictions: Dict, show_top_k: int = 5):
    """
    Display prediction results in a formatted way.
    """
    print(f"\n{'='*60}")
    print("BK CLASSIFICATION RESULTS")
    print(f"{'='*60}")
    
    # Model info
    model_info = predictions['model_info']
    print(f"Model: {model_info['name']}")
    print(f"Expected Accuracy: {model_info['subset_accuracy']:.1%} | MCC: {model_info['mcc']:.3f}")
    print(f"Input length: {predictions['input_length']} tokens")
    print(f"Max probability: {predictions['max_probability']:.4f}")
    print(f"Threshold used: {predictions['threshold_used']}")
    print(f"Predictions above threshold: {predictions['num_above_threshold']}")
    
    # Show threshold-based predictions
    if predictions['threshold_predictions']:
        print(f"\nPREDICTIONS ABOVE THRESHOLD ({predictions['threshold_used']})")
        print("-" * 60)
        print(f"{'Rank':<4} {'BK Code':<15} {'Probability':<12} {'Confidence':<10}")
        print("-" * 60)
        for i, pred in enumerate(predictions['threshold_predictions'][:show_top_k], 1):
            confidence_emoji = "H" if pred['confidence'] == 'High' else "M" if pred['confidence'] == 'Medium' else "L"
            print(f"{i:<4} {pred['label']:<15} {pred['probability']:<12.4f} {confidence_emoji} {pred['confidence']}")
    else:
        print(f"\nNo predictions above threshold {predictions['threshold_used']}")
    
    # Always show top-k predictions
    print(f"\nTOP-{len(predictions['top_k_predictions'])} PREDICTIONS (Regardless of Threshold)")
    print("-" * 60)
    print(f"{'Rank':<4} {'BK Code':<15} {'Probability':<12} {'Confidence':<10}")
    print("-" * 60)
    for i, pred in enumerate(predictions['top_k_predictions'][:show_top_k], 1):
        confidence_emoji = "H" if pred['confidence'] == 'High' else "M" if pred['confidence'] == 'Medium' else "L"
        print(f"{i:<4} {pred['label']:<15} {pred['probability']:<12.4f} {confidence_emoji} {pred['confidence']}")
    
    print(f"\n{'='*60}")

# Test with the sample text
print("Testing with sample German library book...")
sample_predictions = predict_bk_codes(test_text, threshold=0.5)
display_predictions(sample_predictions, show_top_k=8)


Passing a tuple of `past_key_values` is deprecated and will be removed in Transformers v4.58.0. You should pass an instance of `EncoderDecoderCache` instead, e.g. `past_key_values=EncoderDecoderCache.from_legacy_cache(past_key_values)`.


Testing with sample German library book...

BK CLASSIFICATION RESULTS
Model: Two-Stage BART
Expected Accuracy: 25.7% | MCC: 0.498
Input length: 39 tokens
Max probability: 0.8759
Threshold used: 0.5
Predictions above threshold: 585

PREDICTIONS ABOVE THRESHOLD (0.5)
------------------------------------------------------------
Rank BK Code         Probability  Confidence
------------------------------------------------------------
1    48.58           0.8759       H High
2    56.11           0.8260       H High
3    31.45           0.7921       M Medium
4    24.30           0.7812       M Medium
5    05.37           0.7777       M Medium
6    86.78           0.7726       M Medium
7    42.60           0.7658       M Medium
8    21.19           0.7583       M Medium

TOP-10 PREDICTIONS (Regardless of Threshold)
------------------------------------------------------------
Rank BK Code         Probability  Confidence
------------------------------------------------------------
1    48.58    

In [8]:
def classify_bibliographic_record(
    title: str,
    summary: str = "",
    keywords: str = "",
    loc_keywords: str = "",
    rvk: str = "",
    threshold: float = 0.5,
) -> Dict:
    """
    End-to-end classification function for bibliographic records.
    
    Args:
        title: Book title (required)
        summary: Book summary/abstract
        keywords: Subject keywords
        loc_keywords: Library of Congress keywords
        rvk: RVK classification codes
        threshold: Probability threshold for predictions
        top_k: Number of top predictions to return
    
    Returns:
        Dictionary with predictions and input information
    """
    # Preprocess the input
    processed_text = preprocess_text(
        title=title,
        summary=summary,
        keywords=keywords,
        loc_keywords=loc_keywords,
        rvk=rvk
    )
    
    # Make predictions
    predictions = predict_bk_codes(
        text=processed_text,
        threshold=threshold
    )
    
    return predictions

# Example usage - you can modify these fields for your own book
print("Example: Computer Science Book")
print("=" * 50)

cs_results = classify_bibliographic_record(
    title="Deep Learning: Grundlagen und praktische Anwendungen",
    summary="",
    keywords="",
    loc_keywords="",
    rvk="",
    threshold=0.2  # Lower threshold for more predictions
)

display_predictions(cs_results)


Example: Computer Science Book

BK CLASSIFICATION RESULTS
Model: Two-Stage BART
Expected Accuracy: 25.7% | MCC: 0.498
Input length: 40 tokens
Max probability: 0.8941
Threshold used: 0.2
Predictions above threshold: 1824

PREDICTIONS ABOVE THRESHOLD (0.2)
------------------------------------------------------------
Rank BK Code         Probability  Confidence
------------------------------------------------------------
1    48.58           0.8941       H High
2    56.11           0.8395       H High
3    05.37           0.7959       M Medium
4    86.78           0.7935       M Medium
5    31.45           0.7935       M Medium

TOP-10 PREDICTIONS (Regardless of Threshold)
------------------------------------------------------------
Rank BK Code         Probability  Confidence
------------------------------------------------------------
1    48.58           0.8941       H High
2    56.11           0.8395       H High
3    05.37           0.7959       M Medium
4    86.78           0.7935  

## 🔧 Test Your Own Books

Modify the cell below to classify your own bibliographic records:


In [9]:
# 📝 Modify these fields for your book:
your_book = {
    'title': "Einführung in die Quantenphysik",
    'summary': "",
    'keywords': "",
    'loc_keywords': "",
    'rvk': ""
}

print("🔬 Your Custom Classification:")
print("=" * 50)

# Run classification
your_results = classify_bibliographic_record(**your_book)
display_predictions(your_results)


🔬 Your Custom Classification:

BK CLASSIFICATION RESULTS
Model: Two-Stage BART
Expected Accuracy: 25.7% | MCC: 0.498
Input length: 35 tokens
Max probability: 0.8900
Threshold used: 0.5
Predictions above threshold: 614

PREDICTIONS ABOVE THRESHOLD (0.5)
------------------------------------------------------------
Rank BK Code         Probability  Confidence
------------------------------------------------------------
1    48.58           0.8900       H High
2    56.11           0.8325       H High
3    31.45           0.7917       M Medium
4    42.60           0.7898       M Medium
5    24.30           0.7872       M Medium

TOP-10 PREDICTIONS (Regardless of Threshold)
------------------------------------------------------------
Rank BK Code         Probability  Confidence
------------------------------------------------------------
1    48.58           0.8900       H High
2    56.11           0.8325       H High
3    31.45           0.7917       M Medium
4    42.60           0.7898    