# Jigsaw - Agile Community Rules Classification
### https://www.kaggle.com/competitions/jigsaw-agile-community-rules

## Constrained Generation for Reddit Content Moderation: Binary Classification Using Logit Probabilities
Technical Overview
This notebook implements a constrained generation approach for automated content moderation using a Llama 3.2-1B model. Instead of relying on traditional text parsing, we employ logits processors to restrict the model's output vocabulary to only "True" and "False" tokens, then extract probabilistic confidence scores directly from the model's logit distributions.
A reference notebook from  https://www.kaggle.com/code/xbar19/jigsaw-llama3-1-8b-instruct-fine-tuned is appreciated.

## Install packages on Kaggle: Add-ons > Install Dependencies 

```bash
pip install pip3-autoremove
pip install torch torchvision torchaudio xformers --index-url https://download.pytorch.org/whl/cu124
pip install unsloth vllm
pip install scikit-learn
```

In [1]:
import kagglehub
import pandas as pd
import os
import glob

# Check if running on Kaggle
if 'KAGGLE_KERNEL_RUN_TYPE' in os.environ:
    # Running on Kaggle
    base_path = "/kaggle/input/jigsaw-agile-community-rules/"
    df_train = pd.read_csv(f"{base_path}train.csv")
    df_test = pd.read_csv(f"{base_path}test.csv")
else:
    # Running locally
    base_path = "./data/synthetic_generation/batch_6/"
    
    # Find all train files
    train_files = glob.glob(f"{base_path}*batch*.csv")
    if train_files:
        train_dfs = [pd.read_csv(file) for file in train_files]
        df_train = pd.concat(train_dfs, ignore_index=True)
        print(f"Concatenated {len(train_files)} train files: {train_files}")
    else:
        raise FileNotFoundError(f"No train files found in {base_path}")
    
    # Find all test files
    test_files = glob.glob(f"{base_path}*batch*.csv")
    if test_files:
        test_dfs = [pd.read_csv(file) for file in test_files]
        df_test = pd.concat(test_dfs, ignore_index=True)
        print(f"Concatenated {len(test_files)} test files: {test_files}")
    else:
        raise FileNotFoundError(f"No test files found in {base_path}")

print(f"Using path: {base_path}")
print(f"Train shape: {df_train.shape}")
print(f"Test shape: {df_test.shape}")
df_train.head(2)

Concatenated 50 train files: ['./data/synthetic_generation/batch_6/realistic_rules_batch_19.csv', './data/synthetic_generation/batch_6/realistic_rules_batch_49.csv', './data/synthetic_generation/batch_6/realistic_rules_batch_43.csv', './data/synthetic_generation/batch_6/realistic_rules_batch_5.csv', './data/synthetic_generation/batch_6/realistic_rules_batch_32.csv', './data/synthetic_generation/batch_6/realistic_rules_batch_36.csv', './data/synthetic_generation/batch_6/realistic_rules_batch_47.csv', './data/synthetic_generation/batch_6/realistic_rules_batch_22.csv', './data/synthetic_generation/batch_6/realistic_rules_batch_48.csv', './data/synthetic_generation/batch_6/realistic_rules_batch_27.csv', './data/synthetic_generation/batch_6/realistic_rules_batch_10.csv', './data/synthetic_generation/batch_6/realistic_rules_batch_4.csv', './data/synthetic_generation/batch_6/realistic_rules_batch_39.csv', './data/synthetic_generation/batch_6/realistic_rules_batch_35.csv', './data/synthetic_ge

Unnamed: 0,Subreddit,Rule Name,Rule Description,Formatted Rule,Positive Example 1,Negative Example 1,Positive Example 2,Negative Example 2,Test Comment,Violates Rule,Raw Response,Error
0,MildlyBadDrivers,Crashes Need An Explanation,"Like Rule 2 says, _mildly_ bad driving does no...",Crash posts need OP explanation why it's *mild...,"Yeah, it was a crash. So what? Get 20% off you...",OP here: This was a super low-speed parking lo...,Lol this idiot deserved it. Who cares if it's ...,This is mild because the driver only scraped t...,OP here. It's mild cause the driver was playin...,True,Formatted Rule: Crash posts need OP explanatio...,
1,PhoenixSC,Adhere to the rules & TOS of Reddit,Please stick to the general Reddit rules and r...,"Adhere to Reddit's site-wide rules, User Agree...","Yo check out my new Minecraft server, it's lit...",That new snapshot looks wild! I wonder how the...,"OMG guys, I found a way to get FREE netherite!...","Haha, this meme perfectly captures the pain of...","Anyone else think the Warden is too OP? Like, ...",False,Formatted Rule: Adhere to Reddit's site-wide r...,


In [2]:
df_test["rule"]=df_test["Formatted Rule"]
df_test.columns=['subreddit', 'rule_name', 'rule_description', 'formatted_rule','positive_example_1', 'negative_example_1', 'positive_example_2', \
       'negative_example_2', 'test_comment', 'violates_rule', 'raw_response','error', 'rule']

### Kaggle-train set for test

In [3]:
# df_test = pd.read_csv(f"./data/train.csv")
# df_test["violates_rule"] = "Error"
# df_test["violates_rule"] = df_test["rule_violation"] == 1
# df_test["test_comment"] = df_test["body"] 
# df_test.head(2)

## Load LLM (llama-3.2-1B) model with vLLM (Suitable for batch inference) (logits output)

In [4]:
import multiprocessing as mp
mp.set_start_method('spawn', force=True)
import os
os.environ['VLLM_USE_V1'] = '0'  # Force V0 for logits processor support

import torch
import numpy as np
import pandas as pd
from vllm import LLM, SamplingParams
from transformers import LogitsProcessor
import math
from vllm.lora.request import LoRARequest


class TrueFalseLogitsProcessor(LogitsProcessor):
    """Forces model to only output True or False tokens"""
    def __init__(self, allowed_ids):
        self.allowed_ids = allowed_ids
        
    def __call__(self, input_ids, scores: torch.Tensor) -> torch.Tensor:
        # Create a mask that's -inf for all tokens except allowed ones
        mask = torch.full_like(scores, float('-inf'))
        mask[self.allowed_ids] = 0
        
        # Apply the mask to force only allowed tokens
        scores = scores + mask
        return scores

class LlamaClassifier:
    def __init__(self):
        # Model path selection
        if os.getenv('KAGGLE_KERNEL_RUN_TYPE'):
            self.model_path = "/kaggle/input/llama-3.2/transformers/1b-instruct/1"
        else:
            self.model_path = "unsloth/Qwen3-4B"
        
        # Initialize model with LoRA support
        self.model = LLM(
            model=self.model_path,
            max_model_len=2048,
            gpu_memory_utilization=0.9,
            dtype="half",
            seed=123,
            enable_lora=True,  # Enable LoRA support
            max_lora_rank=64,  # Adjust based on your LoRA configuration
            max_loras=1,       # Maximum number of LoRA adapters to load
        )
        
        self.tokenizer = self.model.get_tokenizer()
        self.setup_token_constraints()
        
        # Sampling with constrained output
        logits_processors = [TrueFalseLogitsProcessor(self.KEEP)]
        self.sampling_params = SamplingParams(
            n=1,
            temperature=0,
            seed=777,
            skip_special_tokens=True,
            max_tokens=1,
            logits_processors=logits_processors,
            logprobs=2
        )
    
    def setup_token_constraints(self):
        """Get token IDs for 'False' and 'True'"""
        choices = ["False", "True"]
        self.KEEP = []
        for x in choices:
            c = self.tokenizer.encode(x, add_special_tokens=False)[0]
            self.KEEP.append(c)
        
        self.false_token_id = self.KEEP[0]
        self.true_token_id = self.KEEP[1]
        print(f"Constrained to tokens: {self.KEEP} = {choices}")

    def create_lora_request(self, lora_adapter_path, adapter_name="custom_adapter"):
        """Create LoRA request object"""
        return LoRARequest(
            lora_name=adapter_name,
            lora_int_id=1,
            lora_local_path=lora_adapter_path
        )
    
    def create_prompt(self, input_data: pd.Series):
        return f"""Below is an instruction that describes a task, paired with an input that provides further context. 
Write a response that appropriately completes the request.

### Instruction:
You are a really experienced moderator for the subreddit /r/{input_data['subreddit']}. 
Your job is to determine if the following reported comment violates the given rule.
Answer with only "True" or "False".

### Input:
Rule: {input_data['rule']}

Example 1:
{self.format_comment(input_data['positive_example_1'])}
Rule violation: True

Example 2:
{self.format_comment(input_data['negative_example_1'])}
Rule violation: False

Example 3:
{self.format_comment(input_data['positive_example_2'])}
Rule violation: True

Example 4:
{self.format_comment(input_data['negative_example_2'])}
Rule violation: False

Test sentence:
{self.format_comment(input_data['test_comment'])}

### Response:
Rule violation:"""
    
    def format_comment(self, comment):
        return "\n".join(["| " + line for line in comment.split('\n')])
    
    def predict_classification(self, input_data: pd.Series, lora_adapter_path=None):
        """Single prediction with optional LoRA"""
        prompt = self.create_prompt(input_data)
        
        # Add LoRA request if path provided
        generate_kwargs = {"use_tqdm": False}
        if lora_adapter_path:
            generate_kwargs["lora_request"] = self.create_lora_request(lora_adapter_path)
        
        responses = self.model.generate([prompt], self.sampling_params, **generate_kwargs)
        
        response = responses[0]
        predicted_text = response.outputs[0].text.strip()
        
        try:
            x = response.outputs[0].logprobs[0]
            
            # Extract probabilities for True/False tokens
            logprobs = []
            for k in self.KEEP:
                if k in x:
                    logprobs.append(math.exp(x[k].logprob))
                else:
                    logprobs.append(0)
            
            logprobs = np.array(logprobs)
            logprobs /= (logprobs.sum() + 1e-15)
            
            violation_probability = logprobs[1]  # True probability
            confidence = max(logprobs)
            
        except Exception as e:
            print(f"Error: {e}")
            violation_probability = 0.5
            confidence = 0.5
        
        return {
            'prediction': predicted_text,
            'is_violation': violation_probability > 0.5,
            'violation_probability': violation_probability,
            'confidence': confidence
        }
    
    def predict_batch(self, input_data_list, verbose=False, lora_adapter_path=None):
        """Batch predictions with optional LoRA"""
        prompts = [self.create_prompt(data) for data in input_data_list]
        
        # Add LoRA request if path provided
        generate_kwargs = {"use_tqdm": True}
        if lora_adapter_path:
            generate_kwargs["lora_request"] = self.create_lora_request(lora_adapter_path)
        
        responses = self.model.generate(prompts, self.sampling_params, **generate_kwargs)
        
        results = []
        for i, response in enumerate(responses):
            try:
                predicted_text = response.outputs[0].text.strip()
                x = response.outputs[0].logprobs[0]
                
                # Extract probabilities
                logprobs = []
                for k in self.KEEP:
                    if k in x:
                        logprobs.append(math.exp(x[k].logprob))
                    else:
                        logprobs.append(0)
                
                logprobs = np.array(logprobs)
                logprobs /= (logprobs.sum() + 1e-15)
                
                violation_probability = logprobs[1]
                confidence = max(logprobs)
                
            except Exception as e:
                print(f"Error {i}: {e}")
                violation_probability = 0.5
                confidence = 0.5
                predicted_text = "Error"
            
            result = {
                'prediction': predicted_text,
                'is_violation': violation_probability > 0.5,
                'violation_probability': violation_probability,
                'confidence': confidence,
                'sample_index': i
            }
            
            # if not verbose:
            #     print(f"Sample {i+1}: {predicted_text} (prob: {violation_probability:.4f})")
            
            results.append(result)
        
        return results

    # Convenience methods for LoRA usage
    def predict_with_lora(self, input_data: pd.Series, lora_adapter_path):
        """Single prediction using LoRA adapter"""
        return self.predict_classification(input_data, lora_adapter_path)
    
    def predict_batch_with_lora(self, input_data_list, lora_adapter_path, verbose=False):
        """Batch predictions using LoRA adapter"""
        return self.predict_batch(input_data_list, verbose, lora_adapter_path)

INFO 08-10 10:56:38 [__init__.py:235] Automatically detected platform cuda.


## Instantiate the model=vLLM

In [5]:
model=LlamaClassifier()

INFO 08-10 10:56:46 [config.py:1604] Using max model len 2048
INFO 08-10 10:56:46 [llm_engine.py:228] Initializing a V0 LLM engine (v0.10.0) with config: model='unsloth/Qwen3-4B', speculative_config=None, tokenizer='unsloth/Qwen3-4B', skip_tokenizer_init=False, tokenizer_mode=auto, revision=None, override_neuron_config={}, tokenizer_revision=None, trust_remote_code=False, dtype=torch.float16, max_seq_len=2048, download_dir=None, load_format=LoadFormat.AUTO, tensor_parallel_size=1, pipeline_parallel_size=1, disable_custom_all_reduce=False, quantization=None, enforce_eager=False, kv_cache_dtype=auto,  device_config=cuda, decoding_config=DecodingConfig(backend='xgrammar', disable_fallback=False, disable_any_whitespace=False, disable_additional_properties=False, reasoning_backend=''), observability_config=ObservabilityConfig(show_hidden_metrics_for_version=None, otlp_traces_endpoint=None, collect_detailed_traces=None), seed=123, served_model_name=unsloth/Qwen3-4B, num_scheduler_steps=1, mu

Loading safetensors checkpoint shards:   0% Completed | 0/2 [00:00<?, ?it/s]


INFO 08-10 10:56:57 [default_loader.py:262] Loading weights took 6.79 seconds
INFO 08-10 10:56:57 [punica_selector.py:19] Using PunicaWrapperGPU.
INFO 08-10 10:56:58 [model_runner.py:1115] Model loading took 7.8073 GiB and 8.209698 seconds
INFO 08-10 10:56:59 [worker.py:295] Memory profiling takes 1.59 seconds
INFO 08-10 10:56:59 [worker.py:295] the current vLLM instance can use total_gpu_memory (15.69GiB) x gpu_memory_utilization (0.90) = 14.12GiB
INFO 08-10 10:56:59 [worker.py:295] model weights take 7.81GiB; non_torch_memory takes 0.05GiB; PyTorch activation peak memory takes 1.40GiB; the rest of the memory reserved for KV Cache is 4.87GiB.
INFO 08-10 10:56:59 [executor_base.py:113] # cuda blocks: 2216, # CPU blocks: 1820
INFO 08-10 10:56:59 [executor_base.py:118] Maximum concurrency for 2048 tokens per request: 17.31x
INFO 08-10 10:57:01 [model_runner.py:1385] Capturing cudagraphs for decoding. This may lead to unexpected consequences if the model is not static. To run the model in

Capturing CUDA graph shapes:   0%|          | 0/35 [00:00<?, ?it/s]

INFO 08-10 10:57:17 [model_runner.py:1537] Graph capturing finished in 17 secs, took 0.43 GiB
INFO 08-10 10:57:17 [llm_engine.py:424] init engine (profile, create kv cache, warmup model) took 19.96 seconds
Constrained to tokens: [4049, 2514] = ['False', 'True']


## Prediction for test-dataset (batch)

In [6]:
from tqdm import tqdm
import numpy as np

def process_dataframe_in_batches(model, df, batch_size=12):
    """Process dataframe using batch predictions with progress bar and error handling"""
    
    # Calculate number of batches
    num_batches = len(df) // batch_size + (1 if len(df) % batch_size > 0 else 0)
    
    all_results = []
    failed_batches = []
    failed_indices = []
    
    # Process in batches with progress bar
    with tqdm(total=len(df), desc="Processing predictions") as pbar:
        for i in range(0, len(df), batch_size):
            # Get current batch
            batch_df = df.iloc[i:i+batch_size]
            current_batch_num = i//batch_size + 1
            
            try:
                # Convert batch to list of Series (input format for predict_batch)
                batch_list = [row for idx, row in batch_df.iterrows()]  # Fixed syntax
                
                # Get predictions for this batch
                batch_results = model.predict_batch(batch_list)
                
                # Add batch indices to results for tracking
                for j, result in enumerate(batch_results):
                    result['original_index'] = batch_df.index[j]
                
                # Add to results
                all_results.extend(batch_results)
                
            except Exception as e:
                # Log the error and continue with next batch
                print(f"\nError processing batch {current_batch_num}: {str(e)}")
                failed_batches.append(current_batch_num)
                batch_indices = batch_df.index.tolist()
                failed_indices.extend(batch_indices)
                
                # Create error results for all items in failed batch
                for idx in batch_indices:
                    error_result = {
                        'prediction': 'Error',
                        'is_violation': False,  # Default to False for errors
                        'violation_probability': 0.0,
                        'confidence': 0.0,
                        'original_index': idx,
                        'error': f"Batch {current_batch_num} failed: {str(e)}",
                        'batch_error': True
                    }
                    all_results.append(error_result)
            
            # Update progress bar
            pbar.update(len(batch_df))
            pbar.set_postfix({
                'Batch': f'{current_batch_num}/{num_batches}',
                'Failed': len(failed_batches)
            })
    
    # Print summary
    if failed_batches:
        print(f"\nProcessing completed with {len(failed_batches)} failed batches.")
        print(f"Failed batch numbers: {failed_batches}")
        print(f"Total failed rows: {len(failed_indices)}")
    else:
        print(f"\nAll {num_batches} batches processed successfully!")
    
    return all_results, failed_batches, failed_indices

# Process in batches with error handling
predictions, failed_batches, failed_indices = process_dataframe_in_batches(model, df_test, batch_size=12)

# Extract predictions and handle errors
df_test['predicted_rule_violation'] = [pred['prediction'] for pred in predictions]
df_test['prediction_error'] = [pred.get('error', '') for pred in predictions]
df_test['batch_failed'] = [pred.get('batch_error', False) for pred in predictions]

# Optional: Check results
print(f"Total predictions: {len(predictions)}")
print(f"Failed batches: {len(failed_batches)}")
print(f"Failed rows: {len(failed_indices)}")
if len(predictions) > 0:
    success_rate = ((len(predictions) - len(failed_indices)) / len(predictions)) * 100
    print(f"Success rate: {success_rate:.2f}%")

Processing predictions:   0%|                          | 0/1500 [00:00<?, ?it/s]

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:   1%| | 12/1500 [00:00<01:02, 23.70it/s, Batch=1/125, Fa

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:   2%| | 24/1500 [00:01<01:03, 23.38it/s, Batch=2/125, Fa

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:   2%| | 36/1500 [00:01<01:01, 23.94it/s, Batch=3/125, Fa

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:   3%| | 48/1500 [00:02<01:00, 23.87it/s, Batch=4/125, Fa

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:   4%| | 60/1500 [00:02<01:00, 23.78it/s, Batch=5/125, Fa

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:   5%| | 72/1500 [00:03<01:00, 23.77it/s, Batch=6/125, Fa

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:   6%| | 84/1500 [00:03<01:01, 22.85it/s, Batch=7/125, Fa

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:   6%| | 96/1500 [00:04<01:01, 22.80it/s, Batch=8/125, Fa

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:   7%| | 108/1500 [00:04<01:01, 22.77it/s, Batch=9/125, F

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:   8%| | 120/1500 [00:05<01:00, 22.64it/s, Batch=10/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:   9%| | 132/1500 [00:05<01:00, 22.74it/s, Batch=11/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  10%| | 144/1500 [00:06<00:58, 23.05it/s, Batch=12/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  10%| | 156/1500 [00:06<00:58, 22.96it/s, Batch=13/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  11%| | 168/1500 [00:07<00:57, 23.28it/s, Batch=14/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  12%| | 180/1500 [00:07<00:56, 23.45it/s, Batch=15/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  13%|▏| 192/1500 [00:08<00:54, 23.84it/s, Batch=16/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  14%|▏| 204/1500 [00:08<00:53, 24.21it/s, Batch=17/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  14%|▏| 216/1500 [00:09<00:52, 24.38it/s, Batch=18/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  15%|▏| 228/1500 [00:09<00:53, 23.91it/s, Batch=19/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  16%|▏| 240/1500 [00:10<00:53, 23.52it/s, Batch=20/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  17%|▏| 252/1500 [00:10<00:53, 23.34it/s, Batch=21/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  18%|▏| 264/1500 [00:11<00:51, 23.87it/s, Batch=22/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  18%|▏| 276/1500 [00:11<00:50, 24.30it/s, Batch=23/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  19%|▏| 288/1500 [00:12<00:49, 24.33it/s, Batch=24/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  21%|▏| 312/1500 [00:12<00:48, 24.59it/s, Batch=26/125, 


Error processing batch 26: 'float' object has no attribute 'split'


Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  22%|▏| 324/1500 [00:13<00:37, 31.62it/s, Batch=27/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  22%|▏| 336/1500 [00:13<00:39, 29.76it/s, Batch=28/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  23%|▏| 348/1500 [00:14<00:42, 27.35it/s, Batch=29/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  24%|▏| 360/1500 [00:14<00:43, 26.10it/s, Batch=30/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  25%|▏| 372/1500 [00:15<00:43, 25.70it/s, Batch=31/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  26%|▎| 384/1500 [00:15<00:45, 24.55it/s, Batch=32/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  26%|▎| 396/1500 [00:16<00:45, 24.11it/s, Batch=33/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  27%|▎| 408/1500 [00:16<00:45, 24.16it/s, Batch=34/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  28%|▎| 420/1500 [00:17<00:45, 23.87it/s, Batch=35/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  29%|▎| 432/1500 [00:17<00:44, 23.94it/s, Batch=36/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  30%|▎| 444/1500 [00:18<00:44, 23.88it/s, Batch=37/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  30%|▎| 456/1500 [00:18<00:43, 23.84it/s, Batch=38/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  31%|▎| 468/1500 [00:19<00:43, 23.74it/s, Batch=39/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  32%|▎| 480/1500 [00:19<00:42, 23.79it/s, Batch=40/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  34%|▎| 504/1500 [00:20<00:40, 24.31it/s, Batch=42/125, 


Error processing batch 42: 'float' object has no attribute 'split'


Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  34%|▎| 516/1500 [00:20<00:30, 31.79it/s, Batch=43/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  35%|▎| 528/1500 [00:21<00:32, 30.09it/s, Batch=44/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  36%|▎| 540/1500 [00:21<00:34, 28.21it/s, Batch=45/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  37%|▎| 552/1500 [00:22<00:34, 27.32it/s, Batch=46/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  38%|▍| 564/1500 [00:22<00:35, 26.59it/s, Batch=47/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  38%|▍| 576/1500 [00:23<00:34, 26.45it/s, Batch=48/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  39%|▍| 588/1500 [00:23<00:35, 25.73it/s, Batch=49/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  40%|▍| 600/1500 [00:24<00:36, 24.88it/s, Batch=50/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  41%|▍| 612/1500 [00:24<00:36, 24.65it/s, Batch=51/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  42%|▍| 624/1500 [00:25<00:35, 24.83it/s, Batch=52/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  42%|▍| 636/1500 [00:25<00:34, 24.98it/s, Batch=53/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  43%|▍| 648/1500 [00:26<00:33, 25.26it/s, Batch=54/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  44%|▍| 660/1500 [00:26<00:33, 24.75it/s, Batch=55/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  45%|▍| 672/1500 [00:27<00:33, 24.73it/s, Batch=56/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  46%|▍| 684/1500 [00:27<00:33, 24.57it/s, Batch=57/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  47%|▍| 708/1500 [00:28<00:32, 24.39it/s, Batch=59/125, 


Error processing batch 59: 'float' object has no attribute 'split'


Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  48%|▍| 720/1500 [00:28<00:24, 31.91it/s, Batch=60/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  49%|▍| 732/1500 [00:29<00:25, 29.71it/s, Batch=61/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  50%|▍| 744/1500 [00:29<00:26, 28.19it/s, Batch=62/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  50%|▌| 756/1500 [00:30<00:27, 26.71it/s, Batch=63/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  51%|▌| 768/1500 [00:30<00:28, 25.70it/s, Batch=64/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  52%|▌| 780/1500 [00:31<00:28, 24.97it/s, Batch=65/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  53%|▌| 792/1500 [00:31<00:29, 24.29it/s, Batch=66/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  54%|▌| 804/1500 [00:32<00:29, 23.77it/s, Batch=67/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  54%|▌| 816/1500 [00:32<00:28, 23.70it/s, Batch=68/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  55%|▌| 828/1500 [00:33<00:28, 23.44it/s, Batch=69/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  56%|▌| 840/1500 [00:33<00:27, 24.17it/s, Batch=70/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  57%|▌| 852/1500 [00:34<00:26, 24.28it/s, Batch=71/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  58%|▌| 864/1500 [00:34<00:26, 23.99it/s, Batch=72/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  58%|▌| 876/1500 [00:35<00:26, 23.65it/s, Batch=73/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  59%|▌| 888/1500 [00:35<00:25, 24.11it/s, Batch=74/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  60%|▌| 900/1500 [00:36<00:24, 24.41it/s, Batch=75/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  61%|▌| 912/1500 [00:36<00:24, 24.34it/s, Batch=76/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  62%|▌| 924/1500 [00:37<00:23, 24.16it/s, Batch=77/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  62%|▌| 936/1500 [00:37<00:24, 23.38it/s, Batch=78/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  63%|▋| 948/1500 [00:38<00:23, 23.51it/s, Batch=79/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  64%|▋| 960/1500 [00:38<00:22, 24.03it/s, Batch=80/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  65%|▋| 972/1500 [00:39<00:21, 24.13it/s, Batch=81/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  66%|▋| 984/1500 [00:39<00:21, 23.76it/s, Batch=82/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  66%|▋| 996/1500 [00:40<00:21, 23.70it/s, Batch=83/125, 

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  67%|▋| 1008/1500 [00:40<00:20, 23.86it/s, Batch=84/125,

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  68%|▋| 1020/1500 [00:41<00:20, 23.72it/s, Batch=85/125,

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  69%|▋| 1032/1500 [00:41<00:20, 23.21it/s, Batch=86/125,

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  70%|▋| 1044/1500 [00:42<00:19, 23.44it/s, Batch=87/125,

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  70%|▋| 1056/1500 [00:42<00:19, 23.32it/s, Batch=88/125,

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  71%|▋| 1068/1500 [00:43<00:18, 23.53it/s, Batch=89/125,

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  72%|▋| 1080/1500 [00:43<00:17, 24.15it/s, Batch=90/125,

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  73%|▋| 1092/1500 [00:44<00:16, 24.11it/s, Batch=91/125,

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  74%|▋| 1104/1500 [00:44<00:16, 23.89it/s, Batch=92/125,

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  75%|▊| 1128/1500 [00:45<00:15, 23.75it/s, Batch=94/125,


Error processing batch 94: 'float' object has no attribute 'split'


Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  76%|▊| 1140/1500 [00:45<00:11, 31.17it/s, Batch=95/125,

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  77%|▊| 1152/1500 [00:46<00:11, 29.29it/s, Batch=96/125,

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  78%|▊| 1164/1500 [00:46<00:12, 27.90it/s, Batch=97/125,

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  78%|▊| 1176/1500 [00:47<00:12, 26.86it/s, Batch=98/125,

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  79%|▊| 1188/1500 [00:47<00:11, 26.11it/s, Batch=99/125,

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  81%|▊| 1212/1500 [00:48<00:11, 25.87it/s, Batch=101/125


Error processing batch 101: 'float' object has no attribute 'split'


Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  82%|▊| 1224/1500 [00:48<00:08, 32.32it/s, Batch=102/125

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  82%|▊| 1236/1500 [00:49<00:08, 30.33it/s, Batch=103/125

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  83%|▊| 1248/1500 [00:49<00:08, 28.46it/s, Batch=104/125

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  84%|▊| 1260/1500 [00:50<00:08, 26.80it/s, Batch=105/125

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  85%|▊| 1272/1500 [00:50<00:08, 26.15it/s, Batch=106/125

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  86%|▊| 1284/1500 [00:51<00:08, 25.51it/s, Batch=107/125

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  86%|▊| 1296/1500 [00:51<00:08, 25.20it/s, Batch=108/125

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  87%|▊| 1308/1500 [00:52<00:07, 24.89it/s, Batch=109/125

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  88%|▉| 1320/1500 [00:52<00:07, 24.62it/s, Batch=110/125

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  89%|▉| 1332/1500 [00:53<00:06, 24.81it/s, Batch=111/125

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  90%|▉| 1344/1500 [00:53<00:06, 24.39it/s, Batch=112/125

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  90%|▉| 1356/1500 [00:54<00:05, 24.32it/s, Batch=113/125

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  91%|▉| 1368/1500 [00:54<00:05, 24.24it/s, Batch=114/125

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  92%|▉| 1380/1500 [00:55<00:04, 24.24it/s, Batch=115/125

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  93%|▉| 1392/1500 [00:55<00:04, 24.33it/s, Batch=116/125

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  94%|▉| 1404/1500 [00:56<00:03, 24.79it/s, Batch=117/125

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  94%|▉| 1416/1500 [00:56<00:03, 24.70it/s, Batch=118/125

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  95%|▉| 1428/1500 [00:57<00:02, 24.65it/s, Batch=119/125

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  96%|▉| 1440/1500 [00:57<00:02, 24.88it/s, Batch=120/125

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  97%|▉| 1452/1500 [00:57<00:01, 24.46it/s, Batch=121/125

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  98%|▉| 1464/1500 [00:58<00:01, 24.87it/s, Batch=122/125

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions:  98%|▉| 1476/1500 [00:58<00:00, 24.92it/s, Batch=123/125

Adding requests:   0%|          | 0/12 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/12 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions: 100%|█| 1500/1500 [00:59<00:00, 25.24it/s, Batch=125/125


Error processing batch 125: 'float' object has no attribute 'split'

Processing completed with 6 failed batches.
Failed batch numbers: [26, 42, 59, 94, 101, 125]
Total failed rows: 72
Total predictions: 1500
Failed batches: 6
Failed rows: 72
Success rate: 95.20%





## Prediction for test-dataset-LoRA (batch)

In [7]:
# from tqdm import tqdm
# import numpy as np

# def process_dataframe_in_batches_lora(model, df, lora_adapter_path, batch_size=12):
#     """Process dataframe using LoRA batch predictions with progress bar and error handling"""
    
#     # Calculate number of batches
#     num_batches = len(df) // batch_size + (1 if len(df) % batch_size > 0 else 0)
    
#     all_results = []
#     failed_batches = []
#     failed_indices = []
    
#     # Process in batches with progress bar
#     with tqdm(total=len(df), desc="Processing LoRA predictions") as pbar:
#         for i in range(0, len(df), batch_size):
#             # Get current batch
#             batch_df = df.iloc[i:i+batch_size]
#             current_batch_num = i//batch_size + 1
            
#             try:
#                 # Convert batch to list of Series (input format for predict_batch_with_lora)
#                 batch_list = [row for idx, row in batch_df.iterrows()]
                
#                 # Get LoRA predictions for this batch
#                 batch_results = model.predict_batch_with_lora(
#                     batch_list, 
#                     lora_adapter_path=lora_adapter_path,
#                     verbose=False
#                 )
                
#                 # Add batch indices to results for tracking
#                 for j, result in enumerate(batch_results):
#                     result['original_index'] = batch_df.index[j]
                
#                 # Add to results
#                 all_results.extend(batch_results)
                
#             except Exception as e:
#                 # Log the error and continue with next batch
#                 print(f"\nError processing LoRA batch {current_batch_num}: {str(e)}")
#                 failed_batches.append(current_batch_num)
#                 batch_indices = batch_df.index.tolist()
#                 failed_indices.extend(batch_indices)
                
#                 # Create error results for all items in failed batch
#                 for idx in batch_indices:
#                     error_result = {
#                         'prediction': 'Error',
#                         'is_violation': False,
#                         'violation_probability': 0.0,
#                         'confidence': 0.0,
#                         'original_index': idx,
#                         'error': f"LoRA batch {current_batch_num} failed: {str(e)}",
#                         'batch_error': True
#                     }
#                     all_results.append(error_result)
            
#             # Update progress bar
#             pbar.update(len(batch_df))
#             pbar.set_postfix({
#                 'Batch': f'{current_batch_num}/{num_batches}',
#                 'Failed': len(failed_batches),
#                 'LoRA': 'Active'
#             })
    
#     # Print summary
#     if failed_batches:
#         print(f"\nLoRA processing completed with {len(failed_batches)} failed batches.")
#         print(f"Failed batch numbers: {failed_batches}")
#         print(f"Total failed rows: {len(failed_indices)}")
#     else:
#         print(f"\nAll {num_batches} LoRA batches processed successfully!")
    
#     return all_results, failed_batches, failed_indices

# # Usage with LoRA adapter
# lora_adapter_path = "./lora/Qwen3_4B_lora_fp16_r64_s10000_e_3_msl2048"  # Set your LoRA adapter path

# # Process in batches with LoRA and error handling
# predictions, failed_batches, failed_indices = process_dataframe_in_batches_lora(
#     model, 
#     df_test, 
#     lora_adapter_path=lora_adapter_path,
#     batch_size=12
# )

# # Extract predictions and handle errors
# df_test['predicted_rule_violation'] = [pred['prediction'] for pred in predictions]
# df_test['prediction_error'] = [pred.get('error', '') for pred in predictions]
# df_test['batch_failed'] = [pred.get('batch_error', False) for pred in predictions]

# # Check results
# print(f"Total predictions: {len(predictions)}")
# print(f"Failed batches: {len(failed_batches)}")
# print(f"Failed rows: {len(failed_indices)}")
# if len(predictions) > 0:
#     success_rate = ((len(predictions) - len(failed_indices)) / len(predictions)) * 100
#     print(f"LoRA Success rate: {success_rate:.2f}%")

## Summary

In [8]:
import pandas as pd
import numpy as np
from sklearn.metrics import f1_score, accuracy_score, precision_score, recall_score, classification_report

def convert_to_bool(series):
    """Simple boolean conversion"""
    if series.dtype == 'bool':
        return series
    
    # Convert to string then to boolean
    def to_bool(val):
        if pd.isna(val):
            return np.nan
        val_str = str(val).strip().lower()
        if val_str in ['true', '1', 'yes', 'y']:
            return True
        elif val_str in ['false', '0', 'no', 'n']:
            return False
        else:
            return np.nan
    
    return series.apply(to_bool)

# Convert both columns to boolean
df_test['violates_rule_bool'] = convert_to_bool(df_test['violates_rule'])
df_test['predicted_bool'] = convert_to_bool(df_test['predicted_rule_violation'])

# Filter valid data (no NaNs, no errors)
valid_mask = (
    df_test['violates_rule_bool'].notna() & 
    df_test['predicted_bool'].notna() &
    (df_test['predicted_rule_violation'] != 'Error')
)
df_clean = df_test[valid_mask]

print(f"Total rows: {len(df_test)}")
print(f"Valid rows: {len(df_clean)}")
print(f"Success rate: {len(df_clean)/len(df_test)*100:.2f}%\n")

if len(df_clean) > 0:
    # Convert to numpy boolean arrays for sklearn
    y_true = np.array(df_clean['violates_rule_bool'], dtype=bool)
    y_pred = np.array(df_clean['predicted_bool'], dtype=bool)
    
    # Calculate metrics
    f1 = f1_score(y_true, y_pred)
    accuracy = accuracy_score(y_true, y_pred)
    precision = precision_score(y_true, y_pred, zero_division=0)
    recall = recall_score(y_true, y_pred, zero_division=0)
    
    print(f"F1 Score: {f1:.4f}")
    print(f"Accuracy: {accuracy:.4f}")
    print(f"Precision: {precision:.4f}")
    print(f"Recall: {recall:.4f}\n")
    
    print("Classification Report:")
    print(classification_report(y_true, y_pred, target_names=['No Violation', 'Violation']))
else:
    print("No valid data for evaluation.")

Total rows: 1500
Valid rows: 1428
Success rate: 95.20%

F1 Score: 0.8126
Accuracy: 0.7157
Precision: 0.9832
Recall: 0.6924

Classification Report:
              precision    recall  f1-score   support

No Violation       0.27      0.90      0.41       157
   Violation       0.98      0.69      0.81      1271

    accuracy                           0.72      1428
   macro avg       0.62      0.80      0.61      1428
weighted avg       0.90      0.72      0.77      1428

