# Jigsaw - Agile Community Rules Classification
### https://www.kaggle.com/competitions/jigsaw-agile-community-rules

## Constrained Generation for Reddit Content Moderation: Binary Classification Using Logit Probabilities
Technical Overview
This notebook implements a constrained generation approach for automated content moderation using a Llama 3.2-1B model. Instead of relying on traditional text parsing, we employ logits processors to restrict the model's output vocabulary to only "True" and "False" tokens, then extract probabilistic confidence scores directly from the model's logit distributions.
A reference notebook from  https://www.kaggle.com/code/xbar19/jigsaw-llama3-1-8b-instruct-fine-tuned is appreciated.

## Install packages on Kaggle: Add-ons > Install Dependencies 

```bash
pip install pip3-autoremove
pip install torch torchvision torchaudio xformers --index-url https://download.pytorch.org/whl/cu124
pip install unsloth vllm
pip install scikit-learn
```

In [11]:
import kagglehub
import pandas as pd
import os

# Check if running on Kaggle
if 'KAGGLE_KERNEL_RUN_TYPE' in os.environ:
   # Running on Kaggle
   base_path = "/kaggle/input/jigsaw-agile-community-rules/"
   df_train = pd.read_csv(f"{base_path}train.csv")
   df_test = pd.read_csv(f"{base_path}test.csv")
else:
   # Running locally
   base_path = "./data/"
   df_train = pd.read_csv(f"{base_path}train.csv")
   df_test = pd.read_csv(f"{base_path}test.csv")

print(f"Using path: {base_path}")
df_train.head(2)

Using path: ./data/


Unnamed: 0,row_id,body,rule,subreddit,positive_example_1,positive_example_2,negative_example_1,negative_example_2,rule_violation
0,0,Banks don't want you to know this! Click here ...,"No Advertising: Spam, referral links, unsolici...",Futurology,If you could tell your younger self something ...,hunt for lady for jack off in neighbourhood ht...,Watch Golden Globe Awards 2017 Live Online in ...,"DOUBLE CEE x BANDS EPPS - ""BIRDS""\n\nDOWNLOAD/...",0
1,1,SD Stream [ ENG Link 1] (http://www.sportsstre...,"No Advertising: Spam, referral links, unsolici...",soccerstreams,[I wanna kiss you all over! Stunning!](http://...,LOLGA.COM is One of the First Professional Onl...,#Rapper \nðŸš¨Straight Outta Cross Keys SC ðŸš¨YouTu...,[15 Amazing Hidden Features Of Google Search Y...,0


## Load LLM (llama-3.2-1B) model with vLLM (Suitable for batch inference) (logits output)

In [12]:
import multiprocessing as mp
mp.set_start_method('spawn', force=True)
import os
os.environ['VLLM_USE_V1'] = '0'  # Force V0 for logits processor support

import torch
import numpy as np
import pandas as pd
from vllm import LLM, SamplingParams
from transformers import LogitsProcessor
import math

class TrueFalseLogitsProcessor(LogitsProcessor):
    """Forces model to only output True or False tokens"""
    def __init__(self, allowed_ids):
        self.allowed_ids = allowed_ids
        
    def __call__(self, input_ids, scores: torch.Tensor) -> torch.Tensor:
        scores[self.allowed_ids] += 100  # Boost allowed tokens
        return scores

class LlamaClassifier:
    def __init__(self):
        # Model path selection
        if os.getenv('KAGGLE_KERNEL_RUN_TYPE'):
            self.model_path = "/kaggle/input/llama-3.2/transformers/1b-instruct/1"
        else:
            self.model_path = "unsloth/Llama-3.2-1B-Instruct"
        
        # Initialize model
        self.model = LLM(
            model=self.model_path,
            max_model_len=1024,
            gpu_memory_utilization=0.5,
            dtype="half",
            seed=123,
        )
        
        self.tokenizer = self.model.get_tokenizer()
        self.setup_token_constraints()
        
        # Sampling with constrained output
        logits_processors = [TrueFalseLogitsProcessor(self.KEEP)]
        self.sampling_params = SamplingParams(
            n=1,
            temperature=0,
            seed=777,
            skip_special_tokens=True,
            max_tokens=1,
            logits_processors=logits_processors,
            logprobs=2
        )
    
    def setup_token_constraints(self):
        """Get token IDs for 'False' and 'True'"""
        choices = ["False", "True"]
        self.KEEP = []
        for x in choices:
            c = self.tokenizer.encode(x, add_special_tokens=False)[0]
            self.KEEP.append(c)
        
        self.false_token_id = self.KEEP[0]
        self.true_token_id = self.KEEP[1]
        print(f"Constrained to tokens: {self.KEEP} = {choices}")
    
    def create_prompt(self, input_data: pd.Series):
        return f"""Below is an instruction that describes a task, paired with an input that provides further context. 
Write a response that appropriately completes the request.

### Instruction:
You are a really experienced moderator for the subreddit /r/{input_data['subreddit']}. 
Your job is to determine if the following reported comment violates the given rule.
Answer with only "True" or "False".

### Input:
Rule: {input_data['rule']}

Example 1:
{self.format_comment(input_data['positive_example_1'])}
Rule violation: True

Example 2:
{self.format_comment(input_data['negative_example_1'])}
Rule violation: False

Example 3:
{self.format_comment(input_data['positive_example_2'])}
Rule violation: True

Example 4:
{self.format_comment(input_data['negative_example_2'])}
Rule violation: False

Test sentence:
{self.format_comment(input_data['body'])}

### Response:
Rule violation:"""
    
    def format_comment(self, comment):
        return "\n".join(["| " + line for line in comment.split('\n')])
    
    def predict_classification(self, input_data: pd.Series):
        """Single prediction"""
        prompt = self.create_prompt(input_data)
        responses = self.model.generate([prompt], self.sampling_params, use_tqdm=False)
        
        response = responses[0]
        predicted_text = response.outputs[0].text.strip()
        
        try:
            x = response.outputs[0].logprobs[0]
            
            # Extract probabilities for True/False tokens
            logprobs = []
            for k in self.KEEP:
                if k in x:
                    logprobs.append(math.exp(x[k].logprob))
                else:
                    logprobs.append(0)
            
            logprobs = np.array(logprobs)
            logprobs /= (logprobs.sum() + 1e-15)
            
            violation_probability = logprobs[1]  # True probability
            confidence = max(logprobs)
            
        except Exception as e:
            print(f"Error: {e}")
            violation_probability = 0.5
            confidence = 0.5
        
        return {
            'prediction': predicted_text,
            'is_violation': violation_probability > 0.5,
            'violation_probability': violation_probability,
            'confidence': confidence
        }
    
    def predict_batch(self, input_data_list, verbose=False):
        """Batch predictions"""
        prompts = [self.create_prompt(data) for data in input_data_list]
        responses = self.model.generate(prompts, self.sampling_params, use_tqdm=True)
        
        results = []
        for i, response in enumerate(responses):
            try:
                predicted_text = response.outputs[0].text.strip()
                x = response.outputs[0].logprobs[0]
                
                # Extract probabilities
                logprobs = []
                for k in self.KEEP:
                    if k in x:
                        logprobs.append(math.exp(x[k].logprob))
                    else:
                        logprobs.append(0)
                
                logprobs = np.array(logprobs)
                logprobs /= (logprobs.sum() + 1e-15)
                
                violation_probability = logprobs[1]
                confidence = max(logprobs)
                
            except Exception as e:
                print(f"Error {i}: {e}")
                violation_probability = 0.5
                confidence = 0.5
                predicted_text = "Error"
            
            result = {
                'prediction': predicted_text,
                'is_violation': violation_probability > 0.5,
                'violation_probability': violation_probability,
                'confidence': confidence,
                'sample_index': i
            }
            
            # if not verbose:
            #     print(f"Sample {i+1}: {predicted_text} (prob: {violation_probability:.4f})")
            
            results.append(result)
        
        return results

## Instantiate the model based on one of the above class(unsloth/AutoModel/vLLM)

In [13]:
model=LlamaClassifier()

INFO 08-03 15:36:47 [config.py:1604] Using max model len 1024
INFO 08-03 15:36:47 [llm_engine.py:228] Initializing a V0 LLM engine (v0.10.0) with config: model='unsloth/Llama-3.2-1B-Instruct', speculative_config=None, tokenizer='unsloth/Llama-3.2-1B-Instruct', skip_tokenizer_init=False, tokenizer_mode=auto, revision=None, override_neuron_config={}, tokenizer_revision=None, trust_remote_code=False, dtype=torch.float16, max_seq_len=1024, download_dir=None, load_format=LoadFormat.AUTO, tensor_parallel_size=1, pipeline_parallel_size=1, disable_custom_all_reduce=False, quantization=None, enforce_eager=False, kv_cache_dtype=auto,  device_config=cuda, decoding_config=DecodingConfig(backend='xgrammar', disable_fallback=False, disable_any_whitespace=False, disable_additional_properties=False, reasoning_backend=''), observability_config=ObservabilityConfig(show_hidden_metrics_for_version=None, otlp_traces_endpoint=None, collect_detailed_traces=None), seed=123, served_model_name=unsloth/Llama-3.2

Loading safetensors checkpoint shards:   0% Completed | 0/1 [00:00<?, ?it/s]


INFO 08-03 15:36:50 [default_loader.py:262] Loading weights took 0.78 seconds
INFO 08-03 15:36:50 [model_runner.py:1115] Model loading took 2.3029 GiB and 1.451490 seconds
INFO 08-03 15:36:51 [worker.py:295] Memory profiling takes 0.26 seconds
INFO 08-03 15:36:51 [worker.py:295] the current vLLM instance can use total_gpu_memory (15.70GiB) x gpu_memory_utilization (0.50) = 7.85GiB
INFO 08-03 15:36:51 [worker.py:295] model weights take 2.30GiB; non_torch_memory takes -0.01GiB; PyTorch activation peak memory takes 1.17GiB; the rest of the memory reserved for KV Cache is 4.39GiB.
INFO 08-03 15:36:51 [executor_base.py:113] # cuda blocks: 8982, # CPU blocks: 8192
INFO 08-03 15:36:51 [executor_base.py:118] Maximum concurrency for 1024 tokens per request: 140.34x
INFO 08-03 15:36:52 [model_runner.py:1385] Capturing cudagraphs for decoding. This may lead to unexpected consequences if the model is not static. To run the model in eager mode, set 'enforce_eager=True' or use '--enforce-eager' in t

Capturing CUDA graph shapes:   0%|          | 0/35 [00:00<?, ?it/s]

INFO 08-03 15:37:05 [model_runner.py:1537] Graph capturing finished in 13 secs, took 0.16 GiB
INFO 08-03 15:37:05 [llm_engine.py:424] init engine (profile, create kv cache, warmup model) took 14.62 seconds
Constrained to tokens: [4139, 2575] = ['False', 'True']


## Prediction for train-dataset (non-batch)

In [14]:
# from tqdm import tqdm
# tqdm.pandas()

# # Apply the prompt function row-wise with progress bar
# result = df_train.iloc[0:1].progress_apply(model.predict_classification, axis=1)

# # Since result is a Series, access the first (and only) result
# first_result = result.iloc[0]
# print("CLASSIFICATION RESULTS:")
# print(f"Predicted: {first_result['prediction']}")
# print(f"Is Violation: {first_result['is_violation']}")
# print(f"Violation Probability: {first_result['violation_probability']:.3f}")
# print(f"Confidence: {first_result['confidence']:.3f}")
# # print(f"Raw LogProbs: {first_result['raw_logprobs']}")

## Prediction for train-dataset (batch)

In [15]:
# from tqdm import tqdm
# import numpy as np
# print(df_train.shape)
# def process_dataframe_in_batches(model, df, batch_size=12):
#     """Process dataframe using batch predictions with progress bar"""
    
#     # Calculate number of batches
#     num_batches = len(df) // batch_size + (1 if len(df) % batch_size > 0 else 0)
    
#     all_results = []
    
#     # Process in batches with progress bar
#     with tqdm(total=len(df), desc="Processing predictions") as pbar:
#         for i in range(0, len(df), batch_size):
#             # Get current batch
#             batch_df = df.iloc[i:i+batch_size]
            
#             # Convert batch to list of Series (input format for predict_batch)
#             batch_list = [row for _, row in batch_df.iterrows()]  # Fixed: removed asterisks and fixed variable name
            
#             # Get predictions for this batch
#             batch_results = model.predict_batch(batch_list)
            
#             # Add to results
#             all_results.extend(batch_results)
            
#             # Update progress bar
#             pbar.update(len(batch_df))
#             pbar.set_postfix({'Batch': f'{i//batch_size + 1}/{num_batches}'})
    
#     return all_results

# # Process in batches
# predictions = process_dataframe_in_batches(model, df_train, batch_size=12)

# # Add predictions to dataframe - extract violation probabilities from the list of dictionaries
# df_train['predicted_violation'] = [pred['violation_probability'] for pred in predictions]  # Fixed: extract from list and fixed typo

## Prediction for train-dataset - Summary

In [16]:
# import pandas as pd
# from sklearn.metrics import f1_score, accuracy_score, precision_score, recall_score, classification_report

# # Round probabilities to binary predictions (0 or 1) using 0.5 threshold
# df_train['predicted_violation'] = (df_train['predicted_violation'] >= 0.5).astype(int)

# f1 = f1_score(df_train['rule_violation'], df_train['predicted_violation'])
# print(f"F1 Score: {f1:.4f}")

# accuracy = accuracy_score(df_train['rule_violation'], df_train['predicted_violation'])
# precision = precision_score(df_train['rule_violation'], df_train['predicted_violation'])
# recall = recall_score(df_train['rule_violation'], df_train['predicted_violation'])

# print(f"Accuracy: {accuracy:.4f}")
# print(f"Precision: {precision:.4f}")
# print(f"Recall: {recall:.4f}")

# print("\nClassification Report:")
# print(classification_report(df_train['rule_violation'], df_train['predicted_violation']))

## Prediction for test-dataset (batch)

In [17]:
from tqdm import tqdm
import numpy as np
print(df_train.shape)
def process_dataframe_in_batches(model, df, batch_size=12):
    """Process dataframe using batch predictions with progress bar"""
    
    # Calculate number of batches
    num_batches = len(df) // batch_size + (1 if len(df) % batch_size > 0 else 0)
    
    all_results = []
    
    # Process in batches with progress bar
    with tqdm(total=len(df), desc="Processing predictions") as pbar:
        for i in range(0, len(df), batch_size):
            # Get current batch
            batch_df = df.iloc[i:i+batch_size]
            
            # Convert batch to list of Series (input format for predict_batch)
            batch_list = [row for _, row in batch_df.iterrows()]  # Fixed: removed asterisks and fixed variable name
            
            # Get predictions for this batch
            batch_results = model.predict_batch(batch_list)
            
            # Add to results
            all_results.extend(batch_results)
            
            # Update progress bar
            pbar.update(len(batch_df))
            pbar.set_postfix({'Batch': f'{i//batch_size + 1}/{num_batches}'})
    
    return all_results

# Process in batches
predictions = process_dataframe_in_batches(model, df_test, batch_size=12)
df_test['rule_violation'] = [pred['violation_probability'] for pred in predictions] 

(2029, 9)


Processing predictions:   0%|                            | 0/10 [00:00<?, ?it/s]

Adding requests:   0%|          | 0/10 [00:00<?, ?it/s]

Processed prompts:   0%| | 0/10 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, 

Processing predictions: 100%|â–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆ| 10/10 [00:00<00:00, 67.45it/s, Batch=1/1]


In [18]:
df_test.head(2)

Unnamed: 0,row_id,body,rule,subreddit,positive_example_1,positive_example_2,negative_example_1,negative_example_2,rule_violation
0,2029,NEW RAP GROUP 17. CHECK US OUT https://soundcl...,"No Advertising: Spam, referral links, unsolici...",hiphopheads,"Hey, guys, just wanted to drop in and invite y...",Cum Swallowing Hottie Katrina Kaif Cartoon Xvi...,SD Stream Eng - [Chelsea TV USA](http://soccer...,HD Streams: |[ENG HD Stoke vs Manchester Unite...,0.731059
1,2030,Make your life comfortable. Get up to 15% Disc...,No legal advice: Do not offer or request legal...,AskReddit,Get a lawyer and get the security camera foota...,That isn't drastic. You tried reaching out to ...,So what are you going to do with the insurance...,It's just for Austria & Germany. If you still ...,0.7773


## Write to submissions.csv

In [19]:
# write to submissions.csv
df_test[["row_id","rule_violation"]].to_csv("submission.csv",index=False)
print("wrote results to submission.csv")

wrote results to submission.csv
