# Jigsaw - Agile Community Rules Classification
### https://www.kaggle.com/competitions/jigsaw-agile-community-rules

## LLM-Based Content Moderation (Zero-Shot)

This notebook evaluates a **Llama-1B model** for Reddit content moderation using only prompting techniques. We use zero-shot examples in an Alpaca prompt format to classify comments as rule violations and measure performance with F1-score metrics.

### Install packages on Kaggle: Add-ons > Install Dependencies 

```bash
pip install pip3-autoremove
pip install torch torchvision torchaudio xformers --index-url https://download.pytorch.org/whl/cu124
pip install unsloth vllm
pip install scikit-learn
```

In [1]:
import kagglehub
import pandas as pd
import os

# Check if running on Kaggle
if 'KAGGLE_KERNEL_RUN_TYPE' in os.environ:
   # Running on Kaggle
   base_path = "/kaggle/input/jigsaw-agile-community-rules-enforcement/"
   df_train = pd.read_csv(f"{base_path}train.csv")
   df_test = pd.read_csv(f"{base_path}test.csv")
else:
   # Running locally
   base_path = "./data/"
   df_train = pd.read_csv(f"{base_path}train.csv")
   df_test = pd.read_csv(f"{base_path}test.csv")

print(f"Using path: {base_path}")
df_train.head(2)

Using path: ./data/


Unnamed: 0,row_id,body,rule,subreddit,positive_example_1,positive_example_2,negative_example_1,negative_example_2,rule_violation
0,0,Banks don't want you to know this! Click here ...,"No Advertising: Spam, referral links, unsolici...",Futurology,If you could tell your younger self something ...,hunt for lady for jack off in neighbourhood ht...,Watch Golden Globe Awards 2017 Live Online in ...,"DOUBLE CEE x BANDS EPPS - ""BIRDS""\n\nDOWNLOAD/...",0
1,1,SD Stream [ ENG Link 1] (http://www.sportsstre...,"No Advertising: Spam, referral links, unsolici...",soccerstreams,[I wanna kiss you all over! Stunning!](http://...,LOLGA.COM is One of the First Professional Onl...,#Rapper \nðŸš¨Straight Outta Cross Keys SC ðŸš¨YouTu...,[15 Amazing Hidden Features Of Google Search Y...,0


In [2]:
from unsloth import FastLanguageModel
import pandas as pd
import torch
import re
device = "cuda" if torch.cuda.is_available() else "cpu"
print('device', device)

class Model:
    def __init__(self):
        ###------- for Unsloth-------------------
        if 'KAGGLE_KERNEL_RUN_TYPE' in os.environ:
            self.model_path = "/kaggle/input/llama-3.2/transformers/1b-instruct/1"
        else:
            self.model_path = "unsloth/Llama-3.2-1B-Instruct"
        self.model, self.tokenizer = FastLanguageModel.from_pretrained(
            model_name=self.model_path,
            max_seq_length=2048,
            dtype=torch.float16,
            load_in_4bit=False
        )
        self.model = self.model.to("cuda")
        FastLanguageModel.for_inference(self.model)  # Enable native 2x faster inference
        #######------------------------------------
        
        # Check model dtype and device
        for name, param in self.model.named_parameters():
            print(f"{name}: {param.dtype} on {param.device}")
            break  # remove break to list all parameters

    def format_comment(self, comment):
        return "\n".join(["| " + line for line in comment.split('\n')])
    
    def create_prompt(self, input_data: pd.Series):
        return f"""Below is an instruction that describes a task, paired with an input that provides further context. 
            Write a response that appropriately completes the request.

            ### Instruction:
            You are a really experienced moderator for the subreddit /r/{input_data['subreddit']}. 
            Your job is to determine if the following reported comment violates the given rule.
            Return results in {{violates rule: probability of violation between 0-1}} format as a float.
            
            ### Input:
            Rule: {input_data['rule']}
            
            Example 1:
            {self.format_comment(input_data['positive_example_1'])}
            Rule violation: True
            
            Example 2:
            {self.format_comment(input_data['negative_example_1'])}
            Rule violation: False
            
            Example 3:
            {self.format_comment(input_data['positive_example_2'])}
            Rule violation: True
            
            Example 4:
            {self.format_comment(input_data['negative_example_2'])}
            Rule violation: False
            
            Test sentence:
            {self.format_comment(input_data['body'])}
            
            ### Response:
            Violates rule: """

    def get_response(self, input_data: pd.Series):
        formatted_prompt = self.create_prompt(input_data)
        inputs = self.tokenizer([formatted_prompt], return_tensors="pt").to(device)
        
        outputs = self.model.generate(
            **inputs, 
            max_new_tokens=50,  # Shorter for classification task
            use_cache=True,
            temperature=0.5,  # Lower temperature for more consistent classification
            do_sample=True,
            pad_token_id=self.tokenizer.eos_token_id
        )
        return self.tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]
    
    def extract_probability(self, response_text):
        """Extract probability from model response"""
        import re
        # Look for patterns like "0.85", "0.2", etc.
        prob_pattern = r"(?:Violates rule:\s*)?(\d+\.?\d*)"
        match = re.search(prob_pattern, response_text.split("### Response:")[-1])
        
        if match:
            try:
                prob = float(match.group(1))
                # Ensure probability is between 0 and 1
                if prob > 1.0:
                    prob = prob / 100.0  # Convert percentage to probability
                return min(max(prob, 0.0), 1.0)
            except ValueError:
                pass
        return 0.5  # Default neutral probability if parsing fails
    
    def predict(self, input_data: pd.Series) -> float:
        """Predict if a comment violates the rule and return probability"""
        output_decoded = self.get_response(input_data)
        probability = self.extract_probability(output_decoded)
        return probability


ðŸ¦¥ Unsloth: Will patch your computer to enable 2x faster free finetuning.
ðŸ¦¥ Unsloth Zoo will now patch everything to make training faster!
INFO 07-30 21:19:03 [__init__.py:239] Automatically detected platform cuda.
device cuda


In [3]:
model=Model()

==((====))==  Unsloth 2025.4.7: Fast Llama patching. Transformers: 4.51.3. vLLM: 0.8.2.
   \\   /|    NVIDIA GeForce RTX 4070 Ti SUPER. Num GPUs = 1. Max memory: 15.693 GB. Platform: Linux.
O^O/ \_/ \    Torch: 2.6.0+cu124. CUDA: 8.9. CUDA Toolkit: 12.4. Triton: 3.2.0
\        /    Bfloat16 = TRUE. FA [Xformers = 0.0.29.post2. FA2 = False]
 "-____-"     Free license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!
model.embed_tokens.weight: torch.float16 on cuda:0


In [4]:
from tqdm import tqdm
tqdm.pandas()

# Apply the prompt function row-wise with progress bar
df_train=df_train.iloc[0:100]
df_train['prediction'] = df_train.progress_apply(model.predict, axis=1)

100%|â–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆ| 100/100 [01:01<00:00,  1.62it/s]


In [5]:
import pandas as pd
from sklearn.metrics import f1_score, accuracy_score, precision_score, recall_score, classification_report

# Round probabilities to binary predictions (0 or 1) using 0.5 threshold
df_train['predicted_violation'] = (df_train['prediction'] >= 0.5).astype(int)

f1 = f1_score(df_train['rule_violation'], df_train['predicted_violation'])
print(f"F1 Score: {f1:.4f}")

accuracy = accuracy_score(df_train['rule_violation'], df_train['predicted_violation'])
precision = precision_score(df_train['rule_violation'], df_train['predicted_violation'])
recall = recall_score(df_train['rule_violation'], df_train['predicted_violation'])

print(f"Accuracy: {accuracy:.4f}")
print(f"Precision: {precision:.4f}")
print(f"Recall: {recall:.4f}")

print("\nClassification Report:")
print(classification_report(df_train['rule_violation'], df_train['predicted_violation']))

F1 Score: 0.5574
Accuracy: 0.4600
Precision: 0.5075
Recall: 0.6182

Classification Report:
              precision    recall  f1-score   support

           0       0.36      0.27      0.31        45
           1       0.51      0.62      0.56        55

    accuracy                           0.46       100
   macro avg       0.44      0.44      0.43       100
weighted avg       0.44      0.46      0.45       100



In [6]:
# prediction for test data
df_test['rule_violation'] = df_test.progress_apply(model.predict, axis=1)

100%|â–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆ| 10/10 [00:06<00:00,  1.66it/s]


In [7]:
df_test.head(2)

Unnamed: 0,row_id,body,rule,subreddit,positive_example_1,positive_example_2,negative_example_1,negative_example_2,rule_violation
0,2029,NEW RAP GROUP 17. CHECK US OUT https://soundcl...,"No Advertising: Spam, referral links, unsolici...",hiphopheads,"Hey, guys, just wanted to drop in and invite y...",Cum Swallowing Hottie Katrina Kaif Cartoon Xvi...,SD Stream Eng - [Chelsea TV USA](http://soccer...,HD Streams: |[ENG HD Stoke vs Manchester Unite...,0.99
1,2030,Make your life comfortable. Get up to 15% Disc...,No legal advice: Do not offer or request legal...,AskReddit,Get a lawyer and get the security camera foota...,That isn't drastic. You tried reaching out to ...,So what are you going to do with the insurance...,It's just for Austria & Germany. If you still ...,1.0


In [8]:
# write to submission.csv
df_test[["row_id","rule_violation"]].to_csv("submission.csv",index=False)