# üìë Rhetorical Role Labeling for Indian Judgments

This notebook trains a model to segment legal judgments into 13 functional parts:

- PREAMBLE, FACTS, ISSUE
- ARGUMENT_PETITIONER, ARGUMENT_RESPONDENT
- ANALYSIS, STATUTE, PRECEDENT_RELIED, PRECEDENT_NOT_RELIED
- RATIO, RULING_LOWER_COURT, RULING_PRESENT_COURT, NONE

In [None]:
# Setup
import sys
sys.path.insert(0, '..')

import torch
import json
from pathlib import Path

from src.models import RhetoricalRoleLabeler
from src.utils import set_seed

set_seed(42)

# Check device
device = "cuda" if torch.cuda.is_available() else "mps" if hasattr(torch.backends, 'mps') and torch.backends.mps.is_available() else "cpu"
print(f"Using device: {device}")

## 1. Initialize RRL Model

In [None]:
# Initialize the Rhetorical Role Labeler
rrl_model = RhetoricalRoleLabeler(
    model_name="law-ai/InLegalBERT",
    use_crf=True,      # Use CRF for label constraints
    use_bilstm=True,   # Use BiLSTM for sequence modeling
    device=device
)

print(f"Model: {rrl_model.model_name}")
print(f"Number of roles: {len(rrl_model.RRL_LABELS)}")
print(f"\nRhetorical Roles:")
for i, label in enumerate(rrl_model.RRL_LABELS):
    print(f"  {i:2d}. {label}")

## 2. Sample Judgment for Testing

In [None]:
SAMPLE_JUDGMENT = """
IN THE SUPREME COURT OF INDIA
CIVIL APPEAL NO. 1234 OF 2023

ABC Company Ltd.                           ... Appellant
                    Versus
State of Maharashtra & Ors.                ... Respondents

JUDGMENT

1. This appeal challenges the judgment dated 15.01.2023 passed by the 
High Court of Bombay in WP No. 4567/2022.

2. The brief facts are that the appellant is a company engaged in 
manufacturing. The respondent State imposed a tax which the 
appellant claims is unconstitutional.

3. The issue for consideration is whether the impugned tax violates 
Article 14 of the Constitution.

4. Learned Senior Counsel for the appellant submitted that the tax 
creates unreasonable classification without intelligible differentia.

5. Per contra, the learned Additional Solicitor General argued that 
the classification is based on legitimate State interests.

6. We have carefully considered the submissions and perused the record.

7. In M/s. Kiran vs. State (2019) 5 SCC 123, this Court held that 
taxation must satisfy the test of reasonableness under Article 14.

8. The doctrine of proportionality requires that the means adopted 
must be proportionate to the ends sought to be achieved.

9. In view of the above analysis, the appeal is allowed. The impugned 
order is set aside.
"""

print(f"Sample judgment: {len(SAMPLE_JUDGMENT)} characters")

## 3. Manual Annotation Example

In [None]:
# Training data format for RRL
SAMPLE_RRL_DATA = [
    {"sentence": "IN THE SUPREME COURT OF INDIA", "label": "PREAMBLE"},
    {"sentence": "CIVIL APPEAL NO. 1234 OF 2023", "label": "PREAMBLE"},
    {"sentence": "This appeal challenges the judgment dated 15.01.2023 passed by the High Court.", "label": "PREAMBLE"},
    {"sentence": "The brief facts are that the appellant is a company engaged in manufacturing.", "label": "FACTS"},
    {"sentence": "The respondent State imposed a tax which the appellant claims is unconstitutional.", "label": "FACTS"},
    {"sentence": "The issue for consideration is whether the impugned tax violates Article 14.", "label": "ISSUE"},
    {"sentence": "Learned Senior Counsel for the appellant submitted that the tax creates unreasonable classification.", "label": "ARGUMENT_PETITIONER"},
    {"sentence": "Per contra, the ASG argued that the classification is based on legitimate State interests.", "label": "ARGUMENT_RESPONDENT"},
    {"sentence": "We have carefully considered the submissions and perused the record.", "label": "ANALYSIS"},
    {"sentence": "In M/s. Kiran vs. State (2019) 5 SCC 123, this Court held that taxation must satisfy reasonableness.", "label": "PRECEDENT_RELIED"},
    {"sentence": "The doctrine of proportionality requires proportionate means.", "label": "RATIO"},
    {"sentence": "In view of the above, the appeal is allowed.", "label": "RULING_PRESENT_COURT"},
]

# Display annotated data
print("Sample RRL Training Data:")
print("-" * 80)
for item in SAMPLE_RRL_DATA:
    print(f"[{item['label']:25}] {item['sentence'][:55]}...")

## 4. Load and Predict (Demo)

In [None]:
# ‚ö†Ô∏è Uncomment to load model and predict
# rrl_model.load_model()

# Make predictions
# predictions = rrl_model.predict_document(SAMPLE_JUDGMENT)

# Display predictions
# for pred in predictions[:5]:
#     print(f"[{pred['role']:25}] {pred['sentence'][:60]}...")

print("Prediction code ready!")
print("Uncomment to run after installing torchcrf.")

## 5. Generate Structured Summary

In [None]:
# After prediction, generate structured summary
# summary = rrl_model.generate_structured_summary(predictions)
# print(summary)

# Manual example of what the output looks like:
EXAMPLE_SUMMARY = """
**ISSUE:** Whether the impugned tax violates Article 14 of the Constitution.

**FACTS:** The appellant is a company engaged in manufacturing. The respondent 
State imposed a tax which the appellant claims is unconstitutional.

**PRECEDENTS RELIED:** In M/s. Kiran vs. State (2019) 5 SCC 123, this Court held 
that taxation must satisfy the test of reasonableness under Article 14.

**RATIO:** The doctrine of proportionality requires that the means adopted must 
be proportionate to the ends sought to be achieved.

**RULING:** The appeal is allowed. The impugned order is set aside.
"""

print("Example Structured Summary:")
print(EXAMPLE_SUMMARY)

## 6. Training Pipeline

In [None]:
# Full training pipeline would include:
# 1. Load BUILD dataset from OpenNyAI
# 2. Tokenize sentences
# 3. Train with BiLSTM-CRF architecture
# 4. Evaluate with weighted F1

print("Training Pipeline Steps:")
print("1. Download BUILD dataset from OpenNyAI GitHub")
print("2. Convert to sentence-label pairs")
print("3. Train BiLSTM-CRF on top of InLegalBERT")
print("4. Evaluate on test set")
print("\nDataset: github.com/Legal-NLP-EkStep/rhetorical-role-baseline")

## Next Steps

1. **Get BUILD Dataset**: Download from OpenNyAI GitHub
2. **Train**: Fine-tune InLegalBERT + BiLSTM-CRF
3. **Integrate**: Use RRL for extractive summarization