# Ontology Verification: Co-occurrence Analysis

This notebook tests whether ontological relationships manifest as **label co-occurrence** in the data.

**Key Distinction:**
- **Ontological truth**: "MTD for Lack of SMJ" **IS-A** "Motion to Dismiss" (legally true)
- **Labeling practice**: Do labelers redundantly tag BOTH? Or only the most specific?

| Relationship | If DENSE labeling | If SPARSE labeling |
|--------------|-------------------|-------------------|
| **IS-A** | Child + Parent co-occur (100%) | Only child labeled (0%) |
| **DEPENDS-ON** | Dependent + Prerequisite co-occur | Co-occur (context needed) |
| **Aggregation** | N/A | My categories don't exist in data |

In [4]:
import pandas as pd
import numpy as np
from collections import Counter, defaultdict
import json

# Load the data - it's JSON Lines format
data = []
with open('../data/TRDataChallenge2023.txt', 'r') as f:
    for line in f:
        data.append(json.loads(line.strip()))

df = pd.DataFrame(data)
print(f"Total documents: {len(df):,}")
print(f"Columns: {df.columns.tolist()}")
print(f"\nSample postures:")
for i in range(3):
    print(f"  Doc {i}: {df.iloc[i]['postures']}")

Total documents: 18,000
Columns: ['documentId', 'postures', 'sections']

Sample postures:
  Doc 0: ['On Appeal']
  Doc 1: ['Appellate Review', 'Sentencing or Penalty Phase Motion or Objection']
  Doc 2: ['Motion to Compel Arbitration', 'On Appeal']


In [5]:
# Postures are already parsed as lists in JSON
# Just ensure they're lists
df['postures'] = df['postures'].apply(lambda x: x if isinstance(x, list) else [])

# Check the parsing
print("Sample parsed postures:")
for i, row in df.head(5).iterrows():
    print(f"  Doc {i}: {row['postures']}")

Sample parsed postures:
  Doc 0: ['On Appeal']
  Doc 1: ['Appellate Review', 'Sentencing or Penalty Phase Motion or Objection']
  Doc 2: ['Motion to Compel Arbitration', 'On Appeal']
  Doc 3: ['On Appeal', 'Review of Administrative Decision']
  Doc 4: ['On Appeal']


In [6]:
# How many postures per document?
df['num_postures'] = df['postures'].apply(len)
print("Postures per document distribution:")
print(df['num_postures'].value_counts().sort_index().head(10))
print(f"\nMax postures in one doc: {df['num_postures'].max()}")
print(f"Docs with multiple postures: {(df['num_postures'] > 1).sum():,} ({(df['num_postures'] > 1).mean()*100:.1f}%)")

Postures per document distribution:
num_postures
0     923
1    8118
2    7604
3    1129
4     190
5      32
6       2
7       2
Name: count, dtype: int64

Max postures in one doc: 7
Docs with multiple postures: 8,959 (49.8%)


## 1. IS-A Relationship: Labeling Practice Test

**Question:** Do labelers apply BOTH parent and child labels (dense), or only the most specific (sparse)?

If "Motion to Dismiss for Lack of SMJ" IS-A "Motion to Dismiss":
- **Dense labeling** → Both labels appear together (~100%)
- **Sparse labeling** → Only child appears (~0%)

**Hypothesized IS-A pairs from ontology:**

In [7]:
# Define IS-A relationships from our ontology
IS_A_RELATIONSHIPS = [
    # Motion to Dismiss hierarchy
    ("Motion to Dismiss for Lack of Subject Matter Jurisdiction", "Motion to Dismiss"),
    ("Motion to Dismiss for Lack of Personal Jurisdiction", "Motion to Dismiss"),
    ("Motion to Dismiss for Lack of Standing", "Motion to Dismiss"),
    ("Motion to Dismiss for Lack of Jurisdiction", "Motion to Dismiss"),
]

print("IS-A relationships to verify:")
for child, parent in IS_A_RELATIONSHIPS:
    print(f"  {child} IS-A {parent}")

IS-A relationships to verify:
  Motion to Dismiss for Lack of Subject Matter Jurisdiction IS-A Motion to Dismiss
  Motion to Dismiss for Lack of Personal Jurisdiction IS-A Motion to Dismiss
  Motion to Dismiss for Lack of Standing IS-A Motion to Dismiss
  Motion to Dismiss for Lack of Jurisdiction IS-A Motion to Dismiss


In [8]:
def check_cooccurrence(df, child_posture, parent_posture):
    """
    Check co-occurrence between two postures.
    Returns: (child_count, both_count, parent_given_child_rate)
    """
    child_docs = df[df['postures'].apply(lambda x: child_posture in x)]
    both_docs = child_docs[child_docs['postures'].apply(lambda x: parent_posture in x)]
    
    child_count = len(child_docs)
    both_count = len(both_docs)
    
    if child_count > 0:
        rate = both_count / child_count
    else:
        rate = 0
    
    return child_count, both_count, rate

# Verify IS-A relationships
print("\n" + "="*80)
print("IS-A RELATIONSHIP VERIFICATION")
print("Expected: If Child IS-A Parent, then P(Parent | Child) should be HIGH (ideally 100%)")
print("="*80 + "\n")

isa_results = []
for child, parent in IS_A_RELATIONSHIPS:
    child_count, both_count, rate = check_cooccurrence(df, child, parent)
    isa_results.append({
        'child': child,
        'parent': parent,
        'child_count': child_count,
        'both_count': both_count,
        'P(parent|child)': rate
    })
    status = "✓ CONFIRMED" if rate > 0.9 else ("? PARTIAL" if rate > 0.5 else "✗ NOT CONFIRMED")
    print(f"{status}")
    print(f"  Child: {child}")
    print(f"  Parent: {parent}")
    print(f"  Docs with child: {child_count}")
    print(f"  Docs with BOTH: {both_count}")
    print(f"  P(parent | child): {rate:.1%}")
    print()


IS-A RELATIONSHIP VERIFICATION
Expected: If Child IS-A Parent, then P(Parent | Child) should be HIGH (ideally 100%)

✗ NOT CONFIRMED
  Child: Motion to Dismiss for Lack of Subject Matter Jurisdiction
  Parent: Motion to Dismiss
  Docs with child: 343
  Docs with BOTH: 11
  P(parent | child): 3.2%

✗ NOT CONFIRMED
  Child: Motion to Dismiss for Lack of Personal Jurisdiction
  Parent: Motion to Dismiss
  Docs with child: 204
  Docs with BOTH: 18
  P(parent | child): 8.8%

✗ NOT CONFIRMED
  Child: Motion to Dismiss for Lack of Standing
  Parent: Motion to Dismiss
  Docs with child: 137
  Docs with BOTH: 19
  P(parent | child): 13.9%

✗ NOT CONFIRMED
  Child: Motion to Dismiss for Lack of Jurisdiction
  Parent: Motion to Dismiss
  Docs with child: 124
  Docs with BOTH: 5
  P(parent | child): 4.0%



## 2. DEPENDS-ON Relationship Verification

If "Motion to Post Bond" DEPENDS-ON "On Appeal", then whenever we see the motion, we should also see the appellate stage.

In [9]:
# Define DEPENDS-ON relationships from our ontology
DEPENDS_ON_RELATIONSHIPS = [
    # Appellate motions depend on appellate stage
    ("Motion to Post Bond", "On Appeal"),
    ("Motion for Appeal Bond", "On Appeal"),
    ("Motion to Expand the Record", "On Appeal"),
    ("Motion to Supplement the Record", "On Appeal"),
    ("Motion for Rehearing", "On Appeal"),
    ("Motion to Reargue", "On Appeal"),
    ("Petition for Rehearing En Banc", "On Appeal"),
]

print("DEPENDS-ON relationships to verify:")
for dependent, prerequisite in DEPENDS_ON_RELATIONSHIPS:
    print(f"  {dependent} DEPENDS-ON {prerequisite}")

DEPENDS-ON relationships to verify:
  Motion to Post Bond DEPENDS-ON On Appeal
  Motion for Appeal Bond DEPENDS-ON On Appeal
  Motion to Expand the Record DEPENDS-ON On Appeal
  Motion to Supplement the Record DEPENDS-ON On Appeal
  Motion for Rehearing DEPENDS-ON On Appeal
  Motion to Reargue DEPENDS-ON On Appeal
  Petition for Rehearing En Banc DEPENDS-ON On Appeal


In [10]:
# Verify DEPENDS-ON relationships
print("\n" + "="*80)
print("DEPENDS-ON RELATIONSHIP VERIFICATION")
print("Expected: If A DEPENDS-ON B, then P(B | A) should be HIGH")
print("="*80 + "\n")

depends_results = []
for dependent, prerequisite in DEPENDS_ON_RELATIONSHIPS:
    dep_count, both_count, rate = check_cooccurrence(df, dependent, prerequisite)
    depends_results.append({
        'dependent': dependent,
        'prerequisite': prerequisite,
        'dependent_count': dep_count,
        'both_count': both_count,
        'P(prereq|dependent)': rate
    })
    status = "✓ CONFIRMED" if rate > 0.9 else ("? PARTIAL" if rate > 0.5 else "✗ NOT CONFIRMED")
    print(f"{status}")
    print(f"  Dependent: {dependent}")
    print(f"  Prerequisite: {prerequisite}")
    print(f"  Docs with dependent: {dep_count}")
    print(f"  Docs with BOTH: {both_count}")
    print(f"  P(prerequisite | dependent): {rate:.1%}")
    print()


DEPENDS-ON RELATIONSHIP VERIFICATION
Expected: If A DEPENDS-ON B, then P(B | A) should be HIGH

✓ CONFIRMED
  Dependent: Motion to Post Bond
  Prerequisite: On Appeal
  Docs with dependent: 2
  Docs with BOTH: 2
  P(prerequisite | dependent): 100.0%

✗ NOT CONFIRMED
  Dependent: Motion for Appeal Bond
  Prerequisite: On Appeal
  Docs with dependent: 2
  Docs with BOTH: 1
  P(prerequisite | dependent): 50.0%

✓ CONFIRMED
  Dependent: Motion to Expand the Record
  Prerequisite: On Appeal
  Docs with dependent: 2
  Docs with BOTH: 2
  P(prerequisite | dependent): 100.0%

? PARTIAL
  Dependent: Motion to Supplement the Record
  Prerequisite: On Appeal
  Docs with dependent: 20
  Docs with BOTH: 11
  P(prerequisite | dependent): 55.0%

? PARTIAL
  Dependent: Motion for Rehearing
  Prerequisite: On Appeal
  Docs with dependent: 48
  Docs with BOTH: 43
  P(prerequisite | dependent): 89.6%

? PARTIAL
  Dependent: Motion to Reargue
  Prerequisite: On Appeal
  Docs with dependent: 35
  Docs wit

## 3. Summary Table

In [11]:
# Create summary DataFrames
print("IS-A Relationships Summary:")
isa_df = pd.DataFrame(isa_results)
isa_df['P(parent|child)'] = isa_df['P(parent|child)'].apply(lambda x: f"{x:.1%}")
display(isa_df)

print("\nDEPENDS-ON Relationships Summary:")
dep_df = pd.DataFrame(depends_results)
dep_df['P(prereq|dependent)'] = dep_df['P(prereq|dependent)'].apply(lambda x: f"{x:.1%}")
display(dep_df)

IS-A Relationships Summary:


Unnamed: 0,child,parent,child_count,both_count,P(parent|child)
0,Motion to Dismiss for Lack of Subject Matter J...,Motion to Dismiss,343,11,3.2%
1,Motion to Dismiss for Lack of Personal Jurisdi...,Motion to Dismiss,204,18,8.8%
2,Motion to Dismiss for Lack of Standing,Motion to Dismiss,137,19,13.9%
3,Motion to Dismiss for Lack of Jurisdiction,Motion to Dismiss,124,5,4.0%



DEPENDS-ON Relationships Summary:


Unnamed: 0,dependent,prerequisite,dependent_count,both_count,P(prereq|dependent)
0,Motion to Post Bond,On Appeal,2,2,100.0%
1,Motion for Appeal Bond,On Appeal,2,1,50.0%
2,Motion to Expand the Record,On Appeal,2,2,100.0%
3,Motion to Supplement the Record,On Appeal,20,11,55.0%
4,Motion for Rehearing,On Appeal,48,43,89.6%
5,Motion to Reargue,On Appeal,35,28,80.0%
6,Petition for Rehearing En Banc,On Appeal,6,5,83.3%


## 4. Discover Other Co-occurrence Patterns

Let's explore what postures commonly co-occur to discover relationships we might have missed.

In [12]:
# Build co-occurrence matrix for top postures
from itertools import combinations

# Count all posture pairs
pair_counts = Counter()
posture_counts = Counter()

for postures in df['postures']:
    for p in postures:
        posture_counts[p] += 1
    for p1, p2 in combinations(sorted(postures), 2):
        pair_counts[(p1, p2)] += 1

print(f"Total unique postures: {len(posture_counts)}")
print(f"Total unique pairs: {len(pair_counts)}")

Total unique postures: 224
Total unique pairs: 922


In [13]:
# Top co-occurring pairs (by raw count)
print("Top 20 co-occurring posture pairs (by count):")
print("="*80)
for (p1, p2), count in pair_counts.most_common(20):
    # Calculate conditional probabilities
    p_p2_given_p1 = count / posture_counts[p1]
    p_p1_given_p2 = count / posture_counts[p2]
    print(f"\n{count:4d} docs: {p1[:40]}..." if len(p1) > 40 else f"\n{count:4d} docs: {p1}")
    print(f"          + {p2[:40]}..." if len(p2) > 40 else f"          + {p2}")
    print(f"          P(2|1)={p_p2_given_p1:.1%}, P(1|2)={p_p1_given_p2:.1%}")

Top 20 co-occurring posture pairs (by count):

1396 docs: Motion to Dismiss
          + On Appeal
          P(2|1)=83.1%, P(1|2)=15.2%

1288 docs: On Appeal
          + Review of Administrative Decision
          P(2|1)=14.0%, P(1|2)=46.4%

1273 docs: Appellate Review
          + Sentencing or Penalty Phase Motion or Ob...
          P(2|1)=27.4%, P(1|2)=94.9%

1064 docs: Appellate Review
          + Trial or Guilt Phase Motion or Objection
          P(2|1)=22.9%, P(1|2)=97.0%

 483 docs: Motion for Attorney's Fees
          + On Appeal
          P(2|1)=78.9%, P(1|2)=5.3%

 476 docs: Appellate Review
          + Post-Trial Hearing Motion
          P(2|1)=10.2%, P(1|2)=93.0%

 218 docs: On Appeal
          + Petition to Terminate Parental Rights
          P(2|1)=2.4%, P(1|2)=99.5%

 192 docs: Motion for New Trial
          + On Appeal
          P(2|1)=85.0%, P(1|2)=2.1%

 186 docs: Motion to Compel Arbitration
          + On Appeal
          P(2|1)=72.9%, P(1|2)=2.0%

 178 docs: Motion t

In [14]:
# Find high conditional probability pairs (potential IS-A or DEPENDS-ON)
print("\nHigh conditional probability pairs (potential relationships):")
print("Looking for P(B|A) > 80% where A appears at least 10 times")
print("="*80)

high_cond_pairs = []
for (p1, p2), count in pair_counts.items():
    if posture_counts[p1] >= 10:
        p_p2_given_p1 = count / posture_counts[p1]
        if p_p2_given_p1 > 0.8:
            high_cond_pairs.append((p1, p2, posture_counts[p1], count, p_p2_given_p1))
    if posture_counts[p2] >= 10:
        p_p1_given_p2 = count / posture_counts[p2]
        if p_p1_given_p2 > 0.8:
            high_cond_pairs.append((p2, p1, posture_counts[p2], count, p_p1_given_p2))

# Sort by conditional probability
high_cond_pairs.sort(key=lambda x: x[4], reverse=True)

# Remove duplicates
seen = set()
for p1, p2, count_p1, count_both, prob in high_cond_pairs[:30]:
    if (p1, p2) not in seen:
        seen.add((p1, p2))
        print(f"\nP({p2[:35]}... | {p1[:35]}...) = {prob:.1%}" if len(p1) > 35 else 
              f"\nP({p2} | {p1}) = {prob:.1%}")
        print(f"   Count of '{p1[:40]}': {count_p1}")
        print(f"   Count of both: {count_both}")


High conditional probability pairs (potential relationships):
Looking for P(B|A) > 80% where A appears at least 10 times

P(On Appeal... | Motion to Set Aside or Vacate Dismi...) = 100.0%
   Count of 'Motion to Set Aside or Vacate Dismissal': 21
   Count of both: 21

P(On Appeal... | Motion for Restraining or Protectio...) = 100.0%
   Count of 'Motion for Restraining or Protection Ord': 59
   Count of both: 59

P(On Appeal... | Motion to Modify or Terminate Alimo...) = 100.0%
   Count of 'Motion to Modify or Terminate Alimony/Ma': 24
   Count of both: 24

P(On Appeal | Petition to Enforce Child Support) = 100.0%
   Count of 'Petition to Enforce Child Support': 12
   Count of both: 12

P(On Appeal... | Request for Award of Permanent Alim...) = 100.0%
   Count of 'Request for Award of Permanent Alimony/M': 21
   Count of both: 21

P(On Appeal... | Motion to Modify Visitation Rights ...) = 100.0%
   Count of 'Motion to Modify Visitation Rights or Pa': 18
   Count of both: 18

P(On Appeal

## 5. Conclusions

Based on the co-occurrence analysis, update the interpretation below:

In [15]:
# Final summary
print("ONTOLOGY VERIFICATION SUMMARY")
print("="*80)

print("\n1. IS-A RELATIONSHIPS:")
for r in isa_results:
    prob = float(r['P(parent|child)'].strip('%')) / 100 if isinstance(r['P(parent|child)'], str) else r['P(parent|child)']
    if r['child_count'] > 0:
        status = "✓" if prob > 0.9 else ("?" if prob > 0.5 else "✗")
        print(f"   {status} {r['child'][:50]}... → {r['parent']}")

print("\n2. DEPENDS-ON RELATIONSHIPS:")
for r in depends_results:
    prob = float(r['P(prereq|dependent)'].strip('%')) / 100 if isinstance(r['P(prereq|dependent)'], str) else r['P(prereq|dependent)']
    if r['dependent_count'] > 0:
        status = "✓" if prob > 0.9 else ("?" if prob > 0.5 else "✗")
        print(f"   {status} {r['dependent'][:50]}... → {r['prerequisite']}")

ONTOLOGY VERIFICATION SUMMARY

1. IS-A RELATIONSHIPS:
   ✗ Motion to Dismiss for Lack of Subject Matter Juris... → Motion to Dismiss
   ✗ Motion to Dismiss for Lack of Personal Jurisdictio... → Motion to Dismiss
   ✗ Motion to Dismiss for Lack of Standing... → Motion to Dismiss
   ✗ Motion to Dismiss for Lack of Jurisdiction... → Motion to Dismiss

2. DEPENDS-ON RELATIONSHIPS:
   ✓ Motion to Post Bond... → On Appeal
   ✗ Motion for Appeal Bond... → On Appeal
   ✓ Motion to Expand the Record... → On Appeal
   ? Motion to Supplement the Record... → On Appeal
   ? Motion for Rehearing... → On Appeal
   ? Motion to Reargue... → On Appeal
   ? Petition for Rehearing En Banc... → On Appeal


In [19]:
# KEY FINDINGS - Labeling Pattern Discovery

print("""
╔══════════════════════════════════════════════════════════════════════════════╗
║              LABELING PRACTICE vs ONTOLOGICAL TRUTH                          ║
╠══════════════════════════════════════════════════════════════════════════════╣
║                                                                              ║
║  IMPORTANT DISTINCTION:                                                      ║
║  ─────────────────────                                                       ║
║  • Ontology: "MTD for Lack of SMJ" IS-A "Motion to Dismiss" ← TRUE           ║
║  • Labels:   Low co-occurrence (3.2%) ← SPARSE LABELING, not wrong ontology  ║
║                                                                              ║
╠══════════════════════════════════════════════════════════════════════════════╣
║                                                                              ║
║  1. SPARSE LABELING PATTERN DISCOVERED                                       ║
║     ────────────────────────────────────                                     ║
║     "Motion to Dismiss for Lack of SMJ" co-occurs with "Motion to Dismiss"   ║
║     only 3.2% of the time.                                                   ║
║                                                                              ║
║     This means: Labelers use the MOST SPECIFIC label available.              ║
║     They don't redundantly add parent labels (efficient annotation).         ║
║                                                                              ║
║     → The IS-A relationship is ONTOLOGICALLY TRUE                            ║
║     → But labels are SPARSE (most-specific only)                             ║
║                                                                              ║
║  2. STAGE LABELS = REQUIRED CONTEXT TAGS                                     ║
║     ─────────────────────────────────────                                    ║
║     Many postures ALWAYS co-occur with stage labels:                         ║
║     - Family Law postures → 99%+ with "On Appeal"                            ║
║     - Criminal postures → 95%+ with "Appellate Review"                       ║
║                                                                              ║
║     → Stages ARE labeled (dense) because they add context                    ║
║     → Motion hierarchies are NOT labeled (sparse) - redundant info           ║
║                                                                              ║
║  3. MODELING IMPLICATIONS                                                    ║
║     ────────────────────────                                                 ║
║     a) Hierarchical loss IS APPROPRIATE for IS-A relationships               ║
║        - Predicting "Motion to Dismiss" when truth is "MTD for Lack of SMJ"  ║
║          is a PARTIAL match (less wrong than predicting something unrelated) ║
║                                                                              ║
║     b) Label EXPANSION at inference time                                     ║
║        - If model predicts "MTD for Lack of SMJ", we can INFER               ║
║          it's also a "Motion to Dismiss" (even though not labeled)           ║
║                                                                              ║
║     c) Multi-task learning still valuable:                                   ║
║        - Stage classifier (On Appeal, Appellate Review, Trial, etc.)         ║
║        - Motion/Proceeding classifier (multi-label, hierarchical)            ║
║                                                                              ║
║     d) Training labels need EXPANSION for hierarchical models                ║
║        - Add implicit parent labels during training                          ║
║        - Or use hierarchical loss that understands the tree                  ║
║                                                                              ║
╚══════════════════════════════════════════════════════════════════════════════╝
""")


╔══════════════════════════════════════════════════════════════════════════════╗
║              LABELING PRACTICE vs ONTOLOGICAL TRUTH                          ║
╠══════════════════════════════════════════════════════════════════════════════╣
║                                                                              ║
║  IMPORTANT DISTINCTION:                                                      ║
║  ─────────────────────                                                       ║
║  • Ontology: "MTD for Lack of SMJ" IS-A "Motion to Dismiss" ← TRUE           ║
║  • Labels:   Low co-occurrence (3.2%) ← SPARSE LABELING, not wrong ontology  ║
║                                                                              ║
╠══════════════════════════════════════════════════════════════════════════════╣
║                                                                              ║
║  1. SPARSE LABELING PATTERN DISCOVERED                                       ║
║     ─────────────────────

In [17]:
# Visualize: Which postures ALWAYS come with "On Appeal" or "Appellate Review"?

stage_labels = ["On Appeal", "Appellate Review"]

print("Postures that ALMOST ALWAYS co-occur with stage labels (>90%):")
print("="*80)

for stage in stage_labels:
    print(f"\n{stage}:")
    print("-"*40)
    
    # Find all postures that highly co-occur with this stage
    high_cooccur = []
    for posture in posture_counts.keys():
        if posture == stage:
            continue
        pair = tuple(sorted([posture, stage]))
        if pair in pair_counts:
            count_posture = posture_counts[posture]
            count_both = pair_counts[pair]
            if count_posture >= 10:  # At least 10 occurrences
                prob = count_both / count_posture
                if prob > 0.9:
                    high_cooccur.append((posture, count_posture, count_both, prob))
    
    # Sort by probability
    high_cooccur.sort(key=lambda x: x[3], reverse=True)
    
    for posture, count, both, prob in high_cooccur[:15]:
        print(f"  {prob:5.1%} | {posture[:50]}... (n={count})")

Postures that ALMOST ALWAYS co-occur with stage labels (>90%):

On Appeal:
----------------------------------------
  100.0% | Motion to Set Aside or Vacate Dismissal... (n=21)
  100.0% | Motion for Restraining or Protection Order... (n=59)
  100.0% | Motion to Modify or Terminate Alimony/Maintenance... (n=24)
  100.0% | Petition to Enforce Child Support... (n=12)
  100.0% | Request for Award of Permanent Alimony/Maintenance... (n=21)
  100.0% | Motion to Modify Visitation Rights or Parenting Ti... (n=18)
  99.5% | Petition to Terminate Parental Rights... (n=219)
  99.2% | Petition for Divorce or Dissolution... (n=123)
  98.6% | Motion to Renew... (n=74)
  97.6% | Special Motion to Strike... (n=41)
  96.3% | Petition for Adoption... (n=27)
  95.7% | Motion to Modify Property Division Portions of Div... (n=23)
  94.9% | Petition for Custody... (n=59)
  94.1% | Petition for Visitation Rights or Parenting Time... (n=17)
  91.7% | Petition to Set Child Support... (n=36)

Appellate Review:


In [18]:
# Verify: "Motion to Dismiss" variants are ALTERNATIVES (mutually exclusive)
# not IS-A (hierarchical)

mtd_variants = [
    "Motion to Dismiss",
    "Motion to Dismiss for Lack of Subject Matter Jurisdiction",
    "Motion to Dismiss for Lack of Personal Jurisdiction", 
    "Motion to Dismiss for Lack of Standing",
    "Motion to Dismiss for Lack of Jurisdiction",
    "Motion to Dismiss for Failure to State a Claim",
    "Motion to Dismiss for Failure to Prosecute",
]

print("Motion to Dismiss Family - Co-occurrence Matrix")
print("="*80)
print("If IS-A: specific should ALWAYS co-occur with general (100%)")
print("If ALTERNATIVES: they should RARELY co-occur (<10%)")
print()

# Build co-occurrence for MTD family
for v1 in mtd_variants:
    if v1 not in posture_counts:
        continue
    print(f"\n{v1} (n={posture_counts[v1]}):")
    for v2 in mtd_variants:
        if v1 == v2 or v2 not in posture_counts:
            continue
        pair = tuple(sorted([v1, v2]))
        both = pair_counts.get(pair, 0)
        prob = both / posture_counts[v1] if posture_counts[v1] > 0 else 0
        if both > 0:
            print(f"  + {v2[:45]}...: {both} docs ({prob:.1%})")

Motion to Dismiss Family - Co-occurrence Matrix
If IS-A: specific should ALWAYS co-occur with general (100%)
If ALTERNATIVES: they should RARELY co-occur (<10%)


Motion to Dismiss (n=1679):
  + Motion to Dismiss for Lack of Subject Matter ...: 11 docs (0.7%)
  + Motion to Dismiss for Lack of Personal Jurisd...: 18 docs (1.1%)
  + Motion to Dismiss for Lack of Standing...: 19 docs (1.1%)
  + Motion to Dismiss for Lack of Jurisdiction...: 5 docs (0.3%)

Motion to Dismiss for Lack of Subject Matter Jurisdiction (n=343):
  + Motion to Dismiss...: 11 docs (3.2%)
  + Motion to Dismiss for Lack of Personal Jurisd...: 8 docs (2.3%)
  + Motion to Dismiss for Lack of Standing...: 12 docs (3.5%)
  + Motion to Dismiss for Lack of Jurisdiction...: 1 docs (0.3%)

Motion to Dismiss for Lack of Personal Jurisdiction (n=204):
  + Motion to Dismiss...: 18 docs (8.8%)
  + Motion to Dismiss for Lack of Subject Matter ...: 8 docs (3.9%)
  + Motion to Dismiss for Lack of Standing...: 2 docs (1.0%)
  + Moti

# Conclusion

## Key Finding: SPARSE vs DENSE Labeling

| Aspect | Finding |
|--------|---------|
| **Ontology** | IS-A relationships are **TRUE** (legally correct) |
| **Labels** | **SPARSE** - only most-specific label applied |
| **Implication** | Need to EXPAND labels or use hierarchical loss |

## Labeling Patterns Discovered:

### 1. Hierarchies → SPARSE (most specific only)
```
"MTD for Lack of SMJ" labeled, but NOT "Motion to Dismiss"
   ↓
Ontologically: MTD for Lack of SMJ IS-A Motion to Dismiss ✓
In labels: Only child labeled, parent omitted (efficient)
```

### 2. Stages → DENSE (always co-labeled)
```
Family Law → 99%+ co-occur with "On Appeal"
Criminal   → 95%+ co-occur with "Appellate Review"
   ↓
Stage labels ADD information, so they're included
```

## Modeling Recommendations:

| Strategy | Description |
|----------|-------------|
| **Hierarchical Loss** | Treat parent predictions as partial credit |
| **Label Expansion** | Add parent labels to training data |
| **Inference Expansion** | If predict child → also output parent |
| **Multi-task** | Separate heads for Stage vs Motion |