# Legislation Comparison Analysis

This notebook compares the results between JSONL output and CSV files with legislation, and calculates precision, recall, and F-measure.

In [27]:
import os
import pandas as pd
import numpy as np
import json
from sklearn.metrics import precision_recall_fscore_support, confusion_matrix

## Parse the JSONL File

Parse the JSONL file using the provided code to extract the nested JSON content

In [28]:
# Read and parse each line in the output file
parsed_responses = []

with open("eat_2025_29_openai_batch_output.jsonl", "r") as f:
    for line in f.readlines():
        data = json.loads(line)
        try:
            raw_content = data["response"]["body"]["choices"][0]["message"]["content"]
            response_json = json.loads(raw_content)  # Parse the stringified JSON
            para_id = response_json.get("para_id")
            contains_law_application = response_json.get("contains_application")
            law_application_matches = response_json.get("matches")
            parsed_responses.append({
                "para_id": para_id,
                "contains_law_application": contains_law_application,
                "law_application_matches": law_application_matches,
                "response": response_json
            })
        except Exception as e:
            print(f"Error parsing line: {e}")

# Convert to DataFrame
jsonl_df = pd.DataFrame(parsed_responses)
print(f"Parsed {len(jsonl_df)} records from the JSONL file")

# Display the first few rows
display(jsonl_df.head())

Parsed 57 records from the JSONL file


Unnamed: 0,para_id,contains_law_application,law_application_matches,response
0,eat_2025_29#para_1,False,,"{'para_id': 'eat_2025_29#para_1', 'contains_ap..."
1,eat_2025_29#para_10,True,[{'caselaw_excerpt': 'the cause of the appella...,"{'para_id': 'eat_2025_29#para_10', 'contains_a..."
2,eat_2025_29#para_11,False,[],"{'para_id': 'eat_2025_29#para_11', 'contains_a..."
3,eat_2025_29#para_12,True,[{'caselaw_excerpt': 'the failure to make reas...,"{'para_id': 'eat_2025_29#para_12', 'contains_a..."
4,eat_2025_29#para_13,True,"[{'caselaw_excerpt': 'Mr Gourley's condition, ...","{'para_id': 'eat_2025_29#para_13', 'contains_a..."


## Load the CSV File

Load the CSV file with legislation for comparison

In [29]:
# Load the CSV file
csv_path = '../data/test2/csv_cases/csv_with_legislation/eat_2025_29.csv'
try:
    csv_df = pd.read_csv(csv_path)
    print(f"Loaded CSV file with {len(csv_df)} rows")
    
    # Display the first few rows to understand the structure
    display(csv_df.head())
except Exception as e:
    print(f"Error loading CSV file: {str(e)}")

Loaded CSV file with 171 rows


Unnamed: 0,case_uri,para_id,paragraphs,references,application_of_law_phrases_actual,if_law_applied_actual,reason(optional),application_of_law_phrases.1,applied provision,act,legislative term,if_law_applied_gpt-4o-mini,application_of_law_phrases_gpt-4o-mini,reason_gpt-4o-mini,if_law_applied_gpt-4o,application_of_law_phrases_gpt-4o,reason_gpt-4o,section_id,section_text
0,https://caselaw.nationalarchives.gov.uk/eat/20...,para_1,\n\t \n\t 1. \n\t \n\t \n\t \n\t The...,[],[],no,,,,,,0,[],The paragraph provides procedural background i...,0,[],The paragraph provides procedural information ...,ukpga/2010/15_section_133,133 Remedies in pensions cases (1) This sectio...
1,https://caselaw.nationalarchives.gov.uk/eat/20...,para_1,\n\t \n\t 1. \n\t \n\t \n\t \n\t The...,[],[],no,,,,,,0,[],The paragraph provides procedural background i...,0,[],The paragraph provides procedural information ...,ukpga/2010/15_section_172,172 Appeals (1) A person who is aggrieved by t...
2,https://caselaw.nationalarchives.gov.uk/eat/20...,para_1,\n\t \n\t 1. \n\t \n\t \n\t \n\t The...,[],[],no,,,,,,0,[],The paragraph provides procedural background i...,0,[],The paragraph provides procedural information ...,ukpga/2010/15_section_116,116 Education cases (1) A claim is within this...
3,https://caselaw.nationalarchives.gov.uk/eat/20...,para_2,\n\t \n\t 2. \n\t \n\t \n\t \n\t For...,[],[],no,,,,,,0,[],The paragraph provides background information ...,0,[],The paragraph provides definitions and simplif...,ukpga/2010/15_section_172,172 Appeals (1) A person who is aggrieved by t...
4,https://caselaw.nationalarchives.gov.uk/eat/20...,para_2,\n\t \n\t 2. \n\t \n\t \n\t \n\t For...,[],[],no,,,,,,0,[],The paragraph provides background information ...,0,[],The paragraph provides definitions and simplif...,ukpga/2010/15_section_136,136 Burden of proof (1) This section applies t...


## Update Law Application Flag

If case_law_excerpt is not from a paragraph, update the law application flag to False/No

In [30]:
# Check if CSV dataframe is loaded
if 'csv_df' in locals():
    # Make a copy to avoid modifying the original
    updated_csv_df = csv_df.copy()
    
    # Check if required columns exist
    if 'paragraphs' in updated_csv_df.columns and 'if_law_applied_actual' in updated_csv_df.columns:
        # Identify rows where paragraphs is not from a paragraph
        not_from_para = updated_csv_df['paragraphs'].isna() | \
                       (~updated_csv_df['paragraphs'].str.contains('para', case=False, na=False))
        
        # Update if_law_applied_actual to 'no' for these rows
        updated_csv_df.loc[not_from_para, 'if_law_applied_actual'] = 'no'
        
        print(f"Updated {sum(not_from_para)} rows where paragraphs is not from paragraph")
    else:
        print("Required columns not found in CSV file")
else:
    print("CSV dataframe not loaded")

Updated 135 rows where paragraphs is not from paragraph


## Prepare Data for Comparison

Prepare both dataframes for comparison by aligning columns and data types

In [31]:
# Prepare the JSONL dataframe
if 'jsonl_df' in locals():
    # Convert the application of law column to boolean
    if 'contains_law_application' in jsonl_df.columns:
        # Create a boolean version of the column
        jsonl_df['law_application_bool'] = jsonl_df['contains_law_application'].map(
            {True: True, False: False, 'yes': True, 'no': False, 'true': True, 'false': False}
        )
        print("Converted JSONL law application column to boolean")
    else:
        print("Law application column not found in JSONL dataframe")

# Prepare the CSV dataframe
if 'updated_csv_df' in locals():
    # Convert the application of law column to boolean
    if 'if_law_applied_actual' in updated_csv_df.columns:
        # Create a boolean version of the column
        updated_csv_df['law_application_bool'] = updated_csv_df['if_law_applied_actual'].map(
            {'yes': True, 'no': False, True: True, False: False}
        )
        print("Converted CSV law application column to boolean")
    else:
        print("Law application column not found in CSV dataframe")

Converted JSONL law application column to boolean
Converted CSV law application column to boolean


## Merge Dataframes

Merge the JSONL and CSV dataframes on para_id for comparison

In [32]:
# Embed 'eat_2025_29#' in the para_id column of CSV dataframe
updated_csv_df['para_id'] = 'eat_2025_29#' + updated_csv_df['para_id'].astype(str)

# Merge on para_id
try:
    merged_df = pd.merge(jsonl_df, updated_csv_df, on='para_id', suffixes=('_jsonl', '_csv'))
    print(f"Merged dataframes on para_id. Result has {len(merged_df)} rows.")
    
    # Display the first few rows
    display(merged_df.head())
except Exception as e:
    print(f"Error merging dataframes: {str(e)}")

Merged dataframes on para_id. Result has 171 rows.


Unnamed: 0,para_id,contains_law_application,law_application_matches,response,law_application_bool_jsonl,case_uri,paragraphs,references,application_of_law_phrases_actual,if_law_applied_actual,...,legislative term,if_law_applied_gpt-4o-mini,application_of_law_phrases_gpt-4o-mini,reason_gpt-4o-mini,if_law_applied_gpt-4o,application_of_law_phrases_gpt-4o,reason_gpt-4o,section_id,section_text,law_application_bool_csv
0,eat_2025_29#para_1,False,,"{'para_id': 'eat_2025_29#para_1', 'contains_ap...",False,https://caselaw.nationalarchives.gov.uk/eat/20...,\n\t \n\t 1. \n\t \n\t \n\t \n\t The...,[],[],no,...,,0,[],The paragraph provides procedural background i...,0,[],The paragraph provides procedural information ...,ukpga/2010/15_section_133,133 Remedies in pensions cases (1) This sectio...,False
1,eat_2025_29#para_1,False,,"{'para_id': 'eat_2025_29#para_1', 'contains_ap...",False,https://caselaw.nationalarchives.gov.uk/eat/20...,\n\t \n\t 1. \n\t \n\t \n\t \n\t The...,[],[],no,...,,0,[],The paragraph provides procedural background i...,0,[],The paragraph provides procedural information ...,ukpga/2010/15_section_172,172 Appeals (1) A person who is aggrieved by t...,False
2,eat_2025_29#para_1,False,,"{'para_id': 'eat_2025_29#para_1', 'contains_ap...",False,https://caselaw.nationalarchives.gov.uk/eat/20...,\n\t \n\t 1. \n\t \n\t \n\t \n\t The...,[],[],no,...,,0,[],The paragraph provides procedural background i...,0,[],The paragraph provides procedural information ...,ukpga/2010/15_section_116,116 Education cases (1) A claim is within this...,False
3,eat_2025_29#para_10,True,[{'caselaw_excerpt': 'the cause of the appella...,"{'para_id': 'eat_2025_29#para_10', 'contains_a...",True,https://caselaw.nationalarchives.gov.uk/eat/20...,\n\t \n\t 10. \n\t \n\t \n\t \n\t Th...,[],[],no,...,,1,['the appellant’s position was that his psychi...,The paragraph discusses the application of leg...,0,[],The paragraph discusses the disputed issues at...,ukpga/2010/15_section_133,133 Remedies in pensions cases (1) This sectio...,False
4,eat_2025_29#para_10,True,[{'caselaw_excerpt': 'the cause of the appella...,"{'para_id': 'eat_2025_29#para_10', 'contains_a...",True,https://caselaw.nationalarchives.gov.uk/eat/20...,\n\t \n\t 10. \n\t \n\t \n\t \n\t Th...,[],[],no,...,,1,['the appellant’s position was that his psychi...,The paragraph discusses the application of leg...,0,[],The paragraph discusses the disputed issues at...,ukpga/2010/15_section_122,"122 References by court to tribunal, etc. (1) ...",False


## Calculate Metrics

Calculate precision, recall, and F1 score for the law application detection

In [33]:
# Check if merged dataframe exists
if 'merged_df' in locals():
    # Check if the boolean columns exist
    jsonl_bool_col = 'law_application_bool_jsonl' if 'law_application_bool_jsonl' in merged_df.columns else None
    csv_bool_col = 'law_application_bool_csv' if 'law_application_bool_csv' in merged_df.columns else None
    
    # If the columns don't exist with suffixes, check without suffixes
    if not jsonl_bool_col and 'law_application_bool' in merged_df.columns:
        jsonl_bool_col = 'law_application_bool'
    if not csv_bool_col and 'law_application_bool' in merged_df.columns:
        csv_bool_col = 'law_application_bool'
    
    # If still not found, check the original columns
    if not jsonl_bool_col:
        potential_cols = ['contains_law_application_jsonl', 'contains_law_application']
        for col in potential_cols:
            if col in merged_df.columns:
                jsonl_bool_col = col
                break
    
    if not csv_bool_col:
        potential_cols = ['if_law_applied_actual_csv', 'if_law_applied_actual']
        for col in potential_cols:
            if col in merged_df.columns:
                csv_bool_col = col
                break
    
    if jsonl_bool_col and csv_bool_col:
        # Convert to boolean if needed
        if merged_df[jsonl_bool_col].dtype != bool:
            merged_df[f"{jsonl_bool_col}_bool"] = merged_df[jsonl_bool_col].map(
                {True: True, False: False, 'yes': True, 'no': False, 'true': True, 'false': False}
            )
            jsonl_bool_col = f"{jsonl_bool_col}_bool"
        
        if merged_df[csv_bool_col].dtype != bool:
            merged_df[f"{csv_bool_col}_bool"] = merged_df[csv_bool_col].map(
                {True: True, False: False, 'yes': True, 'no': False, 'true': True, 'false': False}
            )
            csv_bool_col = f"{csv_bool_col}_bool"
        
        # Calculate metrics
        try:
            y_true = merged_df[csv_bool_col]
            y_pred = merged_df[jsonl_bool_col]
            
            precision, recall, f1, _ = precision_recall_fscore_support(
                y_true, y_pred, average='binary', zero_division=0
            )
            
            print("\nMetrics for Law Application Detection:")
            print(f"Precision: {precision:.4f}")
            print(f"Recall: {recall:.4f}")
            print(f"F1 Score: {f1:.4f}")
            
            # Create a confusion matrix
            cm = confusion_matrix(y_true, y_pred)
            
            print("\nConfusion Matrix:")
            print(f"True Negatives: {cm[0][0]}")
            print(f"False Positives: {cm[0][1]}")
            print(f"False Negatives: {cm[1][0]}")
            print(f"True Positives: {cm[1][1]}")
            
            # Calculate additional metrics
            accuracy = (cm[0][0] + cm[1][1]) / cm.sum()
            specificity = cm[0][0] / (cm[0][0] + cm[0][1]) if (cm[0][0] + cm[0][1]) > 0 else 0
            
            print(f"Accuracy: {accuracy:.4f}")
            print(f"Specificity: {specificity:.4f}")
        except Exception as e:
            print(f"Error calculating metrics: {str(e)}")
    else:
        print("Boolean columns for comparison not found in merged dataframe")
else:
    print("Merged dataframe not available for metric calculation")


Metrics for Law Application Detection:
Precision: 0.0476
Recall: 1.0000
F1 Score: 0.0909

Confusion Matrix:
True Negatives: 45
False Positives: 120
False Negatives: 0
True Positives: 6
Accuracy: 0.2982
Specificity: 0.2727


## Validate Law Application Phrases

Check if phrases are legitimately found in the text

In [34]:
def check_if_phrase_in_text(phrase, text):
    """Check if a phrase is contained in the text"""
    try:
        if phrase in text:
            return True
        else:
            return False
    except:
        return False

def get_legit_phrases(df):
    """Extract legitimate phrases from the dataframe"""
    # Make a copy to avoid modifying the original
    df_copy = df.copy()
    
    # Check if required columns exist
    if 'law_application_matches' in df_copy.columns and 'paragraphs' in df_copy.columns:
        # Create a new column for legitimate phrases
        df_copy['legit_phrases'] = None
        df_copy['has_legit_phrases'] = False
        
        # Process each row
        for idx, row in df_copy.iterrows():
            matches = row['law_application_matches']
            text = str(row['paragraphs'])
            
            if isinstance(matches, list) and len(matches) > 0:
                legit_phrases = []
                
                for match in matches:
                    if isinstance(match, dict) and 'caselaw_excerpt' in match:
                        phrase = match['caselaw_excerpt']
                        if check_if_phrase_in_text(phrase, text):
                            legit_phrases.append(phrase)
                
                df_copy.at[idx, 'legit_phrases'] = legit_phrases
                df_copy.at[idx, 'has_legit_phrases'] = len(legit_phrases) > 0
        
        print(f"Processed {len(df_copy)} rows for legitimate phrases")
        print(f"Found {df_copy['has_legit_phrases'].sum()} rows with legitimate phrases")
    else:
        print("Required columns not found in dataframe")
    
    return df_copy

# Process the merged dataframe if it exists
if 'merged_df' in locals():
    validated_df = get_legit_phrases(merged_df)
    
    # Update the law application flag based on legitimate phrases
    validated_df['validated_law_application'] = validated_df['has_legit_phrases']
    
    # Display statistics
    print("\nValidated Law Application Statistics:")
    print(f"Total rows: {len(validated_df)}")
    print(f"Rows with legitimate phrases: {validated_df['has_legit_phrases'].sum()}")
    print(f"Rows with law application (original): {validated_df[jsonl_bool_col].sum()}")
    print(f"Rows with law application (validated): {validated_df['validated_law_application'].sum()}")
else:
    print("Merged dataframe not available for validation")

Processed 171 rows for legitimate phrases
Found 87 rows with legitimate phrases

Validated Law Application Statistics:
Total rows: 171
Rows with legitimate phrases: 87
Rows with law application (original): 126
Rows with law application (validated): 87


In [45]:
merged_df_Selected = merged_df[['para_id', 'contains_law_application', 'law_application_matches', 'paragraphs','law_application_bool_jsonl','if_law_applied_actual','validated_law_application']]

KeyError: "['validated_law_application'] not in index"

In [43]:
#drop na all
merged_df_Selected = merged_df_Selected.dropna()
# Drop duplicates using only hashable columns (exclude 'law_application_matches' which is a list)
hashable_cols = [col for col in merged_df_Selected.columns if merged_df_Selected[col].apply(lambda x: isinstance(x, (str, int, float, bool, type(None)))).all()]
merged_df_Selected = merged_df_Selected.drop_duplicates(subset=hashable_cols)

In [44]:
merged_df_Selected

Unnamed: 0,para_id,contains_law_application,law_application_matches,paragraphs,law_application_bool_jsonl,if_law_applied_actual
3,eat_2025_29#para_10,True,[{'caselaw_excerpt': 'the cause of the appella...,\n\t \n\t 10. \n\t \n\t \n\t \n\t Th...,True,no
6,eat_2025_29#para_11,False,[],\n\t \n\t 11. \n\t \n\t \n\t \n\t On...,False,no
9,eat_2025_29#para_12,True,[{'caselaw_excerpt': 'the failure to make reas...,\n\t \n\t 12. \n\t \n\t \n\t \n\t ...,True,no
12,eat_2025_29#para_13,True,"[{'caselaw_excerpt': 'Mr Gourley's condition, ...",\n\t \n\t 13. \n\t \n\t \n\t \n\t ...,True,no
15,eat_2025_29#para_14,True,[{'caselaw_excerpt': 'the failure to make reas...,\n\t \n\t 14. \n\t \n\t \n\t \n\t ...,True,no
18,eat_2025_29#para_15,True,[{'caselaw_excerpt': 'the tribunal is satisfie...,\n\t \n\t 15. \n\t \n\t \n\t \n\t ...,True,no
21,eat_2025_29#para_16,True,[{'caselaw_excerpt': 'the failure to make adju...,\n\t \n\t 16. \n\t \n\t \n\t \n\t Wh...,True,no
24,eat_2025_29#para_17,True,[{'caselaw_excerpt': 'the respondents have not...,\n\t \n\t 17. \n\t \n\t \n\t \n\t ...,True,no
27,eat_2025_29#para_18,True,[{'caselaw_excerpt': 'the claimant would not h...,\n\t \n\t 18. \n\t \n\t \n\t \n\t ...,True,no
30,eat_2025_29#para_19,True,[{'caselaw_excerpt': 'The tribunal has assesse...,\n\t \n\t 19. \n\t \n\t \n\t \n\t ...,True,no


## Analyze Law Application Matches

Analyze the law application matches extracted from the JSONL file

In [35]:
# Check if JSONL dataframe exists
if 'jsonl_df' in locals() and 'law_application_matches' in jsonl_df.columns:
    # Count the number of matches per paragraph
    jsonl_df['match_count'] = jsonl_df['law_application_matches'].apply(lambda x: len(x) if isinstance(x, list) else 0)
    
    # Display statistics
    print("\nLaw Application Matches Statistics:")
    print(f"Total paragraphs: {len(jsonl_df)}")
    print(f"Paragraphs with matches: {(jsonl_df['match_count'] > 0).sum()}")
    print(f"Total matches: {jsonl_df['match_count'].sum()}")
    print(f"Average matches per paragraph: {jsonl_df['match_count'].mean():.2f}")
    print(f"Max matches in a paragraph: {jsonl_df['match_count'].max()}")
else:
    print("JSONL dataframe or law_application_matches column not available for analysis")


Law Application Matches Statistics:
Total paragraphs: 57
Paragraphs with matches: 42
Total matches: 46
Average matches per paragraph: 0.81
Max matches in a paragraph: 2


## Conclusion

This analysis compares the results between the JSONL output file and the CSV file with legislation. We've calculated precision, recall, and F-measure for the application of law detection.

The comparison helps evaluate the accuracy of the application of law detection and identify areas for improvement.