### Citation anonymization test

In this notebook, we apply the "citation anonymization" test. Modern LLMs have been trained on a large variety of text inputs. This raises the risk that the LLM has seen the input citation in its training process and has ``memorized" the correct answer.

For our anonymization test, we first ask an LLM to rewrite the citation text in a manner which anonymizes case names and citations. This ensures that the prompt has not been seen in training, and cases mentioned are unlikely to actually exist. We then use this modified prompt and use an LLM to identify the right **holding** as before.

In [1]:
import pandas as pd
from datasets import load_dataset
from langchain_aws import ChatBedrockConverse
from boto3 import client
from botocore.config import Config
from sklearn.metrics import f1_score
import numpy as np
import re

config = Config(read_timeout=1000)

client = client(service_name='bedrock-runtime',
                      config=config, region_name="us-east-1")

In [2]:
def create_new_citation_prompt(r):
    return """Input: "{}"

    Task: Please rewrite the input, anonymizing any case references. Substitute other names when legal cases are referenced, and other numbers for case citations. Make no other changes.

    Output: """.format(r['citing_prompt'])

def create_prompt(r):
    return """Input: "{}"
    Question: Consider the appropriate legal grammar and reasoning. Select which of the following clauses is the best replacement for the <HOLDING> tag above? 
    A: {}
    B: {}
    C: {}
    D: {}
    E: {}
    
    First think it through. Then provide your answer formatted in backticks like ANSWER:`A` or ANSWER:`C`.
    
    REASONING: """.format(r['anonymized_citation_llama_90b'], r['holding_0'], r['holding_1'], r['holding_2'], r['holding_3'], r['holding_4'])

In [None]:
test = pd.read_csv('casehold_test.csv')
test['initial_prompt'] = test.apply(create_new_citation_prompt, axis = 1)

In [None]:
#test['anonymized_citation_llama_90b'] = ''
#test['response_llama_90b_anonymized'] = ''

In [None]:
llm_llama32_90b = ChatBedrockConverse(model="us.meta.llama3-2-90b-instruct-v1:0", region_name="us-east-1", temperature = 0, client = client)

In [None]:
for i in range(0, test.shape[0]):
    
    print(i)
    messages = [
        ("user", test.loc[i, 'initial_prompt'])
    ]

    #We first get the anonymized citation
    e = llm_llama32_90b.invoke(messages)
    test.loc[i, 'anonymized_citation_llama_90b'] = e.content
    
    row = test.loc[i]

    #Create a new prompt with the anonymized citation
    final_prompt = create_prompt(row)

    messages2 = [
        ("user", final_prompt)
    ]

    #Record the answer choice using the anonymized citation
    f = llm_llama32_90b.invoke(messages2)
    test.loc[i, 'response_llama_90b_anonymized'] = f.content


In [3]:
#test[['example_id', 'citing_prompt', 'holding_0', 'holding_1', 'holding_2', 'holding_3', 'holding_4', 'label', 'response_llama_90b', 'initial_prompt', 'anonymized_citation_llama_90b', 'response_llama_90b_anonymized']].to_csv('casehold_test_with_anonymization.csv')

test = pd.read_csv('casehold_test_with_anonymization.csv')

### Examples of anonymized prompts

In [12]:
for i in [7, 5270, 816, 5308]:
    print('***')
    print(test.loc[i]['citing_prompt'])
    print(test.loc[i]['anonymized_citation_llama_90b'])

***
she argues that that affidavit demonstrates that Mandeville’s testimony regarding DMV complaints about Bucci is not worthy of belief. It is important to note, however, that San Bento said in his affidavit that he did not recall making any complaints about Bucci and that he found her to be “professional, efficient and competent.” But Mandeville never said that it had been San Bento who had made a complaint to her about Bucci. Indeed, other employees from the DMV, or employees from motor-vehicle departments from other states, may well have complained about Bucci’s performance. When a plaintiff attempts to counter a claim by an employer that it fired an employee for poor performance, it is simply not sufficient for a plaintiff to present evidence that her performance wa (7th Cir.2007) (<HOLDING>). Further, the policy states that there are
She argues that that affidavit demonstrates that Johnson's testimony regarding DMV complaints about Lee is not worthy of belief. It is important to 

### Answer analysis

In [5]:
def extract_answers(text):
    # Look for text surrounded by single backticks
    pattern = r"(?::| |\`)([A-E])(?:\`|$)"
    match = re.search(pattern, text)
    letter = match.group(1) if match else ''

    if letter:
        # Return the content if found, None otherwise
        return ord(letter) - ord('A')
    else: 
        return -1

In [6]:
test['answer_llama_90b'] = test['response_llama_90b'].apply(lambda x: extract_answers(x))
test['answer_llama_90b_anonymized'] = test['response_llama_90b_anonymized'].apply(lambda x: extract_answers(x))

In [7]:
#Fill in random guesses elsewhere
test.loc[test['answer_llama_90b'] == -1, 'answer_llama_90b'] = np.random.choice(5, size = test.query('answer_llama_90b == -1').shape[0])
test.loc[test['answer_llama_90b_anonymized'] == -1, 'answer_llama_90b_anonymized'] = np.random.choice(5, size = test.query('answer_llama_90b_anonymized == -1').shape[0])

In [8]:
(f1_score(test['label'], test['answer_llama_90b'], average='macro'), 
f1_score(test['label'], test['answer_llama_90b_anonymized'], average='macro'))


(0.7138425651864675, 0.7002595152899526)

In [9]:
# This measures whether the "holding" tag was successfully reproduced in the anonymized text.
test['anonymized_well'] = test['anonymized_citation_llama_90b'].apply(lambda x: '<HOLDING>' in x)

test['anonymized_well'].mean()

np.float64(0.931501693639443)

In [10]:
test.query('anonymized_well == True & answer_llama_90b != answer_llama_90b_anonymized')

Unnamed: 0.1,Unnamed: 0,example_id,citing_prompt,holding_0,holding_1,holding_2,holding_3,holding_4,label,response_llama_90b,initial_prompt,anonymized_citation_llama_90b,response_llama_90b_anonymized,answer_llama_90b,answer_llama_90b_anonymized,anonymized_well
6,6,47827,the difference between it and the previous pro...,holding that payments made to a debtor from in...,holding inheritance received by debtor more th...,holding noncompete payments were not exempt fr...,holding that income payment debtor received fr...,holding that distributions from an inter vivos...,3.0,To determine the best replacement for the <HOL...,"Input: ""the difference between it and the prev...","""the difference between it and the previous pr...",To determine the best replacement for the <HOL...,3,4,True
12,12,47833,waterways.” It is well established that “ ‘an ...,holding that animal studies can be a proper fo...,holding that a reliable differential diagnosis...,holding inadmissible auto reconstruction exper...,holding that an experts opinion must be based ...,holding that an expert opinion on a question o...,2.0,To determine the best replacement for the <HOL...,"Input: ""waterways.” It is well established tha...","""waterways.” It is well established that “ ‘an...",To determine the best replacement for the <HOL...,0,3,True
25,25,47846,the power and the intention at a given time to...,holding that the mandatory nature of the unite...,recognizing that under the booker remedial reg...,holding that sentencing under the mandatory gu...,holding that the fact that a sentence imposed ...,holding that even in the absence of a sixth am...,2.0,To determine the best replacement for the <HOL...,"Input: ""the power and the intention at a given...","""the power and the intention at a given time t...",To determine the best replacement for the <HOL...,1,0,True
34,34,47855,"Mindful of these words, it should be noted tha...",holding that an insured has substantially comp...,holding that the claimant committed willful mi...,holding that regardless of the subjective beli...,holding that insured had not substantially com...,holding that insured substantially complied wi...,3.0,To determine the best replacement for the <HOL...,"Input: ""Mindful of these words, it should be n...","Mindful of these words, it should be noted tha...",To determine the best replacement for the <HOL...,0,4,True
38,38,47859,"under the ordinance.” Nevertheless, she claims...",holding that defendant could not assert statut...,holding that state rights are equivalent to fe...,holding that contraceptive providers could ass...,holding that a litigant may not claim standing...,holding that it is well settled that while one...,2.0,To determine the best replacement for the <HOL...,"Input: ""under the ordinance.” Nevertheless, sh...","""under the ordinance.” Nevertheless, she claim...",To determine the best replacement for the <HOL...,3,2,True
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
5268,5268,53089,the position actually did become available dur...,holding that a plaintiff in a reverse discrimi...,holding that plaintiff could not show that he ...,holding that a plaintiff would have to show th...,holding that because there was no evidence bef...,holding that the plaintiff was replaced when a...,2.0,To determine the best replacement for the <HOL...,"Input: ""the position actually did become avail...","""the position actually did become available du...",To determine the best replacement for the <HOL...,3,2,True
5281,5281,53102,a Congressional Act may pass the substantial e...,holding that there is a rational basis for the...,holding that the statute as applied violates t...,holding rational basis as standard for commerc...,recognizing heightened rational basis scrutiny,holding that a law survives rational basis rev...,2.0,To determine the best replacement for the <HOL...,"Input: ""a Congressional Act may pass the subst...","""a Congressional Act may pass the substantial ...",To determine the best replacement for the <HOL...,2,4,True
5294,5294,53115,volume of post-conviction litigation. Moreover...,recognizing the cause of action,recognizing district courts inherent authority...,recognizing a trial courts inherent authority ...,recognizing cause of action,holding that a district court has discretion t...,2.0,To determine the best replacement for the <HOL...,"Input: ""volume of post-conviction litigation. ...","""volume of post-conviction litigation. Moreove...",To determine the best replacement for the <HOL...,1,2,True
5309,5309,53130,court’s subject-matter jurisdiction and is pro...,holding that even where taxpayer did not have ...,holding that in a proceeding to recover taxes ...,holding that a taxpayer cannot bring a suit fo...,holding that the county had no standing to sue...,holding a taxpayer has three years from the da...,2.0,To determine the best replacement for the <HOL...,"Input: ""court’s subject-matter jurisdiction an...",court’s subject-matter jurisdiction and is pro...,To determine the best replacement for the <HOL...,2,3,True
