# LLaMA-2 Base
This notebook loads LLaMA-2 Chat and prompts it with each case from the Oyez dataset, predicting which party will win or what the court will rule on the legal question at hand.

COMPUTE REQUIREMENTS: Google Colab A100 High Ram (40GB) GPU

## Results From This Notebook:
### Predicting Winning Party
**When court justices were included in prompt:**

Whole dataset accuracy: 0.5209246711837385

Accuracy for cases after knowledge cutoff: 0.555

**When court justices included in prompt**

Whole dataset accuracy: 0.5105619768832204

Accuracy for cases after knowledge cutoff: 0.444

### Predicting Answer to Legal Question

Accuracy for cases after knowledge cutoff: 0.2


In [None]:
!pip install transformers



In [None]:
from google.colab import drive

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from tqdm import tqdm

import torch
from torch import nn
from torch.optim import Adam

from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig

In [None]:
drive.mount('/content/drive/')

#change this to the directory you have the files stored in
%cd /content/drive/My Drive/CPSC-477-Project/

df = pd.read_csv('2024-05-07-oyez-scrape.csv')#scrape that includes the party names
dfq = pd.read_csv('KBJFQC.csv') #scrape which includes justices
df.head()

Drive already mounted at /content/drive/; to attempt to forcibly remount, call drive.mount("/content/drive/", force_remount=True).
/content/drive/.shortcut-targets-by-id/1ygGGGOVkhqy-8CG8UUS16DYOOWKzLRFx/CPSC-477-Project


Unnamed: 0,Case Key,Case Name,First Party Label,First Party,Second Party Label,Second Party,Winning Party,Justices,Facts,Question,Conclusion
0,1971/70-18,Roe v. Wade,Appellant,Jane Roe,Appellee,Henry Wade,Jane Roe,"William O. Douglas, Potter Stewart, Thurgood M...","In 1970, Jane Roe (a fictional name used in co...",Does the Constitution recognize a woman's righ...,Inherent in the Due Process Clause of the Four...
1,1971/70-5014,Stanley v. Illinois,Petitioner,"Peter Stanley, Sr.",Respondent,Illinois,Stanley,"William O. Douglas, Potter Stewart, Thurgood M...",Joan Stanley had three children with Peter Sta...,Does the Illinois statutory scheme that assume...,"Yes. Justice Byron R. White, writing for a 5-..."
2,1971/70-29,Giglio v. United States,Petitioner,John Giglio,Respondent,United States,Giglio,"William O. Douglas, Potter Stewart, Thurgood M...",John Giglio was convicted of passing forged mo...,Is the prosecution’s failure to disclose a pro...,"Yes. Chief Justice Warren E. Burger, writing ..."
3,1971/70-4,Reed v. Reed,Appellant,Sally Reed,Appellee,Cecil Reed,Sally Reed,"William O. Douglas, Potter Stewart, Thurgood M...","The Idaho Probate Code specified that ""males m...",Did the Idaho Probate Code violate the Equal P...,"In a unanimous decision, the Court held that t..."
4,1971/70-73,Miller v. California,Appellant,Marvin Miller,Appellee,California,Marvin Miller,"Warren E. Burger, William O. Douglas, William ...","Miller, after conducting a mass mailing campai...",Is the sale and distribution of obscene materi...,"In a 5-to-4 decision, the Court held that obsc..."


## Load Model
Model is LlaMA 2 7 billion parameter version, tuned for chat applications. NousResearch adopted it for huggingface.

In [None]:
model_name = "NousResearch/Llama-2-7b-chat-hf" #using chat so we can ask it questions
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
)
model.eval()

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/746 [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/500k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.84M [00:00<?, ?B/s]

added_tokens.json:   0%|          | 0.00/21.0 [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/435 [00:00<?, ?B/s]

In [None]:
use_cuda = torch.cuda.is_available()
device = torch.device("cuda" if use_cuda else "cpu")
model.to(device)

generation_config = GenerationConfig(
    max_new_tokens = 30, #short generation since we only need one sentence responses
    decoder_start_token_id=1,
    eos_token_id=model.config.eos_token_id,
    pad_token=model.config.pad_token_id,
)
def get_model_response(input):
  torch.cuda.empty_cache()
  tokenized_input = tokenizer(input, add_special_tokens=True, return_tensors="pt")
  tokenized_input.to(device)
  input_len = len(input)
  outputs = model.generate(**tokenized_input, generation_config=generation_config)[0]
  tokenized_input.to('cpu')
  model_response = tokenizer.decode(outputs)[input_len:]
  return model_response


In [None]:
#Test generation
input = "What is the function of the Supreme Court of the United States?"
print(get_model_response(input))


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


tes?

The Supreme Court of the United States is the highest court in the federal judiciary of


In [None]:
def get_model_prediction(case_info):
  input = f"""
  [INST]
  The United States Supreme Court is hearing a legal case centered around the legal question of {case_info["question"]}.
  Given these case facts: {case_info["facts"]}
  Where the parties in question are {case_info["party_1"]} and {case_info["party_2"]}

  Respoonding with only one party, which party would the United States Supreme Court rule in favor of?
  [/INST]

    """

  return get_model_response(input)



def get_case_info(case_num):
  facts = list(df["Facts"])[case_num]
  question = list(df["Question"])[case_num]
  party_1 = list(df["First Party"])[case_num]
  party_2 = list(df["Second Party"])[case_num]
  winning_party = list(df["Winning Party"])[case_num]
  key = list(df["Case Key"])[case_num]

  justices = df.loc[df["Case Key"] == key]["Justices"].item()

  return {"facts" : facts, "question": question,
          "party_1": party_1, "party_2": party_2,
          "winning_party": winning_party, "key": key,
          "justices" : justices
          }

## Winning Party Prediction
The following code predicts the winning party for every case in the dataset.

In [None]:
import logging
from tqdm import tqdm

# Set logging level to suppress warnings
logging.getLogger("transformers").setLevel(logging.ERROR)

num_correct = 0
case_count = 0
for case_num in tqdm(range(100)):
  case_info = get_case_info(case_num)
  case_count += 1
  prediction = get_model_prediction(case_info)
  correct = case_info["winning_party"].lower() in prediction.lower()
  if correct:
    num_correct += 1
  if case_count % 500 == 0:
    print(num_correct/case_count)

print(" ")
accuracy = num_correct/case_count
print(f"Accuracy: {accuracy}")

100%|██████████| 100/100 [02:19<00:00,  1.40s/it]

 
Accuracy: 0.56





## Knowledge Cutoff Prediction
The next cell predicts the winning party for every case decided after 2017, our testing set.


In [None]:
num_correct = 0
num_cases = len(df)
case_count = 0
for case_num in range(num_cases):
  case_info = get_case_info(case_num)
  if int(case_info["key"][:4]) > 2017:
    case_count += 1
    prediction = get_model_prediction(case_info)
    correct = case_info["winning_party"].lower() in prediction.lower()
    if correct:
      num_correct += 1

print(num_correct/case_count)

0.4444444444444444


## Legal Question
The following cells predict the answer to the legal question the court is faced with over just cases past the knowledge cutoff – here LLaMA always predicts yes so no need to run over the entire dataset.

In [None]:
def get_case_info_legal_question(case_num):
  facts = list(dfq["Facts"])[case_num]
  question = list(dfq["Question"])[case_num]
  answer = list(dfq["Binary"])[case_num] == 1
  justices = list(dfq["Justices"])[case_num]
  key = list(dfq["Case Key"])[case_num]
  return {"facts" : facts, "question": question,
           "answer": answer, "justices": justices,
           "key": key}

def get_model_prediction_legal_question(case_info):
  input = f"""
  [INST]
  The United States Supreme Court is hearing a legal case.
  Given these case facts: {case_info["facts"]}
  Considering the legal question: {case_info["question"]}

  Respond with a single word, yes or no only, representing your
  prediction for the overall ruling of the court for that legal question.

  [\INST]
    """
  model_reply = get_model_response(input).lower()
  return "yes" in model_reply

In [None]:
num_correct = 0
case_count = 0
for case_num in range(10):
  case_info = get_case_info_legal_question(case_num)
  if int(case_info["key"][:4]) > 2022:
    case_count += 1
    prediction = get_model_prediction_legal_question(case_info)
    if case_info["answer"] == prediction:
      num_correct += 1
print(num_correct/case_count)

0.2


##Sources:

https://huggingface.co/docs/transformers/en/main_classes/text_generation

https://huggingface.co/NousResearch/Llama-2-7b-chat-hf/discussions/5

https://huggingface.co/docs/transformers/main/en/model_doc/llama2

https://mlabonne.github.io/blog/posts/Fine_Tune_Your_Own_Llama_2_Model_in_a_Colab_Notebook.html

