### To do

1. Make it easier to visualize jury trials/extract jury trial results and do the same for other variables
2. Make sure context + system prompt fits in context window
3. Add helper function to manually label
4. Add reranking algorithm

Reranking
- https://adasci.org/a-hands-on-guide-to-enhance-rag-with-re-ranking/
- https://techcommunity.microsoft.com/t5/microsoft-developer-community/doing-rag-vector-search-is-not-enough/ba-p/4161073
- https://community.openai.com/t/bad-formats-for-semantic-search-of-rag-implementing-internal-chatbot-for-troubleshooting-an-sdk/848715
- https://learn.microsoft.com/en-us/azure/search/index-similarity-and-scoring
- https://cohere.com/blog/rerank-3
- https://www.reddit.com/r/LocalLLaMA/comments/1d9h2pg/doing_rag_vector_search_is_not_enough/
- https://www.datacamp.com/tutorial/boost-llm-accuracy-retrieval-augmented-generation-rag-reranking
- https://python.langchain.com/v0.2/docs/integrations/retrievers/flashrank-reranker/

RAG
- https://ollama.com/blog/embedding-models
- https://huggingface.co/learn/nlp-course/chapter5/6
- https://docs.mistral.ai/guides/rag/
- https://docs.trychroma.com/guides

### Code

In [1]:
import numpy as np
import pandas as pd
import os
from utils.case_directory import CaseDirectory
from utils.case_metadata import CaseMetadata
from extractors.jury_ruling_classifier import JuryRulingClassifier

In [11]:
df = pd.read_csv("labeled_cases.csv")
df[df.trial_type == "jury"].metadata_path.tolist()

['workdata/100_random_sample/New_York_State_Suffolk_County_Supreme_Court/602235---2016/metadata.json',
 'workdata/100_random_sample/Delaware_District_Court/1--21-cv-01238/metadata.json',
 'workdata/100_random_sample/Massachusetts_State_Superior_Court_Essex_County/1777CV00789/metadata.json',
 'workdata/100_random_sample/Connecticut_State_Superior_Court/HHD-CV17-6080452-S/metadata.json',
 'workdata/100_random_sample/Connecticut_State_Superior_Court/UWY-CV22-6068059-S/metadata.json']

In [12]:
df[df.trial_type == "bench"].metadata_path.tolist()

['workdata/100_random_sample/Florida_State_Broward_County_Seventeenth_Circuit_Court/CACE15005896/metadata.json',
 'workdata/100_random_sample/New_York_Southern_District_Court/1--05-cv-06677/metadata.json',
 'workdata/100_random_sample/Texas_Northern_District_Court/2--07-cv-00142/metadata.json',
 'workdata/100_random_sample/Massachusetts_District_Court/1--14-cv-14176/metadata.json',
 'workdata/100_random_sample/California_State_Court_of_Appeals_Second_District/B232339/metadata.json',
 'workdata/100_random_sample/North_Carolina_Western_District_Court/2--12-cr-00007/metadata.json',
 'workdata/100_random_sample/California_State_San_Francisco_County_Superior_Court/CGC-05-439929/metadata.json',
 'workdata/100_random_sample/Washington_State_Pierce_County_Superior_Court/09-2-16353-2/metadata.json',
 'workdata/100_random_sample/Illinois_Northern_District_Court/1--21-cv-05336/metadata.json',
 'workdata/100_random_sample/US_Court_of_Appeals_Ninth_Circuit_BAP/22-1214/metadata.json']

In [2]:
path = '100_random_sample/New_York_State_Suffolk_County_Supreme_Court/602235---2016/metadata.json'
classifier = JuryRulingClassifier(path)

In [3]:
classifier.extract()

Extracting from metadata...
- Getting relevant chunks...
- Querying llm...
- Response: {'reasoning': 'According to the documents, the plaintiff has submitted proposed verdict sheets and jury instructions. This indicates that a decision was made by the plaintiff about how they would like the case to be decided, which suggests that the jury verdict was in favor of the plaintiff.', 'category': 'plaintiff'}


{'reasoning': 'According to the documents, the plaintiff has submitted proposed verdict sheets and jury instructions. This indicates that a decision was made by the plaintiff about how they would like the case to be decided, which suggests that the jury verdict was in favor of the plaintiff.',
 'category': 'plaintiff'}

In [4]:
classifier.log

{'metadata_response': {'model': 'mistral',
  'created_at': '2024-07-22T20:36:59.275455Z',
  'response': '{"reasoning": "According to the documents, the plaintiff has submitted proposed verdict sheets and jury instructions. This indicates that a decision was made by the plaintiff about how they would like the case to be decided, which suggests that the jury verdict was in favor of the plaintiff.", "category": "plaintiff"}',
  'done': True,
  'done_reason': 'stop',
  'context': [3,
   1027,
   781,
   6158,
   1763,
   1228,
   1164,
   8351,
   6416,
   23043,
   29491,
   1763,
   1390,
   1115,
   2846,
   1032,
   2042,
   1070,
   1207,
   3510,
   11498,
   1245,
   6416,
   10949,
   22255,
   1066,
   1032,
   1990,
   1065,
   1040,
   3737,
   4311,
   1163,
   1032,
   5929,
   2037,
   1254,
   1032,
   20491,
   29491,
   19170,
   3873,
   1228,
   15468,
   1254,
   2404,
   29491,
   2450,
   10949,
   11152,
   1066,
   1040,
   2116,
   1990,
   29491,
   5718,
   2343,

In [9]:
classifier.metadata.get_docket_report()

  lambda html: BeautifulSoup(html, features="html.parser").text


Unnamed: 0,date,contents,link,link_viewer,number,document_path
0,2024-12-05,Jury Selection / Trial - Proceeding,,,,
1,2024-11-19,Trial Management Conference - Proceeding,,,,
2,2024-02-05,NOTICE OF COMPLIANCE Supplemental Compliance a...,https://www.docketalarm.com/cases/Connecticut_...,https://www.docketalarm.com/cases/Connecticut_...,,
3,2024-01-08,OFFER OF COMPROMISE As to Kayson Medina PPA Gi...,https://www.docketalarm.com/cases/Connecticut_...,https://www.docketalarm.com/cases/Connecticut_...,,
4,2023-07-19,OFFER OF COMPROMISE As to Plaintiff Girelys An...,https://www.docketalarm.com/cases/Connecticut_...,https://www.docketalarm.com/cases/Connecticut_...,,
5,2023-07-18,NOTICE OF COMPLIANCE Def. Maya Kaiser NOC,https://www.docketalarm.com/cases/Connecticut_...,https://www.docketalarm.com/cases/Connecticut_...,,
6,2023-06-26,ORDER RESULT: Sustained 6/26/2023 HON ROBERT D...,https://www.docketalarm.com/cases/Connecticut_...,https://www.docketalarm.com/cases/Connecticut_...,,
7,2023-06-08,OBJECTION TO MOTION as to Defendants Motion # ...,https://www.docketalarm.com/cases/Connecticut_...,https://www.docketalarm.com/cases/Connecticut_...,,
8,2023-06-08,LIST OF DOCUMENTS IN LIEU OF THE LIVE TESTIMON...,https://www.docketalarm.com/cases/Connecticut_...,https://www.docketalarm.com/cases/Connecticut_...,,
9,2023-06-08,NOTICE OF COMPLIANCE Supplemental as to Plaint...,https://www.docketalarm.com/cases/Connecticut_...,https://www.docketalarm.com/cases/Connecticut_...,,
