Skip to content

Maximbrg/legalAI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

49 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Criminal Sentence Classification

This project classifies key aspects of criminal cases within the Israeli legal framework. The project leverages a few-shot learning approach for accurate sentence classification relevant to sentencing decisions.

Key Features

  • Code: Implements few-shot learning approaches for sentence classification.
  • Data: Utilizes two datasets—one developed in collaboration with criminal law experts from the Israeli Ministry of Justice, focusing on key aspects of criminal cases, and another generated by ChatGPT and refined for this specific task.
  • Results: Presents performance evaluations of the classification methods on weapon-related and drug-related cases, along with sample outputs from ChatGPT's automated sentence tagging.

Execution

The project provides two methodologies located in the code directory. Each methodology includes its own README file with detailed execution instructions.

Data

The data directory includes the following subdirectories:

  1. tagged_data_manually

    This folder contains two subdirectories: drugs and weapons. Each of these directories includes:

    • train.csv / test.csv / eval.csv – Predefined data splits of verdict sentences for training, testing, and evaluation.
    • agreement.csv – Contains statistics on inter-annotator agreement for the manual tagging process.
  2. tagged_data_auto This folder contains two subdirectories: drugs and weapons. Each of these directories includes:

    • results_batch_LABEL_fewshot.csv – A dataset of 10,000 sentences automatically tagged by ChatGPT, categorized by labels (LABEL).

Results

The result files are located in the results directory:

  • drugs_setfit&gpt_experiments.csv
    Contains evaluation results on eval.csv located at
    data/tagged_data_manually/drugs/eval.csv.

  • verification_drugs_auto.csv
    Contains results for a sample of sentences from
    data/tagged_data_auto/drugs/results_batch_LABEL_fewshot.csv.

  • verification_weapon_auto.csv
    Contains results for a sample of sentences from
    data/tagged_data_auto/weapons/results_batch_LABEL_fewshot.csv.

  • wep_setfit&gpt_experiments.csv
    Contains evaluation results on eval.csv located at
    data/tagged_data_manually/weapons/eval.csv.

Meta Information

  • Language: Hebrew
  • License Type: OpenRAIL
  • ResouceType: Coropra, Models and Tools
  • Model Sub Category: Pre-trained language models
  • Coropra Sub Category: Annotated Dataset
  • Task: Text Classification

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •