# a. Bibliography and SOA

## Main Objective of Legal Judgment Prediction (LJP)

The **main objective** of Legal Judgment Prediction (LJP) is to **predict the outcome (verdict) of a legal decision given its facts**. This typically involves transforming complex, multi-label judicial decisions into a more manageable classification task.In this dataset, this involves a **simplified binary classification task: predicting either "approval" or "dismissal"** of a plaintiff's request, based solely on the facts of the case. The decisions from the Federal Supreme Court of Switzerland (FSCS) are particularly challenging because they often focus on small parts of previous decisions, discussing possible wrong reasoning by lower courts. The models are designed to use only the "facts" section of the court decision as input, as this represents a more realistic and challenging scenario for prediction before the extensive "considerations" (legal reasoning) are fully drafted.

## Potential Business Case for LJP

The development of LJP models and related supportive tasks offers several significant **potential business cases** and benefits for the legal sector and beyond:
- **Assisting Legal Professionals**: Predictive AI models can **assist lawyers** in preparing arguments by identifying strengths and weaknesses. They can also help **judges and clerks** review or prioritize cases, thereby **enhancing and speeding up judicial processes and improving their quality**. This is particularly relevant in jurisdictions facing **excessive workloads and high delays**.
- **Legal Research and Scholarship**: LJP models can aid **legal scholars** in studying case law.
- **Ethical Oversight and Democratization of Law**: These models can help **sociologists and research ethicists** expose irresponsible use of AI in the justice system. Fundamentally, this research aims to **improve legal services and democratize law**, while also highlighting multi-aspect shortcomings to ensure responsible and ethical deployment of technology. The goal is not to produce a "robot lawyer" but to build **assisting technology for human experts**.
- **Resource Optimization**: By predicting outcomes based on facts early in the process, LJP could allow legal professionals to allocate their time more efficiently, focusing their effort where it is most needed, as drafting the "considerations" (legal reasoning) takes up the majority of a judge's or clerk's time (85%) compared to the "facts" (10%).

## Current State of the Art for the Swiss-Judgment-Prediction Dataset's Task

The Swiss-Judgment-Prediction (SJP) dataset itself is presented as a **new multilingual, LJP benchmark** designed to accelerate future research and ensure reproducibility. It comprises **85,000 cases** from the Federal Supreme Court of Switzerland (FSCS) from 2000-2020, in **German (50K), French (31K), and Italian (4K)**. It also includes **metadata like publication years, legal areas, and cantons of origin** for robustness and fairness studies.

The "state of the art" for this specific dataset and task has evolved through several stages:

- **Initial Benchmarks and Methodologies (2021 Study)**:
  - The task is fundamentally a **challenging text classification task**.
  - Initial experiments used **state-of-the-art BERT-based methods**.
  - Given that the average length of case facts (up to 850 tokens in French) often exceeds BERT's 512-token limit, **special methods were required to handle long textual input**. These included:
    - **Long BERT**: An extension of standard BERT models with additional positional embeddings to process up to 2048 tokens.
    - **Hierarchical BERT**: Uses a shared standard BERT encoder to process segments (e.g., 4x512 tokens) independently, with an additional BiLSTM encoder to aggregate the segment encodings for classification.
    - **Hierarchical BERT showed the best initial performance**, achieving approximately **68-70% Macro-F1-Score in German and French**. Macro-F1 is the preferred metric due to the dataset's high class imbalance (over 75% dismissal cases).
    - **Native (monolingually pre-trained) BERT models generally outperformed multilingual BERT** counterparts.
    - **Performance was found to deteriorate as input text length increased** and varied significantly across legal areas (e.g., better in penal law) and cantons of origin. No significant performance fluctuation was observed across years.

- **Advanced Transfer Learning and Augmentation (2022 Study - Current State of the Art)**:
    The more recent study (2022) significantly advanced the state of the art for the SJP dataset by exploring various transfer learning techniques:
    - **Cross-Lingual Transfer (CLT)**:
      - **Multilingually pre-trained models (e.g., XLM-R) fine-tuned across all languages performed better** than monolingual counterparts, especially with **adapter-based fine-tuning**. Adapters improve overall performance and lead to higher performance parity across languages.
      - **Machine Translation (NMT)-based Data Augmentation** was a key enhancement. Translating documents into the other two languages created a **3x larger training corpus (approx. 180K documents)**. This augmentation led to a significant improvement, particularly for the **low-resource Italian subset**, making Italian the best-performing language after augmentation in some cases.
    - **Cross-Domain and Cross-Regional Transfer**:
      - Models trained **across all legal areas (domains) and origin regions performed overall better and more robustly**, showing improved results in worst-case scenarios. This indicates that models benefit from exposure to a wider variety of cases and data volume, even if legal systems or case characteristics vary.

Sources: 

[1] Swiss-Judgment-Prediction: A Multilingual Legal Judgment Prediction Benchmark (Niklaus et al., NLLP 2021). [10.18653/v1/2021.nllp-1.3 ](https://aclanthology.org/2021.nllp-1.3/)

[2] Joel Niklaus, Matthias Stürmer, and Ilias Chalkidis. 2022. An Empirical Study on Cross-X Transfer for Legal Judgment Prediction. In Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 32–46, Online only. Association for Computational Linguistics. [10.18653/v1/2022.aacl-main.3](https://aclanthology.org/2022.aacl-main.3/)