Analyzing Legal Texts from the Bulgarian Constitutional Court

Using natural language processing and deep learning methods for text and sentence classification tasks, applied to legal texts from the Bulgarian Constitutional Court.

Requirements

The Bulgarian Constitutional Court (BCC) project is managed in a virtual environment, using pipenv. All packages and their dependencies can be found in Pipfile and Pipfile.lock. To create a pipenv environment and install all the packages needed to run the codes in the repository, run the following in a terminal:

# install pipenv
pip install pipenv

# navigate to the repository directory
cd ~/path/to/bulgarian-constitutional-court-decisions

# install virtual environment and dependencies
pipenv install

All models that are currently in development are contained in the models folder. Text data and annotated documents can be found in the models/data folder, as well as a guide on converting documents from pdf to text, and a jupyter notebook tutorial on how to do this in python.

Current Results

The baseline models so far achieve the following performance on the training and validation data:

Baseline Model	Test Accuracy
Logistic Regression	80%
Naive Bayes	84%
Support Vector Machines (SVM)	81%

The deep learning models so far achieve the following performance on the training and validation data:

Deep Learning Model	Test Accuracy	Validation Accuracy
Convolutional Neural Network (CNN)	89%	80%
Long Short-Term Memory Neural Network (LSTM)	89%	80%

Project Plans

Status

This project is still in progress. Current models are in the early stages of development.

TODOs

Current TODOs for future development:

Tune baseline model hyperparameters to improve performance
Improve deep learning models
Visualize model performance
Further model testing
Add more annotated data to improve training process

Resources

If you are interested in using NLP or deep learning methods for analyzing legal texts, the following resources may be useful.

Legal Corpora

Research

McCarty (2007) - Deep Semantic Interpretations of Legal Texts

Other Resources

License

The data for this project is licensed under the Creative Commons Attribution 3.0 Unported license, and the code used to train the models is licensed under the MIT license.

Contact

If you have any questions or comments, feel free to contact me by email, on Twitter, or in the repository discussions.

Name		Name	Last commit message	Last commit date
Latest commit History 277 Commits
.github		.github
data		data
models		models
notebooks		notebooks
.gitignore		.gitignore
LICENSE.md		LICENSE.md
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.github

.github

data

data

models

models

notebooks

notebooks

.gitignore

.gitignore

LICENSE.md

LICENSE.md

Pipfile

Pipfile

Pipfile.lock

Pipfile.lock

README.md

README.md

Repository files navigation

Analyzing Legal Texts from the Bulgarian Constitutional Court

Contents

Requirements

Current Results

Project Plans

Status

TODOs

Resources

Legal Corpora

Research

Other Resources

License

Contact

About

Contributors 3

Languages

License

Paulj1989/bulgarian-constitutional-court-decisions

Folders and files

Latest commit

History

Repository files navigation

Analyzing Legal Texts from the Bulgarian Constitutional Court

Contents

Requirements

Current Results

Project Plans

Status

TODOs

Resources

Legal Corpora

Research

Other Resources

License

Contact

About

Topics

Resources

License

Stars

Watchers

Forks

Languages