nllegalcit is a Python library to find citations to Dutch legal and parliamentary documents in natural language text.
This library is partially based on the linkextractor developed by KOOP to implement the Linked Data Overheid (LiDO). The aim of nllegalcit is to provide a more generally accessible way to recognize Dutch legal citations in natural language text.
Please note that this library is currently under development. It will probably not work reliably.
nllegalcit is currently built and tested for Python 3.10, using either the default CPython or the (much faster) PyPy runtime.
nllegalcit can be installed using pip:
$ pip install nllegalcit
To find all supported citations in some string, we use the following general function:
.. autofunction:: nllegalcit.parse_citations
This gives us a list of all recognized citations in the input:
>>> from nllegalcit import parse_citations >>> parse_citations("Kamerstukken I 1979/80, 15 516, nr. 42e, blz. 7") [Kamerstukken I 1979-1980, 15516, nr. 42e p. 7]
Other specific parse functions also exist, for example to find citations in a PDF file, or to only find citations to Kamerstukken. For more information, please refer to :ref:`the API reference <nllegalcit_package>`.
nllegalcit aims to provide a simple interface to find all supported citations in a text. At its core, nllegalcit consists of functions to parse a type of text (plaintext, PDF files, online PDF files, etc.), and specific Citation classes to describe the recognized citations.
Only absolute citations are recognized. Detecting relative citations in a text is currently out of scope for nllegalcit.
The following types of citations are implemented:
- Kamerstukken (Dutch parliamentary documents) [
⚠️ Work in progress]: - Works reasonably well for modern (>1995) citations following the guidelines. Older citations may work. Simple page number notations work.
- Kamerstukken (Dutch parliamentary documents) [
- Handelingen (Dutch parliamentary minutes) [
⚠️ Work in progress]: - Initial implementation exists, but needs more thorough testing.
- Handelingen (Dutch parliamentary minutes) [
- ECLI case law [
⚠️ Work in progress]: - Seems to work, but more testing is needed. Paragraph information is not parsed.
- ECLI case law [
The following types of citations are not yet implemented, but are planned:
- Dutch case law other than ECLI
- Dutch national laws
- Dutch treaties
- Dutch local laws
- EU laws
Copyright (c) 2023-2024 Martijn Staal <nllegalcit [a t ] martijn-staal.nl>
Available under the European Union Public License v1.2 (EUPL-1.2), or, at your option, any later version.
Please file any issues you may have in GitHub.
.. toctree:: :maxdepth: 2 :caption: Contents: performance.rst kamerstukken.rst caselaw.rst nllegalcit API reference <nllegalcit.rst>