Skip to content

Latest commit

 

History

History

python

acl-anthology-py

License Build Status Documentation Code Coverage Supported Python Versions Development Status Package on PyPI

This package accesses data from the ACL Anthology.

How to use

Install via pip:

$ pip install acl-anthology-py

Instantiate the library, automatically fetching data files from the ACL Anthology repo (requires git to be installed on your system):

from acl_anthology import Anthology
anthology = Anthology.from_repo()

Some brief usage examples:

>>> paper = anthology.get("C92-1025")
>>> str(paper.title)
Two-Level Morphology with Composition
>>> [author.name for author in paper.authors]
[
    Name(first='Lauri', last='Karttunen'),
    Name(first='Ronald M.', last='Kaplan'),
    Name(first='Annie', last='Zaenen')
]
>>> anthology.find_people("Karttunen, Lauri")
[
    Person(
        id='lauri-karttunen', names=[Name(first='Lauri', last='Karttunen')],
        item_ids=<set of 30 AnthologyIDTuple objects>, comment=None
    )
]

Find more examples and details on the API in the official documentation.

Developing

This package uses the Poetry packaging system. Development is easiest with the just command runner; running just -l will list all available recipes, while just -n <recipe> will print the commands that the recipe would run.

Running checks, pre-commit hooks, and tests

  • just check will run black, ruff, mypy, and some other pre-commit hooks on all files in the repo.

    • just install-hooks will install pre-commit hooks so they run on every attempted commit.
  • just test-all will run all tests except for tests that run on the full Anthology data.

    • just test NAME will only run test functions with NAME in them.
    • just test-integration will run tests on the full Anthology data.
  • just fix-and-test (or just ft for short) will run all checks and tests, additionally re-running the checks on failure, so that the checking and testing will continue even if some hooks have modified files.

  • The justfile defines several more useful recipes; list them with just -l!

Running benchmarks

There are some benchmark scripts intended to be run with richbench:

poetry run richbench benchmarks/

Generating and writing documentation

  • just docs generates the documentation in the site/ folder.
  • just docs-serve serves the documentation for local browsing.

Docstrings are written in Google style as this supports the most features with the mkdocstrings handler (particularly compared to Sphinx/reST).