Skip to content

cldf/cldfcatalog

Repository files navigation

cldfcatalog

Utilities to use git repository clones and reference catalogs.

Build Status PyPI

Research data - and in particular CLDF data - is often curated using git repositories for version control. cldfcatalog.Repository provides a wrapper around GitPython's git.Repo class, exposing relevant functionality in this context.

A particularly important piece of data for CLDF are reference catalogs, which are consulted during CLDF data creation. Again, such catalogs are often available as git repositories hosted on GitHub, such as Glottolog or Concepticon.

The typical usage scenario for these catalogs is as follows:

  • To follow upstream development of the catalogs, a user has a local clone of the repository, which is periodically synched running git pull origin.
  • When creating a CLDF dataset, a particular released version of a catalog is consulted.

Thus, we want to

  • checkout a particular version of the catalog,
  • run the CLDF creation,
  • restore the previous state of the repository clone.

This is exactly the functionality of cldfcatalog.Catalog:

>>> from cldfcatalog import Catalog
>>> glottolog = Catalog('../../glottolog/glottolog', 'v4.0')
>>> glottolog.active_branch
'master'
>>> with glottolog:
...     print(glottolog.describe())
...     
v4.0
>>> glottolog.describe()
'v4.0-52-ga4cfc90'

Configuration

cldfcatalog supports discovery of local paths to catalog clones via a configuration file. If a file catalog.ini is found at appdirs.user_config_dir('cldf') (see appdirs) is found, its clones section is used as a mapping from Catalog.cli_name() to clone path. Thus, with a configuration

[clones]
clts = /home/forkel/.config/cldf/clts

a catalog can be intialized as

with Catalog.from_config('clts', tag='v1.0'):
    ...

When cloning a catalog, running Catalog.clone,appdirs.user_config_dir('cldf') will be used as directory for the clone, and the path will be written to the config file.

To add add paths to a config file use it as context manager:

from cldfcatalog import Config

with Config.from_file() as cfg:
    cfg.add_clone(key, path)