Plot values for a parameter of a CLDF dataset on a map linked to a tree.
The core functionality is implemented in Florian Matter's lingtreemaps
package and all its configuration options are
available from cldfviz.treemap
as options prefixed with --ltm-
.
cldfviz.treemap
adds the ability to run lingtreemaps
with data from CLDF datasets, exploiting
additional data from LanguageTables to
allow for configurable matching between data and tree.
As an example, we plot values of WALS feature 88A for languages in a couple of Indo-European genera against the Glottolog classification for Indo-European.
$ cldfbench cldfviz.treemap wals-2020.3/ 88A --tree indo1319 \
--output tm.svg --open --glottolog-cldf glottolog-cldf-4.7/ \
--language-filters '{"Genus":"Germanic|Romance|Celtic"}' --ltm-text-x-offset 0.5
The tree can be specified in various ways:
- Using the
--tree
option which accepts- a Glottocode - in which case the Glottolog sub-classification tree for the languoid specified by
--tree
is used (see above), - a Newick-formatted string,
- a path name to an existing file containing the Newick-formatted tree.
- a Glottocode - in which case the Glottolog sub-classification tree for the languoid specified by
- Using the
--tree-dataset
(and--tree-id
) options to select a tree provided in theTreeTable
of a CLDF dataset.
Since the WALS CLDF data contains the (shallow) trees of its Genealogical Language List, we can plot the same data again as follows:
$ cldfbench cldfviz.treemap wals-2020.3/ 88A --output tm.svg --open \
--language-filters '{"Genus":"Germanic|Romance|Celtic"}' --ltm-text-x-offset 0.5 \
--tree-dataset wals-2020.3/ --tree-id family-indoeuropean \
--tree-label-property ID
Since WALS contains coordinates for all its languages, we do not have to access Glottolog data anymore.
Linking the languages in the parameter's dataset to languages in the tree can be configured via the
--glottocodes-as-tree-labels
and --tree-label-property
options. The former will re-name tree
labels to Glottocodes using the mapping given in the --tree-dataset
's LanguageTable
. The latter
specifies a column in the parameter's LanguageTable
values of which are used to match languages to
tree node labels.
In the second examples above, we used WALS family trees, which use the 'ID' values of the
LanguageTable
as node labels. Thus, we had to specify --tree-label-property ID
to make sure
datapoints can be matched with nodes in the tree.