Metadata-free conformance and additional columns #50

Anaphory · 2017-11-13T11:45:16Z

The standard says

A dataset can be CLDF conformant without providing a separate metadata description file. To do so, the dataset must follow the default specification for the appropriate module regarding

I had assumed I could add additional columns which just would not have well-defined semantics and only string as possible datatype. But when I tried to load

ID,Language_ID,Parameter_ID,Form,Segments,Comment,Source,Cognate_Set
0,abai1240,feature1,form,,,,0
1,afad1236,feature1,form,,,,1
2,ambu1247,feature1,form,,,,0

I can get the Cognate_Set column by using iterdict, but I cannot ask the Dataset object whether that column exists.

Clarify in the CLDF specs whether additional columns are permitted in mdf conformance (I'll raise a separate issue there)
If they are permitted, sniff the table header to add them to the table spec
In any case, unify column existence between tableSpec and iterdicts
Fix cldf validate to enforce the specs

(Or convince me why the current state is as it should be – as has happened often enough – and a bit of documentation about it somewhere.)

The text was updated successfully, but these errors were encountered:

xrotwang · 2017-11-13T12:05:25Z

The scenario you describe isn't a problem of "metadata-free conformance". Even with a description file, iterdicts may return dictionaries with more keys than listed in the tableSchema. So I guess there are two levels of support for additional columns:

Implicit: You'd have to inspect the first dictionary returned by iterdicts
Explicit: Whatever is listed as non-virtual column in the description

I wouldn't want to force-add all columns to the description, because this would require reading (parts of) the data file right away and because this would interfere with the expliciteness of the description. So, in terms of the ZEN of CLDF I'd say:

Whatever isn't listed in the description shouldn't be accessed by CLDF-aware software.
If the default descriptions for metadata-free conformance don't list what you want to access, create an explicit more inclusive description first, which can subsequently also serve as documentation for your code.

Anaphory · 2017-11-13T14:13:48Z

In that case, I think it's good enough to document this behaviour in the iterdicts docstring (which is completely missing at the moment: clld/clldutils#60) and in the CLDF specs.

Anaphory mentioned this issue Nov 13, 2017

It is unclear (also from pycldf implementation) whether metadatafree datasets may have additional columns cldf/cldf#46

Closed

xrotwang closed this as completed Nov 28, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Metadata-free conformance and additional columns #50

Metadata-free conformance and additional columns #50

Anaphory commented Nov 13, 2017 •

edited

Loading

xrotwang commented Nov 13, 2017

Anaphory commented Nov 13, 2017 •

edited

Loading

Metadata-free conformance and additional columns #50

Metadata-free conformance and additional columns #50

Comments

Anaphory commented Nov 13, 2017 • edited Loading

xrotwang commented Nov 13, 2017

Anaphory commented Nov 13, 2017 • edited Loading

Anaphory commented Nov 13, 2017 •

edited

Loading

Anaphory commented Nov 13, 2017 •

edited

Loading