Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attributes vs. Datatypes and Constraints #33

Open
UnclePoole opened this issue Sep 16, 2020 · 0 comments
Open

Attributes vs. Datatypes and Constraints #33

UnclePoole opened this issue Sep 16, 2020 · 0 comments

Comments

@UnclePoole
Copy link

One open design question for an attribute data model is how to relate attributes with the datatypes they are implemented with.

A particular question is whether the datatypes should be constrained to just be the set of primitives (real, integer, boolean, text, codelist) or should be a more detailed datatype model similar to XSD where you can build up logical types from primitives via range constraints, measurement units, text patterns, and so on.

Many attributes may use the same logical constraints e.g. a text pattern for date strings or an angle real value constraint of [0, 360) with measurement unit degree.

However, attributes may have semantic range constraints that are stricter than the physical measurement unit range constraints - distance as a data type may have a range [0, inf) but the WID attribute would typically have a narrower range such as (0, 100].

This is one area where GGDM and NAS parent standards diverge.

GGDM flattens datatypes into attributes and each attribute bound to a particular entity type has the full constraint information duplicated at the binding rather than a separate datatype specification. So each distinct instance of WID has its own duplicate constraint information.

NAS has a separate datatype table with range, pattern, and measurement constraints and attributes are just used to bind datatypes to particular names, meaning the same datatype can be reused for many different attributes.

CDB 1.2 places constraints at the attribute level only and datatypes are only the primitives.

This gets particularly complex when dealing with enumerated qualitative attributes (codelists) where the list of valid vocabulary terms may vary both per attribute and per entity type (even NAS doesn't normalize duplicate terms across different attributes).

Thoughts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant