The meaning, history, and future of data_type
#704
Replies: 3 comments
-
|
this is what codex thinks: I would lean toward deprecating The clean model is:
Right now My recommendation:
So for the listed options, my answer would be closest to:
The principle I would use is: wrong metadata is worse than missing metadata. |
Beta Was this translation helpful? Give feedback.
-
|
I am inclined to agree, but would like to hear other's thoughts. |
Beta Was this translation helpful? Give feedback.
-
|
When I started with DAScore, the datatype was useful hint. Given all the complexity and transforms possible, I would never expect it to be trustworthy all along a processing chain. Following your questions:
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
First, some history.
data_typewas one of DASCore’s earliest fixed-name attributes. Its original purpose was to distinguish between strain, strain-rate, and deformation-rate DAS data. Importantly, it predates the addition of unit support in DASCore.The main question now is: in current DASCore, what is the semantic meaning of
data_type, and how should it interact withdata_units?The issue is well illustrated by #511. After integrating
strain_ratedata, thedata_typeremainsstrain_rate, even though the units are updated correctly. Should integration inspect the free-formdata_typestring and decide whether to transform it, for example by dropping"rate"? That approach quickly becomes messy. Each function would need its own logic for interpreting and transforming arbitrary input strings.Another approach is for each patch function to explicitly declare its output
data_type, without trying to infer meaning from the inputdata_type. This is the approach taken in #693 and #681. It also motivated #702, which makes setting adata_typemore ergonomic by allowing it to be specified in thepatch_functiondecorator when a function has a single expected output type.This approach is simple, but it can also cause problems. For example, with inverse operations such as integration and differentiation, setting integration output to something like
"integrated_quantity"would mean differentiation should not blindly replace it with"differentiated_quantity".More broadly,
data_typemay now be somewhat redundant. Patches already have units and history, which can convey the same information, although less directly.So, going forward, should we:
Allow each function to set whatever
data_typeit wants?If a function does not explicitly set
data_type, should DASCore clear it to an empty string to avoid misleading metadata, as in patch integrate in TIME does not update data_type #511?Should
data_typebe a strictly controlled vocabulary with documented semantics (eg the tuple in constants) or just free-form text?Should we just deprecate
data_typeand rely ondata_unitsand history to convey the same meaning?Beta Was this translation helpful? Give feedback.
All reactions