Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cif_core.dic: _atom_site.label is linked to itself #42

Closed
vaitkus opened this issue Jun 15, 2017 · 7 comments
Closed

cif_core.dic: _atom_site.label is linked to itself #42

vaitkus opened this issue Jun 15, 2017 · 7 comments

Comments

@vaitkus
Copy link
Collaborator

vaitkus commented Jun 15, 2017

The _atom_site.label data item definition in the cif_core.dic dictionary states that this data item is linked to itself using the _name.linked_item_id data item (this is done indirectly by importing the _atom_site_labeldata block from the templ_attr.cif file). Is this done on purpose?

@jamesrhester
Copy link
Contributor

jamesrhester commented Jul 14, 2017

This has been done purely for convenience, so that the attributes for all of the atom_site.label data names are easily harmonised. I suggest that it is not an error for a data name to link to itself as this is trivially correct. If there are other considerations that make such linking undesirable, we can either (1) fully provide the attributes for _atom_site.label, which are then duplicated in templ_attr.cif, or else (2) remove _name.linked_item_id from the templ_attr file and add it in to all of the linked data names explicitly. Solution (1) is quite appealing, as then the details of how an atomic label is constructed are immediately available in the main file at least once, even if they are then duplicated in the template file.

@vaitkus
Copy link
Collaborator Author

vaitkus commented Jul 14, 2017

The formal definition of the _name.linked_item_id data item states that it is a "Dataname of an equivalent item in *another category*..." so, technically, linking a data item to itself is currently not allowed (even though it seems to be used in this way; see discussion at the end of this post).

I would also lean more towards the (1) solution. However, I would recommend moving the _description.text data item from the template to the _atom_site.label definition and providing each importing data item with its own succinct description. This would provide the following benefits:

  1. The template could be imported to the definition of the _atom_site_aniso.label;
  2. The template could potentially be imported to the _atom_site.calc_attached_atom and _model_site.label data items. However, currently the _type.purpose of the two data items differs from the template (Link vs Encode). If solution (1) is implemented maybe the _atom_site.label could retain the original Encode purpose and the template could be changed to contain the Link purpose?
  3. If the _description.text in the save_restr_label template is also distributed over the importing data items, the save_restr_label could be replaced by the save_atom_site_label template altogether;
  4. If the _description.text in the save_atom_site_id template is also distributed over the importing data items, the save_atom_site_id could be replaced by the save_atom_site_label template altogether.

On a slightly related note, some definitions of the data items seems to contradict their actual usage:

  1. As mentioned, the _name.linked_item_id data item states that the linked item must be from a different category, however, all the linked *_su data items belong to the same category as their Measurement counterparts. Maybe this restriction should be removed from the definition?
  2. The _type.purpose data item value Link description states that it is "Used to type an item with a value that is unique within a looped list of items *belonging to another category*...". However, the _atom_site.calc_attached_atom is linked to an element in the same category (_atom_site.label). Maybe this restriction should be removed from the definition?

@jamesrhester
Copy link
Contributor

jamesrhester commented Jul 16, 2017

I agree that we should remove the "belonging to another category" condition on the Link attribute. I do not know of any reason for restricting the nature of a Link in this way. I suggest changing it to:
"Used to type an item that has values that are drawn from a set of unique values for another item". This still excludes the current usage in _atom_site.label ("another item").
I agree that it would be reasonable to move _description.text for _atom_site.label into the main dictionary, and provide individual descriptions for the linked items. The templates can then be simplified as suggested. I will make this change shortly.

@jamesrhester
Copy link
Contributor

jamesrhester commented Jul 24, 2017

The latest update to ddl.dic includes edited text for _name.linked_item_id and the Link state for _type.purpose. Please check.

@vaitkus
Copy link
Collaborator Author

vaitkus commented Jul 24, 2017

The _name.linked_item_id description seems good in the scope of this discussion. I do, however, dislike that the SU values are involved in the definition. It is a separate discussion all together and probably not one of a high priority so I will just outline the general problems:

  1. The interpretation of one data item values becomes dependent on the values of the other data item (_type.purpose value SU forces the _name.linked_item_id item to be interpreted as the measurement data of the standard uncertainty and not as a foreign key);
  2. The SU value in the Measurement data item (a separately defined item with the same name as the measurand item but with an additional suffix '_su') is referenced differently than the Measurement value in the SU data item (the name is provided via the _name.linked_item_id data item).

The _type.purpose description is much better than before. However, I would suggest changing a few things:

  1. Maybe the first sentence should reflect the overall purpose of the type and detailed explanation could follow in other sentences;
  2. The relational database definition of the foreign key also encompasses self-referencing tables. The "... usually (but not always) ... "" part naturally raises a question what are the use cases when this type of value does not serve as a foreign key;
  3. The word packet is used, however, it does not seem to be previously explained. I think that the foreign key implies that the reference applies to the whole row for the relational table (loop line).

A possible rewording of the definition could be as follows:

Used to type an item that acts as a foreign key between two categories.
The definition of this item must contain the attribute "_name.linked_item_id"
specifying the data item with unique values in the linked category. The values
of the defined item are drawn from the set of values in the referenced item.
Cross referencing items from the same category is allowed.

@jamesrhester
Copy link
Contributor

jamesrhester commented Jan 7, 2019

Will update _type.purpose Link text in next commit. Leaving this issue open as a placeholder for SU discussion.

@jamesrhester
Copy link
Contributor

jamesrhester commented Mar 17, 2020

I will open a separate issue for the SU discussion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants