Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change caseless data values to cased for consistency with DDL1. #249

Closed
wants to merge 2 commits into from

Conversation

jamesrhester
Copy link
Contributor

In the transition to DDLm it appears that some data names were
classed as 'Code' type (caseless) when in DDL1 they were 'char'
(always case-sensitive). This patch attempts to identify those
data names for which this difference is significant. In some
situations, such as data names referring to external
database identifiers, the identifiers should be caseless, even
if there was no mechanism in DDL1 to indicate this.

In the transition to DDLm it appears that some data names were
classed as 'Code' type (caseless) when in DDL1 they were 'char'
(always case-sensitive). This patch attempts to identify those
data names for which this difference is significant. In some
situations, such as data names referring to external
database identifiers, the identifiers should be caseless, even
if there was no mechanism in DDL1 to indicate this.
@jamesrhester
Copy link
Contributor Author

This pull request is aimed at fixing #245. I have also attempted to catch any other items that were case-sensitive in DDL1 but not in DDLm. I have assumed that references to external identifiers (e.g. databases) are case-insensitive.

@vaitkus
Copy link
Collaborator

vaitkus commented Sep 16, 2021

Firstly, I wanted to note, that this also reintroduces ASCII whitespaces into the set of symbols that are allowed in values of the changed data items. This increases the compatibility with DDL1 even more, however, personally I somewhat liked the additional restriction of not having spaces in atom labels, etc. Not sure if this is much of an issue for other users, though.

Secondly, if I am not mistaken, one of the new desired DDLm features was that enumeration values would be case-insensitive. The current PR version would break this for _diffrn_source.target, _exptl_crystal_appearance.hue and _exptl_crystal.colour data items (the last one being a composition of 3 enumeration set items). Maybe these items could be left unchanged since other enumeration set items were not modified?

The previous commit accidentally created case-sensitivity for enumerated values, which should not be case-sensitive.
@vaitkus
Copy link
Collaborator

vaitkus commented Sep 17, 2021

Thank you for the changes. So just to confirm, the reintroduction of ASCII whitespaces into the set of permitted characters is not viewed as a problem?

@jamesrhester
Copy link
Contributor Author

jamesrhester commented Sep 24, 2021

Strictly speaking including whitespace is consistent with DDL1, but I note that mmCIF does not allow whitespace in atom labels. To handle this we would need to invent a new _type.contents that was the equivalent of Code but case-sensitive. I'll ask on the cif-developers list to see what the feeling is.

@jamesrhester
Copy link
Contributor Author

I will update this PR to use the new 'Word' type for compatibility with DDL1.

@jamesrhester jamesrhester deleted the fix_case branch October 19, 2021 07:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants