[OPTIMADE API] Several validation issues reported by the OPTIMADE provider dashboard #128

ml-evs · 2021-09-29T15:04:52Z

Hi qmpy devs (I guess mostly @tachyontraveler 😉), as I don't think any of you were in the last OPTIMADE meeting, I just thought I would give you a heads-up that the OPTIMADE provider dashboard now lists the errors associated with your implementation. OQMD has quite a few validation errors that you may want to sort out (that others have also raised at recent OPTIMADE tutorials, that make it impossible to use OQMD with the new pymatgen OPTIMADE client). Here is the dashboard link: https://www.optimade.org/providers-dashboard/providers/oqmd.html

The main errors reported by that dashboard can be clustered vaguely into:

Issues with the OptimadeStructureSerializer, e.g. missing values (that should be null or present), incorrect chemical formula formats (additional redundant "1"s, wrong element orderings etc) and some fields having the wrong type, e.g. elements should be a list and not a string.
Incorrect reporting of errors/not implemented filters, e.g. the filter ?filter=id < 12345 should return a proper status code and error format that indicates to automated clients that this filter is not available.
The only mandatory filter (structure_features) is not supported - in this case it, I don't think OQMD hosts any "weird" structures that would have values for structure_features, so it should probably just return all-or-nothing.
A minor issue with the /info/structures endpoint, you need to report that your prefix is _oqmd (even if it is obvious) so that clients know to look for your prefixed properties (which are perhaps the most useful ones I've seen - would love it if all "stability" dbs could agree on a format of hull distance for filtering).

Most of the reported errors should only require minor changes, I'm happy to help out and make some of the tweaks if that would help! If you could handle adding optimade-python-tools as a dependency (only a minimal install would be needed), you could also validate the models produced by your serializer with Python code before it gets deployed.

Let me know what you think!

The text was updated successfully, but these errors were encountered:

tachyontraveler · 2021-10-05T14:42:03Z

Hi @ml-evs , Thanks for reporting the issues. I will work on them this week and they all should be fixed by this weekend - unless I ran into any totally unexpected issues with the Django REST API framework that we use in OQMD.

(and I will be there at the October OPTIMADE meeting)

I'd be happy to look into adding the optimade-python-tools to qmpy once the current errors are fixed, and will let you know if I come across any significant hurdles.

Abhi

tachyontraveler · 2021-11-16T03:15:39Z

I've fixed every issue except for the "CONSTANTFIRST" queries since they're optional in OPTIMADE v1.0.0 and v1.1.0

I will close this issue in a few days unless @ml-evs has any comments/suggestions.

ml-evs · 2021-11-16T10:12:20Z

Thanks @tachyontraveler, this looks great! The dashboard confirms that the only issue is now the constant first comparison.

One issue is that I don't think constant first queries are optional... the spec says:

6.2.2   Numeric and String comparisons

Comparisons involving Numeric and String properties can be expressed using the usual comparison operators: '<', '>', '<=', '>=', '=', '!='. Implementations MUST support comparisons in the forms:

identifier <operator> constant
constant <operator> identifier

Where identifier is a property name and constant is either a numerical or string type constant.

Implementations MAY also support comparisons with identifiers on both sides, and comparisons with numerical type constants on both sides, i.e., in the forms:

identifier <operator> identifier
constant <operator> constant

so while I think identifier <operator> identifier is optional (and whatever constant <operator> constant is meant to do...), it seems that constant <operator> identifier is a "MUST".

tachyontraveler · 2021-11-16T20:26:55Z

Thanks for clarifying that, @ml-evs . I have patched all the issues and now the local optimade validator installed my machine finds no test failures. Also passes 2/4 optional tests as well. I can work on the remaining optional fixes sometime later - probably after this quarter.

tachyontraveler · 2021-11-17T14:08:49Z

Closing this issue since the OQMD implementation now passes all the validation tests, as indicated on the dashboard.

ml-evs · 2021-11-17T14:35:46Z

Awesome, nice one!

ml-evs · 2021-11-25T10:13:41Z

Hey @tachyontraveler, @JPBergsma spotted one final issue with the OQMD OPTIMADE (that also revealed a bug in the validator). The anonymous chemical formulae are ordered incorrectly, e.g., you serve ABCD4 rather than A4BCD. We had a bug that would only enforce the correct order on proportions != 1...

I will quickly throw together a PR for this now.

tachyontraveler · 2021-12-16T17:04:03Z

I will close this issue after the anonymous formula issue is resolved in the scripts and then implemented at oqmd.org

tachyontraveler · 2021-12-17T00:06:41Z

@ml-evs

I tested a few ways to solve this problem and hit a wall at a consistency issue in the filtering of chemical_formula_anonymous.

There's an issue with implementing support for STARTS operation for chemical_formula_anonymous because of the disparity in the ordering between OQMD and OPTIMADE.

For example,

The OPTIMADE query chemical_formula_anonymous STARTS "A2B" AND nelements < 4 requires to return all materials whose anonymous formula is one of
"A2B", "A2B2", "A2BC", "A2B2C2"

But internally, I have to convert the query to look for chemical_formula_anonymous to start with "AB2", not "A2B", as the order in which the anonymous formula is saved in OQMD is ascending. In that case, the internal Django query would return the materials whose anonymous formula is one of
"AB2", "AB2C", "AB2C2", "AB2C3", "AB2C4", "AB2C5", ...

Writing a script to parse the OPTIMADE query and finding relevant matches before the Django SQL query is impractical when nelements filter is not applied.

Now, I can just change the format of retrieved data and disable the support for the STARTS query with chemical_formula_anonymous. Lmk if that's a solution for now with OPTIMADE?

There's a very low chance that I will change the OQMD's generic formula convention anytime soon as it'll break many of the internal users' data generation and retrieval scripts. I will keep this issue open until it's fully resolved.

ml-evs · 2021-12-17T01:22:40Z

Tricky, and completely understand that you don't want to change the qmpy representation!

It actually looks like supporting substring comparisons for chemical_formula_anonymous is not mandatory (see https://github.com/Materials-Consortia/OPTiMaDe/blob/develop/optimade.rst#chemical-formula-anonymous), so you can just spit out a 501 Not Implemented for CONTAINS, STARTS, ENDS etc. Do you have a filter mapper that already allows you to check for equality? e.g. if they ask for A2B in the filter, the query gets converted to AB2 for Django? If so, that should be enough!

ml-evs · 2021-12-17T01:24:26Z

I see that the dashboard was having a go at you for this (which I guess prompted your comment) - I can't check right now, but if it still moans when you return a 501, we can fix that in the validator itself.

tachyontraveler · 2021-12-17T04:29:21Z

Tricky, and completely understand that you don't want to change the qmpy representation!

It actually looks like supporting substring comparisons for chemical_formula_anonymous is not mandatory (see https://github.com/Materials-Consortia/OPTiMaDe/blob/develop/optimade.rst#chemical-formula-anonymous), so you can just spit out a 501 Not Implemented for CONTAINS, STARTS, ENDS etc. Do you have a filter mapper that already allows you to check for equality? e.g. if they ask for A2B in the filter, the query gets converted to AB2 for Django? If so, that should be enough!

Yes, I can easily implement the equality comparison after conversion to OQMD format. I will proceed in this way in that case.

tachyontraveler · 2021-12-17T07:17:31Z

Now the query with START operation results in a 501 error while a query with equality operation returns a successful response.

The anonymous formulae are returned in OPTIMADE's preferred format

ml-evs · 2021-12-17T11:32:08Z

Looks good to me! I notice that the validator does incorrectly flag this, so I raised an issue to track this in the Python tools.

tachyontraveler · 2021-12-17T17:52:35Z

Sounds good. I will close this issue in that case.

tachyontraveler pinned this issue Oct 14, 2021

ml-evs mentioned this issue Nov 4, 2021

Fixes for OPTIMADE structures and info #129

Merged

5 tasks

tachyontraveler unpinned this issue Nov 16, 2021

tachyontraveler closed this as completed Nov 17, 2021

ml-evs mentioned this issue Nov 25, 2021

Reverse anonymous chemical formulae for OPTIMADE ('ABCD4' -> 'A4BCD') #134

Merged

tachyontraveler reopened this Dec 16, 2021

tachyontraveler mentioned this issue Dec 17, 2021

OPTIMADE-related fix (chemical_formula_anonymous vs generic) #135

Merged

ml-evs mentioned this issue Dec 17, 2021

Overzealous validation of substring comparisons for chemical formula fields Materials-Consortia/optimade-python-tools#1024

Closed

tachyontraveler closed this as completed Dec 17, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[OPTIMADE API] Several validation issues reported by the OPTIMADE provider dashboard #128

[OPTIMADE API] Several validation issues reported by the OPTIMADE provider dashboard #128

ml-evs commented Sep 29, 2021

tachyontraveler commented Oct 5, 2021

tachyontraveler commented Nov 16, 2021

ml-evs commented Nov 16, 2021

tachyontraveler commented Nov 16, 2021

tachyontraveler commented Nov 17, 2021

ml-evs commented Nov 17, 2021

ml-evs commented Nov 25, 2021

tachyontraveler commented Dec 16, 2021

tachyontraveler commented Dec 17, 2021 •

edited

ml-evs commented Dec 17, 2021

ml-evs commented Dec 17, 2021

tachyontraveler commented Dec 17, 2021

tachyontraveler commented Dec 17, 2021

ml-evs commented Dec 17, 2021

tachyontraveler commented Dec 17, 2021

[OPTIMADE API] Several validation issues reported by the OPTIMADE provider dashboard #128

[OPTIMADE API] Several validation issues reported by the OPTIMADE provider dashboard #128

Comments

ml-evs commented Sep 29, 2021

tachyontraveler commented Oct 5, 2021

tachyontraveler commented Nov 16, 2021

ml-evs commented Nov 16, 2021

tachyontraveler commented Nov 16, 2021

tachyontraveler commented Nov 17, 2021

ml-evs commented Nov 17, 2021

ml-evs commented Nov 25, 2021

tachyontraveler commented Dec 16, 2021

tachyontraveler commented Dec 17, 2021 • edited

ml-evs commented Dec 17, 2021

ml-evs commented Dec 17, 2021

tachyontraveler commented Dec 17, 2021

tachyontraveler commented Dec 17, 2021

ml-evs commented Dec 17, 2021

tachyontraveler commented Dec 17, 2021

tachyontraveler commented Dec 17, 2021 •

edited