Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Overhaul of filter transformers, mappers and response fields #797

Merged
merged 9 commits into from May 31, 2021

Conversation

ml-evs
Copy link
Member

@ml-evs ml-evs commented May 10, 2021

This PR lays the groundwork for dealing with some outstanding issues:

  1. Returning None for "unknown unknown" properties (e.g. _other_implementation_band_gap) (closes Missing optional fields are not returned as null when requested with response_fields #516) and the raising of errors for "known unknown" properties in filters (e.g. unprefixed_optimade_field) (closes Respect responses for unknown properties #263).
  2. Abstraction of the aliasing code so that "Quantities" can be defined independent of the backend (closes Move aliasing code to base transformer #743).
  3. Definition of provider fields via pydantic models so that they can be filtered appropriately
  4. MongoDB $size: 1 queries match non-existent fields (closes mongomock $size queries match all non-array fields for {$size: 1}, even nulls #807) Removed, as this is actually a mongomock bug

It does so by:

  • Moving aliasing code from individual transformers out to the base transformer, by making use of the Quantity classes defined for elasticsearch.
  • Adding a ENTRY_RESOURCE_ATTRIBUTE_CLASS class attribute to the mappers, such that model schemas can be accessed.

Still to-do:

  • Add passthrough for "other provider" fields so that querying for _exmpl2_band_gap does not fail for provider _exmpl1_.
  • Emit warning if provider prefix is unknown and allow through if provider is registered with OPTIMADE
    • This is done via the providers submodule (rather than the get_providers) request. We could consider making this a configuration option instead.
  • Handle unknown response_fields
    • Return any requested, missing field as None if present in response_fields
    • Raise an error if an unknown, unprefixed field is requested in response_fields (e.g. fractional_coordinates)
    • Emit a warning if a provider-specific field is requested that does not exist (e.g. _exmpl_test)
    • Passthrough other-provider-specific fields as if they were missing from the entry (i.e. replace them with null)

@codecov
Copy link

codecov bot commented May 10, 2021

Codecov Report

Merging #797 (ec8318f) into master (0f11919) will decrease coverage by 0.00%.
The diff coverage is 96.52%.

❗ Current head ec8318f differs from pull request most recent head eb69f62. Consider uploading reports for the commit eb69f62 to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##           master     #797      +/-   ##
==========================================
- Coverage   92.68%   92.68%   -0.01%     
==========================================
  Files          68       68              
  Lines        3677     3759      +82     
==========================================
+ Hits         3408     3484      +76     
- Misses        269      275       +6     
Flag Coverage Δ
project 92.68% <96.52%> (-0.01%) ⬇️
validator 92.68% <96.52%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
optimade/validator/validator.py 82.49% <93.75%> (+0.34%) ⬆️
optimade/filtertransformers/elasticsearch.py 84.57% <94.59%> (+0.79%) ⬆️
optimade/server/mappers/entries.py 97.11% <95.34%> (-1.42%) ⬇️
optimade/filtertransformers/mongo.py 97.00% <96.42%> (-0.60%) ⬇️
optimade/filtertransformers/base_transformer.py 97.45% <96.82%> (-1.28%) ⬇️
optimade/filtertransformers/__init__.py 100.00% <100.00%> (ø)
optimade/server/entry_collections/elasticsearch.py 97.40% <100.00%> (-0.52%) ⬇️
...made/server/entry_collections/entry_collections.py 97.63% <100.00%> (+0.22%) ⬆️
optimade/server/mappers/links.py 100.00% <100.00%> (ø)
optimade/server/mappers/references.py 100.00% <100.00%> (ø)
... and 6 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0f11919...eb69f62. Read the comment docs.

@ml-evs ml-evs added blocked For issues/PRs that are blocked by required changes/clarifications to the specification. priority/medium Issue or PR with a consensus of medium priority transformers Related to all filter transformers labels May 11, 2021
@ml-evs ml-evs force-pushed the ml-evs/transformers_overhaul branch from 6d13f3e to 805d914 Compare May 14, 2021 11:11
@ml-evs ml-evs removed the blocked For issues/PRs that are blocked by required changes/clarifications to the specification. label May 14, 2021
@ml-evs ml-evs force-pushed the ml-evs/transformers_overhaul branch 6 times, most recently from 8158e2e to 9ab9858 Compare May 18, 2021 11:47
@ml-evs ml-evs changed the title Overhaul of filter transformers Overhaul of filter transformers, mappers and response fields May 18, 2021
@ml-evs ml-evs force-pushed the ml-evs/transformers_overhaul branch from 0a7d14a to b40d191 Compare May 18, 2021 18:23
@ml-evs ml-evs mentioned this pull request May 18, 2021
4 tasks
@ml-evs
Copy link
Member Author

ml-evs commented May 18, 2021

  • Handle unknown response_fields
    * Return any requested, missing field as None if present in response_fields
    * Raise an error if an unknown, unprefixed field is requested in response_fields (e.g. fractional_coordinates)
    * Emit a warning if a provider-specific field is requested that does not exist (e.g. _exmpl_test)
    * Passthrough other-provider-specific fields as if they were missing from the entry (i.e. replace them with null)

@CasperWA: this might be an over-interpretation of the spec. Looks like other implementations just return null for any requested response field, but above I have described (and implemented) something that more closely follows the recommendations on filters.

@ml-evs ml-evs marked this pull request as ready for review May 19, 2021 08:08
Copy link
Contributor

@markus1978 markus1978 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can only provide some more superficial comments. I think the biggest issues are

  • the increased number of terms for the same thing (quantity, property, attribute, field). I know I contributed to that.
  • I feel that the quantities related functionality belongs into the ResourceMapper classes.

optimade/filtertransformers/base_transformer.py Outdated Show resolved Hide resolved
optimade/filtertransformers/mongo.py Show resolved Hide resolved
optimade/filtertransformers/base_transformer.py Outdated Show resolved Hide resolved
@ml-evs ml-evs force-pushed the ml-evs/transformers_overhaul branch from f2e96f0 to 94bf368 Compare May 19, 2021 11:40
@ml-evs ml-evs requested a review from CasperWA May 20, 2021 18:44
ml-evs and others added 6 commits May 27, 2021 12:57
…oses #743)

- Add entry resource class attribute to mappers, use that to scrape attributes

- Use new BaseTransformer to handle aliasing in MongoTransformer

- Move definitions of ES quantities away from server and into mapper

- Improve MongoFilterTransformer tests by using OPTIMADE queries directly

- Add more ES FilterTransformer tests for KNOWN/UNKNOWN and fix weird dimension_types test

- Extract as much info from schemas to construct mappers

- Tweak and use retrieve_queryable_properties in mapper

- Provide clearer error message for ES relationship filtering

- Docstrings and annotations for transformers

- Make reversed_operator_map and quantity_type private
…x does not match a known provider (closes #263)

- Elasticsearch fix for provider fields
- Add warning checks to tests
- Fallback to no providers
- Passthrough for 'other provider' fields
@ml-evs ml-evs force-pushed the ml-evs/transformers_overhaul branch 3 times, most recently from 26d4a57 to 444b1b6 Compare May 27, 2021 17:52
ml-evs and others added 2 commits May 27, 2021 18:58
- Switch to single underscore for quantity methods, update docs for Quantity/ElasticsearchQuantity
- Add docstrings for BaseTransformer attributes
- Apply suggestions from code review
- Docstring tweaks and less verbose mkdocs in CI
- Remove seemingly defunct REQUIRED_FIELDS member of mapper
- Use base EntryCollection init in ES collection
- Docstrings; add a '.deserialize()' method to mapper
- Add classpropertys to mapper docstring for mkdocs

Co-authored-by: Markus Scheidgen <markus1978@users.noreply.github.com>
Co-authored-by: Johan Bergsma <JPBergsma@users.noreply.github.com>
Co-authored-by: Casper Welzel Andersen <CasperWA@users.noreply.github.com>
- Skip unknown length test for mongomock, temporarily
tests/server/conftest.py Outdated Show resolved Hide resolved
Co-authored-by: Johan Bergsma <JPBergsma@users.noreply.github.com>
@ml-evs ml-evs force-pushed the ml-evs/transformers_overhaul branch from ec8318f to eb69f62 Compare May 31, 2021 18:08
@ml-evs
Copy link
Member Author

ml-evs commented May 31, 2021

Hi @JPBergsma, I have rebased your final suggestions into one commit eb69f62. Thanks again for the thorough review, it would be great to merge this with enough time to prepare tutorials for the workshop!

Copy link
Contributor

@JPBergsma JPBergsma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you have handled all my comments, so I think you can merge it with the main branch.

@ml-evs ml-evs merged commit 2e9507e into master May 31, 2021
@ml-evs ml-evs deleted the ml-evs/transformers_overhaul branch May 31, 2021 21:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority/medium Issue or PR with a consensus of medium priority transformers Related to all filter transformers
Projects
None yet
4 participants