Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve handling of MongoDB ObjectIDs as OPTIMADE immutable_id #1142

Merged
merged 4 commits into from May 10, 2022

Conversation

ml-evs
Copy link
Member

@ml-evs ml-evs commented May 8, 2022

Currently, we loop over the entire cursor and check whether the _id projection was requested, which is very inefficient. This PR makes some fixes around that:

  • Use native MongoDB toString projection if _id is requested (and only do the inefficient loop if using mongomock).
  • Tidy up the Mongo-specific param handling into a method
  • Loosen the validator to not require comparison operators on the immutable_id field (except equality).
  • Automatically cast the immutable_id field to a string in the models.
  • Use the generated MongoDB ObjectID as the immutable_id in our test cases.

@ml-evs ml-evs added priority/medium Issue or PR with a consensus of medium priority server Issues pertaining to the example server implementation labels May 8, 2022
@ml-evs ml-evs force-pushed the ml-evs/improve_objectid_handling branch from 576ebbf to 8ff8615 Compare May 9, 2022 10:14
@codecov
Copy link

codecov bot commented May 9, 2022

Codecov Report

Merging #1142 (3a03ca3) into master (d9f0ab0) will increase coverage by 0.09%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master    #1142      +/-   ##
==========================================
+ Coverage   92.32%   92.41%   +0.09%     
==========================================
  Files          68       68              
  Lines        3894     3904      +10     
==========================================
+ Hits         3595     3608      +13     
+ Misses        299      296       -3     
Flag Coverage Δ
project 92.41% <100.00%> (+0.09%) ⬆️
validator 92.41% <100.00%> (+0.09%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...made/server/entry_collections/entry_collections.py 97.60% <ø> (-0.04%) ⬇️
optimade/validator/config.py 100.00% <ø> (ø)
optimade/models/entries.py 97.77% <100.00%> (+0.27%) ⬆️
optimade/server/entry_collections/mongo.py 96.77% <100.00%> (+2.22%) ⬆️
optimade/validator/validator.py 84.03% <0.00%> (+0.36%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d9f0ab0...3a03ca3. Read the comment docs.

@ml-evs
Copy link
Member Author

ml-evs commented May 9, 2022

@JPBergsma would you be able to take a look at this at some point today? I think we should try to get it into v0.17 as it could be quite significant speed-up in certain scenarios

@ml-evs ml-evs force-pushed the ml-evs/improve_objectid_handling branch 3 times, most recently from 73e197e to 723ff87 Compare May 9, 2022 18:56
JPBergsma
JPBergsma previously approved these changes May 10, 2022
Copy link
Contributor

@JPBergsma JPBergsma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I did not see your request to review it sooner.
It got lost in the flood of github messages in my inbox.
I did not see anything wrong with it.

@@ -74,7 +74,7 @@
DataType.STRING: inclusive_ops + substring_operators,
DataType.TIMESTAMP: (
# N.B. "=" and "<=" are disabled due to issue with microseconds stored in database vs API response (see Materials-Consortia/optimade-python-tools/#606)
# ">=" is fine as all microsecond trimming will round times down
# ">=" is fine as all microsecond trimming will round times down
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you accidentally added a tab here

Suggested change
# ">=" is fine as all microsecond trimming will round times down
# ">=" is fine as all microsecond trimming will round times down

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was actually intentional to stop the operator in the second line flowing with the inline comments underneath (which are just commented out functionality). Maybe a linebreak/moving the comment above would be clearer

@ml-evs
Copy link
Member Author

ml-evs commented May 10, 2022

Thanks @JPBergsma and np, lots of notifications yesterday 🙃 I'll address your comment and merge without a futher review.

@ml-evs ml-evs force-pushed the ml-evs/improve_objectid_handling branch from 80350a2 to 3a03ca3 Compare May 10, 2022 11:26
@ml-evs ml-evs changed the title Improve handling of MongoDB ObjectIDs Improve handling of MongoDB ObjectIDs as OPTIMADE immutable_id May 10, 2022
@ml-evs ml-evs changed the title Improve handling of MongoDB ObjectIDs as OPTIMADE immutable_id Improve handling of MongoDB ObjectIDs as OPTIMADE immutable_id May 10, 2022
@ml-evs ml-evs merged commit 0ef02ce into master May 10, 2022
@ml-evs ml-evs deleted the ml-evs/improve_objectid_handling branch May 10, 2022 11:43
@ml-evs ml-evs added the enhancement New feature or request label May 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request priority/medium Issue or PR with a consensus of medium priority server Issues pertaining to the example server implementation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants