Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalising collections and adding ElasticsearchCollection #660

Merged
merged 5 commits into from Mar 15, 2021

Conversation

ml-evs
Copy link
Member

@ml-evs ml-evs commented Jan 5, 2021

This PR is the spiritual successor of #339 and includes a lot of @markus1978's original work.

Closes #742.

To-do

Known issues (to be raised pending review)

  • Had to disable some of the more esoteric elasticsearch features that we already had, e.g. HAS ANY querying
  • When querying a field not in the index, it returns an empty cursor, rather than an error
  • Testing across all backends is slightly fragile, relying on some hardcoded checks for backend-dependent error messages

Some future to-do:

This PR opens up the opportunity to do some work, which we can track with new issues:

],
)

def find(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel that much of the parameter handling (especially all the guard checks) could be done in the base class, as they are basically independent of the underlying db. I would suggest to implement find(params) in the base class and have an abstract method _run_db_query(filter_query, page_limit, page_offset, sort_field, order).

@codecov
Copy link

codecov bot commented Jan 12, 2021

Codecov Report

Merging #660 (f2d97a8) into master (9306cb8) will decrease coverage by 0.10%.
The diff coverage is 93.71%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #660      +/-   ##
==========================================
- Coverage   92.78%   92.67%   -0.11%     
==========================================
  Files          67       68       +1     
  Lines        3533     3645     +112     
==========================================
+ Hits         3278     3378     +100     
- Misses        255      267      +12     
Flag Coverage Δ
project 92.67% <93.71%> (-0.11%) ⬇️
validator 92.67% <93.71%> (+29.04%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
optimade/filtertransformers/django.py 91.52% <ø> (ø)
optimade/filtertransformers/mongo.py 97.57% <25.00%> (ø)
optimade/filtertransformers/elasticsearch.py 83.78% <60.00%> (-5.44%) ⬇️
optimade/server/config.py 86.74% <81.25%> (-1.49%) ⬇️
...made/server/entry_collections/entry_collections.py 97.41% <94.11%> (-1.44%) ⬇️
optimade/server/entry_collections/elasticsearch.py 97.91% <97.91%> (ø)
optimade/server/data/__init__.py 75.00% <100.00%> (+5.00%) ⬆️
optimade/server/entry_collections/__init__.py 100.00% <100.00%> (ø)
optimade/server/entry_collections/mongo.py 96.15% <100.00%> (+0.84%) ⬆️
optimade/server/main.py 94.23% <100.00%> (ø)
... and 8 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9306cb8...f2d97a8. Read the comment docs.

@ml-evs
Copy link
Member Author

ml-evs commented Feb 21, 2021

I think this is now ready for you @markus1978... perhaps it makes sense for you to test your server with this backend as part of the review? We can then go through and improve the overall integration in future PRs (see e.g. future to-do list in top comment)

@ml-evs
Copy link
Member Author

ml-evs commented Feb 21, 2021

Requesting @CasperWA too in case he wants to look at the more general stuff.

@ml-evs ml-evs force-pushed the ml-evs/elastic branch 2 times, most recently from d718bcf to 5ab64c1 Compare February 22, 2021 00:06
@ml-evs ml-evs marked this pull request as ready for review February 24, 2021 22:08
@ml-evs ml-evs requested a review from shyamd as a code owner February 24, 2021 22:08
@ml-evs
Copy link
Member Author

ml-evs commented Feb 24, 2021

Weird pydantic failures regarding the config with Python 3.9 that I can't shake off...

@CasperWA CasperWA added the blocking For issues/PRs that are blocking other PRs. label Mar 3, 2021
Copy link
Member

@CasperWA CasperWA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Several different things to touch on. But generally, I'm really optimistic about these changes and this implementation/extension to the entry collections module.

See all suggested changes and comments below.

.github/workflows/deps_lint.yml Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Show resolved Hide resolved
optimade/filtertransformers/django.py Outdated Show resolved Hide resolved
optimade/filtertransformers/elasticsearch.py Show resolved Hide resolved
tests/server/query_params/test_filter.py Show resolved Hide resolved
import pytest
from optimade.server.config import CONFIG, SupportedBackend
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have issues when importing from the package outside specific test functions.
I see you're using it in decorators, but maybe we can get around this in another way?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, when reviewing this file, you're already using several different ways to handle the different backends. Choose one, and don't choose the one that uses decorators (because we then need to import from the package outside the test functions). And then import from the package where needed within each function.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what the problems that you're alluding to are; each pytest session is going to be testing one fixed backend so it shouldn't break anything. Importing inside each test will have the same side effect anyway, as the client will only be created once.

The different ways of handling it correspond to different outcomes: either needing to check for different error messages, or marking the tests as xfails.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unsure about the import order, but I just remember there were issues with importing configuration settings and similar outside tests as opposed to inside. But if it works, it's fine, I guess.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It works fine with this current layout, i.e. testing each backend indepedently. It would be nice if we could wrap a lot of these tests with some kind of parameterized context manager that can switch between the backends (this will also be more extensible if we add more backends) but that is beyond the scope of this PR

tests/server/query_params/test_sort.py Outdated Show resolved Hide resolved
tests/server/test_server_validation.py Outdated Show resolved Hide resolved
@ml-evs ml-evs force-pushed the ml-evs/elastic branch 2 times, most recently from f225114 to 41abcae Compare March 9, 2021 01:34
@ml-evs ml-evs mentioned this pull request Mar 9, 2021
@ml-evs ml-evs force-pushed the ml-evs/elastic branch 5 times, most recently from b972b88 to a7526b6 Compare March 11, 2021 20:03
- Renamed 'use_real_mongo' option to reflect general usage
- Tweaks to elasticsearch indexes and naming conventions
- Explicitly remove HAS ONLY code
- Remove CI_FORCE_<x> config values and promote mongomock to a supported backend
- Reintroduce 'use_real_mongo' as a deprecated field
- Generalised some tests to work with elastic
- Fix sorting and aliases
- Fix 'more_data_available' response from elasticsearch
- Fix query in structure router tests
- Fix typo in elastic aliases
- Add 'id' to elastic index and sort on it by default
- Properly handle aliases in elastic index
- Fixed elasticsearch sorting and tests
- Added default sort over ID for structure data
- Updated sorting tests to reflect that elasticsearch does not store milliseconds
- Comment out HAS ONLY and correlated list tests, pending removal
- Add quotes and properly decode URLs when testing elastic
- Use 'date' type for last_modified in elastic index and add species_at_sites
- Formatted elastic index
- Use CONFIG rather than env var for elastic testing
- Skip mongo mapper test for non-mongo backends
- Run validator coverage tests with all backends
@ml-evs ml-evs force-pushed the ml-evs/elastic branch 2 times, most recently from c9534d3 to 3d57a79 Compare March 11, 2021 23:59
optimade/server/config.py Outdated Show resolved Hide resolved
optimade/server/entry_collections/elasticsearch.py Outdated Show resolved Hide resolved
optimade/server/entry_collections/elasticsearch.py Outdated Show resolved Hide resolved
optimade/server/entry_collections/entry_collections.py Outdated Show resolved Hide resolved
optimade/server/main.py Outdated Show resolved Hide resolved
requirements.txt Outdated Show resolved Hide resolved
CasperWA
CasperWA previously approved these changes Mar 15, 2021
Co-authored-by: Casper Welzel Andersen <43357585+CasperWA@users.noreply.github.com>
Copy link
Member

@CasperWA CasperWA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job @ml-evs ! As well as the initial work @markus1978. Thanks !

@CasperWA CasperWA merged commit fdc65db into master Mar 15, 2021
@CasperWA CasperWA deleted the ml-evs/elastic branch March 15, 2021 11:49
@markus1978
Copy link
Contributor

Thanks guys for all this work. I hope I can get around taking it for a spin soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocking For issues/PRs that are blocking other PRs. enhancement New feature or request priority/medium Issue or PR with a consensus of medium priority
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Rename config variable use_real_mongo to something more general
3 participants