Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix errors parsing IDs that contain slashes #183

Merged
merged 6 commits into from Mar 4, 2020
Merged

Conversation

ml-evs
Copy link
Member

@ml-evs ml-evs commented Feb 18, 2020

Hopefully this will eventually close #181...

This PR so far just adds tests that should pass and currently don't. When filtering on included data (which is done by default) any "/" in the ID (totally allowed under spec) will throw a lark parser error, e.g.

parser.parse("id=dummy/2019")
E           optimade.filterparser.lark_parser.ParserError: No terminal defined for '/' at line 1 col 9
E
E           id=dummy/2019
E                   ^
E
E           Expecting: {'ESCAPED_STRING', 'SIGNED_INT', 'DOT', 'SIGNED_FLOAT', '_OR', '_AND', 'IDENTIFIER', 'NOT', 'LPAR'}

@CasperWA
Copy link
Member

CasperWA commented Feb 27, 2020

So looking into this issue a bit, I've found that the issue is that we are not enclosing the value in quotes ("value").
But that then amounts to the question of whether id's should be queryable without adding quotes to the value?

Edit: Tried to make this comment long time ago - GitHub finally worked.

@codecov
Copy link

codecov bot commented Feb 27, 2020

Codecov Report

❗ No coverage uploaded for pull request base (master@23159c7). Click here to learn what that means.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff            @@
##             master     #183   +/-   ##
=========================================
  Coverage          ?   86.62%           
=========================================
  Files             ?       39           
  Lines             ?     1854           
  Branches          ?        0           
=========================================
  Hits              ?     1606           
  Misses            ?      248           
  Partials          ?        0           
Flag Coverage Δ
#unittests 86.62% <ø> (?)
Impacted Files Coverage Δ
optimade/models/entries.py 100.00% <0.00%> (ø)
optimade/validator/validator.py 62.33% <0.00%> (ø)
optimade/models/references.py 97.61% <0.00%> (ø)
optimade/filtertransformers/mongo.py 95.29% <0.00%> (ø)
optimade/models/baseinfo.py 89.13% <0.00%> (ø)
optimade/server/mappers/__init__.py 100.00% <0.00%> (ø)
optimade/server/routers/landing.py 66.66% <0.00%> (ø)
optimade/models/jsonapi.py 90.00% <0.00%> (ø)
optimade/server/routers/structures.py 100.00% <0.00%> (ø)
optimade/server/routers/references.py 100.00% <0.00%> (ø)
... and 29 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 23159c7...5ff98d5. Read the comment docs.

@CasperWA
Copy link
Member

So I've "hacked" the grammar to allow for un-quoted id values ... it would be better if just said ids should be in quotes as every other string value, but for ids this maybe doesn't make sense?

@CasperWA CasperWA marked this pull request as ready for review February 27, 2020 16:55
@CasperWA CasperWA requested a review from fekad February 27, 2020 16:55
@CasperWA
Copy link
Member

From Skype conversation with @ml-evs (all from me):

it seems the issue is (simply) that to be recognized as an escaped string, it needs to be in quotations
i.e., id="example/test" will work fine.
This then raises the question if the value for id is special, and it has to be treated as an escaped string even if there are no quotes around it?

I've been testing it further, implementing a separate grammar that adds an ID_ESCAPED_STRING, implementing this to the extent it's needed. However, this also means that it doesn't enclose .............. never mind - just fixed it

@ml-evs
Copy link
Member Author

ml-evs commented Feb 27, 2020

Thanks for this fix @CasperWA, it seems sensible to me! I'd like to test that this works on my own server later on, as the initial problem I had was an internal server issue where it could not return documents with this ID in included, which lead me to writing the initial tests...

@ml-evs ml-evs changed the title Filter lark parser issues Fix errors parsing IDs that contain slashes Feb 28, 2020
@ml-evs
Copy link
Member Author

ml-evs commented Feb 28, 2020

This does indeed fix the error I was getting @CasperWA, but I see what you mean about treating ID's on a separate footing. A small change to router.utils.get_included_relationships will have the same effect, so I guess we should decide which fix to make... I'll push the fix here too, and we can decide whether to revert the grammar after.

@CasperWA
Copy link
Member

CasperWA commented Mar 1, 2020

This does indeed fix the error I was getting @CasperWA, but I see what you mean about treating ID's on a separate footing. A small change to router.utils.get_included_relationships will have the same effect, so I guess we should decide which fix to make... I'll push the fix here too, and we can decide whether to revert the grammar after.

The fix in the router is quite straightforward, but for me it doesn't solve the underlying question: Should id values be enclosed in quotation marks or not? If yes, then we fix it in the router, if no, then we need to fix it in the grammar (unfortunately). To me, the grammar fix is quite ugly.

@ml-evs
Copy link
Member Author

ml-evs commented Mar 3, 2020

I'd rather fix it in the router, so I'll remove the grammar changes.

@CasperWA
Copy link
Member

CasperWA commented Mar 3, 2020

I'd rather fix it in the router, so I'll remove the grammar changes.

Sure. Following the discussion in Materials-Consortia/OPTIMADE#259 we should however be able to not have to enclose the value in quotation marks ...

But this of course shouldn't have to actually result in a grammar change for the filter query parameter input.

Edit: I think I mixed up some issues with this comment - never mind, you do you 😅

@CasperWA CasperWA force-pushed the ml-evs/parser_issues branch 3 times, most recently from 95da19f to 4c85295 Compare March 4, 2020 17:41
@CasperWA CasperWA requested review from CasperWA and removed request for fekad March 4, 2020 17:42
ml-evs and others added 5 commits March 4, 2020 18:54
Since id values may be given without being an escaped string (i.e.,
without quotation marks `"`) this needs to be taken into account in the
grammar.

Up to this point, the only allowed values for id (without quotation
marks) were as a "property", i.e., r"\._[a-z][0-9]", so no capital
letters, nor any special characters, e.g., like `/`.
This commit effectively reverts 1546bd.

Co-Authored-By: Casper Welzel Andersen <43357585+CasperWA@users.noreply.github.com>
@CasperWA
Copy link
Member

CasperWA commented Mar 4, 2020

I have just added tests (they should currently fail) for the include parameter to align with the results of the discussion in Materials-Consortia/OPTIMADE#259

@CasperWA
Copy link
Member

CasperWA commented Mar 4, 2020

I have just added tests (they should currently fail) for the include parameter to align with the results of the discussion in Materials-Consortia/OPTiMaDe#259

Maybe this is for a different PR - I will revert and create a new PR

@ml-evs
Copy link
Member Author

ml-evs commented Mar 4, 2020

I'm happy with this when you are @CasperWA, we can make the required changes later based on the spec-level decision.

@ml-evs ml-evs merged commit 98c9b4b into master Mar 4, 2020
@ml-evs ml-evs deleted the ml-evs/parser_issues branch March 4, 2020 18:58
@CasperWA CasperWA mentioned this pull request Mar 5, 2020
CasperWA added a commit that referenced this pull request Mar 6, 2020
Up to v0.6.0.

**New features**:
- GitHub Action validator that runs `optimade_validator` for a locally running OPTiMaDe server (#191, @CasperWA, tested by @ml-evs)
- Support filter queries for `HAS ALL`, `HAS ANY` and `HAS ONLY` and value lists on MongoDB backends (#173, @ml-evs)
  Note: `OPERATOR` use in value lists are still _not_ supported.
- Debug mode. Start the server in debug mode to enable `debug` log-level in `uvicorn` and get a full Python traceback in the JSON response (#190, @CasperWA)
- Add testing of mandatory `filter` queries to `optimade_validator` (#205, @ml-evs)

**Updates**:
- Allow Cross-Origin requests from anywhere (`*`), i.e., enable CORS for both servers (#194, @CasperWA)
- Updates to models (correct misspelling, more transparent model class naming, streamline models with respect to python class inheritance, update field descriptions) (#195, @CasperWA)
- API change: Rename `optimade.models.toplevel.py` to `optimade.models.responses.py` (#195, @CasperWA)
- Update dependencies to newest versions (as of 02.03.2020) (#202, #196, @CasperWA)
- Move imports from `starlette` to `fastapi`, where possible (#202, @ml-evs)
- Remove custom middleware to redirect slashed URLs in favor of `starlette` implementation (#202, @ml-evs)
- CI tests are now performed with a real MongoDB in the backend. CI tests are also performed with a `mongomock` backend for the tests in `server/test_middleware.py`, `server/test_server_validation.py`, and `server/test_config.py` (#196, @ml-evs, additional testing by @CasperWA )
- Remove `/optimade` from base URLs. This was especially important for the OpenAPI schema (#201, #216, @CasperWA, @ml-evs)
- Check integrity of query part of the raw URL using a custom middleware (#209, @CasperWA)
- New `optimade/server/exceptions.py` to contain custom `HttpException`s, moved `BadRequest` here (#209, @CasperWA)
- Pattern and regex testing for `data.available_api_versions` parts in `/info` endpoint and fix tests for the same (#211, @CasperWA)
- Restructure import of test data for regular server (#212, @shyamd)

**Bug fixes**:
- New retrieval URL for Materials-Consortia's list of providers (#187, @CasperWA)
- Skip local `HAS ONLY` tests with a `mongomock` backend, since v3.19.0 does not support these (#206, @ml-evs)
- Resource ID's can now contain slashes (`/`) (#183, @ml-evs, @CasperWA)
- Remove _valid_ version part of base URL in `meta.query.representation` (#201, @CasperWA)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Relationships don't work when "/" present in id
2 participants