Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add server check for typos in query parameters #1120

Closed
JPBergsma opened this issue Apr 11, 2022 · 9 comments · Fixed by #1122
Closed

Add server check for typos in query parameters #1120

JPBergsma opened this issue Apr 11, 2022 · 9 comments · Fixed by #1122
Labels
enhancement New feature or request ergonomics Features that improve the usability of the package server Issues pertaining to the example server implementation suggestions

Comments

@JPBergsma
Copy link
Contributor

I have already had several times that I made a type in the URL, and it took me quite some time to figure out what I did wrong.
See for example issue #1076
Some other wrong URLs that I have "tried":
https://optimade.herokuapp.com/v1/structures/filter=nelements=2
https://optimade.herokuapp.com/v1/structures?filter=nelements=2&response_field=cartesian_site_positions
Neither of these URLs will give you an error, but the results are definitively not what you want.
Therefore, I think it would be good to add stricter checking of the input URLs.

@JPBergsma JPBergsma changed the title Add Checks for typos in url's Add Checks for typos in URLs Apr 11, 2022
@CasperWA
Copy link
Member

How exactly do you intend to achieve this checking? I think the main point of the API is to facilitate machine-to-machine interaction, where typos would be a thing of a badly written client (or similar).
Furthermore, I think we have to draw the line of "hand-holding" users at some point?

However, if it's simply a check for the tests within our code, I guess that makes sense - but otherwise, I think we should respect the specification, which states to ignore query parameters that are not OPTIMADE-specific. If you really want, we could return a nicer 404 for the misspelled paths, but it should already be included in the error response's metadata.

In other words, I don't think this is an issue of the package, but rather of the users, and not something we should spend energy on - but I'm open for suggestions on how you intend to improve error responses.

@ml-evs
Copy link
Member

ml-evs commented Apr 14, 2022

How exactly do you intend to achieve this checking? I think the main point of the API is to facilitate machine-to-machine interaction, where typos would be a thing of a badly written client (or similar). Furthermore, I think we have to draw the line of "hand-holding" users at some point?

I don't agree that adding a warning (or an error) on invalid query parameters is hand-holding. We should be able to do this easily with a validator on the query params FastAPI model.

However, if it's simply a check for the tests within our code, I guess that makes sense - but otherwise, I think we should respect the specification, which states to ignore query parameters that are not OPTIMADE-specific. If you really want, we could return a nicer 404 for the misspelled paths, but it should already be included in the error response's metadata.

Not sure this is true? This actually seems missing from the spec. The only mention I can find is specifically for the single entry listing endpoint:

[5.4.1 Single Entry URL Query Parameters]
The client MAY provide a set of additional URL query parameters for this endpoint type. URL query parameters not recognized MUST be ignored.

Elsewhere it says:

Additional OPTIONAL URL query parameters not described above are not considered to be part of this standard, and are instead considered to be "custom URL query parameters". These custom URL query parameters MUST be of the format "<url_query_parameter_name>". These names adhere to the requirements on implementation-specific query parameters of JSON API v1.0 since the database-provider-specific prefixes contain at least two underscores (a LOW LINE character '_').

I think we can emit a warning and still be spec compliant. The spec could even be strengthened to error in this scenario, IMO (so the handling of custom fields and params is unified).

In other words, I don't think this is an issue of the package, but rather of the users, and not something we should spend energy on - but I'm open for suggestions on how you intend to improve error responses.

I think I simply disagree! Maybe I am biased by having to run the API tutorials, many people write code that calls the APIs from scripts etc.

@JPBergsma
Copy link
Contributor Author

According to the JSON API specification using the "filer" query parameter, as mentioned in #1076, MUST result in a 400 BAD REQUEST message, as any custom query parameter should have at least one character that is not a lower case alphabetical character (a-z).

I'll try to create a quick check over the weekend. I think I can extract the query parameter keys from the request and then check whether they are in a list of supported query parameters.

Do you think the following options are good behaviour for unknown parameters?

If the parameter does not have a database_specific prefix or the prefix of the current database, throw a 400 Bad Request error.
If the prefix is unknown a warning will be given.
And If the parameter has a known database prefix, it will be ignored.

@ml-evs
Copy link
Member

ml-evs commented Apr 14, 2022

If the parameter does not have a database_specific prefix or the prefix of the current database, throw a 400 Bad Request error. If the prefix is unknown a warning will be given. And If the parameter has a known database prefix, it will be ignored.

I wouldn't worry about the unknown prefix part (if it has any prefix just ignore it), but otherwise this sounds good.

I had a quick play over lunch to do this via the query param classes themselves, but the way FastAPI does dependency injection, this was a bit tricky. Instead you can go via the raw request and compare with the fields of the query param class (might be a bit tricky). Surprised there isn't a supported way of doing this easily really.

@ml-evs ml-evs added enhancement New feature or request server Issues pertaining to the example server implementation suggestions ergonomics Features that improve the usability of the package labels Apr 18, 2022
@CasperWA
Copy link
Member

I think I simply disagree! Maybe I am biased by having to run the API tutorials, many people write code that calls the APIs from scripts etc.

Fair enough :)

I just think this is more of a hint that a proper "nice" client is needed. Either a CLI or a simple Python class client.

We should most likely not start erroring if a query parameter given is badly formed, since it may be that another sub-server is attached to the current FastAPI server with some middleware that uses this supposedly badly formed query parameter or similar? Hmm, however, a warning could be all right, although that may end up being a noisy, incorrect warning in the same case that would then have to be ignored. Versus other warnings, which might shouldn't be ignored - confusing users.

I'd just simply ignore it and try to focus more on making nicer clients that provides users with better tools to query OPTIMADE APIs with less risk of typing errors in the URLs.
But again, if the fix is simple - and non-disturbing for advanced setups - I guess it's fine. This is, after all, just an example server, not something that should be used 1:1 (even though it is and it'd be nice if it could stay like that for all sorts of weird use cases) :)

@JPBergsma
Copy link
Contributor Author

I just think this is more of a hint that a proper "nice" client is needed. Either a CLI or a simple Python class client.

Good point, if there is a good library for creating OPTIMADE queries, far fewer people would need to write their own client software to do this.

Do you know if there are databases that use database specific query parameters that are not declared in the SingleEntryQueryParams or the EntryListingQueryParams classes? And could it be a problem to define them there ?

It would be elegant to have all the query params declared in one place. (Now we are still handling the api_hint elsewhere)

@ml-evs
Copy link
Member

ml-evs commented Apr 19, 2022

I just think this is more of a hint that a proper "nice" client is needed. Either a CLI or a simple Python class client.

Not going to disagree there :)

We should most likely not start erroring if a query parameter given is badly formed, since it may be that another sub-server is attached to the current FastAPI server with some middleware that uses this supposedly badly formed query parameter or similar? Hmm, however, a warning could be all right, although that may end up being a noisy, incorrect warning in the same case that would then have to be ignored. Versus other warnings, which might shouldn't be ignored - confusing users.

This is a good point (especially re:NOMAD). In my review of #1122 I suggested that this feature is optional and off by default.

I'd just simply ignore it and try to focus more on making nicer clients that provides users with better tools to query OPTIMADE APIs with less risk of typing errors in the URLs.

Sure, I'm hoping to find time to add one to this package before the next workshop, but that time is quickly dissipating... I am involved with quite a few use cases where people are directly querying OPTIMADE URLs themselves (as code developers) where using even a Python API might not be possible. Examples:

  • Jmol is adding support for OPTIMADE directly (fancy writing a Java client 😅 ?)
  • Xerus is an automatic XRD refinement library that looks up existing crystal structures, this is one case where a Python client might be useful, but it has to be compatible with the GSAS-II Python library which has various awkward dependencies
  • Materials Resource Registry demo

I'm thinking the best idea would be a very minimal client library with only requests or httpx as a dependency, separate to this package (or namespaced), and then this package can provide additional routines for deserializing/validating the JSON responses.

@CasperWA
Copy link
Member

I'm thinking the best idea would be a very minimal client library with only requests or httpx as a dependency, separate to this package (or namespaced), and then this package can provide additional routines for deserializing/validating the JSON responses.

This is what I've been thinking as well.

I've already managed to create a couple of different clients (and other related software in the same definition-sphere). But a simple requests/httpx-based Python class client would be great - it should in theory utilize the pydantic models, but other than that it shouldn't have any other dependencies.

Another alternative is to use the Typer framework for a CLI or similar, which should be an "easy" way to utilize the pydantic models for a CLI?

Personally I'd prefer the simple Python class client for now, though.

@ml-evs
Copy link
Member

ml-evs commented Apr 19, 2022

I'll see what I can come up with in May before the workshop. (Are you coming this year?)

@ml-evs ml-evs linked a pull request Apr 30, 2022 that will close this issue
@ml-evs ml-evs changed the title Add Checks for typos in URLs Add server check for typos in query parameters Apr 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request ergonomics Features that improve the usability of the package server Issues pertaining to the example server implementation suggestions
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants