Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

filter query parameters of old AiiDA REST API incompatible with OpenAPI/fastapi standard #16

Open
ltalirz opened this issue Jun 2, 2021 · 9 comments

Comments

@ltalirz
Copy link
Member

ltalirz commented Jun 2, 2021

It seems that, in general, the exact structure of the query string is not standardized.
For example, RFC3986 only mentions that

query components are often used to carry identifying information in the form of "key=value" pairs

Similar the latest RFC8820.

However, there is a W3C recommendation for encoding form data (application/x-www-form-urlencoded):

The name is separated from the value by = and name/value pairs are separated from each other by &

While the OpenAPI standard is not explicit on the name-value separator, all of the examples in the spec use =.

As pointed out by @chrisjsewell, fastapi relies on starlette which internally relies on urllib's parse_qsl function for parsing the query parameters. parse_qsl (part of the python standard library) also hardcodes the = separator.

This clashes with the filter operators from the aiida-core rest api, which allow e.g. for queries like

http://localhost:5000/api/v4/nodes?mtime>=2019-04-23
@ltalirz
Copy link
Member Author

ltalirz commented Jun 2, 2021

One way around it could be to introduce a filter parameter (similar to how the optimade API works), e.g.

http://localhost:5000/api/v4/nodes?filter="mtime>=2019-04-23"

It seems to me that this should work in principle, as the openapi standard states

Additionally, the allowReserved keyword specifies whether the reserved characters :/?#[]@!$&'()*+,;= in parameter values are allowed to be sent as they are, or should be percent-encoded. By default, allowReserved is false, and reserved characters are percent-encoded.

However, looking at the string that is fed into parse_qsl, I notice that only the quotation marks are encoded [1], not the values:

# GET http://127.0.0.1:8000/users?id="1eee/4"
# string arriving at parse_qsl: id=%221eee/4%22
INFO:     127.0.0.1:56380 - "GET /users?id=%221eee/4%22 HTTP/1.1" 200 OK
# GET http://127.0.0.1:8000/users?id="123&23"
# string arriving at parse_qsl: id=%22123&23%22
INFO:     127.0.0.1:56308 - "GET /users?id=%22123&23%22 HTTP/1.1" 200 OK

This means applying parse_qsl on that string leads to an unwanted split at the & symbol.

This seems incorrect to me, as the docs of parse_qsl explicitly state:

qs: percent-encoded query string to be parsed

I wonder whether there is a way to let the encoded values arrive there...

traceback

  File "/Users/leopold/Applications/miniconda3/envs/aiida/lib/python3.9/site-packages/uvicorn/protocols/http/httptools_impl.py", line 398, in run_asgi
    result = await app(self.scope, self.receive, self.send)
  File "/Users/leopold/Applications/miniconda3/envs/aiida/lib/python3.9/site-packages/uvicorn/middleware/proxy_headers.py", line 45, in __call__
    return await self.app(scope, receive, send)
  File "/Users/leopold/Applications/miniconda3/envs/aiida/lib/python3.9/site-packages/fastapi/applications.py", line 199, in __call__
    await super().__call__(scope, receive, send)
  File "/Users/leopold/Applications/miniconda3/envs/aiida/lib/python3.9/site-packages/starlette/applications.py", line 112, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/Users/leopold/Applications/miniconda3/envs/aiida/lib/python3.9/site-packages/starlette/middleware/errors.py", line 181, in __call__
    raise exc from None
  File "/Users/leopold/Applications/miniconda3/envs/aiida/lib/python3.9/site-packages/starlette/middleware/errors.py", line 159, in __call__
    await self.app(scope, receive, _send)
  File "/Users/leopold/Applications/miniconda3/envs/aiida/lib/python3.9/site-packages/starlette/exceptions.py", line 82, in __call__
    raise exc from None
  File "/Users/leopold/Applications/miniconda3/envs/aiida/lib/python3.9/site-packages/starlette/exceptions.py", line 71, in __call__
    await self.app(scope, receive, sender)
  File "/Users/leopold/Applications/miniconda3/envs/aiida/lib/python3.9/site-packages/starlette/routing.py", line 580, in __call__
    await route.handle(scope, receive, send)
  File "/Users/leopold/Applications/miniconda3/envs/aiida/lib/python3.9/site-packages/starlette/routing.py", line 241, in handle
    await self.app(scope, receive, send)
  File "/Users/leopold/Applications/miniconda3/envs/aiida/lib/python3.9/site-packages/starlette/routing.py", line 52, in app
    response = await func(request)
  File "/Users/leopold/Applications/miniconda3/envs/aiida/lib/python3.9/site-packages/fastapi/routing.py", line 191, in app
    solved_result = await solve_dependencies(
  File "/Users/leopold/Applications/miniconda3/envs/aiida/lib/python3.9/site-packages/fastapi/dependencies/utils.py", line 559, in solve_dependencies
    dependant.query_params, request.query_params
  File "/Users/leopold/Applications/miniconda3/envs/aiida/lib/python3.9/site-packages/starlette/requests.py", line 107, in query_params
    self._query_params = QueryParams(self.scope["query_string"])
  File "/Users/leopold/Applications/miniconda3/envs/aiida/lib/python3.9/site-packages/starlette/datastructures.py", line 404, in __init__
    parse_qsl(value.decode("latin-1"), keep_blank_values=True), **kwargs
  File "/Users/leopold/Applications/miniconda3/envs/aiida/lib/python3.9/urllib/parse.py", line 746, in parse_qsl

Ok, I see that at

  File "/Users/leopold/Applications/miniconda3/envs/aiida/lib/python3.9/site-packages/fastapi/routing.py", line 191, in app
    solved_result = await solve_dependencies(

the query parameter is actually still percent-encoded.
But no longer in

  File "/Users/leopold/Applications/miniconda3/envs/aiida/lib/python3.9/site-packages/starlette/requests.py", line 107, in query_params
    self._query_params = QueryParams(self.scope["query_string"])

[1] The quotation marks are actually already encoded when I just copy the URL from the URL field of the web browser into this text box, so perhaps that happens on the browser level.

@chrisjsewell
Copy link
Member

Oh yeh indeed, its a bit of a pain to use as a "client", you can see here where I have to do "partial" encoding to get it to work:

https://github.com/chrisjsewell/react-aiida-app/blob/6722e8e4dea489315064fe06f94c8cf3015ebde0/src/clients/aiidaClient.ts#L88-L91

So yeh, in terms of not breaking too much from the existing API, I can see why we want to implement this filter parameter, but... from a top-level / client perspective, I feel I would just generally avoid it, and use instead the "more structured" querybuilder/graphql endpoint

@ltalirz
Copy link
Member Author

ltalirz commented Jun 3, 2021

I just discovered that fastapi has built-in support for parsing the starlette request object manually: https://fastapi.tiangolo.com/advanced/using-request-directly/

If I understand correctly, one can even still have automatic parsing & docs for all of the fields that follow the key=value logic.

@chrisjsewell
Copy link
Member

It would also mean that if you get data from the Request object directly (for example, read the body) it won't be validated, converted or documented (with OpenAPI, for the automatic API user interface) by FastAPI.

Interesting cheers, although... it sounds like that removes a lot of the features of fastapi

@ltalirz
Copy link
Member Author

ltalirz commented Jun 3, 2021

If I understand correctly, then this is a limitation of the OpenAPI standard: query parameters can only have a value, there is no concept of different relations (smaller than, ...) between query parameters and their value other than equality; i.e. there is nothing fastapi can do about it.

This would mean that the filter parameters would be missing e.g. from the interactive client built into the docs; on the other hand it would potentially allow us to be fully backward compatible, which is worth something.

@chrisjsewell
Copy link
Member

Well it would not just be the filter parameters, it would be every parameter, wouldn't it?
i.e once you have take the onus to parse the query string then fastapi will have no idea what any of it is

@chrisjsewell
Copy link
Member

On a high-level note, it would be good to have an idea of who is already using the REST API, and so who this would affect (just Mat Cloud or lots of others). It does clearly state when you start it presently that it should not be used for production use, but then it also has been around for many years.

@ltalirz
Copy link
Member Author

ltalirz commented Jun 3, 2021

Well it would not just be the filter parameters, it would be every parameter, wouldn't it?

I don't think so - as mentioned in the page I linked, you can simply declare any of the parameters that use the = as usual, and then do the manual parsing of the Request to extract the rest.
(one would need to check that the parsing works for the "mixed" query strings, but I'm optimistic since it should split on & which should work nevertheless)

On a high-level note, it would be good to have an idea of who is already using the REST API, and so who this would affect (just Mat Cloud or lots of others).

That is a good point. So far I know of

There may be a little more, and we should ask around in a few days once we have a clearer idea of what we could realistically do. I do suspect that the migration of the aiida-explorer would be the largest task.

It does clearly state when you start it presently that it should not be used for production use, but then it also has been around for many years.

Well, that is just for the server that ships with flask.
On Materials Cloud we run it with apache via mod_wsgi.

@chrisjsewell
Copy link
Member

one would need to check that the parsing works for the "mixed" query strings, but I'm optimistic since it should split on & which should work nevertheless

well I will leave you to figure that out for now, because yeh, from a selfish viewpoint, if I can get the graphql working nicely, I will just look to switch to that for all client-side querying 😝

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants