Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for POST + other improvements of the server #657

Merged
merged 8 commits into from Jul 4, 2022
Merged

Conversation

hannahbast
Copy link
Member

Currently, just a TODO, but this should be feasible with relatively few lines of code.

@joka921 joka921 changed the title Implement support for POST requests Implement support for POST requests #Qlever May 15, 2022
Hannah Bast added 3 commits June 30, 2022 15:07
1. There is now proper support for GET and POST queries. The POST
queries now also allow both variants specified in the SPARQL 1.1
standard: with a URL-encoded query string ("query=...") and with content
type "application/sparql-query" and just the unencoded SPARQL query in
the body of the POST request.

2. Improved the URL parsing helper functions along the way. Eventually,
this should be done with a propery library, but until we are there,
let's improve what we have.

3. Used the opportunity to add a command-line argument for specifying
the access token to be used for restricted API calls like
clear-cache-complete. This was simple enough and completes work from an
earlier PR.
Copy link
Member

@joka921 joka921 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This already looks great, I only have a few smaller comments.

We need a Mocking framework for the Server code to write unit tests for it, but that's a story for another day:)

@@ -62,6 +63,9 @@ int main(int argc, char** argv) {
"The basename of the index files (required).");
add("port,p", po::value<int>(&port)->required(),
"The port on which HTTP requests are served (required).");
add("access-token,a",
po::value<std::string>(&accessToken)->default_value("CHANGE_ME"),
"Access token for restricted API calls.");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it really ok to have the "restricted" API calls via the default "CHANGE_ME" token, when somebody forgets to set the token?

auto filenameAndParams = [&request]() {
if (request.method() == http::verb::get) {
// For a GET request, `request.target()` yields the part after the domain,
// which is a concatenation of the path and the query string (the query
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me as a human the usage of the term query (or query string) for
the GET parameters with ? and & connections is a little bit confusing, because we use query typically as a synonym for SPARQL query`

Comment on lines 120 to 121
// auto filenameAndParams =
// ad_utility::UrlParser::parseTarget(request.target());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be deleted.

LOG(DEBUG) << "Request method: \"" << request.method() << "\""
<< ", target: \"" << request.target() << "\""
<< ", body: \"" << request.body() << "\"" << std::endl;
auto filenameAndParams = [&request]() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This lambda (logic + error handling + Comments) has become so long that it can become a separate (static) function of the Server class,
As it also takes only one parameter (the request).

1. If something goes wrong during query processing as far the URL
parameters is concerned, the error message in the log and the error
message returned to the client are now in sync.

2. Better logging also at other places, in particular, when processing
the commands.

3. If the server is started without --access-token, there is now no
access to the restricted API calls at all (no matter which value is
provided in the URL parameters for access-token).

4. When an access token is provided and it doesn't match or the server
was started without --access-toknen, the whole request is now ignored.
Copy link
Member

@joka921 joka921 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fine, as soon as the catch-co_await thing is fixed, we can merge it.

src/engine/Server.cpp Show resolved Hide resolved
1. Now avoid co_await in catch block because it's against the C++
standard.
2. Also log the query processing time because it was simple enough and
somewhat connected (request logging was improved at various places)
Copy link
Member

@joka921 joka921 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great!
One line which I still think can be removed, but otherwise this looks good to go.
Does this now yield the expected responses when sending the wrong Content-type?

Comment on lines 64 to 65
std::optional<std::string> exceptionErrorMsg;
std::optional<http::response<http::string_body>> badRequestResponse;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the badRequestResponse variable unused? If so, remove it!

co_return co_await send(std::move(response));
};
// Process the request using the `process` method and if it throws an
// exception, log the error message and send a HTTP/1.1 404 Bad Request
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think 400 is Bad Request and 404 is not found.

Note: The log was quite inconsistent regarding what it logs and what it
sends to the server. Made that more consistent now. Also logging some
more relevant information to the INFO log, like the total time for
processing a query or the basic parameters of a request (method, content
type, accept hearders). I tested it for a number of queries and it
worked alright.
Copy link
Member

@joka921 joka921 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my standpoint, this may now be merged!

1. When access to a URL parameter is denied or when there are URL
parameters that were not processed, now abort the query processing with
a 400 error. In particular, this has the consequence that a file is only
served if there are no URL parameters (which makes sense).

2. In processQuery, now do the query planning *after* determining the
media type. Before it could happen that the query planning took some
time and then the processing aborted because of an invalid media type,
which didnt't make sense. Log the time required for the query planning.

3. Support an explicit "ping" action with an optional message. This is
useful for the qlever script and beyond. For example, the ping from the
qlever script is now logged in a way that we can see that it came from
the qlever script.

4. Make sparql-results+json the default media type, as required by the
standard. This is possible now because the latest version of the QLever
UI explicitly asks for qlever-results+json for every SPARQL query it
sends.

5. ALso log the size of the result when it has been computed. This
useful information was missing from the INFO log so far.

6. Minor reformatting of a few strangely formatted comments that I
encountered on the way.
Copy link
Member

@joka921 joka921 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please merge it, but don't add additional features.
In the future please split different issues into different PRs, I identify at least:

  • Implementation of POST
  • Implementation of access tokens
  • Improved logging (in various places unrelated to the above)
  • more consistent error handling in the Server.

@hannahbast hannahbast changed the title Implement support for POST requests #Qlever Support for POST + various improvements of the server Jul 4, 2022
@hannahbast hannahbast changed the title Support for POST + various improvements of the server Support for POST + other improvements of the server Jul 4, 2022
@hannahbast hannahbast merged commit 4466921 into master Jul 4, 2022
@hannahbast
Copy link
Member Author

Please merge it, but don't add additional features. In the future please split different issues into different PRs, I identify at least:

  • Implementation of POST
  • Implementation of access tokens
  • Improved logging (in various places unrelated to the above)
  • more consistent error handling in the Server.

Thanks, but I am not totally convinced, let's talk about it later. The size of this PR is average for our PRs. I know that it started out as only adding POST. Then it turned out to become a more general improve "process" and "processQuery" because the issues I eventually dealt with (which you mention in your list) are relatively tightly connected. The --access-token was just a small leftover from a previous PR.

@hannahbast hannahbast deleted the http-post branch August 26, 2022 17:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants