Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent errors when using PSR-18 clients that do not automatically decode GZIP responses #38

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

jasongill
Copy link

This commit removes the forced Accept-Encoding: gzip header, as not all PSR-18 clients automatically decode/gunzip the responses. With the forced header in place, the error Not a valid Json: Control character error, possibly incorrectly encoded will appear when calling $request->asString(), which is a core part of decoding JSON responses.

If you use Guzzle or another more fully-featured client, you likely won't see this error, but simple PSR-18 clients that don't automatically handle the responses like Guzzle does could run into issues with the gzip'ed responses.

It would be better to handle this Accept-Encoding logic in elastic/elastic-transport-php by adding some detection if the client automatically decodes responses (and if so, automatically adding the header there). Perhaps the PHP-HTTP Decoder Plugin would be a good place to look as well.

In any case, this package version 8.0+ do not work when httplug picks a simpler client that can't handle gzip automatically, so this change resolves that. This is not the perfect solution, as most clients probably do handle gzip'ed responses, but it is a "safer default" that allows all clients to work without special configuration.

If you are finding this PR because you are getting the Not a valid Json... error, it's related to this issue: elastic/elasticsearch-php#1213 which was resolved in the main elasticsearch-php package but hasn't been resolved in this package for Enterprise Search. The quick fix, if this PR is not merged, is to specify Guzzle as the default client which is currently documented here: https://github.com/elastic/enterprise-search-php#psr-18-http-library, as your install must be detecting another HTTP client library that doesn't have all of the features that Guzzle does, so setting that manually will force Guzzle to be used.

@elasticmachine
Copy link
Collaborator

Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually?

@cla-checker-service
Copy link

cla-checker-service bot commented Jun 20, 2023

💚 CLA has been signed

@ezimuel
Copy link
Contributor

ezimuel commented Aug 17, 2023

@jasongill thanks for the PR and sorry for my late reply. Instead of removing the gzip encodng that is the recommended way to connect to Enterprise Search I would suggest to add a Client::setGzipCompression(bool $set = true) function. By default, the compression is true and you can disable if your are using a PSR-18 client that does not support it. WDYT? Thanks.

@jasongill
Copy link
Author

@ezimuel That is not a bad idea, but it would also require adding an error handler to give a better warning (right now, incompatible clients just get "Not a valid Json: Control character error, possibly incorrectly encoded" error which doesn't really make much sense); and that might be a bit of work just to determine if the invalid JSON response is due to actually being invalid JSON, or if the response was gzip mangled. That could add more documentation, possible confusion, etc.

The "best" solution IMHO would be to rewrite the client code to use PHP-HTTP's decoder plugin which will decouple the gzip decoding from the PSR-18 client itself, which will guarantee that there's always a valid response.

An alternative solution might be to try to do the gzip decoding in this package itself; something like (pseudocode):

if ( ! valid_json($response) ) {
    $response = gunzip($response)
    if ( ! valid_json($response) ) throw new Exception("Invalid Json.....")
}

Or finally, a solution similar to #37 but made more extensible to have a "whitelist" of compatible clients would work, but would have its own possible maintenance issues (which I believe the PHP-HTTP project tries to resolve)

@ezimuel
Copy link
Contributor

ezimuel commented Aug 17, 2023

@jasongill thanks for the suggestions. The usage of PHP-HTTP's decoder plugin sounds interesting but this will require some time for refactoring and testing. Maybe this can be a long term investment for the future. In the meantime, I've found a potential solution to improve the error message for invalid gzip response.
I'm thinking to check the Content-Encoding: gzip response header and in this case of InvaldJsonException throw a more informative error message, something like this change in Response:

        try {
            $contentType = $this->response->getHeaderLine('Content-Type');
            if (strpos($contentType, 'application/json') !== false) {
                $this->asArray = JsonSerializer::unserialize($this->asString());
                return $this->asArray;
            }
            if (strpos($contentType, 'application/x-ndjson') !== false) {
                $this->asArray = NDJsonSerializer::unserialize($this->asString());
                return $this->asArray;
            }
            if (strpos($contentType, 'text/csv') !== false) {
                $this->asArray = CsvSerializer::unserialize($this->asString());
                return $this->asArray;
            }
        } catch (InvalidJsonException $e) {
            $contentEncoding = $this->response->getHeaderLine('Content-Encoding') ?? '';
            if (strpos($contentEncoding, 'gzip')  !== false) {
                throw new UnserializeException(
                    "The response is gzipped and it seems the HTTP client cannot handle it. " .
                    "Try to disable the gzip using Client::setGzipCompression(false)"
                );
            }
            throw $e;
        }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants