Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract query and big literals parameters HTTP 414 #794

Closed
AndreaDiPietro-ADP opened this issue May 31, 2020 · 15 comments
Closed

Extract query and big literals parameters HTTP 414 #794

AndreaDiPietro-ADP opened this issue May 31, 2020 · 15 comments

Comments

@AndreaDiPietro-ADP
Copy link
Contributor

Extract query put literals on query string and this may cause an HTTP 414 in some real use case,
I made this Solarium plugin to solve this if you want you can include into Solarium base code.

@mkalkbrenner
Copy link
Member

It looks good :-)

If you adopt it for the master branch (upcoming 6.0.0) and add a test, we can merge it.

@thomascorthals
Copy link
Member

thomascorthals commented Jun 1, 2020

@mkalkbrenner Should this be integrated in the existing PostBigRequest plugin? As another branch for this if:

if (Request::METHOD_GET == $request->getMethod() &&
strlen($queryString) > $this->getMaxQueryStringLength()) {

Maybe move the actual heavy lifting to two separate protected functions. That allows one to extend the plugin and override either of those (but not necessarily both) if that happens to suit one's needs.


Detecting an extract query by the handler isn't 100% reliable. You can configure Solr with a different requestHandler for extracting.


If you need something other than UTF-8, you can use ->setInputEncoding() on the query object. PostBigRequest retrieves it like this with a default fallback:

$charset = $request->getParam('ie') ?? 'utf-8';

@thomascorthals
Copy link
Member

If you need something other than UTF-8, you can use ->setInputEncoding() on the query object.

That's new for the upcoming 6.0.0, by the way. It won't work if you need to backport to a 5.x release.

@wickedOne
Copy link
Collaborator

is there a specific reason the PostBigRequest is a seperate plugin and not a feature of the request itself?

@mkalkbrenner
Copy link
Member

@wickedOne I assume that a "plugin" forces people to think about its configuration. To avoid that plugin you could also modify the configuration of your container, jetty in most Solr installations.
And POST could be blocked by some setups.
In general switching to POST should be avoided wherever possible because you bypass all caches.

@wickedOne
Copy link
Collaborator

@mkalkbrenner thanx for explaining, wasn't aware solr caches were request method sensitive

@mkalkbrenner
Copy link
Member

From the Solr documentation:

Solr only emits cache header elements for GET and HEAD requests. The HTTP standard does not allow cache related headers for POST requests.

@thomascorthals
Copy link
Member

Solr's documentation isn't entirely accurate. (You might have noticed I have a penchant for exact docs.)

For HTTP/1.0, RFC 1945 states:

Applications must not cache responses to a POST request because the
application has no way of knowing that the server would return an
equivalent response on some future request.

For HTTP/1.1, RFC 7231 states:

Responses to POST requests are only cacheable when they include
explicit freshness information (see Section 4.2.1 of [RFC7234]).
However, POST caching is not widely implemented.

It would be more precise to say that the HTTP/1.0 standard doesn't allow it. And because

Solr does everything to avoid such problems because it emits HTTP 1.0 and HTTP 1.1 compliant HTTP headers.

, it doesn't emit cache headers for POST requests.

@mkalkbrenner
Copy link
Member

@thomascorthals thanks for the clarification.
But I think that my statement remains valid for solarium:

In general switching to POST should be avoided wherever possible because you (might) bypass all caches.

@thomascorthals
Copy link
Member

@mkalkbrenner I agree with your statement. We shouldn't switch to POST automatically if the behaviour isn't identical to GET.

I just wanted to provide some context for the way Solr does things.

@thomascorthals
Copy link
Member

Just a thought: the reason why we don't switch to POST automatically without PostBigRequest is because it changes caching behaviour. However, an Extract query is already a POST. Couldn't we always put all parameters in the request body instead of the query string?

@mkalkbrenner
Copy link
Member

I don't think so. It is also common to run Solr behind a reverse caching proxy. Removing the GET parameters would lead to false cache hits.
I assume that Solr's HTTP Cache (which is not enabled by default) will just respect GET parameters to distinguish between different searches.

@mkalkbrenner
Copy link
Member

@AndreaDiPietro-ADP Could you open a PR?

Within the PR a test and some documentation (and an example) should be added.

@AndreaDiPietro-ADP
Copy link
Contributor Author

@mkalkbrenner
PR opened.

@thomascorthals
Copy link
Member

The PostBigExtractQuery plugin was released as part of Solarium 6.1.5.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants