Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make URI query decoding more robust #608

Closed
wants to merge 3 commits into from

Conversation

aferreira
Copy link
Contributor

With the current ES code, exceptions or infinite loops can be generated while parsing the URI query of a request.

This can be seen with the following examples:

curl -XGET localhost:9200/test/_analyze?t&text=this+is+a+test
# the exception stack trace shows up in logs

curl -XGET localhost:9200/test/_analyze?t1&t2&text=this+is+a+test
# never returns, never ends

The changes suggested take care of these issues.

Adriano Ferreira added 3 commits January 6, 2011 20:31
Testing

    RestUtils.decodeQueryString("something", "something".indexOf('?') + 1, params);

is not really checking decoding of an empty query.
Instead, it is testing decoding of "something"
as a query (because "something.index('?')+1" evaluates to 0).
The parameter map is left empty because
"malformed" pairs like "something" are currently skipped.

Instead, this change modify this test to check the edge cases:

+ fromIndex >= queryString.length()
+ fromIndex < 0
The code of decodeQueryString() had some trouble with weird URLs:

(1) an input like "uri?param&p=v" causes an exception to be thrown
(2) an input like "uri?param1&param2" causes an infinite loop

This could be verified against an ES server with requests like

    curl -XGET localhost:9200/test/_analyze?t&text=this+is+a+test
    # the exception stack trace shows up in logs

    curl -XGET localhost:9200/test/_analyze?t1&t2&text=this+is+a+test
    # never returns, never ends

This change fixes these issues.
When writing tests for the fix of decodeQueryString() to
handle gracefully edge cases like:

    &a
    &a&b

it arises the question of what behavior is desirable
for these pathological cases (vs the regular p=v pairs).
This change just skips them which is consonant
to the preexisting code.

To be thorough, we add tests for a bunch of edge cases, like:

    QUERY           PARSED PARAMS

    ?               {}
    ?&              {}
    ?=              { "": "" }
    ?a              {}
    ?p=v&a          { "p": "v" }
    ?p=v&a&p1=v1    { "p": "v", "p1": "v1" }
    ?a&b            {}
@kimchy
Copy link
Member

kimchy commented Jan 7, 2011

Heya,

Thanks for the work!. I added it, and also improved a bit the logic to handle properly cases of empty names with no "=", as well as improve the perf for query component decoding (inspired by netty)

cbuescher pushed a commit to cbuescher/elasticsearch that referenced this pull request Oct 2, 2023
…lastic#608)

Two changes:

 *  Increase the bulk_indexing_clients to 100 in order to better maximize our throughput on a 3-node benchmark.
  * Scope time ranges in the solutions/security track to overlap with the presence of our static data. We have data in all 5 Beats indices dated from 2020-01-01 00:00:00 to at least 01:00:00. Currently, the query_max_date is 2020-01-02, so it would be very rarely that our random query duration lengths would overlap with the presence of actual data on the cluster.
emilykmarx pushed a commit to emilykmarx/elasticsearch that referenced this pull request Dec 26, 2023
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants