Make URI query decoding more robust #608

aferreira · 2011-01-07T00:16:56Z

With the current ES code, exceptions or infinite loops can be generated while parsing the URI query of a request.

This can be seen with the following examples:

curl -XGET localhost:9200/test/_analyze?t&text=this+is+a+test
# the exception stack trace shows up in logs

curl -XGET localhost:9200/test/_analyze?t1&t2&text=this+is+a+test
# never returns, never ends

The changes suggested take care of these issues.

Testing RestUtils.decodeQueryString("something", "something".indexOf('?') + 1, params); is not really checking decoding of an empty query. Instead, it is testing decoding of "something" as a query (because "something.index('?')+1" evaluates to 0). The parameter map is left empty because "malformed" pairs like "something" are currently skipped. Instead, this change modify this test to check the edge cases: + fromIndex >= queryString.length() + fromIndex < 0

The code of decodeQueryString() had some trouble with weird URLs: (1) an input like "uri?param&p=v" causes an exception to be thrown (2) an input like "uri?param1&param2" causes an infinite loop This could be verified against an ES server with requests like curl -XGET localhost:9200/test/_analyze?t&text=this+is+a+test # the exception stack trace shows up in logs curl -XGET localhost:9200/test/_analyze?t1&t2&text=this+is+a+test # never returns, never ends This change fixes these issues.

When writing tests for the fix of decodeQueryString() to handle gracefully edge cases like: &a &a&b it arises the question of what behavior is desirable for these pathological cases (vs the regular p=v pairs). This change just skips them which is consonant to the preexisting code. To be thorough, we add tests for a bunch of edge cases, like: QUERY PARSED PARAMS ? {} ?& {} ?= { "": "" } ?a {} ?p=v&a { "p": "v" } ?p=v&a&p1=v1 { "p": "v", "p1": "v1" } ?a&b {}

kimchy · 2011-01-07T12:35:48Z

Heya,

Thanks for the work!. I added it, and also improved a bit the logic to handle properly cases of empty names with no "=", as well as improve the perf for query component decoding (inspired by netty)

…lastic#608) Two changes: * Increase the bulk_indexing_clients to 100 in order to better maximize our throughput on a 3-node benchmark. * Scope time ranges in the solutions/security track to overlap with the presence of our static data. We have data in all 5 Beats indices dated from 2020-01-01 00:00:00 to at least 01:00:00. Currently, the query_max_date is 2020-01-02, so it would be very rarely that our random query duration lengths would overlap with the presence of actual data on the cluster.

Fixes elastic#608

Adriano Ferreira added 3 commits January 6, 2011 20:31

emilykmarx pushed a commit to emilykmarx/elasticsearch that referenced this pull request Dec 26, 2023

Ignore failures on master branch

9a7f1b5

Fixes elastic#608

This pull request was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make URI query decoding more robust #608

Make URI query decoding more robust #608

aferreira commented Jan 7, 2011

kimchy commented Jan 7, 2011

Make URI query decoding more robust #608

Make URI query decoding more robust #608

Conversation

aferreira commented Jan 7, 2011

kimchy commented Jan 7, 2011