Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correct warning header to be compliant #23275

Merged
merged 16 commits into from
Feb 27, 2017

Conversation

jasontedor
Copy link
Member

@jasontedor jasontedor commented Feb 20, 2017

The warning header used by Elasticsearch for delivering deprecation warnings has a specific format (RFC 7234, section 5.5). The format specifies that the warning header should be of the form

warn-code warn-agent warn-text [warn-date]

Here, the warn-code is a three-digit code which communicates various meanings. The warn-agent is a string used to identify the source of the warning (either a host:port combination, or some other identifier). The warn-text is quoted string which conveys the semantic meaning of the warning. The warn-date is an optional quoted date that can be in a few different formats.

This commit corrects the warning header within Elasticsearch to follow this specification. We use the warn-code 299 which means a "miscellaneous persistent warning." For the warn-agent, we use the version of Elasticsearch that produced the warning. The warn-text is unchanged from what we deliver today, but is wrapped in quotes as specified (this is important as a problem that exists today is that multiple warnings can not be split by comma to obtain the individual warnings as the warnings might themselves contain commas). For the warn-date, we use the RFC 1123 format.

Closes #22986

The warning header used by Elasticsearch for delivering deprecation
warnings has a specific format (RFC 7234, section 5.5). The format
specifies that the warning header should be of the form

    warn-code warn-agent warn-text [warn-date]

Here, the warn-code is a three-digit code which communicates various
meanings. The warn-agent is a string used to identify the source of the
warning (either a host:port combination, or some other identifier). The
warn-text is quoted string which conveys the semantic meaning of the
warning. The warn-date is an optional quoted date that can be in a few
different formats.

This commit corrects the warning header within Elasticsearch to follow
this specification. We use the warn-code 299 which means a
"miscellaneous persistent warning." For the warn-agent, we use the
version of Elasticsearch that produced the warning. The warn-text is
unchanged from what we deliver today, but is wrapped in quotes as
specified (this is important as a problem that exists today is that
multiple warnings can not be split by comma to obtain the individual
warnings as the warnings might themselves contain commas). For the
warn-date, we use the RFC 1123 format.
WARNING_FORMAT,
formattedMessage,
DateTimeFormatter.RFC_1123_DATE_TIME.format(ZonedDateTime.now(GMT)));
assert WARNING_HEADER_PATTERN.matcher(warningValue).matches();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we also test that the format is correct in our test infra, maybe this assertion is not needed then?

Copy link
Member Author

@jasontedor jasontedor Feb 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that it caught an issue I fixed in 0d5310e, I'd prefer that it remain. The issue was the double quotes in the deprecation message.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am curious on how that came up. I guess through an IT test rather than a unit test?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to keep it as we rely on it for deduping. stronger yet - I think we should verify we extract the value

Copy link
Contributor

@bleskes bleskes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the basics are good (and it's always fun to go read the spec of HTTP related stuff). I think we missed encoding / decoding slashes and quotes in the message text.

@@ -112,6 +112,38 @@ public void deprecated(String msg, Object... params) {
deprecated(THREAD_CONTEXT, msg, params);
}

/*
* RFC7234 specifies the warning format as warn-code <space> warn-agent <space> "warn-text" [<space> "warn-date"]. Here, warn-code is a
* three-digit number with various standard warn codes specified. The warn code 299 is apt for our purposes as it represents a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did you choose 299 over 199?

Copy link
Member Author

@jasontedor jasontedor Feb 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To indicate that this is a persistent warning as opposed to a transient warning (e.g., a warning that could expire after time elapses, or on cache invalidation).

public static Pattern WARNING_HEADER_PATTERN = Pattern.compile(
"299 " + // warn code
"Elasticsearch-\\d+\\.\\d+\\.\\d+(?:-(?:alpha|beta|rc)\\d+)?(?:-SNAPSHOT)?/(?:[a-f0-9]{7}|Unknown) " + // warn agent
"\"([^\"]*)\" " + // quoted warning value, captured
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doesn't account for slash escaping.

WARNING_FORMAT,
formattedMessage,
DateTimeFormatter.RFC_1123_DATE_TIME.format(ZonedDateTime.now(GMT)));
assert WARNING_HEADER_PATTERN.matcher(warningValue).matches();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to keep it as we rely on it for deduping. stronger yet - I think we should verify we extract the value

void deprecated(Set<ThreadContext> threadContexts, String message, Object... params) {
Iterator<ThreadContext> iterator = threadContexts.iterator();
void deprecated(final Set<ThreadContext> threadContexts, final String message, final Object... params) {
final Iterator<ThreadContext> iterator = threadContexts.iterator();

if (iterator.hasNext()) {
final String formattedMessage = LoggerMessageFormat.format(message, params);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we also need to escape quotes and slashes in the text, no?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed a commit that does this.

assert value != null;

final Map<String, List<String>> newResponseHeaders = new HashMap<>(this.responseHeaders);
final List<String> existingValues = newResponseHeaders.get(key);

if (existingValues != null) {
if (existingValues.contains(value)) {
final Set<String> existingUniqueValues = existingValues.stream().map(uniqueValue).collect(Collectors.toSet());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert that existingUniqueValues have the same length as existingValues? we should use deduping consistently...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed a commit that does this.

@@ -178,6 +184,22 @@ public void testResponseHeaders() {
threadContext.addResponseHeader("foo", "bar");
}

final String now = DateTimeFormatter.RFC_1123_DATE_TIME.format(ZonedDateTime.now(ZoneId.of("GMT")));
final Function<String, String> deduplicator = v -> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we make this a static value somewhere in DeprecationLogger to make sure we test it as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed a commit that does this, and added tests.

@@ -309,7 +309,7 @@ protected final void assertWarnings(String... expectedWarnings) {
throw new IllegalStateException("unable to check warning headers if the test is not set to do so");
}
try {
final List<String> actualWarnings = threadContext.getResponseHeaders().get(DeprecationLogger.WARNING_HEADER);
final List<String> actualWarnings = threadContext.getResponseHeaders().get("Warning");
for (String msg : expectedWarnings) {
assertThat(actualWarnings, hasItem(containsString(msg)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should do the extraction song and dance (+encoding/decoding) here too, no?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, I pushed a commit that does this.

* master: (54 commits)
  Keep the pipeline handler queue small initially
  Do not create String instances in 'Strings' methods accepting StringBuilder (elastic#22907)
  Tests: fix AwsS3ServiceImplTests
  Remove abstract InternalMetricsAggregation class (elastic#23326)
  Add BulkRequest support to High Level Rest client (elastic#23312)
  Wrap getCredentials() in a doPrivileged() block (elastic#23297)
  Respect promises on pipelined responses
  Align REST specs for HEAD requests
  Remove unnecessary result sorting in SearchPhaseController (elastic#23321)
  Fix SamplerAggregatorTests to have stable and predictable docIds
  Tests: Ensure multi node integ tests wait on first node
  Relocate a comment in HttpPipeliningHandler
  Add comments to HttpPipeliningHandler
  [TEST] Fix incorrect test cluster name in cluster health doc tests
  Build: Change location in zip of license and notice inclusion for plugins (elastic#23316)
  Script: Fix value of `ctx._now` to be current epoch time in milliseconds (elastic#23175)
  Build: Rework integ test setup and shutdown to ensure stop runs when desired (elastic#23304)
  Handle long overflow when adding paths' totals
  Don't set local node on cluster state used for node join validation (elastic#23311)
  Ensure that releasing listener is called
  ...
* master: (26 commits)
  CLI: Fix prompting for yes/no to handle console returning null (elastic#23320)
  Tests: Fix reproduce line for packagingTest (elastic#23365)
  Build: Remove extra copies of netty license (elastic#23361)
  [TEST] Removes timeout based wait_for_active_shards REST test (elastic#23360)
  [TEST] increase timeout slightly in wait_for_active_shards test to allow for index creation cluster state update to be processed before ensuring the wait times out
  Handle snapshot repository's missing index.latest
  Adding equals/hashCode to MainResponse (elastic#23352)
  Always restore the ThreadContext for operations delayed due to a block (elastic#23349)
  Add support for named xcontent parsers to high level REST client (elastic#23328)
  Add unit tests for ParentToChildAggregator (elastic#23305)
  Fix after last merge with master and apply last comments
  [INGEST] Lazy load the geoip databases.
  disable BWC tests for the highlighters, need a new 5.x build to make it work
  Expose WordDelimiterGraphTokenFilter (elastic#23327)
  Test that buildCredentials returns correct clazz (elastic#23334)
  Add BreakIteratorBoundaryScanner support for FVH (elastic#23248)
  Prioritize listing index-N blobs over index.latest in reading snapshots (elastic#23333)
  Test: Fix hdfs test fixture setup on windows
  delete and index tests can share some part of the code
  Remove createSampleDocument method and use the sync'ed index method
  ...
Copy link
Contributor

@bleskes bleskes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Left some minor comments that I don't feel require another cycle.

public static Pattern WARNING_HEADER_PATTERN = Pattern.compile(
"299 " + // warn code
"Elasticsearch-\\d+\\.\\d+\\.\\d+(?:-(?:alpha|beta|rc)\\d+)?(?:-SNAPSHOT)?-(?:[a-f0-9]{7}|Unknown) " + // warn agent
"\"((?:\t| |!|[\\x23-\\x5b]|[\\x5d-\\x7e]|[\\x80-\\xff]|\\\\|\")*)\" " + // quoted warning value, captured
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is right (although it doesn't matter since we only parse our own stuff, for which it works). Currently it will match 299 Elasticsearch-6.0.0-alpha1-SNAPSHOT-Unknown """ "Sun, 26 Feb 2017 20:13:14 GMT" . I think |\")*) should be |\\\")*)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is right

You're correct that it's not right.

I think |\")*) should be |\\\")*)

I think it should be |\\\\\")*).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed 2228554.


if (iterator.hasNext()) {
final String formattedMessage = LoggerMessageFormat.format(message, params);

final String warningValue = formatWarning(formattedMessage);
assert WARNING_HEADER_PATTERN.matcher(warningValue).matches();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we extractWarningValueFromWarningHeader to see we got formattedMessage back?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed 2228554.

* @return the escaped string
*/
public static String escape(String s) {
return s.replaceAll("(\\\\|\")", "\\\\$1");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

java really does need literal strings..

final String s = randomAsciiOfLength(16);
final String first = DeprecationLogger.formatWarning(s);
// force the clock forward by a second so the dates on the warning values are different
Thread.sleep(1000);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:( . Maybe add a variant of formatWarning where you can supply the time? alternatively we can make sure what we extract is the value of s which is probably a better test anyway (we want the original, not that we consistently break things). It seems we don't use escaped chars anyway

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed 2228554.

assertThat(DeprecationLogger.escape("\\\""), equalTo("\\\\\\\""));
assertThat(DeprecationLogger.escape("\"foo\\bar\""),equalTo("\\\"foo\\\\bar\\\""));
// test that characters other than '\' and '"' are left unchanged
final String s = DeprecationLogger.escape(randomAsciiOfLength(16).replace("\\", "").replace("\"", ""));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it doesn't seem that randomAsciiOfLength will give back those chars, ever.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, it's poorly named then. 😦

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed 2228554.

assertNull("unexpected warning headers", warnings);
} finally {
resetDeprecationLogger();
}
}

protected final void assertSettingDeprecations(Setting... settings) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

if (false == expected.isEmpty()) {
if (failureMessage == null) {
failureMessage = new StringBuilder();
final Set<String> expected = new LinkedHashSet<>(expectedWarningHeaders);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should escape expectedWarningHeaders, or alternatively unescape when we read.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed 5ac1b2b.

* master:
  [TEST]  make headers case-insensitive when running yaml tests
  [TEST] randomize request content_type between all of the supported formats
  [TEST] add support for binary responses to REST tests infra
  [TEST] don't check exact size in mapper-size yaml test
  [TEST] move test for binary field to specific test file that sets Content-Type header explicitly
  [TEST] move filters aggs wrapper query builder rewriting test to integ tests
  [TEST] create HttpEntity earlier in REST tests
  [TEST] Remove content type auto-detection while parsing request body in REST tests
  Factor out filling of TopDocs in SearchPhaseController (elastic#23380)
  Add info method to High Level Rest client (elastic#23350)
  Prevent negative `from` parameter in SearchSourceBuilder (elastic#23358)
  reduce the number of iterations in testPrimaryRelocationWhileIndexing and flush every 5
  rollback unneeded change in testNotifyOnDisconnect
  disable sampling in testNotifyOnDisconnect
@jasontedor jasontedor merged commit 577e6a5 into elastic:master Feb 27, 2017
jasontedor added a commit that referenced this pull request Feb 27, 2017
The warning header used by Elasticsearch for delivering deprecation
warnings has a specific format (RFC 7234, section 5.5). The format
specifies that the warning header should be of the form

    warn-code warn-agent warn-text [warn-date]

Here, the warn-code is a three-digit code which communicates various
meanings. The warn-agent is a string used to identify the source of the
warning (either a host:port combination, or some other identifier). The
warn-text is quoted string which conveys the semantic meaning of the
warning. The warn-date is an optional quoted date that can be in a few
different formats.

This commit corrects the warning header within Elasticsearch to follow
this specification. We use the warn-code 299 which means a
"miscellaneous persistent warning." For the warn-agent, we use the
version of Elasticsearch that produced the warning. The warn-text is
unchanged from what we deliver today, but is wrapped in quotes as
specified (this is important as a problem that exists today is that
multiple warnings can not be split by comma to obtain the individual
warnings as the warnings might themselves contain commas). For the
warn-date, we use the RFC 1123 format.

Relates #23275
jasontedor added a commit that referenced this pull request Feb 27, 2017
The warning header used by Elasticsearch for delivering deprecation
warnings has a specific format (RFC 7234, section 5.5). The format
specifies that the warning header should be of the form

    warn-code warn-agent warn-text [warn-date]

Here, the warn-code is a three-digit code which communicates various
meanings. The warn-agent is a string used to identify the source of the
warning (either a host:port combination, or some other identifier). The
warn-text is quoted string which conveys the semantic meaning of the
warning. The warn-date is an optional quoted date that can be in a few
different formats.

This commit corrects the warning header within Elasticsearch to follow
this specification. We use the warn-code 299 which means a
"miscellaneous persistent warning." For the warn-agent, we use the
version of Elasticsearch that produced the warning. The warn-text is
unchanged from what we deliver today, but is wrapped in quotes as
specified (this is important as a problem that exists today is that
multiple warnings can not be split by comma to obtain the individual
warnings as the warnings might themselves contain commas). For the
warn-date, we use the RFC 1123 format.

Relates #23275
@jasontedor
Copy link
Member Author

Thanks for the thorough review @bleskes.

@jasontedor jasontedor deleted the warning-header branch March 1, 2017 02:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants