Permalink
40b9d22 Oct 20, 2017
2 contributors

Users who have contributed to this file

@lmolkova @MrDesjardins
112 lines (80 sloc) 7.71 KB

Hierarchical Request-Id

This document describes hierarchical Request-Id schema for HTTP Correlation Protocol for telemetry correlation.

Overview

The main requirement for Request-Id is uniqueness, any two requests processed by the cluster must not collide. Guids or big random number help to achieve it, but they require other identifiers to query all requests related to the operation.

Hierarchical Request-Id looks like |<root-id>.<local-id1>.<local-id2>. (e.g. |9e74f0e5-efc4-41b5-86d1-3524a43bd891.bcec871c_1.) and holds all information needed to trace whole operation and particular request. Root-id serves as common identifier for all requests involved in operation processing and local-ids represent internal activities (and requests) done within scope of this operation.

Upstream service/client application may be instrumented with other tracing system, so implementation MAY have compatibility layer that parses another set of trace headers. Therefore implementation SHOULD be tolerant to other formats of trace identifiers and do the best effort to keep root-id equivalent in particular tracing system.

Formatting Hierarchical Request-Id

If Request-Id was not provided from upstream service and implementation decides to trace the request, it MUST generate new Request-Id (see Root Request Id Generation) to represent incoming request.

In heterogeneous environment implementations of this protocol with hierarchical Request-Id may interact with other services that do not implement this protocol, but still have notion of request Id. Implementation or logging system should be able unambiguously identify if given Request-Id has hierarchical schema.

Therefore every implementation which support hierarchical structure MUST prepend "|" (vertical bar) to generated Request-Id.

It also MUST append "." (dot) to the end of generated Request-Id to unambiguously mark end of it (e.g. search for |123 may return |1234, but search for |123. would be exact)

Root Request Id Generation

Root Request-Id is the top most Request-Id generated by the first instrumented service. In a hierarchical Request-Id, it is a root node and common for all requests involved in operation processing. It MUST be unique to every high-level operation in the system, so for every traced operation, implementation MUST generate sufficiently large identifier: e.g. GUID, 64-bit or 128-bit random number. Note that random numbers could be encoded to string to decrease Request-Id length.

Root Request-Id MUST contain only Base64 and "-" (hyphen) characters.

Same considerations are applied to client applications making HTTP requests and generating root Request-Id.

Note that in addition to unique part, it may be useful to include some meaningful information such as host name, device or process id, etc. Implementation is free to do it, keeping root id relatively short.

Incoming Request

When Request-Id is provided by upstream service, there is no guarantee that it is unique within the entire system.

Implementation SHOULD make it unique by adding small suffix to incoming Request-Id to represent internal activity and use it for outgoing requests. If implementation does not trust incoming Request-Id in the least, suffix may be as long as Root Request Id. We recommend appending random string of 8 characters length (e.g. 32-bit hex-encoded random integer).

Suffix MUST contain only Base64 and "-" (hyphen) characters

Implementation MUST append "_" (underscore) to mark the end of generated incoming Request-Id.

Outgoing Request

When making request to downstream service, implementation MUST append small id to the incoming Request-Id and pass a new Request-Id to downstream service.

  • Suffix MUST be unique for every outgoing HTTP request sent while processing the incoming request; monotonically incremented number of outgoing request within the scope of this incoming operation, is a good candidate.
  • Suffix MUST contain only Base64 and "-" (hyphen) characters

Implementation MUST append "." (dot) to mark the end of generated outgoing Request-Id.

It may be useful to split incoming request processing to multiple logical sub-operations and assign different identifiers to them, similarly as it is done for outgoing request, except the sub-operation is processed within the same service.

Request-Id Overflow

Extending Request-Id may cause it to exceed length limit. To handle overflow, implementation:

  • MUST generate suffix that keeps possibility of collision with any of the previous or future Request-Id within the same operation neglectable.
  • MUST append "#" symbol to suffix to indicate that overflow happened.
  • MUST trim end of existing Request-Id to make a room for generated LocalId. Implementation MUST trim whole nodes (separated with ".", "_") without preceding delimiter, i.e. it's invalid to trim only part of node.
  • Suffix MUST contain only Base64 and '-' (hyphen) characters

As a result Request-Id will look like:

Beginning-Of-Incoming-Request-Id.LocalId#

Thus, to the extent possible, Request-Id will keep valid part of hierarchical Id.

Overflow suffix should be large enough to ensure new Request-Id does not collide with one of previous/future Request-Ids within the same operation. Using random 32-bytes integer (or 8 chars string) is a good candidate for it. Note that applications could asynchronously start multiple outgoing requests almost at the same time, which makes timestamp even with ticks precision bad candidate for overflow suffix.

Example

Let's consider three services: service-a, service-b and service-c. User calls service-a, which calls service-b to fulfill the user request

User -> service-a -> service-b

  1. A: service-a receives request
  • does not find Request-Id and generates a new root Request-Id |Guid.
  • trace that incoming request was started along with Request-Id: |Guid.
  1. A: service-a makes request to service-b:
  • generates new Request-Id by appending request number to the parent request id: |Guid.1.
  • logs that outgoing request is about to be sent with all the available context: Request-Id: |Guid.1.
  • sends request to service-b
  1. B: service-b receives request
  • scans through its headers and finds Request-Id: |Guid.1.
  • it generates a new Request-Id: |Guid.1.da4e9679_ to uniquely describe operation within service-b
  • logs event that operation was started along with all available context: Request-Id: |Guid.1.da4e9679_
  • processes request and responds to service-a
  1. A: service-a receives response from service-b
  • logs response with context: Request-Id: |Guid.1.
  • Processes request and responds to caller

As a result log records may look like:

Message Component name Context
user starts request to service-a user
incoming request service-a Request-Id=|Guid.
request to service-b service-a Request-Id=|Guid.1.
incoming request service-b Request-Id=|Guid.1.da4e9679_
response service-b Request-Id=|Guid.1.da4e9679_
response from service-b service-a Request-Id=|Guid.1.
response service-a Request-Id=|Guid.
response from service-a user

Remarks

  • All operation logs may be queried by Request-Id prefix |Guid., logs for particular request may be queried by exact Request-Id match
  • When service-a generates a new Request-Id, it does not append suffix, since it generates a root Request-Id and ensures its uniqueness

See also