Add clarification about OT scope and Span logging (#46)

* Add clarification about OT scope and Span logging * Rewrite guidance about logging redirects * Clarify error semantics
opentracing · Apr 19, 2020 · 11dd7f8 · 11dd7f8
1 parent 4ceff7b
commit 11dd7f8
Show file tree

Hide file tree

Showing 2 changed files with 24 additions and 2 deletions.
diff --git a/semantic_conventions.md b/semantic_conventions.md
@@ -74,9 +74,15 @@ The following Span tags combine to model database calls:
 - `peer.address`, `peer.hostname`, `peer.ipv4`, `peer.ipv6`, `peer.port`, `peer.service`: optional tags that describe the database peer
 - `span.kind`: `"client"`
 
-### Captured errors
+### Span and log errors
 
-Errors may be described by OpenTracing in different ways, largely depending on the language. Some of these descriptive fields are specific to errors; others are not (e.g., the `event` or `message` fields).
+It is important to distinguish between **error Spans** and **errors logged during Span execution**.
+
+Every Span either finishes in an error state or does not: the `"error"=true` tag distinguishes between those two cases. (If the `"error"` tag is missing altogether, that implies `"error"=false`) Tools that consume OpenTracing instrumentation should not need to consider any other information to determine whether a Span is in an error state.
+
+#### Logged application-level errors
+
+It can also be useful to record application-level errors that crop up during a Span's lifetime. For those situations, Span logs are more appropriate since errors have a specific timestamp (and Spans in general represent a time interval, not a specific moment). Logged errors may be described by OpenTracing in different ways, largely depending on the language. Some of these descriptive fields are specific to errors; others are not (e.g., the `event` or `message` fields).
 
 For languages where an error object encapsulates a stack trace and type information, log the following fields:
 
@@ -91,3 +97,5 @@ For other languages, or when above is not feasible:
 - error.kind=`"..."` (optional)
 
 This scheme allows Tracer implementations to extract what information they need from the actual error object when it's available.
+
+**Note:** a Span may be in an error state (i.e., have an `"error"=true` tag) and have no error *logs*, and vice versa.
diff --git a/specification.md b/specification.md
@@ -10,6 +10,12 @@ This is the "formal" OpenTracing semantic specification. Since OpenTracing must
 
 The OpenTracing specification uses a `Major.Minor` version number but has no `.Patch` component. The major version increments when backwards-incompatible changes are made to the specification. The minor version increments for non-breaking changes like the introduction of new standard tags, log fields, or SpanContext reference types. (You can read more about the motivation for this versioning scheme at Issue [specification#2](https://github.com/opentracing/specification/issues/2#issuecomment-261740811))
 
+## The Big Picture: OpenTracing's Scope
+
+OpenTracing's core specification (i.e., this document) is intentionally agnostic about the specifics of particular downstream tracing or monitoring systems. This is because **OpenTracing exists to describe the semantics of transactions in distributed systems.** Describing those transactions should not be influenced by how — or how not — any particular backend likes to process or represent data. For instance, detailed OpenTracing instrumentation can be used to simply measure latencies and apply tags in a timeseries monitoring system (e.g., Prometheus); or Span start+finish times along with Span logs may be redirected to a central logging service (e.g., Kibana).
+
+As such, the OpenTracing specification and [data modelling conventions](./data_conventions.md) have a wider scope than some tracing systems, "and that's okay." If certain semantic behavior is out-of-scope for a particular tracing or monitoring system, said system can summarize or simply ignore the respective data flowing from OpenTracing instrumentation.
+
 ## The OpenTracing Data Model
 
 **Traces** in OpenTracing are defined implicitly by their **Spans**. In
@@ -219,6 +225,14 @@ Optional parameters
 
 Note that the OpenTracing project documents certain **["standard log keys"](./semantic_conventions.md#log-fields-table)** which have prescribed semantic meanings.
 
+##### An aside: "Logging" in general, and what it means in OpenTracing
+
+"Logging" is an overloaded term in our industry; one could reasonably argue that all tracing is just a particularly organized form of logging. OpenTracing "logs" are really just key:value maps that describe a particular moment within the context of a Span.
+
+While it's possible to redirect general-purpose process-level logging into OpenTracing, doing so requires care. For instance, logging statements that aren't anchored in specific transactions or traces may not make sense within a tracing system. That said, in enviroments where an overwhelming fraction of conventional logging statements already refer to distributed transactions, `tee`ing that logging data into OpenTracing is reasonable and often beneficial.
+
+The granularity of Span logs is intended to be finer than typical "info"-style logging in process-level logging frameworks. Since tracing systems usually have a smart, all-or-nothing per-trace sampling mechanism, the verbosity within a single trace can be higher than what would be appropriate for a process as a whole — especially when that process contends with high concurrency.
+
 #### Set a **baggage** item
 
 Baggage items are key:value string pairs that apply to the given `Span`, its `SpanContext`, and **all `Spans` which directly or transitively _reference_ the local `Span`.** That is, baggage items propagate in-band along with the trace itself.