Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding propagators API and b3 SDK implementation (#51, #52) #78

Merged
merged 5 commits into from
Aug 15, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
from .binaryformat import BinaryFormat
from .httptextformat import HTTPTextFormat

__all__ = ["BinaryFormat", "HTTPTextFormat"]
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
# Copyright 2019, OpenTelemetry Authors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import abc
import typing

from opentelemetry.trace import SpanContext


class BinaryFormat(abc.ABC):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this class could use some "performance-oriented" APIs: Since creating a slice of bytes copies it, I suggest from_bytes(byte_representation:bytes, offset: int = 0, length: int = -1), and maybe append_bytes(dst: bytearray).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see the value here, but I may wait on adding these until I add an implementation (e.g. the standard binary format). That might better illustrate the right API, unless you already have an example in mind.

Alternatively it looks like memoryview would be a great way to read values:

https://docs.python.org/3/library/stdtypes.html#memoryview

Although I could always construct a memoryview from the bytes object I received. For to_bytes I could use bytearray internally to construct the buffer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

toumorokoshi marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Member

@bogdandrutu bogdandrutu Aug 14, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to not duplicate the same class for correlation-context (distributedcontext) I would suggest to make this a template and have BinaryFormat<SpanContext> and BinaryFormat<DistributedContext>

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Mypy typing does support generics, but I'm planning of filing another PR that aligns closer to the tickets I've mentioned above. This will include creating a unified composed object for SpanContext and DistributedContext and using the composed object as the context for propagators, thereby eliminating the need for the generic.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From an offline conversation with @bogdandrutu: a bigger problem is that referencing SpanContext here would make context depend on trace.

This creates a circular dependency, and even if we remove the propagators from the tracer trace will depend on context in other places. If we think of the context package as representing the base context propagation layer it's surprising to see it depend on a layer higher in the stack.

This isn't a blocking comment for this PR, just something to keep in mind when you start composing context objects.

"""API for serialization of span context into binary formats.

This class provides an interface that enables converting span contexts
to and from a binary format.
"""
@staticmethod
@abc.abstractmethod
def to_bytes(context: SpanContext) -> bytes:
"""Creates a byte representation of a SpanContext.

to_bytes should read values from a SpanContext and return a data
format to represent it, in bytes.

Args:
context: the SpanContext to serialize

Returns:
A bytes representation of the SpanContext.

"""
@staticmethod
@abc.abstractmethod
def from_bytes(byte_representation: bytes) -> typing.Optional[SpanContext]:
"""Return a SpanContext that was represented by bytes.
Copy link
Member

@reyang reyang Aug 14, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we discussed today, we might want to use the same propagator for both SpanContext and DistributedContext.
No need to be blocked though, we can address this in another PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, coming up in the next PR. want to get this merged in first.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Roger that.


from_bytes should return back a SpanContext that was constructed from
the data serialized in the byte_representation passed. If it is not
possible to read in a proper SpanContext, return None.

Args:
byte_representation: the bytes to deserialize

Returns:
A bytes representation of the SpanContext if it is valid.
Otherwise return None.

"""
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
# Copyright 2019, OpenTelemetry Authors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import abc
import typing

from opentelemetry.trace import SpanContext

Setter = typing.Callable[[object, str, str], None]
Getter = typing.Callable[[object, str], typing.List[str]]


class HTTPTextFormat(abc.ABC):
"""API for propagation of span context via headers.

This class provides an interface that enables extracting and injecting
span context into headers of HTTP requests. HTTP frameworks and clients
can integrate with HTTPTextFormat by providing the object containing the
headers, and a getter and setter function for the extraction and
injection of values, respectively.

Example::

import flask
import requests
from opentelemetry.context.propagation import HTTPTextFormat

PROPAGATOR = HTTPTextFormat()
toumorokoshi marked this conversation as resolved.
Show resolved Hide resolved



Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why all the blank lines?

def get_header_from_flask_request(request, key):
toumorokoshi marked this conversation as resolved.
Show resolved Hide resolved
return request.headers.get_all(key)

def set_header_into_requests_request(request: requests.Request,
key: str, value: str):
request.headers[key] = value

def example_route():
span_context = PROPAGATOR.extract(
get_header_from_flask_request,
flask.request
)
request_to_downstream = requests.Request(
"GET", "http://httpbin.org/get"
)
PROPAGATOR.inject(
span_context,
set_header_into_requests_request,
request_to_downstream
)
session = requests.Session()
session.send(request_to_downstream.prepare())


.. _Propagation API Specification:
https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/api-propagators.md
"""
@abc.abstractmethod
def extract(self, get_from_carrier: Getter,
carrier: object) -> SpanContext:
toumorokoshi marked this conversation as resolved.
Show resolved Hide resolved
"""Create a SpanContext from values in the carrier.

The extract function should retrieve values from the carrier
object using get_from_carrier, and use values to populate a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
object using get_from_carrier, and use values to populate a
object using `get_from_carrier`, and use values to populate a

SpanContext value and return it.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
SpanContext value and return it.
`SpanContext` value and return it.

And the same changes to other docstrings. This is to make the sphinx docs render nicely, but feel free to omit the sphinx markup where it hurts code readability.

I haven't been including sphinx markup in the first line of the docstrings since we need them to fit on a single line, but since we changed the sphinx default role most markdown is just backticks now, and it may be worth the extra characters to do this consistently.


Args:
get_from_carrier: a function that can retrieve zero
or more values from the carrier. In the case that
the value does not exist, return an empty list.
carrier: and object which contains values that are
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
carrier: and object which contains values that are
carrier: an object which contains values that are

used to construct a SpanContext. This object
must be paired with an appropriate get_from_carrier
which understands how to extract a value from it.
Returns:
A SpanContext with configuration found in the carrier.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the carrier doesn't provide a SpanContext, do we expect to generate one or return None?
This could become tricky since we might receive something between valid and invalid. For example
https://github.com/w3c/trace-context/blob/b145878f5618fccbf4926bc181ecb25b709b111d/test/test.py#L536.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the the Specification (https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/api-propagators.md#frombytes):

If the value could not be parsed, the underlying implementation SHOULD decide to return ether an empty value, an invalid value, or a valid value.

In Java we do not return null ever for this (makes me wonder if we should tune the Specification for this), and we instead will be returning SpanContext.getInvalid() ;)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current organization of the API makes the ramifications of this choice and how to handle it a decision of whoever would handle handing the SpanContext back to the context. I.E. error handling and how to deal with that is up to whoever calls the extract / inject method.

I think as a convention the formats should return either a valid value or nothing at all. But honestly I think we'll see changes here as we discover more use cases around things like needing to support multiple propagators.


"""
@abc.abstractmethod
def inject(self, context: SpanContext, set_in_carrier: Setter,
carrier: object) -> None:
"""Inject values from a SpanContext into a carrier.

inject enables the propagation of values into HTTP clients or
other objects which perform an HTTP request. Implementations
should use the set_in_carrier method to set values on the
carrier.

Args:
context: The SpanContext to read values from.
set_in_carrier: A setter function that can set values
on the carrier.
carrier: An object that a place to define HTTP headers.
Should be paired with set_in_carrier, which should
know how to set header values on the carrier.

"""
Empty file.
Empty file.
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
# Copyright 2019, OpenTelemetry Authors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import typing

from opentelemetry.context.propagation.httptextformat import HTTPTextFormat
import opentelemetry.trace as trace


class B3Format(HTTPTextFormat):
toumorokoshi marked this conversation as resolved.
Show resolved Hide resolved
"""Propagator for the B3 HTTP header format.

See: https://github.com/openzipkin/b3-propagation
"""

SINGLE_HEADER_KEY = "b3"
TRACE_ID_KEY = "x-b3-traceid"
SPAN_ID_KEY = "x-b3-spanid"
SAMPLED_KEY = "x-b3-sampled"
FLAGS_KEY = "x-b3-flags"
_SAMPLE_PROPAGATE_VALUES = set(["1", "True", "true", "d"])

@classmethod
def extract(cls, get_from_carrier, carrier):
trace_id = format_trace_id(trace.INVALID_TRACE_ID)
span_id = format_span_id(trace.INVALID_SPAN_ID)
sampled = 0
flags = None

single_header = _extract_first_element(
get_from_carrier(carrier, cls.SINGLE_HEADER_KEY))
if single_header:
# The b3 spec calls for the sampling state to be
# "deferred", which is unspecified. This concept does not
# translate to SpanContext, so we set it as recorded.
sampled = "1"
fields = single_header.split("-", 4)

if len(fields) == 1:
sampled = fields[0]
elif len(fields) == 2:
trace_id, span_id = fields
elif len(fields) == 3:
trace_id, span_id, sampled = fields
elif len(fields) == 4:
trace_id, span_id, sampled, _parent_span_id = fields
else:
return trace.INVALID_SPAN_CONTEXT
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

else:
trace_id = _extract_first_element(
get_from_carrier(carrier, cls.TRACE_ID_KEY)) or trace_id
span_id = _extract_first_element(
get_from_carrier(carrier, cls.SPAN_ID_KEY)) or span_id
sampled = _extract_first_element(
get_from_carrier(carrier, cls.SAMPLED_KEY)) or sampled
flags = _extract_first_element(
get_from_carrier(carrier, cls.FLAGS_KEY)) or flags

options = 0
# The b3 spec provides no defined behavior for both sample and
# flag values set. Since the setting of at least one implies
# the desire for some form of sampling, propagate if either
# header is set to allow.
if sampled in cls._SAMPLE_PROPAGATE_VALUES or flags == "1":
options |= trace.TraceOptions.RECORDED
toumorokoshi marked this conversation as resolved.
Show resolved Hide resolved

return trace.SpanContext(
# trace an span ids are encoded in hex, so must be converted
trace_id=int(trace_id, 16),
span_id=int(span_id, 16),
trace_options=options,
trace_state={},
)

@classmethod
def inject(cls, context, set_in_carrier, carrier):
sampled = (trace.TraceOptions.RECORDED & context.trace_options) != 0
set_in_carrier(carrier, cls.TRACE_ID_KEY,
format_trace_id(context.trace_id))
set_in_carrier(carrier, cls.SPAN_ID_KEY,
format_span_id(context.span_id))
set_in_carrier(carrier, cls.SAMPLED_KEY, "1" if sampled else "0")


def format_trace_id(trace_id: int):
"""Format the trace id according to b3 specification."""
return format(trace_id, "032x")


def format_span_id(span_id: int):
"""Format the span id according to b3 specification."""
return format(span_id, "016x")


def _extract_first_element(list_object: list) -> typing.Optional[object]:
if list_object:
return list_object[0]
return None
Empty file.
Empty file.
Loading