Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(instrumentation): add OpenTelemetry tracing and metrics with basic configurations #5175

Merged
merged 90 commits into from Oct 11, 2022
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
Show all changes
90 commits
Select commit Hold shift + click to select a range
a5a7f42
feat(instrumentation): create basic tracer and meter with console exp…
Sep 15, 2022
9e1b2d0
style: fix overload and cli autocomplete
jina-bot Sep 15, 2022
514792a
feat(instrumentation): move the instrumentation package to the serve …
Sep 16, 2022
c3b0c37
feat(instrumentation): provide options to enable tracing and metrics …
Sep 16, 2022
2269b57
feat(instrumentation): add the correct grpc opentelmetery insturmenta…
Sep 19, 2022
14cb744
feat(serve): instrument grpc server and channel with interceptors
Sep 19, 2022
f53be22
style: fix overload and cli autocomplete
jina-bot Sep 19, 2022
a4a4621
feat(instrumentation): provide opentelemety context from the grpc cli…
Sep 20, 2022
78efb44
feat(instrumentation): check for opentelemetry environment variables …
Sep 20, 2022
7116e9f
feat(instrumentation): create InstrumentationMixin for server and cli…
Sep 20, 2022
92d3679
chore(instrumentation): use absolute module import
Sep 21, 2022
eb0ccd3
feat(instrumentation): trace http and websocket server and clients
Sep 21, 2022
38cae61
chore(instrumentation): update/add new opentelemetry arguments
Sep 21, 2022
45d1794
feat(instrumentation): globally disable tracing health check requests
Sep 21, 2022
b107f80
feat(instrumentation): add InstrumentationMixIn for Head and Worker r…
Sep 22, 2022
cd17588
feat(instrumentation): disable tracing of ServerReflection and endpoi…
Sep 26, 2022
2e44270
test(instrumentation): add basic tracing and metrics tests for HTTP G…
Sep 26, 2022
a083146
test(instrumentation): move test common code for tracing and metrics …
Sep 26, 2022
30ee9e3
feat(instrumentation): enable tracing of flow internal and start up r…
Sep 26, 2022
3998e2f
test(instrumentation): move test common code to new base class
Sep 26, 2022
30409c2
test(instrumentation): test grpc gateway opentelemety instrumentation
Sep 26, 2022
e2ee862
feat(instrumentation): add Jaeger export agent and required configura…
Sep 27, 2022
a0bfaf8
chore(instrumentation): remove print statement
Sep 27, 2022
60be044
test(instrumentation): document spans in the grpc and http gateway in…
Sep 27, 2022
9da9eaf
Merge branch 'master' into feat-instrumentation-5155
Sep 27, 2022
adb96ba
style: fix overload and cli autocomplete
jina-bot Sep 27, 2022
0af8ffb
chore: remove print statement
Sep 27, 2022
a241f62
test(instrumentation): add instrumentaiton tests for websocket gateway
Sep 27, 2022
528e38b
fix: import openetelmetry api globally and the other dependencies onl…
Sep 27, 2022
47ed0a8
fix: use class name as default name when creating Executor instrument…
Sep 27, 2022
3f436da
fix: provide argparse arguments to AlternativeGateway
Sep 27, 2022
578e882
style: fix overload and cli autocomplete
jina-bot Sep 27, 2022
aa5a34a
style: fix overload and cli autocomplete
Sep 28, 2022
87c15f5
Merge branch 'master' into feat-instrumentation-5155
Sep 28, 2022
f7b4af4
style: fix overload and cli autocomplete
jina-bot Sep 28, 2022
3a2e1de
style: fix overload and cli autocomplete
Sep 28, 2022
82dad9c
style: fix overload and cli autocomplete
jina-bot Sep 28, 2022
42d00e6
fix: revert changes for Gateway implementation
Sep 29, 2022
9ade3b6
Merge branch 'master' into feat-instrumentation-5155
Sep 29, 2022
4132396
feat(instrumentation): remove init method from InstrumentationMixin
Sep 29, 2022
4efbbd7
feat(instrumentation): create vendor neutral opentelemetry export arg…
Sep 29, 2022
8e9abcb
style: fix overload and cli autocomplete
Sep 29, 2022
8eed211
feat(instrumentation): inject tracing variables from AsyncLoopRuntime…
Sep 30, 2022
175a399
style: fix overload and cli autocomplete
jina-bot Sep 30, 2022
030b980
feat(instrumentation): configure a OTLP collector for exporting trace…
Sep 30, 2022
c686498
style: fix overload and cli autocomplete
jina-bot Sep 30, 2022
6d21a3a
feat(instrumentation): return None for aio server interceptors if tra…
Oct 4, 2022
00c6c12
test: fix handling of optional args
Oct 5, 2022
92c0e1f
Merge branch 'master' into feat-instrumentation-5155
Oct 5, 2022
6e27829
fix: remove print debug statement
Oct 5, 2022
366a20e
fix: fix gateway class loading
alaeddine-13 Oct 5, 2022
822b541
Merge branch 'feat-instrumentation-5155' of github.com:jina-ai/jina i…
alaeddine-13 Oct 5, 2022
963b82d
feat(instrumentation): fix BaseGateway telemetry dependency injection
Oct 5, 2022
6433930
fix: fix WebsocketGateway loading
alaeddine-13 Oct 5, 2022
ffadb73
fix(instrumentation): correctly handle default executor runtime_args
Oct 5, 2022
3f6eeff
test(instrumentation): add integration tests for grpc, http and webso…
Oct 5, 2022
6b35909
test(instrumentation): parameterize instrumentation tests
Oct 5, 2022
2906369
test(instrumentation): remove outdated tests replaced by parametrized…
Oct 6, 2022
f1ad7a2
fix(instrumentation): fix executor instrumentation setup
Oct 6, 2022
d7bb8d9
fix(instrumentation): force spawn process when running flows in param…
Oct 6, 2022
5e31dca
feat(instrumentation): omit opentelemetry from cli args
Oct 6, 2022
c23f30a
style: fix overload and cli autocomplete
jina-bot Oct 6, 2022
bcc39a8
test: small test refactoring
JoanFM Oct 6, 2022
c540628
Merge branch 'master' into feat-instrumentation-5155
Oct 6, 2022
2ce9c67
style: fix overload and cli autocomplete
Oct 6, 2022
adcb457
style: fix overload and cli autocomplete
jina-bot Oct 6, 2022
0ae5f99
Merge branch 'master' into feat-instrumentation-5155
Oct 6, 2022
b45de43
test: dont set multiprocessing start method to spawn
Oct 6, 2022
bbd2fb8
fix: hide opentelemetry imports
Oct 6, 2022
222cfb9
Merge branch 'master' into feat-instrumentation-5155
JoanFM Oct 6, 2022
dcf7296
fix(runtimes): shutdown instrumentation exporters during teardown
Oct 7, 2022
57be55e
test: spawn processes by default in tests
Oct 7, 2022
7266abc
Merge branch 'feat-instrumentation-5155' of github.com:jina-ai/jina i…
Oct 7, 2022
e9e78ae
fix: provide client and server interceptors only when tracing is ena…
Oct 7, 2022
3656afc
Merge branch 'master' into feat-instrumentation-5155
Oct 7, 2022
4f83c47
fix(serve): correctly handle default instrumentation runtime_args
Oct 7, 2022
a9d5b1b
chore: hide opentelemetry imports under TYPE_CHECKING
Oct 7, 2022
a706480
test: avoid using spawn
JoanFM Oct 7, 2022
ef4a232
fix: add explicit type info and hide imports
Oct 7, 2022
1c0aedd
fix(executors): handle optional runtime_args correctly
Oct 7, 2022
c292234
chore: rename otel_context to tracing_context
Oct 7, 2022
01d543b
feat: use None instead of NoOp tracer and meter implementations
Oct 10, 2022
4afc51b
fix: remove unused import
Oct 10, 2022
70146e4
feat: add default tracing span for DataRequestHandler handle invocation
Oct 10, 2022
7f20c06
test: add test case to verify exception recording in a span
Oct 10, 2022
550a975
fix: use continue_on_error instead of try-except-pass
Oct 10, 2022
b644004
Merge branch 'master' into feat-instrumentation-5155
girishc13 Oct 10, 2022
d55d86c
chore: rename method name to match returning a list
Oct 11, 2022
132a932
fix: rename span_exporter args to traces_exporter
Oct 11, 2022
bb0b003
style: fix overload and cli autocomplete
jina-bot Oct 11, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
6 changes: 6 additions & 0 deletions extra-requirements.txt
Expand Up @@ -35,8 +35,14 @@ packaging>=20.0: core
docarray>=0.16.4: core
jina-hubble-sdk>=0.15.1: core
jcloud>=0.0.35: core
opentelemetry-api>=1.12.0: core
uvloop: perf,standard,devel
prometheus_client: perf,standard,devel
opentelemetry-sdk>=1.12.0: perf,standard,devel
opentelemetry-exporter-otlp>=1.12.0: perf,standard,devel
opentelemetry-exporter-prometheus>=1.12.0rc1: perf,standard,devel
opentelemetry-semantic-conventions>=0.33b0: perf,standard,devel
opentelemetry-instrumentation-grpc>=0.33b0: perf,standard,devel
fastapi>=0.76.0: standard,devel
uvicorn[standard]: standard,devel
docarray[common]>=0.16.3: standard,devel
Expand Down
1 change: 1 addition & 0 deletions jina/__init__.py
Expand Up @@ -102,6 +102,7 @@ def _warning_on_one_line(message, category, filename, lineno, *args, **kwargs):
'JINA_OPTOUT_TELEMETRY',
'JINA_RANDOM_PORT_MAX',
'JINA_RANDOM_PORT_MIN',
'JINA_ENABLE_OTEL_TRACING',
girishc13 marked this conversation as resolved.
Show resolved Hide resolved
)

__default_host__ = _os.environ.get(
Expand Down
2 changes: 2 additions & 0 deletions jina/clients/__init__.py
Expand Up @@ -24,6 +24,8 @@ def Client(
protocol: Optional[str] = 'GRPC',
proxy: Optional[bool] = False,
tls: Optional[bool] = False,
opentelemetry_tracing: Optional[bool] = False,
opentelemetry_metrics: Optional[bool] = False,
**kwargs
) -> Union[
'AsyncWebSocketClient',
Expand Down
4 changes: 3 additions & 1 deletion jina/clients/base/__init__.py
Expand Up @@ -17,9 +17,10 @@

InputType = Union[GeneratorSourceType, Callable[..., GeneratorSourceType]]
CallbackFnType = Optional[Callable[[Response], None]]
from jina.serve.instrumentation import InstrumentationMixin


class BaseClient(ABC):
class BaseClient(InstrumentationMixin, ABC):
"""A base client for connecting to the Flow Gateway.

:param args: the Namespace from argparse
Expand Down Expand Up @@ -47,6 +48,7 @@ def __init__(
os.unsetenv('https_proxy')
self._inputs = None
send_telemetry_event(event='start', obj=self)
self._setup_instrumentation()

@staticmethod
def check_input(inputs: Optional['InputType'] = None, **kwargs) -> None:
Expand Down
16 changes: 16 additions & 0 deletions jina/parsers/client.py
Expand Up @@ -30,3 +30,19 @@ def mixin_client_features_parser(parser):
default=False,
help='If set, then the input and output of this Client work in an asynchronous manner. ',
)

parser.add_argument(
'--opentelemetry-tracing',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just tracing?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree here, I am not sure people are interested in the OpenTelemetry detail?

action='store_true',
default=False,
help='If set, real implementation of the tracer will be available and will be enabled for automatic tracing of requests and customer span creation. '
'Otherwise a no-op implementation will be provided.',
)

parser.add_argument(
'--opentelemetry-metrics',
action='store_true',
default=False,
help='If set, real implementation of the metrics will be available for default monitoring and custom measurements. '
'Otherwise a no-op implementation will be provided.',
)
16 changes: 16 additions & 0 deletions jina/parsers/orchestrate/pod.py
Expand Up @@ -119,3 +119,19 @@ def mixin_pod_parser(parser):
help='If set, the current Pod/Deployment can not be further chained, '
'and the next `.add()` will chain after the last Pod/Deployment not this current one.',
)

gp.add_argument(
'--opentelemetry-tracing',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just tracing?

action='store_true',
default=False,
help='If set, real implementation of the tracer will be available and will be enabled for automatic tracing of requests and customer span creation. '
'Otherwise a no-op implementation will be provided.',
)

gp.add_argument(
'--opentelemetry-metrics',
action='store_true',
default=False,
help='If set, real implementation of the metrics will be available for default monitoring and custom measurements. '
'Otherwise a no-op implementation will be provided.',
)
6 changes: 6 additions & 0 deletions jina/resources/extra-requirements.txt
Expand Up @@ -35,8 +35,14 @@ packaging>=20.0: core
docarray>=0.16.4: core
jina-hubble-sdk>=0.15.1: core
jcloud>=0.0.35: core
opentelemetry-api>=1.12.0: core
uvloop: perf,standard,devel
prometheus_client: perf,standard,devel
opentelemetry-sdk>=1.12.0: perf,standard,devel
opentelemetry-exporter-otlp>=1.12.0: perf,standard,devel
opentelemetry-exporter-prometheus>=1.12.0rc1: perf,standard,devel
opentelemetry-semantic-conventions>=0.33b0: perf,standard,devel
opentelemetry-instrumentation-grpc>=0.33b0: perf,standard,devel
fastapi>=0.76.0: standard,devel
uvicorn[standard]: standard,devel
docarray[common]>=0.16.3: standard,devel
Expand Down
7 changes: 7 additions & 0 deletions jina/serve/executors/__init__.py
Expand Up @@ -7,6 +7,8 @@
from types import SimpleNamespace
from typing import TYPE_CHECKING, Any, Dict, Optional, Type, Union

from opentelemetry import metrics, trace
girishc13 marked this conversation as resolved.
Show resolved Hide resolved

from jina import __args_executor_init__, __cache_path__, __default_endpoint__
from jina.enums import BetterEnum
from jina.helper import (
Expand Down Expand Up @@ -140,6 +142,7 @@ def __init__(
self._add_requests(requests)
self._add_runtime_args(runtime_args)
self._init_monitoring()
self._init_instrumentation()
self._init_workspace = workspace
self.logger = JinaLogger(self.__class__.__name__)
if __dry_run_endpoint__ not in self.requests:
Expand Down Expand Up @@ -186,6 +189,10 @@ def _init_monitoring(self):
self._summary_method = None
self._metrics_buffer = None

def _init_instrumentation(self):
self.tracer = trace.get_tracer(self.runtime_args.name)
self.meter = metrics.get_meter(self.runtime_args.name)

def _add_requests(self, _requests: Optional[Dict]):
if not hasattr(self, 'requests'):
self.requests = {}
Expand Down
77 changes: 77 additions & 0 deletions jina/serve/instrumentation/__init__.py
@@ -0,0 +1,77 @@
from opentelemetry import metrics, trace
from opentelemetry.instrumentation.grpc import (
client_interceptor as grpc_client_interceptor,
)
from opentelemetry.sdk.metrics import MeterProvider
from opentelemetry.sdk.metrics.export import (
ConsoleMetricExporter,
PeriodicExportingMetricReader,
)
from opentelemetry.sdk.resources import SERVICE_NAME, Resource
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor, ConsoleSpanExporter

from jina.serve.instrumentation._aio_client import (
StreamStreamAioClientInterceptor,
StreamUnaryAioClientInterceptor,
UnaryStreamAioClientInterceptor,
UnaryUnaryAioClientInterceptor,
)


class InstrumentationMixin:
'''Instrumentation mixin for OpenTelemetery Tracing and Metrics handling'''

def __init__(self) -> None:
self.tracer = trace.NoOpTracer()
self.meter = metrics.NoOpMeter(name='no-op')

def _setup_instrumentation(self) -> None:
name = self.__class__.__name__
if hasattr(self, 'name') and self.name:
name = self.name
resource = Resource(attributes={SERVICE_NAME: name})

if self.args.opentelemetry_tracing:
provider = TracerProvider(resource=resource)
processor = BatchSpanProcessor(ConsoleSpanExporter())
provider.add_span_processor(processor)
trace.set_tracer_provider(provider)
self.tracer = trace.get_tracer(name)

if self.args.opentelemetry_metrics:
metric_reader = PeriodicExportingMetricReader(ConsoleMetricExporter())
meter_provider = MeterProvider(
metric_readers=[metric_reader], resource=resource
)
metrics.set_meter_provider(meter_provider)
self.meter = metrics.get_meter(name)

def aio_tracing_server_interceptor(self):
'''Create a gRPC aio server interceptor.
:returns: A service-side aio interceptor object.
'''
from . import _aio_server
girishc13 marked this conversation as resolved.
Show resolved Hide resolved

return _aio_server.OpenTelemetryAioServerInterceptor(self.tracer)

@staticmethod
def aio_tracing_client_interceptors():
'''Create a gRPC client aio channel interceptor.
:returns: An invocation-side list of aio interceptor objects.
'''
tracer = trace.get_tracer(__name__)
JoanFM marked this conversation as resolved.
Show resolved Hide resolved

return [
UnaryUnaryAioClientInterceptor(tracer),
UnaryStreamAioClientInterceptor(tracer),
StreamUnaryAioClientInterceptor(tracer),
StreamStreamAioClientInterceptor(tracer),
]

@staticmethod
def tracing_client_interceptor():
'''
:returns: a gRPC client interceptor with the global tracing provider.
'''
return grpc_client_interceptor(trace.get_tracer_provider())