Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support logging #725

Merged
merged 27 commits into from
May 1, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
dc32644
Add logging feature to the distro
harelmo-lumigo Apr 7, 2024
7294ebd
Fix indentation when emitting logs
harelmo-lumigo Apr 8, 2024
534cec5
Pre-CR changes
harelmo-lumigo Apr 8, 2024
8b37423
Test span correlation
harelmo-lumigo Apr 9, 2024
13fdb9e
Test disabled logging
harelmo-lumigo Apr 9, 2024
9ad0e32
Reuse endpoint
harelmo-lumigo Apr 9, 2024
e92a2fc
Test bad dump-file fallback
harelmo-lumigo Apr 10, 2024
402b629
Secret scrubbing and documentation
harelmo-lumigo Apr 11, 2024
8b1e826
Update docs
harelmo-lumigo Apr 11, 2024
f5d7051
Cleaup
harelmo-lumigo Apr 11, 2024
79a450a
Handle both logged messages and objects
harelmo-lumigo Apr 11, 2024
d08ddf9
Implement scrubbing using custom instrumentation code
harelmo-lumigo Apr 16, 2024
f2ff8a3
Scrub using LumigoLogRecordProcessor
harelmo-lumigo Apr 16, 2024
4528dea
revert .pre-commit-config.yaml
harelmo-lumigo Apr 16, 2024
f640f1c
Note about LUMIGO_ENABLE_LOGS default value
harelmo-lumigo Apr 18, 2024
3f3b4e0
Add missing test-app dependency
harelmo-lumigo Apr 18, 2024
6a6ff1f
Try resolve CI-only test issue
harelmo-lumigo Apr 18, 2024
e646156
Rename test env-var
harelmo-lumigo Apr 18, 2024
2f45cef
Fix import issue
harelmo-lumigo Apr 18, 2024
047d9f6
Move logs process to a dedicated folder
harelmo-lumigo Apr 18, 2024
a36e672
Deps
harelmo-lumigo Apr 18, 2024
573ae66
Add missing __init__ file
harelmo-lumigo Apr 18, 2024
4845cd8
Docs
harelmo-lumigo Apr 21, 2024
ff4e84d
Simplify scrubbing tests
harelmo-lumigo Apr 21, 2024
14ad455
Try fix issue with CI
harelmo-lumigo Apr 21, 2024
aff4898
Try fix issue with CI - take II
harelmo-lumigo Apr 21, 2024
060ddd7
Merge branch 'main' into RD-12474-support-logging-in-python-distro
harelmo-lumigo May 1, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
14 changes: 12 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,10 @@ The Lumigo OpenTelemetry Distribution for Python is a package that provides no-c
The Lumigo OpenTelemetry Distribution for Python is made of several upstream OpenTelemetry packages, with additional automated quality-assurance and customizations that optimize for no-code injection, meaning that you should need to update exactly zero lines of code in your application in order to make use of the Lumigo OpenTelemetry Distribution.
(See the [No-code activation](#no-code-activation) section for auto-instrumentation instructions)

## Logging support
The Lumigo OpenTelemetry Distribution also allows logging span-correlated records. See the [configuration](#logging-instrumentation) section for details on how to enable this feature.
When using the logging feature, the same set of rules for [secret masking](#lumigo-specific-configurations) applies on the content of the log message, with only `LUMIGO_SECRET_MASKING_REGEX` being considered.

**Note:** If you are looking for the Lumigo Python tracer for AWS Lambda functions, [`lumigo-tracer`](https://pypi.org/project/lumigo-tracer/) is the package you should use instead.

## Setup
Expand Down Expand Up @@ -102,9 +106,15 @@ The `lumigo_opentelemetry` package additionally supports the following configura
* `LUMIGO_FILTER_HTTP_ENDPOINTS_REGEX='["regex1", "regex2"]'`: This option enables the filtering of client and server endpoints through regular expression searches. Fine-tune your settings via the following environment variables, which work in conjunction with `LUMIGO_FILTER_HTTP_ENDPOINTS_REGEX` for a specific span type:
* `LUMIGO_FILTER_HTTP_ENDPOINTS_REGEX_SERVER` applies the regular expression search exclusively to server spans. Searching is performed against the following attributes on a span: `url.path` and `http.target`.
* `LUMIGO_FILTER_HTTP_ENDPOINTS_REGEX_CLIENT` applies the regular expression search exclusively to client spans. Searching is performed against the following attributes on a span: `url.full` and `http.url`.

For more information check out [Filtering http endpoints](#filtering-http-endpoints).

#### Logging instrumentation

* `LUMIGO_ENABLE_LOGS` - Default: `false`. When set to `true`, turns on the `logging` instrumentation to capture log-records logged by Python's `logging` builtin library and send them to Lumigo. Emitted logs will also get injected with the active span context (see [list](https://opentelemetry-python-contrib.readthedocs.io/en/latest/instrumentation/logging/logging.html#module-opentelemetry.instrumentation.logging)).

* `LUMIGO_DEBUG_LOGDUMP` - similar to `LUMIGO_DEBUG_SPANDUMP`, only for logs instead of spans. Effective only when `LUMIGO_ENABLE_LOGS` is set to `true`.

### Execution Tags

[Execution Tags](https://docs.lumigo.io/docs/execution-tags) allow you to dynamically add dimensions to your invocations so that they can be identified, searched for, and filtered in Lumigo.
Expand Down Expand Up @@ -410,7 +420,7 @@ For exclusive server (inbound) or client (outbound) span filtering, use the envi

Notes:
* the environment variable must be a valid JSON array of strings, so if you want to match endpoint with the hostname `google.com` the environment variable value should be `["google\\.com"]`.
* If we are filtering out an HTTP call to an opentelemetry traced component, every subsequent invocation made by that
* If we are filtering out an HTTP call to an opentelemetry traced component, every subsequent invocation made by that
component won't be traced either.

Examples:
Expand Down
33 changes: 33 additions & 0 deletions noxfile.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

import os
import platform
from posixpath import abspath
import re
import sys
import tempfile
Expand Down Expand Up @@ -1007,6 +1008,38 @@ def integration_tests_redis(
kill_process_and_clean_outputs(temp_file, "test_redis", session)


@nox.session()
@nox.parametrize(
"python",
[(python) for python in python_versions()],
)
def integration_tests_logging(session):
session.install(".")
temp_file = create_it_tempfile("logging")
site_packages_path = abspath(
f"{session.virtualenv.location_name}/lib/python{session.python}/site-packages"
)
with session.chdir("src/test/integration/logging"):
try:
session.run(
"pytest",
"--tb=native",
"--log-cli-level=INFO",
"--color=yes",
"-v",
"-s",
"./tests/test_logging.py",
env={
"LUMIGO_DEBUG_LOGDUMP": temp_file,
"LUMIGO_DEBUG": "true",
"NOX_SITE_PACKAGES_PATH": site_packages_path,
},
)
finally:
# No need to run kill_process_and_clean_outputs() here, as the logging tests are a script that runs and exits
clean_outputs(temp_file, session)


def kill_process_and_clean_outputs(full_path: str, process_name: str, session) -> None:
kill_process(process_name)
clean_outputs(full_path, session)
Expand Down
37 changes: 19 additions & 18 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,25 +17,26 @@
"autowrapt>=1.0",
"wrapt>=1.11.0",
"lumigo_core==0.0.10",
"opentelemetry-api==1.20.0",
"opentelemetry-sdk==1.20.0",
"opentelemetry-api==1.22.0",
"opentelemetry-sdk==1.22.0",
"opentelemetry-sdk-extension-aws==2.0.1",
"opentelemetry-exporter-otlp-proto-http==1.20.0",
"opentelemetry-semantic-conventions==0.41b0",
"opentelemetry-instrumentation==0.41b0",
"opentelemetry-instrumentation-asgi==0.41b0",
"opentelemetry-instrumentation-boto==0.41b0",
"opentelemetry-instrumentation-fastapi==0.41b0",
"opentelemetry-instrumentation-flask==0.41b0",
"opentelemetry-instrumentation-grpc==0.41b0",
"opentelemetry-instrumentation-kafka-python==0.41b0",
"opentelemetry-instrumentation-pika==0.41b0",
"opentelemetry-instrumentation-psycopg2==0.41b0",
"opentelemetry-instrumentation-pymongo==0.41b0",
"opentelemetry-instrumentation-pymysql==0.41b0",
"opentelemetry-instrumentation-requests==0.41b0",
"opentelemetry-instrumentation-redis==0.41b0",
"opentelemetry-instrumentation-django==0.41b0",
"opentelemetry-exporter-otlp-proto-http==1.22.0",
"opentelemetry-semantic-conventions==0.43b0",
"opentelemetry-instrumentation==0.43b0",
"opentelemetry-instrumentation-asgi==0.43b0",
"opentelemetry-instrumentation-boto==0.43b0",
"opentelemetry-instrumentation-fastapi==0.43b0",
"opentelemetry-instrumentation-flask==0.43b0",
"opentelemetry-instrumentation-grpc==0.43b0",
"opentelemetry-instrumentation-kafka-python==0.43b0",
"opentelemetry-instrumentation-pika==0.43b0",
"opentelemetry-instrumentation-psycopg2==0.43b0",
"opentelemetry-instrumentation-pymongo==0.43b0",
"opentelemetry-instrumentation-pymysql==0.43b0",
"opentelemetry-instrumentation-requests==0.43b0",
"opentelemetry-instrumentation-redis==0.43b0",
"opentelemetry-instrumentation-django==0.43b0",
"opentelemetry-instrumentation-logging==0.43b0",
# v4.7.1 is the last version that supports python 3.7
"typing_extensions==4.7.1; python_version<'3.8'",
# v6.7.0 is the last version that supports python 3.7
Expand Down
102 changes: 86 additions & 16 deletions src/lumigo_opentelemetry/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

import logging
import os
import sys
from typing import Any, Callable, Dict, List, TypeVar

LOG_FORMAT = "#LUMIGO# - %(asctime)s - %(levelname)s - %(message)s"
Expand Down Expand Up @@ -94,25 +95,38 @@ def init() -> Dict[str, Any]:
)

from opentelemetry import trace
from opentelemetry import _logs
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.trace import SpanLimits, TracerProvider

from opentelemetry.sdk._logs import LoggerProvider, LoggingHandler
from opentelemetry.sdk._logs.export import BatchLogRecordProcessor
from opentelemetry.exporter.otlp.proto.http._log_exporter import OTLPLogExporter
from opentelemetry.instrumentation.logging import LoggingInstrumentor
from lumigo_opentelemetry.resources.span_processor import LumigoSpanProcessor

DEFAULT_LUMIGO_ENDPOINT = (
"https://ga-otlp.lumigo-tracer-edge.golumigo.com/v1/traces"
)
DEFAULT_DEPENDENCIES_ENDPOINT = (
"https://ga-otlp.lumigo-tracer-edge.golumigo.com/v1/dependencies"
LUMIGO_ENDPOINT_BASE_URL = "https://ga-otlp.lumigo-tracer-edge.golumigo.com/v1"

DEFAULT_LUMIGO_ENDPOINT = f"{LUMIGO_ENDPOINT_BASE_URL}/traces"
DEFAULT_LUMIGO_LOGS_ENDPOINT = f"{LUMIGO_ENDPOINT_BASE_URL}/logs"
DEFAULT_DEPENDENCIES_ENDPOINT = f"{LUMIGO_ENDPOINT_BASE_URL}/dependencies"

lumigo_traces_endpoint = os.getenv("LUMIGO_ENDPOINT", DEFAULT_LUMIGO_ENDPOINT)
lumigo_logs_endpoint = os.getenv(
"LUMIGO_LOGS_ENDPOINT", DEFAULT_LUMIGO_LOGS_ENDPOINT
)
lumigo_endpoint = os.getenv("LUMIGO_ENDPOINT", DEFAULT_LUMIGO_ENDPOINT)
lumigo_token = os.getenv("LUMIGO_TRACER_TOKEN")
lumigo_report_dependencies = os.getenv("LUMIGO_REPORT_DEPENDENCIES", "true").lower()
lumigo_report_dependencies = (
os.getenv("LUMIGO_REPORT_DEPENDENCIES", "true").lower() == "true"
)
logging_enabled = os.getenv("LUMIGO_ENABLE_LOGS", "").lower() == "true"
spandump_file = os.getenv("LUMIGO_DEBUG_SPANDUMP")
logdump_file = os.getenv("LUMIGO_DEBUG_LOGDUMP")

# Activate instrumentations
from lumigo_opentelemetry.instrumentations import instrumentations # noqa
from lumigo_opentelemetry.instrumentations.instrumentations import framework
from lumigo_opentelemetry.libs.general_utils import get_max_size
from lumigo_opentelemetry.processors.logs_processor import LumigoLogRecordProcessor
from lumigo_opentelemetry.resources.detectors import (
get_infrastructure_resource,
get_process_resource,
Expand All @@ -122,27 +136,40 @@ def init() -> Dict[str, Any]:
infrastructure_resource = get_infrastructure_resource()
process_resource = get_process_resource()

tracer_resource = get_resource(
resource = get_resource(
infrastructure_resource, process_resource, {"framework": framework}
)

tracer_provider = TracerProvider(
resource=tracer_resource,
resource=resource,
sampler=_get_lumigo_sampler(),
span_limits=(SpanLimits(max_span_attribute_length=(get_max_size()))),
)

logger_provider = LoggerProvider(resource=resource)
logger_provider.add_log_record_processor(LumigoLogRecordProcessor())

if lumigo_token:
tracer_provider.add_span_processor(
LumigoSpanProcessor(
OTLPSpanExporter(
endpoint=lumigo_endpoint,
endpoint=lumigo_traces_endpoint,
headers={"Authorization": f"LumigoToken {lumigo_token}"},
),
)
)

if lumigo_report_dependencies == "true":
if logging_enabled:
logger_provider.add_log_record_processor(
BatchLogRecordProcessor(
OTLPLogExporter(
endpoint=lumigo_logs_endpoint,
headers={"Authorization": f"LumigoToken {lumigo_token}"},
)
)
)

if lumigo_report_dependencies:
from lumigo_opentelemetry.dependencies import report

try:
Expand All @@ -161,7 +188,6 @@ def init() -> Dict[str, Any]:
"no data will be sent to Lumigo"
)

spandump_file = os.getenv("LUMIGO_DEBUG_SPANDUMP")
if spandump_file:
from opentelemetry.sdk.trace.export import (
ConsoleSpanExporter,
Expand All @@ -184,7 +210,43 @@ def init() -> Dict[str, Any]:

trace.set_tracer_provider(tracer_provider)

return {"tracer_provider": tracer_provider}
if logging_enabled:
harelmo-lumigo marked this conversation as resolved.
Show resolved Hide resolved
_logs.set_logger_provider(logger_provider)

# Add the handler to the root logger, hence affecting all loggers created by the app from now on
logging.getLogger().addHandler(LoggingHandler(logger_provider=logger_provider))

# Inject the span context into logs
LoggingInstrumentor().instrument(set_logging_format=True)

if logdump_file:
harelmo-lumigo marked this conversation as resolved.
Show resolved Hide resolved
from opentelemetry.sdk._logs.export import (
ConsoleLogExporter,
SimpleLogRecordProcessor,
)

try:
output = open(logdump_file, "w")
except Exception:
logger.error(
f"Cannot open the log dump file for writing: {logdump_file}"
)
output = sys.stdout # type: ignore

logger_provider.add_log_record_processor(
SimpleLogRecordProcessor(
ConsoleLogExporter(
out=output,
# Override the default formatter to remove indentation, so one log record will be printed per line
formatter=lambda log_record: log_record.to_json(indent=None)
+ "\n",
)
)
)

logger.debug("Storing a copy of the log data under: %s", logdump_file)

return {"tracer_provider": tracer_provider, "logger_provider": logger_provider}


def lumigo_wrapped(func: Callable[..., T]) -> Callable[..., T]:
Expand All @@ -211,5 +273,13 @@ def wrapper(*args: List[Any], **kwargs: Dict[Any, Any]) -> T:
init_data = init()

tracer_provider = init_data.get("tracer_provider")

__all__ = ["auto_load", "init", "lumigo_wrapped", "logger", "tracer_provider"]
logger_provider = init_data.get("logger_provider")

__all__ = [
"auto_load",
"init",
"lumigo_wrapped",
"logger",
"tracer_provider",
"logger_provider",
]
Empty file.
15 changes: 15 additions & 0 deletions src/lumigo_opentelemetry/processors/logs_processor.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
from opentelemetry.sdk._logs import LogRecordProcessor, LogData


class LumigoLogRecordProcessor(LogRecordProcessor):
def emit(self, log_data: LogData) -> None:
# This class is used by __init__, so moving this to the enclosing scope will result in a circular import
from lumigo_opentelemetry.libs.json_utils import dump

log_data.log_record.body = dump(log_data.log_record.body)

def shutdown(self) -> None:
pass

def force_flush(self, timeout_millis: int = 30000) -> None:
pass
2 changes: 1 addition & 1 deletion src/test/integration/conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,6 @@
import pytest


@pytest.fixture(autouse=True)
@pytest.fixture()
def increment_spans_counter():
SpansContainer.update_span_offset()
Empty file.
Empty file.
24 changes: 24 additions & 0 deletions src/test/integration/logging/app/logging_app.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
from lumigo_opentelemetry import logger_provider, tracer_provider
import logging

tracer = tracer_provider.get_tracer(__file__)
logger = logging.getLogger(__name__)
logger.setLevel(logging.DEBUG)


def write_log(message):
log = {
"some-thing": message,
"some-super-sekret-stuff": "this is a secret",
}

logger.debug(log)


write_log(message="with no recording span")

with tracer.start_as_current_span("some-span") as span:
write_log("with recording span")

logger_provider.force_flush()
logger_provider.shutdown()
Empty file.