Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance logging #2490

Merged
merged 11 commits into from
Feb 5, 2024
Merged

Enhance logging #2490

merged 11 commits into from
Feb 5, 2024

Conversation

bwoebi
Copy link
Collaborator

@bwoebi bwoebi commented Jan 26, 2024

Description

This adds span serialization, span creation and hook invocation logging as well as sidecar logging.

It also features two new config knobs: DD_TRACE_LOG_FILE and DD_TRACE_LOG_LEVEL.

If a log file is given (i.e. not stderr/syslog), sidecar logs will be included.

Reviewer checklist

  • Test coverage seems ok.
  • Appropriate labels assigned.

@codecov-commenter
Copy link

codecov-commenter commented Jan 26, 2024

Codecov Report

Merging #2490 (6ae58e7) into master (679da8e) will decrease coverage by 0.42%.
Report is 3 commits behind head on master.
The diff coverage is 72.38%.

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #2490      +/-   ##
============================================
- Coverage     78.71%   78.29%   -0.42%     
+ Complexity      267      216      -51     
============================================
  Files           110      105       -5     
  Lines         13191    13109      -82     
============================================
- Hits          10383    10264     -119     
- Misses         2808     2845      +37     
Flag Coverage Δ
tracer-extension 78.22% <72.38%> (-0.45%) ⬇️
tracer-integrations 79.86% <ø> (+0.37%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
ext/configuration.h 100.00% <ø> (ø)
ext/ddtrace_arginfo.h 100.00% <ø> (ø)
ext/priority_sampling/priority_sampling.c 95.00% <100.00%> (+0.08%) ⬆️
ext/random.c 92.94% <ø> (ø)
ext/serializer.c 79.04% <100.00%> (+0.02%) ⬆️
ext/sidecar.c 83.92% <100.00%> (+1.57%) ⬆️
ext/span.c 92.82% <100.00%> (+0.30%) ⬆️
ext/compatibility.h 74.28% <83.33%> (-0.72%) ⬇️
ext/configuration.c 78.46% <50.00%> (-1.23%) ⬇️
ext/hook/uhook_legacy.c 87.94% <83.33%> (-0.13%) ⬇️
... and 5 more

... and 20 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 679da8e...6ae58e7. Read the comment docs.

@pr-commenter
Copy link

pr-commenter bot commented Jan 26, 2024

Benchmarks

Benchmark execution time: 2024-02-05 14:31:13

Comparing candidate commit 9a595b2 in PR branch bob/log with baseline commit 679da8e in branch master.

Found 0 performance improvements and 4 performance regressions! Performance is the same for 36 metrics, 50 unstable metrics.

scenario:HookBench/benchHookOverheadInstallHookOnFunction

  • 🟥 execution_time [+190.759µs; +238.705µs] or [+24.801%; +31.035%]

scenario:HookBench/benchHookOverheadInstallHookOnMethod

  • 🟥 execution_time [+195.461µs; +236.355µs] or [+25.169%; +30.435%]

scenario:SpanBench/benchDatadogAPI

  • 🟥 execution_time [+3.171µs; +3.666µs] or [+18.691%; +21.609%]

scenario:TraceSerializationBench/benchSerializeTrace

  • 🟥 execution_time [+13.205µs; +29.595µs] or [+7.585%; +16.999%]

@bwoebi bwoebi force-pushed the bob/log branch 8 times, most recently from 2a66f94 to 222a50e Compare January 29, 2024 18:02
Signed-off-by: Bob Weinand <bob.weinand@datadoghq.com>
Signed-off-by: Bob Weinand <bob.weinand@datadoghq.com>
@bwoebi bwoebi marked this pull request as ready for review January 29, 2024 20:14
@bwoebi bwoebi requested review from a team as code owners January 29, 2024 20:14
Copy link
Collaborator

@pierotibou pierotibou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have time for a big review yet, so my first concern is about the performance impact. Do the benchmarks run with debug logs? If yes, they shouldn't and we should rerun the benchmarks without debug logs.

If they don't, I wonder if we have opportunities to avoid decreasing as much performances. Maybe we need a lot of: "if debugEnabled then ..."

@@ -37,6 +37,7 @@ x-aliases:
- PHP_IDE_CONFIG=serverName=docker
- DD_TRACE_DOCKER_DEBUG
- DATADOG_HAVE_DEV_ENV=1
- DD_SPAWN_WORKER_USE_EXEC=1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this now?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need it, it's for convenience to be able to easily attach gdb to the sidecar.

@@ -238,6 +238,17 @@ static void dd_zend_interrupt_function(zend_execute_data *ex)
}
#endif

void dd_uhook_log_invocation(void (*log)(const char *, ...), zend_execute_data *execute_data, const char *type, zend_object *closure) {
const zend_function *func = zend_get_closure_method_def(closure);
log("Running a %s hook function from %s:%d on %s %s%s%s",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we using log here instead of LOG? Does it matter?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the log variable has the log level included and is spawned into your scope when using LOGEV.

ext/logging.c Show resolved Hide resolved
Signed-off-by: Bob Weinand <bob.weinand@datadoghq.com>
Comment on lines +30 to +32
void ddtrace_log_ginit(void) {
dd_log_set_level(get_global_DD_TRACE_DEBUG());
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where are we calling this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To ensure the log level is initialized per thread (adding comment)

Comment on lines +20 to +32
const Error = 1;
const Warn = 2;
const Info = 3;
const Debug = 4;
const Trace = 5;
const Once = 1 << 3; // I.e. once per request
const _Deprecated = 3 | (1 << 4);
const Deprecated = 3 | (1 << 4) | (1 << 3) /* Once */;
const Startup = 3 | (2 << 4);
const Startup_Warn = 1 | (2 << 4);
const Span = 4 | (3 << 4);
const Span_Trace = 5 | (3 << 4);
const Hook_Trace = 5 | (4 << 4);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, there was a None flag (0). Why isn't it needed anymore? Are we constantly logging something (e.g., Info)?

If we are constantly logging something, could this be the origin of the decrease in micro performance? It could make, therefore, sense that it doesn't appear in macrobenchmarks, considering the relative overhead is negligible.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are only error levels of error messages, not the error levels you can configure.

The default error level is Error. To turn all errors off you could even set it to Off.

Signed-off-by: Bob Weinand <bob.weinand@datadoghq.com>
Signed-off-by: Bob Weinand <bob.weinand@datadoghq.com>
Comment on lines +23 to +27
} else if (zend_string_equals_literal_ci(Z_STR(zai_config_memoized_entries[DDTRACE_CONFIG_DD_TRACE_LOG_LEVEL].decoded_value), "error")) {
ddog_set_error_log_level(once); // optimized handling without parsing
} else {
ddog_set_log_level(dd_zend_string_to_CharSlice(get_global_DD_TRACE_LOG_LEVEL()), once);
}
Copy link
Contributor

@PROFeNoM PROFeNoM Feb 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if I understand this correctly, the "least" amount of logs one can set is error, right? Even if DD_TRACE_DEBUG is disabled?

I'd have thought that if DD_TRACE_DEBUG is disabled, then we have no logs at all.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If DD_TRACE_DEBUG is disabled, error is still present (current state). DD_TRACE_LOG_LEVEL=off would fully disable it.

@bwoebi bwoebi force-pushed the bob/log branch 2 times, most recently from 597d994 to 1e85eaf Compare February 2, 2024 20:11
ext/ddtrace.c Outdated Show resolved Hide resolved
ext/ddtrace.c Outdated
glibc__cxa_thread_atexit_impl = DL_FETCH_SYMBOL(NULL, "__cxa_thread_atexit_impl");
if (glibc__cxa_thread_atexit_impl == NULL) {
glibc__cxa_thread_atexit_impl = CXA_THREAD_ATEXIT_UNAVAILABLE;
pthread_key_create(&dd_cxa_thread_atexit_key, NULL);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason to use this rather than a thread-local variable?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stupid mistake. The point was having the destructor used, but obviously I set it to NULL :-)

ext/ddtrace.c Outdated

// Note: this symbol is not public
int __cxa_thread_atexit_impl(void (*func)(void *), void *obj, void *dso_symbol) {
while (glibc__cxa_thread_atexit_impl != CXA_THREAD_ATEXIT_PHP && glibc__cxa_thread_atexit_impl != CXA_THREAD_ATEXIT_UNAVAILABLE) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure about this. glibc__cxa_thread_atexit_impl is a global variable that's read and written to from several threads without synchronization. the race doesn't seem benign if two threads enter the function with glibc__cxa_thread_atexit_impl == CXA_THREAD_ATEXIT_UNINITIALIZED at the same time (pthread_key_create is called twice, and one or both could use the first value written for dd_cxa_thread_atexit_key)

Also the loop seems to obscure more than to help readability. From what I can tell it would be the same to have

if (glibc__cxa_thread_atexit_impl == CXA_THREAD_ATEXIT_UNINITIALIZED) {
  // set to symbol address or CXA_THREAD_ATEXIT_UNAVAILABLE
  if (glibc__cxa_thread_atexit_impl == CXA_THREAD_ATEXIT_UNAVAILABLE) {
    // call pthread_key_create
  }
}

if ((uintptr_t)glibc__cxa_thread_atexit_impl > 2) {
   // call the real thing
  return;
} 

if (glibc__cxa_thread_atexit_impl == CXA_THREAD_ATEXIT_UNAVAILABLE) {
   // pthread_setspecific ... 
}

// the rest

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, but the function will be called the first time before any threads are forked. I may add a note about that.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And agree, simplified now.

@bwoebi bwoebi force-pushed the bob/log branch 2 times, most recently from c48375d to 9a595b2 Compare February 5, 2024 14:01
Signed-off-by: Bob Weinand <bob.weinand@datadoghq.com>
Copy link
Contributor

@PROFeNoM PROFeNoM left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, as the performance degradations, as discussed internally, have no macro impact.

@bwoebi bwoebi merged commit daed7ad into master Feb 5, 2024
617 of 624 checks passed
@bwoebi bwoebi deleted the bob/log branch February 5, 2024 15:34
@github-actions github-actions bot added this to the 0.98.0 milestone Feb 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants