New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
core: apply LogLevelMax to messages about units #19050
core: apply LogLevelMax to messages about units #19050
Conversation
src/core/unit.h
Outdated
@@ -879,6 +879,11 @@ static inline bool unit_has_job_type(Unit *u, JobType type) { | |||
return u && u->job && u->job->type == type; | |||
} | |||
|
|||
static inline bool unit_filters_log_level(const Unit *u, int level) { | |||
ExecContext *ec = unit_get_exec_context((Unit *)u); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
casting the const away looks wrong. Please move that to unit_get_exec_context()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks pretty good, but please move the testing of the log level into some log_unit_struct() macro, i.e. move it into the log concept, so that we don't accidentally forget it.
src/core/cgroup.c
Outdated
@@ -2785,7 +2785,7 @@ int unit_check_oomd_kill(Unit *u) { | |||
if (!increased) | |||
return 0; | |||
|
|||
if (n > 0) | |||
if (n > 0 && !unit_filters_log_level(u, LOG_NOTICE)) | |||
log_struct(LOG_NOTICE, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, please instead add log_unit_struct() or so, that adds LOG_UNIT_ID(u) and so on implicitly, and does this check internally, too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LOG_UNIT_ID(u)
(etc.) doesn't appear in exactly the same place in every such call. There's a pattern I can abstract that covers the majority of the cases; should I abstract that and leave the remainder as a combination of unit_log_level_test
and log_struct
, or have log_unit_struct
not include LOG_UNIT_ID(u)
at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the order of the fields doesn't matter, journald might reorder anyway
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see, thanks!
src/core/execute.c
Outdated
command->path), | ||
"EXECUTABLE=%s", command->path); | ||
if (!unit_filters_log_level(unit, LOG_INFO)) | ||
log_struct_errno(LOG_INFO, r, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto, please add log_unit_struct_errno(), wich encapsulates this internally
static inline bool unit_filters_log_level(const Unit *u, int level) { | ||
ExecContext *ec = unit_get_exec_context((Unit *)u); | ||
return ec && ec->log_level_max >= 0 && ec->log_level_max < LOG_PRI(level); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'd invert the function result btw. i.e. i'd always focus on the positive case, i.e. letting the logging go though. i.e. invert and rename unit_log_level_test(…) or so?
src/core/job.c
Outdated
if (!unit_log_level_test(u, LOG_NOTICE)) | ||
log_struct(LOG_NOTICE, | ||
"JOB_TYPE=%s", job_type_to_string(t), | ||
"JOB_RESULT=%s", job_result_to_string(result), | ||
LOG_UNIT_ID(u), | ||
LOG_UNIT_MESSAGE(u, "Job %s/%s failed with result '%s'.", | ||
u->id, | ||
job_type_to_string(t), | ||
job_result_to_string(result))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@poettering This looks like it should be a log_unit_struct
call as well; the reason it isn't is that, unlike all the other calls to log_struct
that I modified, this one doesn't contain a LOG_UNIT_INVOCATION_ID
field. I don't know if that was an oversight (and thus that this should be a log_unit_struct
call like the others) or if there's a reason to exclude that field from this call; and if the latter, is this an acceptable solution, or should I remove LOG_UNIT_INVOCATION_ID
from log_unit_struct
and restore it to all the other call sites?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the invocation ID only makes sense when we log about units that are currently active (or at least were just active). but a job can can be enqueued/failed long before a unit is started, i.e. long bevor the invocation ID is assigned to it... Hence, we don't add it here, becuse the job is not directly associated to a specific invocation of a unit, it's usually preparation for a state change, and that's quite often to start something which will then, but only then result in an invocatin ID allocation.
hence, I'd suggest LOG_UNIT_INVOCATION_ID() is something that should be passed manually to log_unit_strruct(), even if we'll pass it in 90% of the cases
This commit applies the filtering imposed by LogLevelMax on a unit's processes to messages logged by PID1 about the unit as well. The target use case for this feature is a service that runs on a timer many times an hour, where the system administrator decides that writing a generic success message to the journal every few minutes or seconds adds no diagnostic value and isn't worth the clutter or disk I/O.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks really good. The log_unit_struct() changes is a big improvement, even beyond the log level cutoff.
THe man page should get some update too, saying this also applies about the logging from PID 1 about the unit now.
@@ -557,6 +557,9 @@ static void job_log_begin_status_message(Unit *u, uint32_t job_id, JobType t) { | |||
if (!IN_SET(t, JOB_START, JOB_STOP, JOB_RELOAD)) | |||
return; | |||
|
|||
if (!unit_log_level_test(u, LOG_INFO)) | |||
return; | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the log_struct() call in this function further down should also use log_unit_struct(), i.e please port all calls over systematically, including those where you add an early check like above
@@ -2220,7 +2223,7 @@ static int unit_log_resources(Unit *u) { | |||
t = strjoina(u->manager->invocation_log_field, u->invocation_id_string); | |||
iovec[n_iovec + 3] = IOVEC_MAKE_STRING(t); | |||
|
|||
log_struct_iovec(log_level, iovec, n_iovec + 4); | |||
log_unit_struct_iovec(u, log_level, iovec, n_iovec + 4); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
given this whole function is relatively expensive, it might make sense to add an early-exit here too if the log level would suppress the output anyway
@@ -4492,6 +4495,9 @@ void unit_warn_if_dir_nonempty(Unit *u, const char* where) { | |||
assert(u); | |||
assert(where); | |||
|
|||
if (!unit_log_level_test(u, LOG_NOTICE)) | |||
return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here, too, please change the log_struct() call below to use log_unit_struct(), too
@@ -957,6 +962,22 @@ int unit_thaw_vtable_common(Unit *u); | |||
#define log_unit_warning_errno(unit, error, ...) log_unit_full_errno(unit, LOG_WARNING, error, __VA_ARGS__) | |||
#define log_unit_error_errno(unit, error, ...) log_unit_full_errno(unit, LOG_ERR, error, __VA_ARGS__) | |||
|
|||
#define log_unit_struct_errno(unit, level, error, ...) \ | |||
({ \ | |||
const Unit *_u = (unit); \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add local variables like this for level + error here too? You use both twice here, and we'd rather avoid double evaluation. adding local variables and only assigning them once means we avoid the double eval
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
error
appears in multiple places in these macros but is only eval'd once. You're right about level
, though.
src/core/unit.h
Outdated
#define log_unit_struct_iovec_errno(unit, level, error, iovec, n_iovec) \ | ||
(unit_log_level_test(unit, level) ? \ | ||
log_struct_iovec_errno(level, error, iovec, n_iovec) : \ | ||
-ERRNO_VALUE(error)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
similar here
look excellent! thanks! lgtm |
- instead of noisy systemd timer (systemd/systemd#4919) (fixed a year ago with systemd/systemd#19050) - also lives at https://github.com/eMPee584/night-mode-daemon now - milestone: my first full python app : D
.. is this supposed to work with user units? because it doesn't seem to.. a user service with |
Works for me. Are you sure you've written the unit file correctly? |
doesn't work with old kernels since unpriv userspace cannot set xattrs on cgroups there, which we need for this to work. |
Thanks to both of you for replying.. kernel is
I will have a sterner look at this at some later point. |
This commit applies the filtering imposed by LogLevelMax on a unit's
processes to messages logged by PID1 about the unit as well.
The target use case for this feature is a service that runs on a timer
many times an hour, where the system administrator decides that writing
a generic success message to the journal every few minutes or seconds
adds no diagnostic value and isn't worth the clutter or disk I/O.
Feature design based on #14279 (comment). Closes #4919. The functionality added here is rather under-tested, but covering every possible case where filtering could happen is probably overboard? Let me know if there are other tests you think should be present.