Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Patch release 1.44.1 #16580

Merged
merged 12 commits into from
Dec 12, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
1 change: 0 additions & 1 deletion .github/workflows/monitor-releases.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,6 @@ concurrency: # This keeps multiple instances of the job from running concurrentl
jobs:
update-stable-agents-metadata:
name: update-stable-agents-metadata
if: ${{ github.ref == 'refs/heads/master' }}
runs-on: ubuntu-latest
steps:
- name: Checkout
Expand Down
7 changes: 7 additions & 0 deletions collectors/debugfs.plugin/debugfs_plugin.c
Original file line number Diff line number Diff line change
Expand Up @@ -235,6 +235,13 @@ int main(int argc, char **argv)
netdata_log_info("all modules are disabled, exiting...");
return 1;
}

fprintf(stdout, "\n");
fflush(stdout);
if (ferror(stdout) && errno == EPIPE) {
netdata_log_error("error writing to stdout: EPIPE. Exiting...");
return 1;
}
}

fprintf(stdout, "EXIT\n");
Expand Down
100 changes: 23 additions & 77 deletions collectors/log2journal/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,8 @@ We have an nginx server logging in this standard combined log format:
'"$http_referer" "$http_user_agent"';
```

### Extracting fields with a pattern

First, let's find the right pattern for `log2journal`. We ask ChatGPT:

```
Expand Down Expand Up @@ -170,6 +172,8 @@ TIME_LOCAL=19/Nov/2023:00:24:43 +0000

As you can see, it extracted all the fields and made them capitals, as systemd-journal expects them.

### Prefixing field names

To make sure the fields are unique for nginx and do not interfere with other applications, we should prefix them with `NGINX_`:

```yaml
Expand Down Expand Up @@ -214,6 +218,8 @@ NGINX_TIME_LOCAL=19/Nov/2023:00:24:43 +0000

```

### Renaming fields

Now, all fields start with `NGINX_` but we want `NGINX_REQUEST` to be the `MESSAGE` of the log line, as we will see it by default in `journalctl` and the Netdata dashboard. Let's rename it:

```yaml
Expand Down Expand Up @@ -262,7 +268,11 @@ NGINX_TIME_LOCAL=19/Nov/2023:00:24:43 +0000

```

Ideally, we would want the 5xx errors to be red in our `journalctl` output and the dashboard. To achieve that we need to add a PRIORITY field to set the log level. Log priorities are numeric and follow the `syslog` priorities. Checking `/usr/include/sys/syslog.h` we can see these:
### Injecting new fields

To have a complete message in journals we need 3 fields: `MESSAGE`, `PRIORITY` and `SYSLOG_IDENTIFIER`. We have already added `MESSAGE` by renaming `NGINX_REQUEST`. We can also inject a `SYSLOG_IDENTIFIER` and `PRIORITY`.

Ideally, we would want the 5xx errors to be red in our `journalctl` output and the dashboard. To achieve that we need to set the `PRIORITY` field to the right log level. Log priorities are numeric and follow the `syslog` priorities. Checking `/usr/include/sys/syslog.h` we can see these:

```c
#define LOG_EMERG 0 /* system is unusable */
Expand All @@ -279,7 +289,7 @@ Avoid setting priority to 0 (`LOG_EMERG`), because these will be on your termina

To set the PRIORITY field in the output, we can use `NGINX_STATUS`. We will do this in 2 steps: a) inject the priority field as a copy is `NGINX_STATUS` and then b) use a pattern on its value to rewrite it to the priority level we want.

First, let's inject it:
First, let's inject `SYSLOG_IDENTIFIER` and `PRIORITY`:

```yaml
pattern: |
Expand Down Expand Up @@ -311,6 +321,9 @@ rename:
inject: # <<< we added this
- key: PRIORITY # <<< we added this
value: '${NGINX_STATUS}' # <<< we added this

- key: SYSLOG_IDENTIFIER # <<< we added this
value: 'nginx-log' # <<< we added this
```

Let's see what this does:
Expand All @@ -328,11 +341,14 @@ NGINX_REQUEST_URI=/index.html
NGINX_SERVER_PROTOCOL=HTTP/1.1
NGINX_STATUS=200
NGINX_TIME_LOCAL=19/Nov/2023:00:24:43 +0000
PRIORITY=200 # <<< PRIORITY added
PRIORITY=200 # <<< PRIORITY added
SYSLOG_IDENTIFIER=nginx-log # <<< SYSLOG_IDENTIFIER added

```

Now we need to rewrite it to the right priority based on its value. We will assign the priority 6 (info) when the status is 1xx, 2xx, 3xx, priority 5 (notice) when status is 4xx, priority 3 (error) when status is 5xx and anything else will go to priority 4 (warning). Let's do it:
### Rewriting field values

Now we need to rewrite `PRIORITY` to the right syslog level based on its value (`NGINX_STATUS`). We will assign the priority 6 (info) when the status is 1xx, 2xx, 3xx, priority 5 (notice) when status is 4xx, priority 3 (error) when status is 5xx and anything else will go to priority 4 (warning). Let's do it:

```yaml
pattern: |
Expand Down Expand Up @@ -400,84 +416,14 @@ NGINX_REQUEST_URI=/index.html
NGINX_SERVER_PROTOCOL=HTTP/1.1
NGINX_STATUS=200
NGINX_TIME_LOCAL=19/Nov/2023:00:24:43 +0000
PRIORITY=6 # <<< PRIORITY rewritten here
PRIORITY=6 # <<< PRIORITY rewritten here
SYSLOG_IDENTIFIER=nginx-log

```

Rewrite rules are powerful. You can have named groups in them, like in the main pattern, to extract sub-fields from them, which you can then use in variable substitution. You can use rewrite rules to anonymize the URLs, e.g to remove customer IDs or transaction details from them.

To complete the example, we can also inject a `SYSLOG_IDENTIFIER`. Generally your journal logs should always have 3 fields: `MESSAGE`, `PRIORITY` and `SYSLOG_IDENTIFIER`. These 3 fields make it a complete entry. Then you can add as many fields as required for your use case.

```yaml
pattern: |
(?x) # Enable PCRE2 extended mode
^
(?<remote_addr>[^ ]+) \s - \s
(?<remote_user>[^ ]+) \s
\[
(?<time_local>[^\]]+)
\]
\s+ "
(?<request>
(?<request_method>[A-Z]+) \s+
(?<request_uri>[^ ]+) \s+
(?<server_protocol>[^"]+)
)
" \s+
(?<status>\d+) \s+
(?<body_bytes_sent>\d+) \s+
"(?<http_referer>[^"]*)" \s+
"(?<http_user_agent>[^"]*)"

prefix: 'NGINX_'

rename:
- new_key: MESSAGE
old_key: NGINX_REQUEST

inject:
- key: PRIORITY
value: '${NGINX_STATUS}'
- key: SYSLOG_IDENTIFIER # <<< we added this
value: 'nginx-log' # <<< we added this

rewrite:
- key: PRIORITY
match: '^[123]'
value: 6

- key: PRIORITY
match: '^4'
value: 5

- key: PRIORITY
match: '^5'
value: 3

- key: PRIORITY
match: '.*'
value: 4
```

Let's see it:

```bash
# echo '1.2.3.4 - - [19/Nov/2023:00:24:43 +0000] "GET /index.html HTTP/1.1" 200 4172 "-" "Go-http-client/1.1"' | log2journal -f nginx.yaml
MESSAGE=GET /index.html HTTP/1.1
NGINX_BODY_BYTES_SENT=4172
NGINX_HTTP_REFERER=-
NGINX_HTTP_USER_AGENT=Go-http-client/1.1
NGINX_REMOTE_ADDR=1.2.3.4
NGINX_REMOTE_USER=-
NGINX_REQUEST_METHOD=GET
NGINX_REQUEST_URI=/index.html
NGINX_SERVER_PROTOCOL=HTTP/1.1
NGINX_STATUS=200
NGINX_TIME_LOCAL=19/Nov/2023:00:24:43 +0000
PRIORITY=6
SYSLOG_IDENTIFIER=nginx-log # <<< SYSLOG_IDENTIFIER added

```
### Sending logs to systemd-journal

Now the message is ready to be sent to a systemd-journal. For this we use `systemd-cat-native`. This command can send such messages to a journal running on the localhost, a local journal namespace, or a `systemd-journal-remote` running on another server. By just appending `| systemd-cat-native` to the command, the message will be sent to the local journal.

Expand Down
10 changes: 7 additions & 3 deletions collectors/proc.plugin/sys_class_drm.c
Original file line number Diff line number Diff line change
Expand Up @@ -648,13 +648,17 @@ static int read_clk_freq_file(procfile **p_ff, const char *const pathname, colle
*p_ff = procfile_open(pathname, NULL, PROCFILE_FLAG_NO_ERROR_ON_FILE_IO);
if(unlikely(!*p_ff)) return -2;
}

if(unlikely(NULL == (*p_ff = procfile_readall(*p_ff)))) return -3;

for(size_t l = 0; l < procfile_lines(*p_ff) ; l++) {
char *str_with_units = NULL;
if((*p_ff)->lines->lines[l].words >= 3 && !strcmp(procfile_lineword((*p_ff), l, 2), "*")) //format: X: collected_number *
str_with_units = procfile_lineword((*p_ff), l, 1);
else if ((*p_ff)->lines->lines[l].words == 2 && !strcmp(procfile_lineword((*p_ff), l, 1), "*")) //format: collected_number *
str_with_units = procfile_lineword((*p_ff), l, 0);

if((*p_ff)->lines->lines[l].words >= 3 && !strcmp(procfile_lineword((*p_ff), l, 2), "*")){
char *str_with_units = procfile_lineword((*p_ff), l, 1);
if (str_with_units) {
char *delim = strchr(str_with_units, 'M');
char str_without_units[10];
memcpy(str_without_units, str_with_units, delim - str_with_units);
Expand Down
14 changes: 4 additions & 10 deletions contrib/debian/netdata-plugin-perf.postinst
Original file line number Diff line number Diff line change
Expand Up @@ -7,16 +7,10 @@ case "$1" in
chown root:netdata /usr/libexec/netdata/plugins.d/perf.plugin
chmod 0750 /usr/libexec/netdata/plugins.d/perf.plugin

if capsh --supports=cap_perfmon 2>/dev/null; then
setcap cap_perfmon+ep /usr/libexec/netdata/plugins.d/perf.plugin
ret="$?"
else
setcap cap_sys_admin+ep /usr/libexec/netdata/plugins.d/perf.plugin
ret="$?"
fi

if [ "${ret}" -ne 0 ]; then
chmod -f 4750 /usr/libexec/netdata/plugins.d/perf.plugin
if ! setcap cap_perfmon+ep /usr/libexec/netdata/plugins.d/perf.plugin 2>/dev/null; then
if ! setcap cap_sys_admin+ep /usr/libexec/netdata/plugins.d/perf.plugin 2>/dev/null; then
chmod -f 4750 /usr/libexec/netdata/plugins.d/perf.plugin
fi
fi
;;
esac
Expand Down
1 change: 0 additions & 1 deletion daemon/analytics.c
Original file line number Diff line number Diff line change
Expand Up @@ -842,7 +842,6 @@ void set_global_environment() {
setenv("NETDATA_LIB_DIR", verify_or_create_required_directory(netdata_configured_varlib_dir), 1);
setenv("NETDATA_LOCK_DIR", verify_or_create_required_directory(netdata_configured_lock_dir), 1);
setenv("NETDATA_LOG_DIR", verify_or_create_required_directory(netdata_configured_log_dir), 1);
setenv("HOME", verify_or_create_required_directory(netdata_configured_home_dir), 1);
setenv("NETDATA_HOST_PREFIX", netdata_configured_host_prefix, 1);

{
Expand Down
6 changes: 3 additions & 3 deletions daemon/buildinfo.c
Original file line number Diff line number Diff line change
Expand Up @@ -343,23 +343,23 @@ static struct {
.json = "cpu_frequency",
.value = "unknown",
},
[BIB_HW_RAM_SIZE] = {
[BIB_HW_ARCHITECTURE] = {
.category = BIC_HARDWARE,
.type = BIT_STRING,
.analytics = NULL,
.print = "CPU Architecture",
.json = "cpu_architecture",
.value = "unknown",
},
[BIB_HW_DISK_SPACE] = {
[BIB_HW_RAM_SIZE] = {
.category = BIC_HARDWARE,
.type = BIT_STRING,
.analytics = NULL,
.print = "RAM Bytes",
.json = "ram",
.value = "unknown",
},
[BIB_HW_ARCHITECTURE] = {
[BIB_HW_DISK_SPACE] = {
.category = BIC_HARDWARE,
.type = BIT_STRING,
.analytics = NULL,
Expand Down
12 changes: 10 additions & 2 deletions daemon/main.c
Original file line number Diff line number Diff line change
Expand Up @@ -1167,8 +1167,6 @@ static void get_netdata_configured_variables() {
netdata_configured_web_dir = config_get(CONFIG_SECTION_DIRECTORIES, "web", netdata_configured_web_dir);
netdata_configured_cache_dir = config_get(CONFIG_SECTION_DIRECTORIES, "cache", netdata_configured_cache_dir);
netdata_configured_varlib_dir = config_get(CONFIG_SECTION_DIRECTORIES, "lib", netdata_configured_varlib_dir);
char *env_home=getenv("HOME");
netdata_configured_home_dir = config_get(CONFIG_SECTION_DIRECTORIES, "home", env_home?env_home:netdata_configured_home_dir);

netdata_configured_lock_dir = initialize_lock_directory_path(netdata_configured_varlib_dir);

Expand Down Expand Up @@ -2080,6 +2078,16 @@ int main(int argc, char **argv) {
if(become_daemon(dont_fork, user) == -1)
fatal("Cannot daemonize myself.");

// The "HOME" env var points to the root's home dir because Netdata starts as root. Can't use "HOME".
struct passwd *pw = getpwuid(getuid());
if (config_exists(CONFIG_SECTION_DIRECTORIES, "home") || !pw || !pw->pw_dir) {
netdata_configured_home_dir = config_get(CONFIG_SECTION_DIRECTORIES, "home", netdata_configured_home_dir);
} else {
netdata_configured_home_dir = config_get(CONFIG_SECTION_DIRECTORIES, "home", pw->pw_dir);
}

setenv("HOME", netdata_configured_home_dir, 1);

dyn_conf_init();

netdata_log_info("netdata started on pid %d.", getpid());
Expand Down
7 changes: 6 additions & 1 deletion database/sqlite/sqlite_aclk.c
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,12 @@ static int create_host_callback(void *data, int argc, char **argv, char **column
UNUSED(argc);
UNUSED(column);

time_t last_connected = (time_t) (argv[IDX_LAST_CONNECTED] ? str2uint64_t(argv[IDX_LAST_CONNECTED], NULL) : 0);
time_t last_connected =
(time_t)(argv[IDX_LAST_CONNECTED] ? str2uint64_t(argv[IDX_LAST_CONNECTED], NULL) : 0);

if (!last_connected)
last_connected = now_realtime_sec();

time_t age = now_realtime_sec() - last_connected;
int is_ephemeral = 0;

Expand Down
13 changes: 7 additions & 6 deletions database/sqlite/sqlite_metadata.c
Original file line number Diff line number Diff line change
Expand Up @@ -1288,6 +1288,8 @@ static void start_all_host_load_context(uv_work_t *req __maybe_unused)
RRDHOST *host;

size_t max_threads = MIN(get_netdata_cpus() / 2, 6);
if (max_threads < 1)
max_threads = 1;
nd_log(NDLS_DAEMON, NDLP_DEBUG, "METADATA: Using %zu threads for context loading", max_threads);
struct host_context_load_thread *hclt = callocz(max_threads, sizeof(*hclt));

Expand Down Expand Up @@ -1472,12 +1474,11 @@ static void start_metadata_hosts(uv_work_t *req __maybe_unused)
char *machine_guid = *PValue;

host = rrdhost_find_by_guid(machine_guid);
if (unlikely(host))
continue;

uuid_t host_uuid;
if (!uuid_parse(machine_guid, host_uuid))
delete_host_chart_labels(&host_uuid);
if (likely(!host)) {
uuid_t host_uuid;
if (!uuid_parse(machine_guid, host_uuid))
delete_host_chart_labels(&host_uuid);
}

freez(machine_guid);
}
Expand Down
1 change: 1 addition & 0 deletions libnetdata/Makefile.am
Original file line number Diff line number Diff line change
Expand Up @@ -41,4 +41,5 @@ SUBDIRS = \

dist_noinst_DATA = \
README.md \
gorilla/README.md \
$(NULL)
39 changes: 39 additions & 0 deletions libnetdata/gorilla/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# Gorilla compression and decompression

This provides an alternative way of representing values stored in database
pages. Instead of allocating and using a page of fixed size, ie. 4096 bytes,
the Gorilla implementation adds support for dynamically sized pages that
contain a variable number of Gorilla buffers.

Each buffer takes 512 bytes and compresses incoming data using the Gorilla
compression:

- The very first value is stored as it is.
- For each new value, Gorilla compression doesn't store the value itself. Instead,
it computes the difference (XOR) between the new value and the previous value.
- If the XOR result is zero (meaning the new value is identical to the previous
value), we store just a single bit set to `1`.
- If the XOR result is not zero (meaning the new value differs from the previous):
- We store a `0` bit to indicate the change.
- We compute the leading-zero count (LZC) of the XOR result, and compare it
with the previous LZC. If the two LZCs are equal we store a `1` bit.
- If the LZCs are different we use 5 bits to store the new LZC, and we store
the rest of the value (ie. without its LZC) in the buffer.

A Gorilla page can have multiple Gorilla buffers. If the values of a metric
are highly compressible, just one Gorilla buffer is able to store all the values
that otherwise would require a regular 4096 byte page, ie. we can use just 512
bytes instead. In the worst case scenario (for metrics whose values are not
compressible at all), a Gorilla page might end up having `9` Gorilla buffers,
consuming 4608 bytes. In practice, this is pretty rare and does not negate
the effect of compression for the metrics.

When a gorilla page is full, ie. it contains 1024 slots/values, we serialize
the linked-list of gorilla buffers directly to disk. During deserialization,
eg. when performing a DBEngine query, the Gorilla page is loaded from the disk and
its linked-list entries are patched to point to the new memory allocated for
serving the query results.

Overall, on a real-agent the Gorilla compression scheme reduces memory
consumption approximately by ~30%, which can be several GiB of RAM for parents
having hundreds, or even thousands of children streaming to them.
4 changes: 3 additions & 1 deletion netdata.spec.in
Original file line number Diff line number Diff line change
Expand Up @@ -168,10 +168,12 @@ Requires: %{name}-plugin-nfacct = %{version}
%if 0%{?_have_freeipmi} && 0%{?centos_ver} != 6 && 0%{?centos_ver} != 7 && 0%{?amazon_linux} != 2
Suggests: %{name}-plugin-freeipmi = %{version}
%endif
%if 0%{?centos_ver} != 6 && 0%{?centos_ver} != 7 && 0%{?amazon_linux} != 2
%if 0%{?centos_ver} != 7 && 0%{?amazon_linux} != 2
Suggests: %{name}-plugin-cups = %{version}
Recommends: %{name}-plugin-systemd-journal = %{version}
Recommends: %{name}-plugin-logs-management = %{version}
%else
Requires: %{name}-plugin-systemd-journal = %{version}
%endif


Expand Down