Skip to content

in_calyptia_fleet: fix init error mem leaks. #10502

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

pwhelan
Copy link
Contributor

@pwhelan pwhelan commented Jun 23, 2025

Summary

Fix memory leaks in in_calyptia_fleet and out_calyptia that could occur during initialization.


Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • [N/A] Example configuration file for the change
  • Debug log output from testing the change
  • Attached Valgrind output that shows no leaks or memory corruption was found

Backporting

  • Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

pwhelan added 2 commits June 23, 2025 17:18
… init.

Signed-off-by: Phillip Whelan <phillip.whelan@chronosphere.io>
Signed-off-by: Phillip Whelan <phillip.whelan@chronosphere.io>
@pwhelan
Copy link
Contributor Author

pwhelan commented Jun 23, 2025

Here is a valgrind log:

╭─pwhelan@hydra /home/pwhelan/Projects/work/fluent-bit.git/pwhelan-fleet-eduardo-mem-leaks/build  ‹system›  <pwhelan-fleet-eduardo-mem-leaks*>
╰─$ valgrind --leak-check=full ./bin/fluent-bit -c pipeline.conf
==185581== Memcheck, a memory error detector
==185581== Copyright (C) 2002-2024, and GNU GPL'd, by Julian Seward et al.
==185581== Using Valgrind-3.25.1 and LibVEX; rerun with -h for copyright info
==185581== Command: ./bin/fluent-bit -c pipeline.conf
==185581==
Fluent Bit v4.0.4
* Copyright (C) 2015-2025 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

______ _                  _    ______ _ _             ___  _____
|  ___| |                | |   | ___ (_) |           /   ||  _  |
| |_  | |_   _  ___ _ __ | |_  | |_/ /_| |_  __   __/ /| || |/' |
|  _| | | | | |/ _ \ '_ \| __| | ___ \ | __| \ \ / / /_| ||  /| |
| |   | | |_| |  __/ | | | |_  | |_/ / | |_   \ V /\___  |\ |_/ /
\_|   |_|\__,_|\___|_| |_|\__| \____/|_|\__|   \_/     |_(_)___/


[2025/06/23 17:23:05] [ info] [fluent bit] version=4.0.4, commit=54594067cc, pid=185581
[2025/06/23 17:23:05] [ info] [custom:calyptia:calyptia.0] read UUID (ee3fb62a-a91c-45ab-8590-97868d55c303) from file: /tmp/calyptia-fleet/machine-id.conf
[2025/06/23 17:23:05] [ info] [custom:calyptia:calyptia.0] custom initialized!
[2025/06/23 17:23:05] [ info] [storage] ver=1.5.3, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2025/06/23 17:23:05] [ info] [simd    ] disabled
[2025/06/23 17:23:05] [ info] [cmetrics] version=1.0.3
[2025/06/23 17:23:05] [ info] [ctraces ] version=0.6.6
[2025/06/23 17:23:05] [ info] [input:fluentbit_metrics:fluentbit_metrics.0] initializing
[2025/06/23 17:23:05] [ info] [input:fluentbit_metrics:fluentbit_metrics.0] storage_strategy='memory' (memory only)
[2025/06/23 17:23:06] [ info] [input:calyptia_fleet:calyptia_fleet.1] initializing
[2025/06/23 17:23:06] [ info] [input:calyptia_fleet:calyptia_fleet.1] storage_strategy='memory' (memory only)
[2025/06/23 17:23:06] [ info] [input:calyptia_fleet:calyptia_fleet.1] initializing calyptia fleet input.
[2025/06/23 17:23:06] [ info] [input:calyptia_fleet:calyptia_fleet.1] invalid interval settings, using defaults: sec=15 nsec=0
[2025/06/23 17:23:06] [ warn] [input:calyptia_fleet:calyptia_fleet.1] unable to find latest configuration file
[2025/06/23 17:23:06] [ info] [input:calyptia_fleet:calyptia_fleet.1] fleet collector initialized with interval: 15 sec 0 nsec
[2025/06/23 17:23:06] [ warn] [output:calyptia:calyptia.0] agent registration failed
[2025/06/23 17:23:06] [ info] [sp] stream processor started
[2025/06/23 17:23:06] [ info] [engine] Shutdown Grace Period=5, Shutdown Input Grace Period=2
==185581== Thread 2 flb-pipeline:
==185581== Conditional jump or move depends on uninitialised value(s)
==185581==    at 0x421E257: output_pre_cb_flush (flb_output.h:675)
==185581==    by 0x54F2646: co_init (amd64.c:117)
==185581==
[2025/06/23 17:23:12] [ info] [output:calyptia:calyptia.0] missing agent_id or agent_token, attempting re-registration register_retry_on_flush=true
[2025/06/23 17:23:12] [ warn] [output:calyptia:calyptia.0] agent registration failed
[2025/06/23 17:23:12] [ warn] [engine] failed to flush chunk '185581-1750713791.78062833.flb', retry in 7 seconds: task_id=0, input=fluentbit_metrics.0 > output=calyptia.0 (out_id=0)
^C[2025/06/23 17:23:15] [engine] caught signal (SIGINT)
[2025/06/23 17:23:15] [ warn] [engine] service will shutdown in max 5 seconds
[2025/06/23 17:23:15] [ info] [input] pausing fluentbit_metrics.0
[2025/06/23 17:23:15] [ info] [input] pausing calyptia_fleet.1
[2025/06/23 17:23:15] [ info] [output:calyptia:calyptia.0] missing agent_id or agent_token, attempting re-registration register_retry_on_flush=true
[2025/06/23 17:23:15] [ warn] [output:calyptia:calyptia.0] agent registration failed
[2025/06/23 17:23:15] [error] [engine] chunk '185581-1750713791.78062833.flb' cannot be retried: task_id=0, input=fluentbit_metrics.0 > output=calyptia.0
[2025/06/23 17:23:16] [ info] [engine] service has stopped (0 pending tasks)
[2025/06/23 17:23:16] [ info] [input] pausing fluentbit_metrics.0
[2025/06/23 17:23:16] [ info] [input] pausing calyptia_fleet.1
==185581==
==185581== HEAP SUMMARY:
==185581==     in use at exit: 0 bytes in 0 blocks
==185581==   total heap usage: 90,217 allocs, 90,217 frees, 8,377,270 bytes allocated
==185581==
==185581== All heap blocks were freed -- no leaks are possible
==185581==
==185581== Use --track-origins=yes to see where uninitialised values come from
==185581== For lists of detected and suppressed errors, rerun with: -s
==185581== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

@pwhelan
Copy link
Contributor Author

pwhelan commented Jun 23, 2025

Here is a debug log:

╭─pwhelan@hydra /home/pwhelan/Projects/work/fluent-bit.git/pwhelan-fleet-eduardo-mem-leaks/build  ‹system›  <pwhelan-fleet-eduardo-mem-leaks*>
╰─$ ./bin/fluent-bit -v -c pipeline.conf
Fluent Bit v4.0.4
* Copyright (C) 2015-2025 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

______ _                  _    ______ _ _             ___  _____
|  ___| |                | |   | ___ (_) |           /   ||  _  |
| |_  | |_   _  ___ _ __ | |_  | |_/ /_| |_  __   __/ /| || |/' |
|  _| | | | | |/ _ \ '_ \| __| | ___ \ | __| \ \ / / /_| ||  /| |
| |   | | |_| |  __/ | | | |_  | |_/ / | |_   \ V /\___  |\ |_/ /
\_|   |_|\__,_|\___|_| |_|\__| \____/|_|\__|   \_/     |_(_)___/


[2025/06/23 18:09:37] [ info] Configuration:
[2025/06/23 18:09:37] [ info]  flush time     | 1.000000 seconds
[2025/06/23 18:09:37] [ info]  grace          | 5 seconds
[2025/06/23 18:09:37] [ info]  daemon         | 0
[2025/06/23 18:09:37] [ info] ___________
[2025/06/23 18:09:37] [ info]  inputs:
[2025/06/23 18:09:37] [ info] ___________
[2025/06/23 18:09:37] [ info]  filters:
[2025/06/23 18:09:37] [ info] ___________
[2025/06/23 18:09:37] [ info]  outputs:
[2025/06/23 18:09:37] [ info] ___________
[2025/06/23 18:09:37] [ info]  collectors:
[2025/06/23 18:09:37] [ info] [fluent bit] version=4.0.4, commit=54594067cc, pid=189231
[2025/06/23 18:09:37] [debug] [engine] coroutine stack size: 24576 bytes (24.0K)
[2025/06/23 18:09:37] [ info] [custom:calyptia:calyptia.0] read UUID (ee3fb62a-a91c-45ab-8590-97868d55c303) from file: /tmp/calyptia-fleet/machine-id.conf
[2025/06/23 18:09:37] [ info] [custom:calyptia:calyptia.0] custom initialized!
[2025/06/23 18:09:37] [ info] [storage] ver=1.5.3, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2025/06/23 18:09:37] [ info] [simd    ] disabled
[2025/06/23 18:09:37] [ info] [cmetrics] version=1.0.3
[2025/06/23 18:09:37] [ info] [ctraces ] version=0.6.6
[2025/06/23 18:09:37] [ info] [input:fluentbit_metrics:fluentbit_metrics.0] initializing
[2025/06/23 18:09:37] [ info] [input:fluentbit_metrics:fluentbit_metrics.0] storage_strategy='memory' (memory only)
[2025/06/23 18:09:37] [debug] [fluentbit_metrics:fluentbit_metrics.0] created event channels: read=25 write=26
[2025/06/23 18:09:37] [ info] [input:calyptia_fleet:calyptia_fleet.1] initializing
[2025/06/23 18:09:37] [ info] [input:calyptia_fleet:calyptia_fleet.1] storage_strategy='memory' (memory only)
[2025/06/23 18:09:37] [debug] [calyptia_fleet:calyptia_fleet.1] created event channels: read=27 write=28
[2025/06/23 18:09:37] [ info] [input:calyptia_fleet:calyptia_fleet.1] initializing calyptia fleet input.
[2025/06/23 18:09:37] [debug] [input:calyptia_fleet:calyptia_fleet.1] initial collector interval: sec=-1 nsec=-1
[2025/06/23 18:09:37] [ info] [input:calyptia_fleet:calyptia_fleet.1] invalid interval settings, using defaults: sec=15 nsec=0
[2025/06/23 18:09:37] [debug] [input:calyptia_fleet:calyptia_fleet.1] loading configuration file
[2025/06/23 18:09:37] [ warn] [input:calyptia_fleet:calyptia_fleet.1] unable to find latest configuration file
[2025/06/23 18:09:37] [ info] [input:calyptia_fleet:calyptia_fleet.1] fleet collector initialized with interval: 15 sec 0 nsec
[2025/06/23 18:09:37] [debug] [calyptia:calyptia.0] created event channels: read=29 write=30
[2025/06/23 18:09:37] [debug] [output:calyptia:calyptia.0] no session file was found
[2025/06/23 18:09:37] [debug] [output:calyptia:calyptia.0] machine_id=5ca91d07f13d92d52e235a1bec54fa1bb049a384ad68aceeb7580856a9762b51
[2025/06/23 18:09:37] [debug] [net] socket #31 could not connect to ::1:443
[2025/06/23 18:09:37] [debug] [net] socket #31 could not connect to 127.0.0.1:443
[2025/06/23 18:09:37] [debug] [net] could not connect to localhost:443
[2025/06/23 18:09:37] [debug] [upstream] connection #-1 failed to localhost:443
[2025/06/23 18:09:37] [ warn] [output:calyptia:calyptia.0] agent registration failed
[2025/06/23 18:09:37] [debug] [router] match rule fluentbit_metrics.0:calyptia.0
[2025/06/23 18:09:37] [ info] [sp] stream processor started
[2025/06/23 18:09:37] [ info] [engine] Shutdown Grace Period=5, Shutdown Input Grace Period=2
[2025/06/23 18:09:43] [debug] [task] created direct task=0x7ffff49a4920 id=0 OK
[2025/06/23 18:09:43] [ info] [output:calyptia:calyptia.0] missing agent_id or agent_token, attempting re-registration register_retry_on_flush=true
[2025/06/23 18:09:43] [debug] [net] socket #34 could not connect to ::1:443
[2025/06/23 18:09:43] [debug] [net] socket #34 could not connect to 127.0.0.1:443
[2025/06/23 18:09:43] [debug] [net] could not connect to localhost:443
[2025/06/23 18:09:43] [debug] [upstream] connection #-1 failed to localhost:443
[2025/06/23 18:09:43] [ warn] [output:calyptia:calyptia.0] agent registration failed
[2025/06/23 18:09:43] [debug] [out flush] cb_destroy coro_id=0
[2025/06/23 18:09:43] [debug] [retry] new retry created for task_id=0 attempts=1
[2025/06/23 18:09:43] [ warn] [engine] failed to flush chunk '189231-1750716582.54521371.flb', retry in 6 seconds: task_id=0, input=fluentbit_metrics.0 > output=calyptia.0 (out_id=0)
^C[2025/06/23 18:09:45] [engine] caught signal (SIGINT)
[2025/06/23 18:09:45] [ warn] [engine] service will shutdown in max 5 seconds
[2025/06/23 18:09:45] [debug] [engine] re-scheduled retry=0x7ffff49e6b80 for task 0
[2025/06/23 18:09:45] [ info] [input] pausing fluentbit_metrics.0
[2025/06/23 18:09:45] [ info] [input] pausing calyptia_fleet.1
[2025/06/23 18:09:45] [ info] [output:calyptia:calyptia.0] missing agent_id or agent_token, attempting re-registration register_retry_on_flush=true
[2025/06/23 18:09:45] [debug] [net] socket #34 could not connect to ::1:443
[2025/06/23 18:09:45] [debug] [net] socket #34 could not connect to 127.0.0.1:443
[2025/06/23 18:09:45] [debug] [net] could not connect to localhost:443
[2025/06/23 18:09:45] [debug] [upstream] connection #-1 failed to localhost:443
[2025/06/23 18:09:45] [ warn] [output:calyptia:calyptia.0] agent registration failed
[2025/06/23 18:09:45] [debug] [out flush] cb_destroy coro_id=1
[2025/06/23 18:09:45] [debug] [task] task_id=0 reached retry-attempts limit 1/1
[2025/06/23 18:09:45] [error] [engine] chunk '189231-1750716582.54521371.flb' cannot be retried: task_id=0, input=fluentbit_metrics.0 > output=calyptia.0
[2025/06/23 18:09:45] [debug] [task] destroy task=0x7ffff49a4920 (task_id=0)
[2025/06/23 18:09:46] [ info] [engine] service has stopped (0 pending tasks)
[2025/06/23 18:09:46] [ info] [input] pausing fluentbit_metrics.0
[2025/06/23 18:09:46] [ info] [input] pausing calyptia_fleet.1

Copy link
Contributor

@patrick-stephens patrick-stephens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like unit tests are failing - not sure if it is that flake I raised before but we should sort.

@pwhelan
Copy link
Contributor Author

pwhelan commented Jun 26, 2025

Looks like unit tests are failing - not sure if it is that flake I raised before but we should sort.

I have been working on a draft which should speed up tests but it still needs quite a bit of work: #10503.

The other alternative is to rewrite that massive test so multiple checks are done in a single function, while also reducing all the sleep calls it makes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants