New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
memleak hunting: cleanup based on valgrind report #1649
Conversation
Build SUCCESS, the tests were executed on test branch: master and test suite: functions |
lib/dnscache.c
Outdated
|
||
if (main_loop_is_terminating(main_loop_get_instance())) | ||
dns_cache_free(dns_cache); | ||
else |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm I don't really like this coupling between dnscache and mainloop. Why isn't the dns_caching_global_deinit() enough? That should cover the exit case properly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mainloop does not wait threads to terminate, it can happen that unused_dns_caches
is freed by the mainloop in dns_caching_global_deinit
by the time dns_caching_thread_deinit
is called. This results in an invalid read in the thread:
unused_dns_caches = g_list_prepend(unused_dns_caches, dns_cache);
as unused_dns_caches
is already freed.
The leak here is only a side effect, due that dns_cache
is appended to a non existing list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ugh, I understand. Still I would leave in this as a bug and then fix the termination issue in mainloop perhaps later.
As far as I understand, the mainloop does have a grab on how many threads are executing, and we could wait for that to drop to 0, perhaps even using a timeout, to ensure that it does not take all eternity to exit.
If the timeout is reached, we would exit drastically (e.g. exit() without calling the shutdown functions), but that should only happen in extreme cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, fine for me. I drop this patch and try to create a separate one based on your guidance.
lib/gprocess.c
Outdated
@@ -1465,6 +1465,7 @@ g_process_finish(void) | |||
#endif | |||
|
|||
g_process_remove_pidfile(); | |||
reloc_deinit(); /* path_cache needed for calculating pidfile location above */ | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hate this one too. I understand that the pidfile location is actually managed by reloc, which then can't be deinitialized in app_shutdown() where it should be deinitialized, or otherwise we'd risk use-after-free, but adding this kind of coupling is worse than that minor, one-time memory leak.
I would either:
- leave the leak as is
- put reloc_deinit() into main()
- use some kind of destructor trick to make sure that hash table gets freed
The easiest is probably 2), which plugs the leak and is acceptable code/coupling wise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I'll go with 2).
lib/mainloop.c
Outdated
@@ -195,7 +195,7 @@ main_loop_initialize_state(GlobalConfig *cfg, const gchar *persist_filename) | |||
return success; | |||
} | |||
|
|||
static inline gboolean | |||
gboolean | |||
main_loop_is_terminating(MainLoop *self) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should not be needed.
lib/stats/stats-control.c
Outdated
@@ -49,3 +49,11 @@ stats_register_control_commands(void) | |||
control_register_command("STATS", NULL, control_connection_send_stats, NULL); | |||
control_register_command("RESET_STATS", NULL, control_connection_reset_stats, NULL); | |||
} | |||
|
|||
#include <stdio.h> | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why the stdio.h inclusion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a few comments that should be changed/dropped but the rest of it is good stuff that should go in.
Signed-off-by: Antal Nemes <antal.nemes@balabit.com>
8c5a990
to
3aae4b1
Compare
Build SUCCESS, the tests were executed on test branch: master and test suite: functions |
Signed-off-by: Antal Nemes <antal.nemes@balabit.com> Signed-off-by: bkil-syslogng <tamas.nagy@balabit.com>
Signed-off-by: Antal Nemes <antal.nemes@balabit.com>
Signed-off-by: bkil-syslogng <tamas.nagy@balabit.com>
3aae4b1
to
2b0a5ea
Compare
This cache can only be freed at a very late stage, because the location of the pidfile is calculated from path_cache. To avoid uneasy coupling of gprocess and reloc, we call reloc_deinit in main(). Signed-off-by: bkil-syslogng <tamas.nagy@balabit.com> Signed-off-by: Antal Nemes <antal.nemes@balabit.com>
Signed-off-by: bkil-syslogng <tamas.nagy@balabit.com>
Signed-off-by: Antal Nemes <antal.nemes@balabit.com>
2b0a5ea
to
d8a478c
Compare
Build SUCCESS, the tests were executed on test branch: master and test suite: functions |
1 similar comment
Build SUCCESS, the tests were executed on test branch: master and test suite: functions |
This patchet intends to fix some entries in valgrind report. All of these issues are minors, i.e non of them accumulate over time. All of these changes intended to be cosmetic, explicitely calling deinits/frees, instead of just adding new exceptions to a valgrind suppress file.
Credits for @bkil-syslogng for #975, which gave much inspiration for this pull request.
Only there are two commits in 975 that were not ported. I list them here for documentation reason.
Edit:
The dns_cache part is reported in: #1400dns_cache will be handled in a separate pull request.