Skip to content

Fix memory leaks found by valgrind in cts-cli tests#4083

Merged
clumens merged 37 commits intoClusterLabs:mainfrom
nrwahl2:nrwahl2-mem_alt
Apr 29, 2026
Merged

Fix memory leaks found by valgrind in cts-cli tests#4083
clumens merged 37 commits intoClusterLabs:mainfrom
nrwahl2:nrwahl2-mem_alt

Conversation

@nrwahl2
Copy link
Copy Markdown
Contributor

@nrwahl2 nrwahl2 commented Apr 21, 2026

Valgrind found leaks in the crm_resource, crm_simulate, crm_verify, and upgrade tests.

If you don't mind, indulge me on the refactor commits. They came up while working on the fix commits. Taking them out of this PR makes conflict resolution a pain in the fix commits. I could do a separate PR first for the refactors though, if you want. Just trying to get the fixes in ASAP.

@nrwahl2 nrwahl2 requested a review from clumens April 21, 2026 17:43
Comment thread daemons/controld/controld_execd_state.c Outdated
Comment thread daemons/controld/controld_remote_ra.c
Comment thread daemons/fenced/fenced_remote.c
Comment thread daemons/fenced/fenced_remote.c
Comment thread daemons/controld/controld_messages.c
Comment thread daemons/based/based_remote.c
@clumens clumens added review: in progress PRs that are currently being reviewed 3.0.2 Things to merge for 3.0.2 labels Apr 22, 2026
@nrwahl2 nrwahl2 marked this pull request as draft April 22, 2026 17:25
@nrwahl2
Copy link
Copy Markdown
Contributor Author

nrwahl2 commented Apr 22, 2026

Updating to fix a small handful of issues.

One scheduler test (nvpair-id-ref) still fails. The get_meta_attributes() call chain includes a call to pcmk__xe_resolve_idref(nvpair, NULL). This searches nvpair's document for an element matching its ID reference. To work, this relied on nvpair being part of scheduler->input's doc. Which, prior to this PR, was the case for some resources, but not for others (e.g., clone instances?).

Sigh. I can break out everything before the fix commits if you'd like.

@nrwahl2
Copy link
Copy Markdown
Contributor Author

nrwahl2 commented Apr 22, 2026

The previous push took care of all the fixes besides the nvpair ID ref issue. This next push is a rebase on current main to resolve a conflict.

nrwahl2 added 20 commits April 22, 2026 16:36
We already use pcmk__assert_asprintf() there.

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
pcmk__strkey_table() is guaranteed to return non-NULL.

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
And use const where appropriate.

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
This is public API, so don't use pcmk__assert().

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
It was already called when unpacking the resource.

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
We already use crm_meta_name(), which calls pcmk__assert_asprintf().

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
For clarity.

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
This is more explicit and doesn't seem any less convenient.

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Previously, we were leaking the list. We moved the head until it became
NULL, and then we freed the head.

The elements of the list will be freed as part of the XML document that
they belong to.

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
If multiple elements with IDREFs resolved to the same element, the hash
table would collapse them into a single entry. The order doesn't matter,
but we need to know that if a value occurs twice in the list of resolved
IDs, then it will occur twice in the list of expected IDs.

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
We will call it again in an upcoming commit.

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
And pe_eval_nvpairs(), to keep them aligned.

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Nothing uses this meaningfully yet.

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
We're just parsing the XML. Newlines are ignored.

Also drop a redundant lookup of "name3". We've already checked that the
hash table size is 2 and that "name1" and "name2" are in the table. So
"name3" can't be.

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
nrwahl2 added 4 commits April 23, 2026 23:27
To replace pcmk_unpack_nvpair_blocks().

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
There does not seem to be a realistic use case for external callers.

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
@nrwahl2 nrwahl2 marked this pull request as ready for review April 25, 2026 05:43
@nrwahl2
Copy link
Copy Markdown
Contributor Author

nrwahl2 commented Apr 25, 2026

retest this please

nrwahl2 added 10 commits April 24, 2026 22:56
Use scheduler->input->doc to evaluate IDREFs. This way, xml_obj doesn't
have to be part of scheduler->input. In particular, it can be a copy of
something in scheduler->input (the CIB).

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
There are only three legitimate values, including UNKNOWN, for
agent_type. We return early if agent_type isn't DOCKER or PODMAN.

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
We already call pcmk__strkey_table() there, which terminates the program
on memory allocation error. So we might as well replace calloc() with
pcmk__assert_alloc().

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
The early return block was added by 1a83b58, before pcmk_node_t:priv
existed. So if the details field was NULL, it made sense to return
early.

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Other tests already do this. Dynamic allocation would make more sense if
we had a generic constructor for pcmk_node_t objects. But all we have is
pe__create_node(), which requires a pcmk_scheduler_t argument. That
seems like overkill.

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
This way the node->priv object owns them and doesn't have to worry about
whether something else has freed them.

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Comment thread tools/crm_resource.c Outdated
Comment thread lib/common/nvpair.c Outdated
Comment thread lib/pengine/bundle.c
@nrwahl2
Copy link
Copy Markdown
Contributor Author

nrwahl2 commented Apr 28, 2026

Added two commits on top

nrwahl2 added 3 commits April 28, 2026 17:24
* rsc->priv->xml and rsc->priv->orig_xml now always belong to the
  pcmk_resource_t. They are in their own document(s) and do not have
  parents or siblings. No existing code accesses the parents or
  siblings.

* rsc->priv->orig_xml is now non-NULL and usable for every
  pcmk_resource_t, regardless of whether template expansion occurred.
  - This simplifies formatted_xml_buf() to the point that it's not worth
    keeping as a separate function, for example.

* rsc->priv->orig_xml is set to a copy of the xml_obj passed into
  pe__unpack_resource() (in a new XML document).

* rsc->priv->xml is initialized to rsc->priv->orig_xml.
  - In the common case, without template expansion, they remain set to
    the same value.
  - If template expansion occurs, rsc->priv->xml is set to a newly
    allocated copy of the template (in a new XML document). Copies of
    the children of rsc->priv->orig_xml are added as children to
    rsc->priv->xml. In case of conflict, values set in
    rsc->priv->orig_xml override those set in the template.

* pcmk__free_resource() now unconditionally frees rsc->priv->xml, and it
  frees rsc->priv->orig_xml if this doesn't point to the same memory as
  rsc->priv->xml.
  - This avoids the need for complicated free functions for clones and
    groups.

* pe__create_clone_child() frees child_copy now.

* Functions in bundle.c free the resource XML that they create.

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
@nrwahl2
Copy link
Copy Markdown
Contributor Author

nrwahl2 commented Apr 29, 2026

Fixed typo found by Coverity

@clumens clumens merged commit c780acd into ClusterLabs:main Apr 29, 2026
1 check passed
@clumens
Copy link
Copy Markdown
Contributor

clumens commented Apr 29, 2026

This will also need to be backported to the 3.0 branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

3.0.2 Things to merge for 3.0.2 review: in progress PRs that are currently being reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants