Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(userspace/libscap): fixed loading of ebpf probe with offline CPUs #721

Merged
merged 2 commits into from
Nov 22, 2022

Conversation

FedeDP
Copy link
Contributor

@FedeDP FedeDP commented Nov 17, 2022

What type of PR is this?

/kind bug

Any specific area of the project related to this PR?

/area libscap-engine-bpf

Does this PR require a change in the driver versions?

NOPE

What this PR does / why we need it:

Revert a bug introduced in #374: use same code used by kmod engine to properly load all CPUs and online CPUs, to be later able to properly count online CPUs.

Which issue(s) this PR fixes:

Fixes #720

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

fix(libscap): fixed loading of ebpf probe with offline CPUs

Signed-off-by: Federico Di Pierro <nierro92@gmail.com>
@@ -1522,7 +1522,7 @@ int32_t scap_bpf_load(

if(online_cpu >= handle->m_dev_set.m_ndevs)
{
snprintf(handle->m_lasterr, SCAP_LASTERR_SIZE, "processors online: %d, expected: %d", online_cpu, handle->m_dev_set.m_ndevs);
snprintf(handle->m_lasterr, SCAP_LASTERR_SIZE, "too many online processors: %d, expected: %d", online_cpu, handle->m_dev_set.m_ndevs);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Distinguish this error from the below one (there is another check at the end of the loop).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other check (L1565) was the one triggering the error. Let's make an example:
we were looping on num_cpus == num_ndevs; let's say we had 3/4 online CPUs, like this: 1 0 1 1.

Basically, our loop would loop 3 times (online CPUs -> remember that num_cpus == num_ndevs); BUT the second element would be skipped because offline.
At the end of the for loop, we had counted 2 online CPUs instead of 3.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes a lot of sense, great catch! I'm missing just one piece of the puzzle why do we skip the check on the first CPU? 👇

		if(j > 0)
		{
			char filename[SCAP_MAX_PATH_SIZE];
			int online;
			FILE *fp;
                        ...........

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because first CPU has not online flag; it cannot be disabled!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uh right, thank you!

@@ -1788,16 +1788,23 @@ static int32_t init(scap_t* handle, scap_open_args *oargs)
//
// Find out how many devices we have to open, which equals to the number of CPUs
//
ssize_t num_cpus = sysconf(_SC_NPROCESSORS_ONLN);
ssize_t num_cpus = sysconf(_SC_NPROCESSORS_CONF);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like in kmod, and like it was before #374 : load all cpus, but only alloc space for online CPUs.

@FedeDP
Copy link
Contributor Author

FedeDP commented Nov 17, 2022

/cc @gnosek @Andreagit97

@FedeDP
Copy link
Contributor Author

FedeDP commented Nov 17, 2022

/milestone 0.10.0

@poiana poiana added this to the 0.10.0 milestone Nov 17, 2022
@@ -1788,16 +1788,23 @@ static int32_t init(scap_t* handle, scap_open_args *oargs)
//
// Find out how many devices we have to open, which equals to the number of CPUs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Find out how many devices we have to open, which equals to the number of online CPUs
I would specify that the device are equals to the online CPUs

@Andreagit97
Copy link
Member

Andreagit97 commented Nov 17, 2022

The if after the for loop is still wrong https://github.com/falcosecurity/libs/pull/721/files#diff-12833abd4271488260dae0ba178c6ad3f0bc63642f793a20b06ab4eb10d02cf9R1565-R1568 and it is the original cause of the strange log in the Falco issue, we should print online_cpu not j 👇

	if(online_cpu != handle->m_dev_set.m_ndevs)
	{
		snprintf(handle->m_lasterr, SCAP_LASTERR_SIZE, "processors online: %d, expected: %d", online_cpu, handle->m_dev_set.m_ndevs);
		return SCAP_FAILURE;
	}

@FedeDP
Copy link
Contributor Author

FedeDP commented Nov 17, 2022

Great catch!

Signed-off-by: Federico Di Pierro <nierro92@gmail.com>

Co-authored-by: Andrea Terzolo <andrea.terzolo@polito.it>
@FedeDP
Copy link
Contributor Author

FedeDP commented Nov 17, 2022

Fixed!

Copy link
Member

@Andreagit97 Andreagit97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@poiana
Copy link
Contributor

poiana commented Nov 17, 2022

LGTM label has been added.

Git tree hash: daebbeb87d639c7edac3be197313b9da9df85777

@poiana
Copy link
Contributor

poiana commented Nov 22, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Andreagit97, FedeDP, leogr

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [Andreagit97,FedeDP,leogr]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@poiana poiana merged commit 332c130 into master Nov 22, 2022
@poiana poiana deleted the fix/bpf_offline_proc branch November 22, 2022 08:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Libinsp fails when CPUs are disabled
4 participants