-
Notifications
You must be signed in to change notification settings - Fork 162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(userspace/libscap): fixed loading of ebpf probe with offline CPUs #721
Conversation
Signed-off-by: Federico Di Pierro <nierro92@gmail.com>
@@ -1522,7 +1522,7 @@ int32_t scap_bpf_load( | |||
|
|||
if(online_cpu >= handle->m_dev_set.m_ndevs) | |||
{ | |||
snprintf(handle->m_lasterr, SCAP_LASTERR_SIZE, "processors online: %d, expected: %d", online_cpu, handle->m_dev_set.m_ndevs); | |||
snprintf(handle->m_lasterr, SCAP_LASTERR_SIZE, "too many online processors: %d, expected: %d", online_cpu, handle->m_dev_set.m_ndevs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Distinguish this error from the below one (there is another check at the end of the loop).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other check (L1565) was the one triggering the error. Let's make an example:
we were looping on num_cpus == num_ndevs; let's say we had 3/4 online CPUs, like this: 1 0 1 1.
Basically, our loop would loop 3 times (online CPUs -> remember that num_cpus == num_ndevs); BUT the second element would be skipped because offline.
At the end of the for loop, we had counted 2 online CPUs instead of 3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes a lot of sense, great catch! I'm missing just one piece of the puzzle why do we skip the check on the first CPU? 👇
if(j > 0)
{
char filename[SCAP_MAX_PATH_SIZE];
int online;
FILE *fp;
...........
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because first CPU has not online
flag; it cannot be disabled!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uh right, thank you!
@@ -1788,16 +1788,23 @@ static int32_t init(scap_t* handle, scap_open_args *oargs) | |||
// | |||
// Find out how many devices we have to open, which equals to the number of CPUs | |||
// | |||
ssize_t num_cpus = sysconf(_SC_NPROCESSORS_ONLN); | |||
ssize_t num_cpus = sysconf(_SC_NPROCESSORS_CONF); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like in kmod, and like it was before #374 : load all cpus, but only alloc space for online CPUs.
/cc @gnosek @Andreagit97 |
/milestone 0.10.0 |
@@ -1788,16 +1788,23 @@ static int32_t init(scap_t* handle, scap_open_args *oargs) | |||
// | |||
// Find out how many devices we have to open, which equals to the number of CPUs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Find out how many devices we have to open, which equals to the number of online CPUs
I would specify that the device are equals to the online CPUs
The if after the if(online_cpu != handle->m_dev_set.m_ndevs)
{
snprintf(handle->m_lasterr, SCAP_LASTERR_SIZE, "processors online: %d, expected: %d", online_cpu, handle->m_dev_set.m_ndevs);
return SCAP_FAILURE;
} |
Great catch! |
Signed-off-by: Federico Di Pierro <nierro92@gmail.com> Co-authored-by: Andrea Terzolo <andrea.terzolo@polito.it>
Fixed! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
LGTM label has been added. Git tree hash: daebbeb87d639c7edac3be197313b9da9df85777
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Andreagit97, FedeDP, leogr The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What type of PR is this?
/kind bug
Any specific area of the project related to this PR?
/area libscap-engine-bpf
Does this PR require a change in the driver versions?
NOPE
What this PR does / why we need it:
Revert a bug introduced in #374: use same code used by kmod engine to properly load all CPUs and online CPUs, to be later able to properly count online CPUs.
Which issue(s) this PR fixes:
Fixes #720
Special notes for your reviewer:
Does this PR introduce a user-facing change?: