Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix kernel memory leak in pNotifShare #655

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Cwndmiao
Copy link

@Cwndmiao Cwndmiao commented Jun 5, 2024

380 if (pNotifierClient != NULL)
381 {
382 ¦ RsResourceRef *pNotifierRef;
383 ¦ INotifier *pNotifier;
384 ¦ if (clientGetResourceRef(pNotifierClient, hNotifierResource, &pNotifierRef) != NV_OK)
385 ¦ ¦ return NV_ERR_INVALID_OBJECT;
386
387 ¦ pNotifier = dynamicCast(pNotifierRef->pResource, INotifier);
388 ¦ if (pNotifier == NULL)
389 ¦ ¦ return NV_ERR_INVALID_OBJECT;
390
391 ¦ rmStatus = inotifyGetOrAllocNotifShare(pNotifier, hNotifierClient, hNotifierResource, &pNotifierShare);
392 ¦ if (rmStatus != NV_OK)
393 ¦ ¦ return rmStatus;
394
395 ¦ *pppEventNotification = inotifyGetNotificationListPtr(pNotifierShare->pNotifier);
396 }
397
398 serverRefShare(&g_resServ, staticCast(pNotifierShare, RsShared)); <-------
399 pEvent->pNotifierShare = pNotifierShare;

line 391 inotifyGetOrAllocNotifShare() initializes pNotifierShare and sets pNotifierShare->refCount to 1.
line 398 serverRefShare() accidentally increments pNotifierShare0->refCount by 1, and it's 2 now.
So pNotifier->refCount would never reach zero and leak forever. This issue can be fixed by my commit.

@Cwndmiao Cwndmiao marked this pull request as ready for review June 5, 2024 07:58
@Qubitium
Copy link

Qubitium commented Jun 5, 2024

@Cwndmiao Just curious from end-user perspective. How severe or often does this leak occur? Trying to figure the severity to see if we want to apply this patch internally pre merge/review. Thanks.

@Cwndmiao
Copy link
Author

Cwndmiao commented Jun 5, 2024

@Cwndmiao Just curious from end-user perspective. How severe or often does this leak occur? Trying to figure the severity to see if we want to apply this patch internally pre merge/review. Thanks.

AFAIK, one NotifierShare binds one/many Event/OsEvent to one Subdevice, so that Event gets notified when something is triggered on one Subdevice/GPU, known as the Observer Design Pattern. To conclude, one NotifierShare leaks once Subdevice gets constructed/destructed, or more concisely, one app linked to libcuda.so spawns/exits.

@Cwndmiao
Copy link
Author

Cwndmiao commented Jun 5, 2024

image

And this can be detected by the SMART NVPORT memory allocator.

@CLAassistant
Copy link

CLAassistant commented Jun 6, 2024

CLA assistant check
All committers have signed the CLA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants