-
Notifications
You must be signed in to change notification settings - Fork 7
Fix OverBudget tests from never terminating. #512
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
bjjones
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Note that I did find the the reason tests were looping forever on my machine: In D3D12ResidencyManagerTests.cpp L91, the return statement should be an OR, not an AND. If I'm remembering correctly - I think we may be seeing (valid) behavior differences between hardware. On my Nvidia dGPU, evicted local memory does not get counted towards the non-local memory budget. I think this is different from how Intel integrated GPUs do it and I think this might be why you used an AND here.
|
Question -
We only evict using the same segment the budget uses, though. GPGMM/src/gpgmm/d3d12/HeapD3D12.cpp Line 48 in 7c8eec9
This check
The AND was because I don't care which segment goes over budget, so long all of them are under. Does that sound correct to you? |
|
...nevermind. What I first said was wrong - I misread the code and jumped to a conclusion. Whats happening on my machine is that local->Budget is exactly equal to local->CurrentUsage - so we're stiil technically not exceeding the budget, and returning false. I'm getting this in a loop: The issue may be that we're calling Evict, and it doesn't immediately evict (as expected). Then in SlabMemoryAllocator.cpp L104 we bail out of allocating more because still |
|
Where do you see an error?
When it falls-back, you get a heap exactly sized to the resource, increasing
In D3D12ResidencyTests.cpp,
I suspect either the segment wasn't being updated or we are allocating from the wrong segment. See if
|
|
I think what I may be seeing is that we're doing this: We'll never exceed the budget when doing this - usage will always be equal to the budget when we check (and I think this is correct behavior). Is using |
|
The test only fails for notifications, correct? If so, I suspect this is due to a race condition. If the notification occurred sometime after a CreateHeap [going over budget] but before Evict is called for the next CreateHeap then this loop cannot terminate. You can test this theory by setting a breakpoint here:
If true, we can make the test become deterministic by changing the termination condition to use the same check in GPGMM_SKIP_TEST_IF_NOT_CREATED_IN_BUDGET. |
|
I've been referring to D3D12ResidencyManagerTests.OverBudget, which appears to be an infinite loop on my machine. D3D12ResidencyManagerTests.OverBudgetUsingBudgetNotifications seems to loop forever too, but I am getting the additional message that is not present in the other test: |
|
@bjjones Thanks, I see the issue now. Looks like I got lucky with my heap sizes. PTAL again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that D3D12ResidencyManagerTests.OverBudgetUsingBudgetNotifications fails for me due to allocations not being resident. I think it may just be that we're going very close to the budget and normal budget fluctuations are causing things to be evicted before we expect them to. I do see the test passing when I increased the allocation size to 50MB and made the change on the condition I suggested in this review. edit: the test is still flaky after making these changes.
|
I was able to make D3D12ResidencyManagerTests.OverBudgetUsingBudgetNotifications pass reliably by changing |
869668c to
cb18b5e
Compare
|
Yup, good catch! I believe the underlying issue was the created heap sizes would not exactly fit the budget left and spilled a bit which caused the test to fail. Thanks again. |
|
@bjjones Could you PTAL again? Thanks! |
bjjones
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still seeing flaky failures on the notification test. Everything else LGTM.
bjjones
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Creating a resource which made the current usage equal to budget would result in IsOverBudget never terminating. This fix replaces IsOverBudget with GetBudgetLeft so created resources never exceed the budget.
|
@bjjones Merging, thanks! |
Creating a resource which made the current usage equal to budget would result in IsOverBudget never terminating. This fix replaces IsOverBudget with GetBudgetLeft so created resources never exceed the budget.