-
Notifications
You must be signed in to change notification settings - Fork 124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible memory leak #37401
Comments
@cailafinn would you be happy to double check this on IDAaaS and check if you see the same behaviour on your Linux machine (and maybe Mac)? I would recommend using the first set of testing instructions (i.e. the time slicing example). While the script I've provided for the second way of causing the memory to fill up gets to over 90%, I can't seem to get it to the point where it crashes Mantid, so I think the first set of instructions is the most reliable way to demonstrate the problem. |
Reproduced on Ubuntu 22.04 (Nightly conda package). Took two repeats of clicking Breakdowns:Nightly:
V9.6.1 and earlier
This would appear to be a regression. |
As Caila has suggested, we should see if this PR to address issues with jemalloc has any impact on this issue. We've not made any changes to the Reflectometry code since 6.9.1 that seem like they should be causing this, and it isn't happening on Windows. I'll test on IDAaaS when that PR has gone in and see where we are then. |
This doesn't appear to have been helped by the jemalloc change. We're still waiting on results from a Valgrind scan. In the meantime I've done some more testing on IDAaaS and have found this alternative way to replicate the behaviour:
It should show the same behaviour as the Reflectometry GUI. If |
Depending on the results of the Valgrind scan, if further investigation is needed it will need to be done by someone with access to a Linux machine as I still can't replicate on Windows. |
I've done a bit more testing on IDAaaS and this is starting to look like a more general problem. I can replicate it using the I've found that if you run the following script then it doesn't build up the memory:
However if you do the above in any of the following ways then the memory does build up:
This doesn't seem to happen in version 6.9.1 of Mantid, and again doesn't happen on Windows. When the memory fills up with this test it seems to eventually cause Mantid to freeze and become unresponsive on IDAaaS rather than crash. |
The steps were followed in IDAaaS using
Valgrind log file for reference |
One billion permutations later, here are my results. Testing on macOS to follow. Testing on Ubuntu 22.04. 32GiB RAM.Run Script Once: Loop Loading Callfor i in range(15):
AnalysisDataService.clear()
LoadISISNexus(Filename="WISH57086", LoadMonitors="Include", OutputWorkspace="test")
Run Script Multiple Times: Single Loading Call# Repeatedly ran 3 times.
AnalysisDataService.clear()
LoadISISNexus(Filename="WISH57086", LoadMonitors="Include", OutputWorkspace="test")
Run Script Multiple Times: Swap Clear Order# Repeatedly ran 4 times.
LoadISISNexus(Filename="WISH57086", LoadMonitors="Include", OutputWorkspace="test")
AnalysisDataService.clear()
No Script# Click Load and load WISH57086 from the algorithm dialog
# Clear the ADS using the clear button.
Load with Algorithm Dialog, Clear From Script# Click Load and load WISH57086 from the algorithm dialog
# Run below from a script
AnalysisDataService.clear()
Load with a script, Clear from the Button# Run the below script to load WISH57086
# Then clear using the button on the workbench GUI.
LoadISISNexus(Filename="WISH57086", LoadMonitors="Include", OutputWorkspace="test")
|
Testing on macOS 14.4.1 (Intel). 32GB RAM.No Script# Click Load and load WISH57086 from the algorithm dialog
# Clear the ADS using the clear button.
|
@cailafinn Are you launching with the |
|
We've noticed that the version of the Nightly from 24th May (which is the first to include the jemalloc pin) is looking much better on IDAaaS now, when started with the launcher from the Applications menu (FYI @sf1919). Tom has explained that launching the Nightly using the IDAaaS launcher sets up jemalloc, whereas using the workbench entry point does not use the script that sets up jemalloc. |
Re-tested launching using
Seems to fix the issue. Do we need to alter the launch instructions for linux? We currently recommend starting it using the |
Some of my findings when loading and clearing all via the GUI (all nightly versions here are 6.9.20240524.1804):
On IDAaaS:
|
With the further changes to IDAaaS last week how are things looking @thomashampson ? |
Please read all comments to get full picture of the issue, as the problem appears to be more general than first described here.
Describe the bug
We've had reports of Mantid crashing due to memory use when users are auto-processing INTER experiment data while time slicing. I've done a bit of investigating and it looks to me like there may be a memory leak somewhere. This is most noticeable when we time slice in the GUI.
To Reproduce
The easiest way to replicate is as follows:
Interfaces
->Reflectometry
->ISIS Reflectometry
.77064
and angle0.8
.(sec) slices
radio box and enter5
to slice the data into 5 second slices.3-258
3001-3256,4001-4256,5001-5256,6001-6256,7001-7256,8001-8256,9001-9256,10001-10256
If I do the above without the GUI (i.e. calling
ReflectometryISISLoadAndProcess
from a script while time-slicing and clearing workspaces between each run) then the memory doesn't seem to build up very much at all. If I run the algorithm directly from the algorithm dialog then the memory does build up more quickly and eventually causes a crash.Expected behavior
Repeated processing of the data should not cause the memory to fill up and cause a crash if the workspace list is being cleared in between.
Platform/Version (please complete the following information):
Additional Information
Another way I've tested is to run this script repeatedly, which does not perform the time slicing, clearing the workspaces in between:
I haven't been able to crash Mantid using this script, though, even though the memory fills up to over 90% if run repeatedly. The memory builds up pretty slowly when tested in this way.
The text was updated successfully, but these errors were encountered: