-
Notifications
You must be signed in to change notification settings - Fork 168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fft_method pyfftw causes unexpected noise additions when using multi-threading #337
Comments
Update on this possible bug: I see that it still comes back in the STEPS nowcasts (not the blending anymore) with both pyfftw and numpy as fft_methods. Seems that in the newer Dask versions, we have to more explicitly pin the activities to specific cores and keep track of it. For the nowcast loop this still goes well multi-threaded, provided that numpy is used as method, but for all other multi-threaded processes (cascade decomposition, noise initialization, etc.) it seems to go wrong. We should either fix those options to one worker or find a different solution, I'm afraid. |
Hi @RubenImhoff thanks for the update. Would it be possible to get a better idea of the changes you are suggesting? If I understand well, one option would be to explicitly set some arguments for dask, right? |
Hi @dnerini, of course. The simple solution is to only use one worker (thread) for the parts where it goes wrong. I have tested it by fixing
where num_workers can be > 1. I think ideally, we would make full use of dask. In that case, we would have to pin the work to specific cores or so. I believe that is possible in Dask too, but I have no experience with it. Maybe you do or @pulkkins? |
…o dask multi-threading (pySTEPS#337)
Hi everyone, pysteps/pysteps/nowcasts/steps.py Lines 708 to 709 in be8eea4
The workers inside the pysteps/pysteps/nowcasts/steps.py Lines 822 to 831 in be8eea4
Finally the list is converted to a numpy array: pysteps/pysteps/nowcasts/steps.py Lines 833 to 846 in be8eea4
When dask is not used their is no concern since the workers will always be triggered in the same order. But when dask is used the order in which the workers are triggered is quite random. This will cause that during the I have proposed a fix in PR #347. @RubenImhoff could you check if this solves your issues? |
* Bugfix: fix random placement of ensemble members in numpy array due to dask multi-threading (#337) * Bugfix: make STEPS (blending) nowcast reproducable when the seed argument is given (#346) * Bugfix: make STEPS (blending) nowcast reproducable, independent of number of workers (#346) * Formatting with black --------- Co-authored-by: ned <daniele.nerini@meteoswiss.ch>
After some testing, I can confirm that #347 fixes the issue. :) |
If I run an ensemble using multiple cores (so, dask will parallellize the ensemble members over the cores), it seems that the ensemble order is lost, resulting in weird transitions (due to noise from a different member that ends up in that member). See for instance (these are three 15-min instance in the forecast):
Or maybe even clearer:
If I run on 1 core, this problem does not occur, so it gives the impression that this has to do with parallellizing using dask. After having contact with @mpvginde, I could not reproduce the error with the Belgian test case that we have (the figures above are with our Dutch data and setup). Only difference between our setups turned out to be the setting that 'fft_method' = pyfftw instead of numpy in the Belgian setup. After changing this, the problems disappeared when running multi-threaded. This gives the impression that pyfftw should not be used for nowcasting and blended forecasting when using more than 1 worker/thread.
Is this a familiar issue to you and is there anything we can do about it?
The text was updated successfully, but these errors were encountered: