Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opencl backend only supports a number of harmonics (k) in range(3,11) #41

Open
mirt001 opened this issue Sep 1, 2021 · 4 comments
Open
Labels
question Further information is requested

Comments

@mirt001
Copy link

mirt001 commented Sep 1, 2021

Probably nobody needs k>10, but k<3 are needed.

What surprised me even more is that the limitation only exists in the opencl backend.
The only place where k is used in an unsafe way, imo, is on line 88 in the futhark code, but I couldn't understand what is ns, and why it can't be what I assume to be 3, or 5.

Interestingly enough, the corresponding code in the "python" backend, is the same, with the exception that it doesn't differentiate between trend and no trend for the value of k2p2, and there's no limitation for k there.

It would be interesting for me to also understand why the python and opencl implementation calculate sigma slightly differently, perhaps pointing me to some literature would suffice.

But the core issue is still for the opencl implementation to work with k=1 or k=2. If this doesn't make mathematical or technical sense, it should probably be explained in the documentation.

mirt001 added a commit to mirt001/bfast that referenced this issue Sep 1, 2021
@mirt001
Copy link
Author

mirt001 commented Sep 1, 2021

Moreover, I just tested !pip install git+https://github.com/mirt001/bfast.git@mirt001-testing-k-1-2#egg=bfast branch on my fork, which comments the lines that check whether k is in k_valid.

It works with seemingly correct results on google colab, and I assume nothing exploded on google side of things.

This is far from proper testing, but whatever prompted imposing the k>2 limitation, might not be in the code anymore.

@mortvest
Copy link
Collaborator

mortvest commented Sep 1, 2021

This limitation has been there from before I joined the project, so I don't know exactly why it is there. I think it has something to do with performance of the GPU version. 2k + 1 is the inner dimension of multiple vector/matrix operations in the code, and the GPU kernels are tuned with this assumption in mind. It should still run for k=1 and k=2, but the performance would probably be suboptimal. More testing needs to be done.

@mortvest mortvest added the question Further information is requested label Sep 1, 2021
@mirt001
Copy link
Author

mirt001 commented Sep 4, 2021

@mortvest I understand that performance might be suboptimal, when compared to k=3, but the performance is still better than the R bfast. Also, python and opencl backends are at least comparable with k=1, and I believe opencl is still faster.
At most, the performance decrease for k=1 and k=2 for opencl should be documented and left to the user. Especially since, imo, most users need specifically k=1 and k=2.

What do you have in mind for testing? I have to run bfastmonitor quite a few times these days. If I can watch out for something extra that would help with development, that would be great. I am currently running my fork that allows me to run with k=1 and k=2. No issues so far.

@mortvest
Copy link
Collaborator

mortvest commented Sep 4, 2021

Thanks for your input @mirt001, I didn't know that it was a popular setup. By testing, I meant how much performance decrease there is and if it actually produces the correct results. The latter is probably true. Regarding the former, we would probably need to retune some parameters for the GPU version. I'll look into it next week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants