You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm not too competent on the topic, so was wondering if someone can tell me that my assumption is correct, given that in the c_code of the function in Theano it is defined as:
DEVICE double _psi(ga_double x)
How it is possible to make the code generic and potentially run for float32.
Also, is there a good reason not to do that etc...
The text was updated successfully, but these errors were encountered:
It is true that the code will always run in double the way it's written. We might want to make the function a template so that it can use float32 for faster speed also. However the constants in the code are computed for float64 and I don't how portable that would be to float32.
so at this stage, first I'll ask how exactly is the function C code inserted when compiling a composite Elementwise operation or a CUDA kernel? Is it as a separate function, which can get the dtype as a template or is it something that needs %(dtype)sas per some of other code? I just have no idea how this conversion goes to have some better perspective on the problem.
As from the code of the two operators, it seems that the most unstable computations are of the form 1/x, 1/(x+1) ... and 1/x^2, 1/(x+1)^2... where these are turned off when x is less than a predefined value. As far as numerical stability is concerned the only issue is when the x is really small, thus why the computation is truncated by default. One might need to define potentially a slightly higher truncation threshold for lower precision types, but other than that I don't see why the rest of the computation can not be done in other types.
I'm not too competent on the topic, so was wondering if someone can tell me that my assumption is correct, given that in the
c_code
of the function in Theano it is defined as:DEVICE double _psi(ga_double x)
How it is possible to make the code generic and potentially run for
float32
.Also, is there a good reason not to do that etc...
The text was updated successfully, but these errors were encountered: