Make CPU reduction more precisse in float32 #1092

nouiz · 2012-11-23T18:09:11Z

Possible fix:

Do the reduction in float64. Maybe this won't slow down as it should be memory bound. Need test.
Do multiple level of summation. This would help for other dtype like int and float64. It would help in the int case to prevent overflow when there is a mix of positive and negative value.
- We could do a sum per row. Probably easy code change, but won't help for big row.

I make it as a high priority as when running on the GPU in debugmode, this cause error, as the GPU sum do this, so it is more precise.

But solving the gpu sum useless error will probably create a new one: theano c_code vs numpy code as numpy code don't do this.

nouiz · 2013-01-14T21:00:02Z

see numpy/numpy#2448 that want to be more general. Some of the suggestion seam fast to do! So why not fix that upstream?

lamblin · 2013-03-26T16:20:28Z

Fixed by gh-1226.

lamblin closed this as completed Mar 26, 2013

Provide feedback