Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integer convolutions have rounding errors #410

Open
rpigott opened this issue Mar 14, 2021 · 4 comments · May be fixed by #545
Open

Integer convolutions have rounding errors #410

rpigott opened this issue Mar 14, 2021 · 4 comments · May be fixed by #545

Comments

@rpigott
Copy link

rpigott commented Mar 14, 2021

DSP contains the following definition:

conv(u::AbstractArray{<:Integer, N}, v::AbstractArray{<:Integer, N}) where {N} =
    round.(Int, conv(float(u), float(v)))

The float conversion and subsequent rounding introduces rounding errors when the floating point epsilon near the computed values is greater than 1, even though it returns an array of Int64. For example, the following result is unexpected to me:

using DSP
n = 314159265 # n < isqrt(typemax(Int)), no overflow
conv([n], [n]) == [n^2] # false, rounding error!

With a "naive" direct convolution, there is no error:

function dconv(a, b)
  na, nb = length(a), length(b)
  b = [b; zeros(eltype(b), na - 1)]
  a = [a; zeros(eltype(a), nb - 1)]
  [sum(a[i]*b[k - i + 1] for i in 1:k) for k in 1:(na+nb-1)]
end
n = 314159265
dconv([n], [n]) == [n^2] # true

The julia FAQ correctly (imo) identifies many reasons why integer overflow behavior is an acceptable error where rounding/truncation is not; it may even be desirable. For julia, I also think it's not very good in terms of type stability, considering the following:

a = UInt32[1, 2, 3]
typeof(a) # Array{UInt32, 1}
typeof(conv(a, a)) # Array{Int64, 1}

since it always casts to Int, regardless of the input integer element type. FWIW, scipy uses machine integer arithmetic:

> import numpy as np
> import scipy.signal as ss
> n = 314159265
> ss.convolve([n], [n]) == [n**2]
array([True])
> n = 2**32
> ss.convolve([n], [n])
array([0]) # int64 overflow

There are a couple ways to tackle this:

  1. Remove the above definition of conv for integer type arrays. Users who are OK with rounding errors can perform the cast themselves, and are more likely to avoid unexpected type stability issues this way if they use an integer type other than signed Int64.
  2. Include a conv method for integer arrays that uses a direct algorithm like I wrote above. (This is what scipy does) It will have the expected overflow behavior for integer types. Again, users who can accept rounding errors can cast their data themselves if they need to use a faster algorithm.
  3. If we want to be cool, include a fast integer convolution method. I guess this would make use of FFT in rings of integers like Schönhage-Strassen. Probably would only make sense for very large arrays.
@galenlynch
Copy link
Member

Currently DSP.jl only has FFT convolution (overlap-save and naive FFT) algorithms using FFTW. The reason the method you note above exists is because FFTW only works with floats, so conversion and rounding is necessary for the available FFT algorithms. I have a PR that has a number of fast direct convolution kernels, but it's stalled out due to my temporary lack of bandwidth and uncertainty about how to select which algorithm to use when there are a decent number of algorithms to pick from (as is the case in my PR).

@rpigott
Copy link
Author

rpigott commented Mar 14, 2021

Thanks, I understand. Maybe conv should just always return float for now then? At least until an integer convolution method is added. To me that makes more sense than rounding data – obviously the error will still be there, but it will be obvious why. It also avoids the cast to Int64.

@galenlynch
Copy link
Member

Sorry I don't think I read your proposal carefully enough the first time. I think your points 2 and 3 are clearly the best way forward, and are indeed what I have attempted to do. If I understand your first suggestion, it is to not support integer convolution, and instead ask users to write their own methods for integer convolution because there is rounding involved. I'm not sure that I agree that stripping away that functionality for the sake of absolute correctness is worth it, but it is an interesting suggestion. I would be curious to hear if others would want no answer instead of an approximate answer. Though you're right that this rounding behavior could be more clearly documented, since it is likely surprising.

@galenlynch
Copy link
Member

I like your suggestion of just returning the floating point result.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants