New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added discrete module, transforms #14725

Merged
merged 11 commits into from May 24, 2018

Conversation

Projects
None yet
4 participants
@sidhantnagpal
Member

sidhantnagpal commented May 18, 2018

Brief description of what is fixed or changed

The following changes will be covered in this PR:

  • Add discrete module
  • Fast Fourier Transform
  • Number Theoretic Transform

Other comments

Fast Walsh Hadamard Transform and remaining transforms will be a part of separate PR.

sidhantnagpal added some commits May 18, 2018

a power of 2.
"""
if not hasattr(seq, '__iter__'):

This comment has been minimized.

@jksuom

jksuom May 18, 2018

Member

SymPy iterable would probably be better:

When SymPy is working with iterables, it is almost always assuming
that the iterable is not a string or a mapping, so those are excluded
by default.

This comment has been minimized.

@sidhantnagpal

sidhantnagpal May 18, 2018

Member

Okay, will change it to iterable(seq).

raise TypeError("Expected a sequence of numeric coefficients " +
"for Fourier Transform")
a = sympify(seq)

This comment has been minimized.

@jksuom

jksuom May 18, 2018

Member

This could be made a list to begin with: a = [sympify(arg) for arg in seq]. (a = list(map(sympify, seq)) is also often used.)

return
while n&(n - 1):
n += n&-n

This comment has been minimized.

@jksuom

jksuom May 18, 2018

Member

I find this difficult to follow. I would rather use something like this:

if n & (n-1):  # not a power of 2
    n = 2**n.bit_length()

This comment has been minimized.

@sidhantnagpal

sidhantnagpal May 18, 2018

Member

Actually n&-n masks all bits but the LSB of n. So, the carry is propogated until there is only 1 bit set. It is O(k) - k is the set bits in n.
It is an optimized version of the following:

while n&(n - 1):
   n += 1 

I haven't run tests but I think it will be faster than calculating n.bit_length() which will always be linear in the position of MSB. If you want I can benchmark them both.

This comment has been minimized.

@jksuom

jksuom May 18, 2018

Member

I don't think it would be necessary to optimize this. The time will be nanoseconds anyway.

This comment has been minimized.

@sidhantnagpal

sidhantnagpal May 18, 2018

Member

Okay, will make the changes if you think it is more readable.

@jksuom

This comment has been minimized.

Member

jksuom commented May 18, 2018

Can you add a reference for this transform?

@sidhantnagpal

This comment has been minimized.

Member

sidhantnagpal commented May 18, 2018

Sure. Are doctests also required?

@jksuom

This comment has been minimized.

Member

jksuom commented May 18, 2018

I am not sure about doctests. Perhaps some simple illustrative one could be added, maybe n = 8.

"""
Performs the DFT in complex field, by padding zeros as
radix 2 FFT, requires the number of sample points to be
a power of 2.

This comment has been minimized.

@asmeurer

asmeurer May 18, 2018

Member

This is unclear. Does the function do the padding itself or is the user required to do so?

This comment has been minimized.

@asmeurer

asmeurer May 18, 2018

Member

I see, you are saying here that it does do the padding. The incorrect comma after "FFT" confused me.

I think this would be clearer

Performs the discrete Fourier transform (DFT) in the complex domain

The sequence is automatically padded to the right with zeros, as the radix 2 FFT requires the number of sample points to be a power of 2.

Also this would be better put in the public API functions.

This comment has been minimized.

@sidhantnagpal

sidhantnagpal May 19, 2018

Member

Sure. Will update it.

return a
def fft(seq):

This comment has been minimized.

@asmeurer

asmeurer May 18, 2018

Member

These need docstrings with doctests.

This comment has been minimized.

@asmeurer

asmeurer May 18, 2018

Member

Also if there is a reference you used for the implementation add it here.

This comment has been minimized.

@sidhantnagpal

sidhantnagpal May 19, 2018

Member

Yes, as discussed, I will be adding them.

a[i], a[j] = a[j], a[i]
ang = -2*pi/n if inverse else 2*pi/n
w = [exp(I*ang*i).expand(complex=True) for i in range(n // 2)]

This comment has been minimized.

@asmeurer

asmeurer May 18, 2018

Member

expand does a lot of unnecessary work. I would just write cos(ang*i) + I*sin(ang*i) directly here.

n = len(a)
if n < 2:
return

This comment has been minimized.

@asmeurer

asmeurer May 18, 2018

Member

Is this correct? This will return None.

This comment has been minimized.

@sidhantnagpal

sidhantnagpal May 19, 2018

Member

Thanks for pointing out. This should be changed to return a.

sidhantnagpal added some commits May 19, 2018

if symbolic:
w = [cos(ang*i) + I*sin(ang*i) for i in range(n // 2)]
else:
w = [exp(I*ang*i).evalf() for i in range(n // 2)]

This comment has been minimized.

@asmeurer

asmeurer May 21, 2018

Member

As I noted on gitter, I think this will already just do the right thing if the input is evalf'd first. Is this not the case?

This comment has been minimized.

@asmeurer

asmeurer May 21, 2018

Member

I guess it doesn't for exp, but it does for sin and cos. Maybe exp should be fixed here. Or you could just detect if the input is floating point (actually Float + Float*I) automatically.

This comment has been minimized.

@sidhantnagpal

sidhantnagpal May 21, 2018

Member

As I noted on gitter, I think this will already just do the right thing if the input is evalf'd first. Is this not the case?

Yes, apparently it does. So is it suggested to convert argument to Float explicity (to handle the integer case, where it remains unevaluated).

Maybe exp should be fixed here

Does this mean that its behaviour should be similar to sin and cos for floating point inputs?
(Currently both sin/cos yield a float if input is floating (real/complex), but this is not the case with exp which as you mentioned evaluates only for real floats)

This comment has been minimized.

@jksuom

jksuom May 21, 2018

Member

Is it necessary to use exp at all if sin and cos can be converted to Floats (by making ang a Float of suitable precision)?

This comment has been minimized.

@sidhantnagpal

sidhantnagpal May 21, 2018

Member

Fair point. In that case, we might just need to devise a way for deciding a suitable precision.

This comment has been minimized.

@asmeurer

asmeurer May 23, 2018

Member

It's probably more efficient to use exp. But also not really worth separate code paths IMO, since it should just do the right thing on floating point inputs.

Fair point. In that case, we might just need to devise a way for deciding a suitable precision.

I think if you let exp evaluate itself (fix the issue I noted on gitter), then the precision won't be an issue. Because the exp will just evaluate to whatever precision the input is, just like sin and cos do now.

@asmeurer

This comment has been minimized.

Member

asmeurer commented May 21, 2018

Yeah exp should be made to auto evaluate here (perhaps after an expand_mul).

sidhantnagpal added some commits May 23, 2018

ang = -2*pi/n if inverse else 2*pi/n
if not symbolic:
ang = ang.evalf()

This comment has been minimized.

@jksuom

jksuom May 23, 2018

Member

The precision should probably be respected here, too.

This comment has been minimized.

@sidhantnagpal

sidhantnagpal May 23, 2018

Member

What changes are expected here, for deciding the precision?

This comment has been minimized.

@jksuom

jksuom May 23, 2018

Member

I think that there are several possibilities. The precision could be given as a keyword parameter or derived from the argument values. The precision for ang should probably be higher than the expected output precision to minimize rounding errors. Probably some experimentation is necessary.

This comment has been minimized.

@sidhantnagpal

sidhantnagpal May 23, 2018

Member

Right, will try them out. Thanks.

def test_ntt_intt():
# prime modulo of the form (m*2**k + 1), sequence length should

This comment has been minimized.

@jksuom

jksuom May 23, 2018

Member

I think that the word is 'modulus'. 'modulo' is its ablative case that means 'by modulus'.

This comment has been minimized.

@sidhantnagpal

sidhantnagpal May 23, 2018

Member

Okay, will make the changes. Thanks.

q = 2*500000003 + 1 # prime modulo only for (length = 1)
r = 2*3*5*7 # composite modulo
assert all(tf(ls, p) == ls for tf in (ntt, intt) \

This comment has been minimized.

@jksuom

jksuom May 23, 2018

Member

Backslash is not needed inside parentheses, etc. PEP 8:

The preferred way of wrapping long lines is by using Python's implied line continuation inside parentheses, brackets and braces. Long lines can be broken over multiple lines by wrapping expressions in parentheses. These should be used in preference to using a backslash for line continuation.

return a
def ntt(seq, q):

This comment has been minimized.

@asmeurer

asmeurer May 23, 2018

Member

Should the second argument here be p, to match the docstring?

This comment has been minimized.

@sidhantnagpal

sidhantnagpal added some commits May 23, 2018

@asmeurer

This comment has been minimized.

Member

asmeurer commented May 23, 2018

Can you explain why the symbolic and dps flags are there? I think neither flag should be necessary. Functions like sin evaluate floats automatically and they do so with the correct precision:

>>> sin(Float('0.1', 100))
0.09983341664682815230681419841062202698991538801798225999276686156165174428329242760966244380406303627

So if the code just computes sin(ang*i) + I*cos(ang*i) it should just work without having to care if ang is symbolic or not. It seems like you are overengineering stuff with this symbolic flag and it isn't clear why. No other function in SymPy has a flag symbolic.

@jksuom

This comment has been minimized.

Member

jksuom commented May 23, 2018

ang is derived from the length of input data. I think that it has to be explicitly converted to a Float if sin and cos are to be Floats. Otherwise they are either unevaluated symbols or expressions with nested square roots.

@sidhantnagpal

This comment has been minimized.

Member

sidhantnagpal commented May 23, 2018

I think neither flag should be necessary

If you are suggesting to just perform the DFT (using unevaluated symbols) and then doing evalf on the coefficients obtained, it would be painfully slow for larger values n.

@sidhantnagpal

This comment has been minimized.

Member

sidhantnagpal commented May 23, 2018

No other function in SymPy has a flag symbolic

As discussed earlier, it might be named better, maybe in relevance with existing methods like evalf(), N().

@asmeurer

This comment has been minimized.

Member

asmeurer commented May 23, 2018

I mean if the function has float inputs then the result should be float, and if it has exact/symbolic inputs, then the result should be exact.

@sidhantnagpal

This comment has been minimized.

Member

sidhantnagpal commented May 24, 2018

Complex roots of unity are independent of inputs (float in your case). So, we would want floating complex roots of unity (will otherwise contain unevaluated symbols - radicals or sin/cos), for the result to be floating.

if not symbolic:
if dps is None:
dps = 15
if dps is not None:
ang = ang.evalf(dps + 10)

This comment has been minimized.

@jksuom

jksuom May 24, 2018

Member

dps + 2 would probably suffice. 10 extra decimals feel like overkill.

This comment has been minimized.

@sidhantnagpal

sidhantnagpal May 24, 2018

Member

Okay. As a follow up to efficiency, currently we have n calls to sin/cos.
It can be reduced to log(n) calls, by using w[i] = w[i&(i - 1)] * w[i&-i] (given all powers of 2 are initialized in w), at the cost of error accumulation (which can be accounted for by increasing the precision).

This comment has been minimized.

@sidhantnagpal

sidhantnagpal May 24, 2018

Member

The above method has O(log(n)) error accumulation and O(n + c*log(n)) complexity (tradeoff).
The current method has error accumulation as O(1), complexity is O(n * c), c is the constant for calculating sin/cos.
The problem with w[i]=w[i-1] * w[1] is that error accumulation is O(n) significant, complexity is O(n).

# prime modulus of the form (m*2**k + 1), sequence length should
# be a divisor of 2**k
p = 7*17*2**23 + 1
q = 2*500000003 + 1 # prime modulus (valid only for unity length)

This comment has been minimized.

@jksuom

jksuom May 24, 2018

Member

What does valid only for unity length mean?

This comment has been minimized.

@sidhantnagpal

sidhantnagpal May 24, 2018

Member

The prime modulus (q) can be used only for sequences of length 1 or 2, I will remove the comment.

This comment has been minimized.

@jksuom

jksuom May 24, 2018

Member

It is not necessary to remove all of the comment. only for sequences of length 1 or 2 would clarify.

This comment has been minimized.

@sidhantnagpal
.. [1] http://www.apfloat.org/ntt.html
.. [2] http://mathworld.wolfram.com/NumberTheoreticTransform.html

This comment has been minimized.

@jksuom

jksuom May 24, 2018

Member

A reference to Wikipedia could also be included: https://en.wikipedia.org/wiki/Discrete_Fourier_transform_(general)

This comment has been minimized.

@sidhantnagpal

sidhantnagpal May 24, 2018

Member

Okay, will do.

@jksuom

This comment has been minimized.

Member

jksuom commented May 24, 2018

Thanks, this can be merged.

@jksuom jksuom merged commit 0c967ae into sympy:master May 24, 2018

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment