Output scaling factor #15

KlepD-SAL · 2023-01-12T12:44:54Z

Hello!

In the spec, it is mentioned that the core calculates the regular DFT "to within a scale factor".

After trying out the FFT core with several different configurations, it is not quite clear to me what this scaling factor is or where it comes from.

As a reference, I compared the FFT core's results to a Python FFT (NumPy, as well as a manual DFT loop) and came to the following conclusion:

Input bitwidth (N)	Output bitwidth	FFT length (F)	Scale factor
8	10	8	2
8	11	16	1
8	11	32	4
8	12	64	2
8	12	128	8
8	13	256	4
8	13	512	16

(Command for core generation: fft-gen -f F -n N -p 1024 -x 4)

The output bitwith was not manually capped, so the generator decided how many bits are required. The same test was performed with an input bitwidth of 16, leading to the same scale factors. So I assume that these are deterministic and due to them being all factors of 2, I also assume that they are the result of certain implementation details (shifting, truncating, etc.).

Unfortunately, I didn't find more information on that topic in other issues, the blog post, spec, or README.

Would it be possible to elaborate on that?

I'm also a little bit surprised that no one already mentioned this since it seems pretty important to me to have the correct scaling factor in order to get "correct" results (i.e.: the same as in a co-simulation, for example).

Thanks a lot!

The text was updated successfully, but these errors were encountered:

ZipCPU · 2023-01-12T13:54:03Z

The output scales at one additional bit for every two stages. This is based upon two principles:

Any incoming noise (zero mean, any standard deviation, any distribution) increases at a half a bit per stage. That is, the variance of the noise grows with the number of stages, but the standard deviation with the square root of the FFT length.
Any tones (i.e. non-zero mean) will increase in value at one bit per stage. That is, the mean output will at most be N times the mean input, where N is the FFT length.

Therefore, the FFT drops one bit every other stage. This minimizes the computational power required, at no loss of relevant precision.

Dan

KlepD-SAL · 2023-01-13T09:51:57Z

Hello Dan,
thanks for your explaination!

Following that logic, the output scale should be calculated according to 2 ^ (log2(FFT_LEN) / 2), right?
log2(FFT_LEN) to get the number of stages, and / 2 to account for the dropped bit in every other stage.

Now, if I understand the source code correctly, the first stage never drops a bit, so only the 3rd, 5th, etc.
This would lead to: 2 ^ ( (log2(FFT_LEN)-1) / 2) for the scale factor.

What is now still unclear to me, is the fact that is sometime "un-drops" a bit, according to the scale factors I found.
F=16, for example, has a scale factor of 1 (2^0), so it apparently never dropped a bit. But the generated Verilog actually shows, that the qtrstage drops a bit (input width == output width).

So I guess there is some other condition that affects the scale factor. Do you know what am I missing here?

Thanks and best regards,
Daniel

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Output scaling factor #15

Output scaling factor #15

KlepD-SAL commented Jan 12, 2023 •

edited

Loading

ZipCPU commented Jan 12, 2023

KlepD-SAL commented Jan 13, 2023

Output scaling factor #15

Output scaling factor #15

Comments

KlepD-SAL commented Jan 12, 2023 • edited Loading

ZipCPU commented Jan 12, 2023

KlepD-SAL commented Jan 13, 2023

KlepD-SAL commented Jan 12, 2023 •

edited

Loading