New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixup underperforming GENERIC kernel for volk_8u_x4_conv_k7_r2_8u #475
Conversation
Signed-off-by: Aang23 <qwerty15@gmx.fr>
|
For reference, here is once again the exact same data (METEOR-M 2 LRPT, which is a weather satellite downlinking a QPSK CCSDS r=1/2 k=7 signal). SSE Kernel : https://ibb.co/cYFTFHs |
Can you point us to the approximate code for comparison's sake? I did a quick look through & didn't see anything obvious (one way or the other). Maybe those other programs are more obvious as to what's going on Volk's?! |
Sure! - I do not know if I am really right on any of this, as I said my understanding of this algorithm isn't really good enough for that. In libcorrect (https://github.com/quiet/libcorrect/blob/f5a28c74fba7a99736fe49d3a5243eca29517ae9/src/convolutional/decode.c#L60) what appears to play the same role is set to conv->rate (so 2 in this instance). And it seems to be identical in this other algorithm I am using (slightly patched for CCSDS codes from an old GNU Radio version, but what's in the main repo currently is identical otherwise), https://github.com/altillimity/SatDump/blob/00e2dfadd7ced52d07d208a30b517eeb0d20a772/src-core/modules/common/viterbi_lib/viterbi.cpp#L65. (Shift applied twice, |
|
Mh, well, the generic and the SSE Spiral code kind of look like code from Phil Karn’s FEC library. Karn’s SSE code uses 5 bits for the branch metric, whereas the Spiral code uses 6 (see usage of @Aang23 Please test whether decreasing |
I established the SSE Spiral kernel was initially generated by http://spiral.ece.cmu.edu/vitgen, which also explains why it is named "spiral" in the first place. (by the way, thanks for the in-depth explanation)
I will try this in a few minutes. |
|
@rear1019 I just tested your theory, with |
|
Any new on this? |
|
I can't tell if things improved now. Did you guys come to a conclusion? It seems like this is not an issue we can simply test in our CI. This kind of FEC performance regression is terribly hard to detect. Anyways, @Aang23 do you have some test code? Ideally, it is as simple as possible but allows some kind of BER/FER over SNR simulation to test performance. I could imagine adding this code to VOLK. We wouldn't run it as part of our normal CI but use it whenever we try to fix an issue such as this one. Besides, we started the process to migrate VOLK to LGPL. It'd be great to receive your support. Ideally with a simple PR against the |
I just realized that Smaller values indicate a better path metric due to the way the branch metrics are calculated. The bad paths of the trellis start off with a higher metric. Moreover, the metric of the correct path will grow slower than the metric of all other paths (assuming a low or moderate error rate). At some point a bad path may become a good one due to overflow. @Aang23 Are you able to compile GNU Radio? If so, please test whether decreasing the value |
|
I can generate some performance test code for this. I'm also having the same issue with this kernel and it's forced me to use libcorrect instead when volk alone could do the job (and do it much faster) |
I do sort of have some, (based on GNU Radio's CC Decoder), but due to the nature of the issue it is not that simple, the issue only really shows when feeding in a large (noisy) dataset. My method for testing this has been feeding real-world data from a CCSDS-compliant satellite and comparing the output quality and BER. The best I could provide to reliably show the issue would be a pre-processed (ready to be fed to the kernel) dataset to feed some basic decoding code. Sorry for the delay in replying, for some reason all mail notifications from this repository ended up in my spam folder...
Yes I am. I have tested this both in my software (SatDump, which is using an implementation based on GNU Radio's) and in GNU Radio. This did not help at all unfortunately. (With METRICSHIFT = 1)
I tried several times using shorter datasets but it really doesn't demonstrate the issue... Though, maybe you can do like I'm doing currently and provide your own fix if Spiral's not available? See https://github.com/altillimity/SatDump/blob/a739b3d9c52b30978452e52d213c7c2c023f3578/src-core/common/codings/viterbi/cc_decoder.cpp#L108 |
|
I'm curious, does this change improve decoding performance now or not? I'd be happy to merge this PR. Could you add some more info here which configurations seem to work well and which don't? Which configuration does CCSDS use? |
On my end, I have been using this PR's contents in SatDump for months as a fallback if the Spiral kernel is not available. I've had plenty of confirmations things worked as expected since then. As far as tests I have done on actual CCSDS signals go, it definitely does in every case.
Well, I have only tested with Voyager/CCSDS codes which are by far the most common for r=2,k=7 codes, so that was with G1 & G2 being 79, 109.
I can only agree on this! The issue will only really show over a decent amount of data at low SNR. It does show decently in GNU Radio by chaining an encoder, noise source (Add + Gaussian noise source) and the decoder, but that's pretty much as good as it gets for reproducing the issue without straight up testing on actual signals. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. These changes work in https://github.com/altillimity/SatDump I expect to work them as least as well for everyone. I hope we can introduce some more extensive tests for our FEC kernels at some point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
Yes wishing we had a test for this, but for now let's just get it fixed |
|
Thanks! |
|
@Aang23 if you like, you can add your name etc. to https://github.com/gnuradio/volk/blob/main/.zenodo.json |
This addresses #473
I ended up comparing this implementation with others open-source implementations such as libcorrect and viterbi_lib from GNU Radio, and accordingly tuned
volk/kernels/volk/volk_8u_x4_conv_k7_r2_8u.h
Line 103 in 79a111b
To my surprise, this did drastically improve performances to match the SSE / Spiral kernel ignoring a few details. Output from this "fixed?" generic kernel does slightly differ from the spiral implementation, but to a level that is entirely tolerable (down to a few bits on an entire 100Mb file) in my opinion.
Honestly, I do not know exactly how this fixes things in detail and did not have much clue what I was doing, so I would entirely understand if this is not accepted as a valid fix, but it does make this kernel usable compared to previous results and definitely as consistent as I would expect it to be.
I did not notice any degradation or weird behavior in my tests.