Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redsea is loosing contact? #94

Closed
Saeco opened this issue Aug 10, 2023 · 11 comments
Closed

Redsea is loosing contact? #94

Saeco opened this issue Aug 10, 2023 · 11 comments

Comments

@Saeco
Copy link

Saeco commented Aug 10, 2023

Hello,

I really appreciate your famous work you've done with this rds decoder - I'm using it in the combination with node-red and rds_fm/rds_udp and use node-red for switching channels / volume , displaying data ...

What I found out during making experiments what are the best parameters: sometime redsea seams to loose contact to rtl_fm/rtl_udp, especially when rtl_fm/rtl_udp is sending no data because the muting is set with squelch or the frequency is set the way, that the signal is very noisy - then redsea doesn't "find back" to correct demodulation. Here's an example:

$ rtl_udp -l 0 -A fast -p 0 -s 171k -F -f 101.2M | redsea --feed-throug 2>/dev/udp/127.0.0.1/4567 | demux -r 171k -R 44.1k | aplay -f S16_LE -c 2 -r 44100

{"pi":"0xD394","group":"8A","prog_type":"Oldies music","tp":false}
{"pi":"0xD394","group":"0A","partial_ps":"WDR 4 ","di":{"artificial_head":false},"is_music":true,"partial_alt_frequencies":[101300,93900,101300,90700,101300,101300,104400],"prog_type":"Oldies music","ta":true,"tp":false}
{"pi":"0xD394","group":"2A","partial_radiotext":"Andreas Bourani ","prog_type":"Oldies music","tp":false}
{"pi":"0xD394","group":"14A","other_network":{"pi":"0xD395","kilohertz":89600,"tp":false},"prog_type":"Oldies music","tp":false}
{"pi":"0xD394","group":"0A","ps":"WDR 4 ","di":{"stereo":true},"is_music":true,"partial_alt_frequencies":[101300,93900,101300,90700,101300,101300,104400,101300,101700],"prog_type":"Oldies music","ta":true,"tp":false}
{"pi":"0xD394","group":"3A","open_data_app":{"app_name":"RadioText+ (RT+)","oda_group":"11A"},"prog_type":"Oldies music","tp":false}
Changing squelch to 1000
{"pi":"0xD394","group":"8A","prog_type":"Oldies music","tp":false}
{"pi":"0xD394","group":"0A","partial_ps":"WD ","di":{"dynamic_pty":false},"is_music":true,"partial_alt_frequencies":[101300,93900,101300,90700,101300,101300,104400,101300,101700,100000,101300],"prog_type":"Oldies music","ta":true,"tp":false}
{"pi":"0xD394"}
Changing squelch to 0
{"pi":"0xD394"}
{"pi":"0xD394"}
{"pi":"0xD394"}
{"pi":"0xD394"}
{"pi":"0xD394"}
{"pi":"0xD394","group":"12A","prog_type":"Alarm test","tp":false,"unknown_oda":{"raw_data":"0F ---- ----"}}
{"pi":"0xD394"}

After muting (squelch 1000) it doesn't find back after unmuting (squelch 0). The same effect is, when you hear noise for some time ... But every 30 sec (felt, not measured) redsea gives the pi of the station ... But the sound comes back again immediately when unmuting.

How can this be solved BESIDES starting all 4 processes in a row ...? It's not very nice for the listener to break listening just to get rds data again.

@windytan
Copy link
Owner

Interesting, is there any possibility you could send me some of that raw rtl_udp output? Say, a recording like this:

  • Recording starts a few seconds before "squelch 1000"
  • Recording continues over the time it's squelched
  • Still continues at least 10 seconds after squelch 0

I know it's going to be a big file, but it would really help investigate this problem further.

I also have some questions regarding your system:

  • What kind of a computer is this? (run uname -mprs)
  • What's the output of redsea --version

@Saeco
Copy link
Author

Saeco commented Aug 15, 2023

The hardware is a raspi 3b where uname -mprs gives: Linux 6.1.21-v7+ armv7l unknown

the redsea version is: redsea 0.21-SNAPSHOT by OH2EIQ

a simple rtl_udp -l 0 -A fast -p 0 -s 171k -F -f 101.2M > rtl_udp_output raw data recording with approx 10 sec silence with squelch 1000 is in my cloud: https://magentacloud.de/s/g4sEoz6BwWy7qmP

if you should need a dump with another parameters, please let me know!

Thank you very much for your help!

@windytan
Copy link
Owner

And does this file exhibit the problem? If you give the file as an input to redsea, what happens?

On my M1 Mac I get the output below.

If you get fewer groups (fewer than these 27), then I suspect you may have encountered an old mystery problem where redsea on 32-bit Arm Raspberry Pi has worse tolerance for very noisy signals. I've mentioned it in the wiki once but at this point I only have wild guesses as to what the reason might be.
Could it be something about how the Arm vectorizes convolution? Or how it calculates sine and cosine? Maybe it defaults to 32-bit floating point somewhere? All guesses are welcome.

$ ./src/redsea --output-hex < ~/Downloads/rtl_udp_output
---- 8379 040B D363
D394 037A 8488 3420
D394 2375 7368 6974
D394 E362 3520 D395
D394 037F 88A3 2020
D394 8379 040C D3A3
D394 0378 EF7D 5744
D394 2376 730D 2020
D394 E363 2020 D395
D394 0379 787D 5220
D394 6374 1357 9BDF
D394 8379 040C D3A3
D394 037A 7D8E 3420
D394 2370 5744 5220
D394 E365 5608 D395
D394 037F 7D8A 2020
D394 ---- 040C D3A3
D394 0378 7DA3 5744
D394 2371 3420 4D65
D394 E365 7D2D D395
D394 0379 7DA6 5220
D394 037A 7D82 3420
D394 2372 696E 6520
D394 E365 A66F D395
D394 037F 7D88 2020
D394 3370 6280 CD46
D394 ---- ---- ----

@windytan
Copy link
Owner

windytan commented Aug 17, 2023

I tested on a Raspi 3 B and got the same result, so that theory was probably wrong. But something else was found that's interesting - when the squelched signal comes back, this causes our phase error tracker to momentarily imagine a phase error of 10^5 degrees... this throws the PLL frequency off target and it doesn't easily recover. I'll experiment with a limiter later.
phase_error

@Saeco
Copy link
Author

Saeco commented Aug 17, 2023

Yes, very interesting. I installed a 64 bit system on a raspi 3B and I could verify your test in 4, I got the same hex-data and the data in json:

{"pi":"0xD394","group":"0A","di":{"artificial_head":false},"is_music":true,"prog_type":"Oldies music","ta":true,"tp":false}
{"pi":"0xD394","group":"2A","prog_type":"Oldies music","tp":false}
{"pi":"0xD394","group":"14A","other_network":{"pi":"0xD395","tp":false},"prog_type":"Oldies music","tp":false}
{"pi":"0xD394","group":"0A","di":{"stereo":true},"is_music":true,"prog_type":"Oldies music","ta":true,"tp":false}
{"pi":"0xD394","group":"8A","prog_type":"Oldies music","tp":false}
{"pi":"0xD394","group":"0A","di":{"dynamic_pty":false},"is_music":true,"prog_type":"Oldies music","ta":true,"tp":false}
{"pi":"0xD394","group":"2A","prog_type":"Oldies music","tp":false}
{"pi":"0xD394","group":"14A","other_network":{"pi":"0xD395","tp":false},"prog_type":"Oldies music","tp":false}
{"pi":"0xD394","group":"0A","di":{"compressed":false},"is_music":true,"prog_type":"Oldies music","ta":true,"tp":false}
{"pi":"0xD394","group":"6A","in_house_data":[20,4951,39903],"prog_type":"Oldies music","tp":false}
{"pi":"0xD394","group":"8A","prog_type":"Oldies music","tp":false}
{"pi":"0xD394","group":"0A","di":{"artificial_head":false},"is_music":true,"prog_type":"Oldies music","ta":true,"tp":false}
{"pi":"0xD394","group":"2A","prog_type":"Oldies music","tp":false}
{"pi":"0xD394","group":"14A","other_network":{"pi":"0xD395","kilohertz":88300,"tp":false},"prog_type":"Oldies music","tp":false}
{"pi":"0xD394","group":"0A","ps":"WDR 4   ","di":{"stereo":true},"is_music":true,"prog_type":"Oldies music","ta":true,"tp":false}
{"pi":"0xD394"}
{"pi":"0xD394","group":"0A","di":{"dynamic_pty":false},"is_music":true,"prog_type":"Oldies music","ta":true,"tp":false}
{"pi":"0xD394","group":"2A","prog_type":"Oldies music","tp":false}
{"pi":"0xD394","group":"14A","other_network":{"pi":"0xD395","kilohertz":92000,"tp":false},"prog_type":"Oldies music","tp":false}
{"pi":"0xD394","group":"0A","di":{"compressed":false},"is_music":true,"prog_type":"Oldies music","ta":true,"tp":false}
{"pi":"0xD394","group":"0A","di":{"artificial_head":false},"is_music":true,"prog_type":"Oldies music","ta":true,"tp":false}
{"pi":"0xD394","group":"2A","prog_type":"Oldies music","tp":false}
{"pi":"0xD394","group":"14A","other_network":{"pi":"0xD395","kilohertz":98600,"tp":false},"prog_type":"Oldies music","tp":false}
{"pi":"0xD394","group":"0A","ps":"WDR 4   ","alt_frequencies_b":{"tuned_frequency":100000,"same_programme":[99500,101700,101300,103800,104100,100500,101100]},"di":{"stereo":true},"is_music":true,"prog_type":"Oldies music","ta":true,"tp":false}
{"pi":"0xD394","group":"3A","debug":["redsea compiled without TMC support"],"open_data_app":{"app_name":"RDS-TMC: ALERT-C","oda_group":"8A"},"prog_type":"Oldies music","tp":false}
{"pi":"0xD394"}
{"pi":"0xD394"}

This looks very good, but nevertheless it stucks during life-performance like in 1 described. But you found an error in 5 :-) Thanks for helping! :-)

windytan added a commit that referenced this issue Aug 17, 2023
After a period of digital silence, the PLL sometimes suddenly jumps
frequencies and can't recover. This was caused by a large transient
value of phase error (on the order of 10^5). Clamping the phase error
seems to improve this.
@windytan
Copy link
Owner

@Saeco , if you could once more test if this problem has been improved by the latest commit in master.

@Saeco
Copy link
Author

Saeco commented Aug 23, 2023

@windytan , hello, thanks, today I found time to compile an test it: it seems to work after silence perfectly and after noise very good and after crossmodulations of 2 stations during scanning the FM band manually satisfactory: it mostly comes back again, after noise sometimes it takes a little bit longer time than after silence to come back. But it's very good so far. I will watching it and give report, if there is a (reproducible) situation when it gets out of order.

Thank you very much for your help!

@Saeco
Copy link
Author

Saeco commented Aug 27, 2023

I tested it unter "working" conditions: while the rds processing is stable over hours and hours when the receiving is also stable, it is furthermore coming out of order, when scanning the band. I don't find a scheme, under with conditions
it stops working, so I can't provide you a file for testing. But is seems, that it is coming back to decode the signal when the time of non-clean signal is short and it seems that the decoder comes out of order the longer the signal is.

My idea is: resetting the decoder with a token sending over an udp-port or resetting the decoder with a watchdog if the output is "quiet" for or than 10sec. What do you think of it? This would a smart solution to coming back to decoding without stoping listening.

@windytan
Copy link
Owner

That's a really good idea! I'll try and have it reset the receiver automatically after some period of lost carrier. And I'll try to record some long-running test sets.

Notes for my future self, I suspect the PLL frequency is slowly drifting into unrealistic territory when cross-modulation is happening or even with the presence of strong noise. After this, it can't find its way back. All noise-free stations seem to have a little less than 57 kHz - so maybe it's biased low. It's an interesting mystery that could be investigated at a later time.

windytan added a commit that referenced this issue Aug 27, 2023
When carrier is lost the PLL keeps looking for it and starts drifting
slowly. This can make it harder to lock on to the carrier when it
comes back. One solution is to reset the frequency back to 57 Khz
after 10 seconds of loss of block sync being detected.

Open question: is the PLL interface being used correctly?
@windytan
Copy link
Owner

Has there been any improvement from these changes?

@windytan
Copy link
Owner

I'm taking that as a yes, closing the issue.

windytan added a commit that referenced this issue Jul 18, 2024
* Change any potentially overflowing signed variable to unsigned. Signed
  integer overflow is not well-defined in C++.
* A signed integer overflow may have caused the 'reset' functionality
  of #94 to stop working after ~3.5 hours.
* Another one may have broken the block synchronization after 41 days of
  data.
* Document/cassert some lines that may look like a buffer over-read.
windytan added a commit that referenced this issue Jul 18, 2024
* Change any potentially overflowing signed variable to unsigned. Signed
  integer overflow is not well-defined in C++.
* A signed integer overflow may have caused the 'reset' functionality
  of #94 to stop working after ~3.5 hours.
* Another one may have broken the block synchronization after 41 days of
  data.
* Document/cassert some lines that may look like a buffer over-read.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants