Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

48khz sampling rate not working with 16 bit #108

Closed
stewartsiu opened this issue Sep 14, 2023 · 29 comments
Closed

48khz sampling rate not working with 16 bit #108

stewartsiu opened this issue Sep 14, 2023 · 29 comments

Comments

@stewartsiu
Copy link

stewartsiu commented Sep 14, 2023

I'm running 1.8.4 in windows 11, sending to a renderer (gmediarender v1.42) with wired connection. I noticed that I'm only able to use the following formats without significant stutter or error:
WAV 16/44.1
WAV 24/48
FLAC 24/48 (smooth but significant latency)

What I really want to use is WAV 16/48, as I was trying to stream youtube and the native audio codec is 48khz (Opus). Do we know why 16/44 and 24/48 work, but not 16/48? 24/48 is ok but I'm trying to save bandwidth and reduce stutters that always occur after a while.

[Edit: For future reference, the correct characterization is that all sample rates eventually drop/stutter, and 16/48 is just a bit worse than others]

@dheijl
Copy link
Owner

dheijl commented Sep 14, 2023

swyh-rs does not touch the sample rate, it receives samples from Windows as F32, converts them to the desired integer bit depth, and streams them immediately to the rendering device in 8KB chunks.
Buffering is a function of the rendering device, and I suppose that gmediarender uses an input buffer that is too small in some cases. I have no idea if you can change the input buffer size in gmediarender.

FLAC is better because the compression saves bandwidth and the FLAC decoder is probably a library external from gmediarender anyway, but the compression introduces extra latency that can not be avoided.
Gmediarender does not have a good reputation and seems to be unmaintained (source: https://github.com/hzeller/gmrender-resurrect).

@stewartsiu
Copy link
Author

Thanks for the reply! If I understand correctly, the 8kB is the only additional tunable parameter from the client side? The gmediarenderer I'm using is a commercial fork (Gustard R26 streamer), and it works without any stutter at any sampling rate (e.g. 384k) if I use foobar2000's plugin foo_out_upnp (https://www.foobar2000.org/components/view/foo_out_upnp), so it seems like there should be a way to do the same thing with windows audio, with at worst more latency. Right now my 24/48 stream via swyh often starts to stutter in just a few minutes.

@dheijl
Copy link
Owner

dheijl commented Sep 16, 2023

When using foobar2000, are you playing local music files or are you using an internet streaming source ?

The 8 kb http streaming buffering is not controlled by swyh-rs, it is the http library that does it.

Does enabling/disabling chunked transfer make any difference?

Swyh-rs tries to minimize the delay, that is why there is no additional internal buffering, but this means that the receiving end is responsible for preventing stuttering by providing an adequate input buffer (and introducing some delay).

@stewartsiu
Copy link
Author

stewartsiu commented Sep 17, 2023

Streaming with foobar2000's foo_out_upnp was with local music files. For chunked transfer - it actually did not work when the disable chunked transfer option was on, whatever sampling rate I used. For what it's worth, I looked at streaming_server.rs and just rebuilt your code with the (None, 8192) changed to some other random values to see what happens. If the stream size / chunked thresholds are too extreme it wouldn't work, but when the streaming does work with 24/48 wav, it still eventually starts to stutter after a few minutes, like it does with 16/48. In contrast there is no stuttering with foobar2000 whatsoever.

@dheijl
Copy link
Owner

dheijl commented Sep 17, 2023

So it looks like the input buffering of the Gustard has problems with the 8 kb buffersize that Rust uses in std::io when copying data (copy_to() and copy_from()).

@stewartsiu
Copy link
Author

stewartsiu commented Oct 1, 2023

Two new observations:

  1. 16/48 actually works if I restart the streamer - it's just that once it starts stuttering / stopping, other sampling rates (even 24/384) works for a few mins if I restart swyh, but 16/48 still stutters if I restart swyh.
  2. When I try to restart swyh, if there's stuttering, I noticed that I would get to "Streaming to ... has ended" even though the variable nclients=1.

As an experiment I tried foo_out_upnp in between these stuttering situations with all sampling rates, without streamer restart, and it has no problems sending wav files to the streamer. Restarting swyh in those cases would not work unless I restart the streamer. So I suspect that foo_out_upnp is using the API differently from swyh-rs, but I don't know the upnp protocol so I don't have any idea what it could be. In case it's helpful I've attached the three service xml from my Gustarenderer in a zip file:
render_xml.zip

Edit: Rereading dheijl's comment, maybe the conclusion is just that the incompatibility is inherent to Rust libraries? I don't really know Rust but would be interested in trying any changes you suggest.

@dheijl
Copy link
Owner

dheijl commented Oct 2, 2023

I don't think it has anything to do with the UPNP protocol, as everything on that side seems to work OK. It's the HTTP streaming that does not work as it should.
And it all points to latency/buffering problems while streaming but I have no clue where to start looking.
And fiddling with the streaming on the Rust side is not easy it would mean ripping out the HTTP server tiny-http and replacing it with something that allows one to play with the output buffering strategy.
It is also possible that foo_out_upnp does not use http streaming at all but something entirely different based on the info in the actual service description. It's not open source. Wireshark sniffer traces would show the differences.

@stewartsiu
Copy link
Author

If http is the problem, here's another potential hint:
disable_chunk_encoding does not work with the current line as follows:
stream_size, chunk_threshold = Some(usize:MAX-1), usize:MAX
But it works when I change it to:
stream_size, chunk_threshold = Some(usize:MAX-1), usize:MAX-1
or
stream_size, chunk_threshold = Some(usize:MAX), usize:MAX

Anyway, will try Wireshark and report back when I have some free time.

@dheijl
Copy link
Owner

dheijl commented Oct 4, 2023

I'll change it to Some(usize:MAX), usize:MAX then, as it makes no difference in my setup, all three work here.

What you could do is use BubbleUPNP server as a proxy for the Gustard. You add the Gustard as an Openhome renderer in the BubbleUPNP GUI, and you stream from swyh-rs to the newly added (local) renderer, maybe it will solve your problems.

dheijl added a commit that referenced this issue Oct 4, 2023
@stewartsiu
Copy link
Author

stewartsiu commented Oct 5, 2023

I managed to do a quick Wireshark capture of the foobar output with foo_out_upnp playing a local file at 16/48:
testfoo.zip

One thing I noticed in the capture is that there is a clear GET message from the renderer (192.168.0.11) asking for the stream.wav file, but if I use swyh-rs to play a local file I don't see a similar line asking for swyh.wav even though the sound still comes out. [Edit: Also if I look at the data going from my PC to the renderer, foo_out_upnp uses TCP while swyh-rs uses VNC]

@dheijl
Copy link
Owner

dheijl commented Oct 6, 2023

Thanks, could you now attach a sniffer trace of a swyh-rs session too ?
I don't see any major differences in the trace, regarding streaming except that foo uses a 32 bit usize::max for the streamsize while swyh-rs uses a 64 bit usize:max.

swyh-rs uses the same tcp as foo_out_pnp, but the server port number 5901 that swyh-rs uses is also used by VNC, and that makes wireshark think that it's looking at a VNC session.

Why you didn't see the GET request beats me, but if it isn't there you won't get sound from swyh-rs.

@stewartsiu
Copy link
Author

Here you are:
testswyh.zip

@stewartsiu
Copy link
Author

stewartsiu commented Oct 6, 2023

I changed the port to 5902 to avoid the confusion with VNC, and now the GET swyh.wav line is shown:
testswyh_5902.zip

Two diffs I see: The swyhrs session doesn't talk to the renderconnmgr1 before rendertransport1, and all the TCP transmission of audio data have conversation completeness as Incomplete(15), vs Complete(47) in testfoo, which means a reset(32) is missing after data if i understand correctly.

@dheijl
Copy link
Owner

dheijl commented Oct 7, 2023

I see that chunked encoding is still being used. The current version (1.8.5) no longer allows chunked encoding as it is nowadays considered a largely useless http 1.1 feature, and has been removed removed from http 2.
But streaming obviously works without problems.

Incomplete/Complete: an endless audio stream can never be complete, swyh-rs answers the get request with an endless HTTP stream, and HTTP has no concept of "conversation completeness", that's a TCP feature. This use of tcp-flags may be caused by the Rust HTTP/TCP libraries, but has never caused any problem so far.

Edit: it seems to be an interpretation by Wireshark, that has no real meaning for the TCP stream:

TCP Conversation Completeness

TCP conversations are said to be complete when they have both opening and closing handshakes, independently of any data transfer. However, we might be interested in identifying complete conversations with some data sent, and we are using the following bit values to build a filter value on the tcp.completeness field :

1 : SYN
2 : SYN-ACK
4 : ACK
8 : DATA
16 : FIN
32 : RST
For example, a conversation containing only a three-way handshake will be found with the filter 'tcp.completeness==7' (1+2+4) while a complete conversation with data transfer will be found with a longer filter as closing a connection can be associated with FIN or RST packets, or even both : 'tcp.completeness==31 or tcp.completeness==47 or tcp.completeness==63'

Another way to select specific conversation values is to filter on the tcp.completeness.str field. Thus, 'tcp.completeness.str matches "(R.|F)[^D]ASS"' will find all 'Complete, NO_DATA' conversations, while the 'Complete, WITH_DATA' ones will be found with 'tcp.completeness.str matches "(R.|F)DASS"'.

@dheijl
Copy link
Owner

dheijl commented Oct 7, 2023

The only thing that could be related to the stuttering that I can see in the trace:

  • once streaming starts the server sends 8 KB chunks, and these are each acknowledged by 6 consecutive ACK frames by the Gustard, each ACK for 1/6th of the 8KB chunk
  • at a certain point an MDNS query occurs, and from then on the Gustard acknowledges each 8 KB chunk with only 2 consecutive ACKs (each for 1/2 of the 8KB chunk)
  • a bit later a TCP retransmission occurs, and the Gustard again starts acknowledging each chunk with 6 consecutive ACKs for a short time, but very soon switches back to 2 ACKS

So perhaps the 8 KB chunks are a problem?

Have you tried 1.8.5 yet that is supposed not to use chunking?

Edit: apparently tiny-http decides to use chunking anyway, regardless of specifying it or not...

I might have to get rid of tiny-http altogether.

@dheijl
Copy link
Owner

dheijl commented Oct 8, 2023

The current code in master prevents http-tiny from activating chunked transfer.

Does it change anything with regard to stuttering?

@stewartsiu
Copy link
Author

testswyh_1.8.6.zip
Just ran v1.8.6 from cli, streaming didn't start at all and I got "streaming to .... has ended" immediately. Wireshark capture attached.

@dheijl
Copy link
Owner

dheijl commented Oct 9, 2023

I seem to have broken WAV, I only use FLAC myself.

@dheijl
Copy link
Owner

dheijl commented Oct 9, 2023

WAV works again here, I replaced 1.8.6 with the fixed version.

@stewartsiu
Copy link
Author

stewartsiu commented Oct 9, 2023

Neither wav nor flac works...
Capture after latest pull: testswyh_flac.zip

@stewartsiu
Copy link
Author

Got it to work by changing usize:MAX - 1 to usize:MAX, but the stuttering behavior is the same (I triggered stuttering by starting, stopping by Ctrl-C and starting again). Here is a capture with stuttering:
testswyh_stutter.zip

@dheijl
Copy link
Owner

dheijl commented Oct 9, 2023

with usize:MAX you have enabled chunking again.

@stewartsiu
Copy link
Author

how about MAX/2?

@omoknen
Copy link

omoknen commented Oct 9, 2023

I wanted to give some information because the thread is very long. I am streaming at 44.1khz/16bit and I don't have any problems. I am not sure why 48khz is a problem, but before the first update 1.8.6 I did not have a problem, and I don't have a problem after updating the first 1.8.6.

44.1Khz sampling with 16 bit is working using WAV does work for me with Sonos speakers with Inject silence using Windows 10, VB-Audio Visual Cable, MusicBee Audio Player with WASPI (shared) and output to VB-Audio Visual Cable.

@dheijl
Copy link
Owner

dheijl commented Oct 9, 2023

I'll try to experiment with some different values. But MAX - 2 broke WAV with mpd for some reason...

@stewartsiu
Copy link
Author

Where do you read whether chunking is enabled from the sniffer trace?

@dheijl
Copy link
Owner

dheijl commented Oct 9, 2023

The data part of the frame starts with the chunk length encoded as ascii hex digits (8192 = 0x32 0x30 0x30 0x30) followed by 0x0d 0x0a, and ends with 0x0d 0a.

@stewartsiu
Copy link
Author

Thanks! I didn't get the chance to test further as I decided to return the streamer, at least for now... If there's a way to reproduce the reliability of foobar output I'll probably buy it again.

@dheijl
Copy link
Owner

dheijl commented Oct 10, 2023

I'm really sorry to hear that. Anyway, I have learned that the content length header value can break streaming, I wasn't aware of that and will investigate further. Thanks for your help in trying to fix it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants