Slow direct play startup #873

mcarlton00 · 2019-02-12T13:52:18Z

Describe the bug
Direct play from the server has large initial slowdowns, taking an average of 12 seconds to send a response to the client.

To Reproduce
Trigger a direct play using the API:
http://192.168.0.200:8096/emby/Videos/{{ item_id }}/stream?static=true

Expected behavior
The video starts playing immediately, or within a few seconds

Logs

[2019-02-12 03:05:12.700 +00:00] [INF] HTTP "HEAD" "http://192.168.0.200:8096/emby/Videos/bf313a860a766111a7e296fcfc8ddccb/stream?static=true&MediaSourceId=bf313a860a766111a7e296fcfc8ddccb". "Host=192.168.0.200:8096, User-Agent=Kodi/17.6 (X11; Linux x86_64) Arch Linux/0.0 App_Bitness/64 Version/17.6-Git:20190116-nogitfound, Accept=*/*, Accept-Charset=UTF-8,*;q=0.8"
[2019-02-12 13:29:22.466 +00:00] [INF] HTTP Response 200 to "192.168.0.148". Time: "12032""ms (slow)". "http://192.168.0.200:8096/emby/Videos/bf313a860a766111a7e296fcfc8ddccb/stream?static=true&MediaSourceId=bf313a860a766111a7e296fcfc8ddccb" ""
[2019-02-12 13:29:22.473 +00:00] [INF] HTTP "GET" "http://192.168.0.200:8096/emby/Videos/bf313a860a766111a7e296fcfc8ddccb/stream?static=true&MediaSourceId=bf313a860a766111a7e296fcfc8ddccb". "Host=192.168.0.200:8096, Range=bytes=0-, User-Agent=Kodi/17.6 (X11; Linux x86_64) Arch Linux/0.0 App_Bitness/64 Version/17.6-Git:20190116-nogitfound, Accept=*/*, Accept-Charset=UTF-8,*;q=0.8"
[2019-02-12 13:29:46.551 +00:00] [INF] HTTP Response 206 to "192.168.0.148". Time: "12042""ms (slow)". "http://192.168.0.200:8096/emby/Videos

System (please complete the following information):

OS: Docker
Browser: Kodi plugin/Firefox/Python requests library
Jellyfin Version: 10.1.0

Additional context
Direct streaming works with almost no initial delay, the problem seems to be limited to Direct Play features

The text was updated successfully, but these errors were encountered:

sachk · 2019-03-06T01:50:28Z

I'm having the same problem. It's far worse when the server has 200ms ping, it often takes longer than a minute to start.

mcarlton00 · 2019-03-06T03:04:06Z

I finally narrowed mine down to being caused by a poor NFS mount in my library. Something in my NAS/NFS exports was causing massive slowdowns where jellyfin was concerned, but had fine performance in other tests. I've temporarily fixed it by using sshfs mounts instead of nfs mounts to my library. I'm planning on going back and doing more tests to see if I can pinpoint exactly what's causing it, though.

JustAMan · 2019-03-06T13:43:53Z

My current suspects would be slow syscalls on such shares, would be great if you could strace problematic cases...

mcarlton00 · 2019-03-06T15:45:43Z

I'm hoping to get a good strace result this weekend, as long as I get some free time.

sachk · 2019-03-06T22:29:49Z

I'm running my Jellyfin on Google Drive mounted with rclone, so I assume I'm also having the same problem. I can also do some profiling soon.

mcarlton00 · 2019-03-09T21:25:37Z

Ran several tests, mostly searching for specific open or read calls that were taking a lot of time, but I couldn't find any. I finally ran with -c to get a summary at the end. Let the server run for ~15 minutes. In that time I hit the API directly several times and started/stopped multiple direct play streams from kodi.

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 99.68   11.383282       64678       176        15 futex
  0.10    0.011822           5      2182           munmap
  0.09    0.009890           2      4231           mprotect
  0.07    0.008271           3      2486           mmap
  0.01    0.001387           5       266           read
  0.01    0.001215           4       326       108 openat
  0.01    0.001046           1      1684         1 lstat
  0.01    0.000594           1       914           write
  0.00    0.000562           2       311           fstat
  0.00    0.000354           2       168         2 close
  0.00    0.000332           2       200           fcntl
  0.00    0.000280           1       419        44 stat
  0.00    0.000200           6        34           gettid
  0.00    0.000197           2       106           lseek
  0.00    0.000184           6        31           rt_sigaction
  0.00    0.000119          13         9           clone
  0.00    0.000109           4        30        30 access
  0.00    0.000103           4        28           sched_yield
  0.00    0.000057          14         4         2 unlink
  0.00    0.000046           6         8           prlimit64
  0.00    0.000039          20         2           mknod
  0.00    0.000039          10         4           pipe2
  0.00    0.000033           3        12           flock
  0.00    0.000025           1        18           brk
  0.00    0.000017           3         5           ioctl
  0.00    0.000017           3         5           sysinfo
  0.00    0.000016           3         5           readlink
  0.00    0.000015           8         2           getsid
  0.00    0.000014           2         6           getpid
  0.00    0.000011          11         1           socket
  0.00    0.000011          11         1           statfs
  0.00    0.000011           4         3           sched_getaffinity
  0.00    0.000007           4         2           geteuid
  0.00    0.000007           2         3           sigaltstack
  0.00    0.000005           5         1           bind
  0.00    0.000005           3         2           sched_setscheduler
  0.00    0.000004           2         2           sched_getparam
  0.00    0.000004           2         2           sched_getscheduler
  0.00    0.000004           4         1           fadvise64
  0.00    0.000003           2         2           sched_get_priority_max
  0.00    0.000002           1         2           sched_get_priority_min
  0.00    0.000001           1         1           rt_sigreturn
  0.00    0.000000           0         2           poll
  0.00    0.000000           0         1           rt_sigprocmask
  0.00    0.000000           0         1           execve
  0.00    0.000000           0         2           uname
  0.00    0.000000           0        25           getdents
  0.00    0.000000           0         2           getuid
  0.00    0.000000           0         1           mlock
  0.00    0.000000           0         1           arch_prctl
  0.00    0.000000           0         1           set_tid_address
  0.00    0.000000           0         1           set_robust_list
------ ----------- ----------- --------- --------- ----------------
100.00   11.420340                 13732       202 total

Each api call/direct play stream has about a 12.5 second response time still, according to the JF logs, but none of the syscalls indicate that. The futex has ~11 seconds total, but it didn't seem to be being called in real time as a direct play was started. I didn't see any new calls when a direct play was started, actually.

stale · 2019-07-29T21:38:28Z

Issues go stale after 60d of inactivity. Mark the issue as fresh by adding a comment or commit. Stale issues close after an additional 7d of inactivity. If this issue is safe to close now please do so. If you have any questions you can reach us on Matrix or Social Media.

mcarlton00 · 2020-02-21T20:54:05Z

Update: Reports from users on matrix that this is still happening. FreeBSD host, Linux VM. Probably needs to be reopened

mcarlton00 · 2020-05-07T16:12:52Z

Received reports of similar issues when using Open Media Vault from dinki on matrix

JustAMan · 2020-05-14T11:21:22Z

Syscalls (and their timings) gathered by strace from running Jellyfin instance would help immensely here. Otherwise it would be a blind search in the dark.

stale · 2021-01-06T12:33:55Z

This issue has gone 120 days without comment. To avoid abandoned issues, it will be closed in 21 days if there are no new comments.
If you're the original submitter of this issue, please comment confirming if this issue still affects you in the latest release or nightlies, or close the issue if it has been fixed. If you're another user also affected by this bug, please comment confirming so. Either action will remove the stale label.
This bot exists to prevent issues from becoming stale and forgotten. Jellyfin is always moving forward, and bugs are often fixed as side effects of other changes. We therefore ask that bug report authors remain vigilant about their issues to ensure they are closed if fixed, or re-confirmed - perhaps with fresh logs or reproduction examples - regularly. If you have any questions you can reach us on Matrix or Social Media.

mcarlton00 · 2021-02-06T04:37:41Z

Discussed in chat again. Still happening in rare circumstances. Josh ran into a likely related issue when setting up https://github.com/joshuaboniface/rffmpeg. Quote from chat:

and specifically - this is as far as i traced it - basically we're waiting on a "chunk" to be loaded for the streamer to send it back to the client. but at the start of the stream, on NFS, it waits a relatively very long time (about 10-15s) and then it finally picks up the piece and sends it
when i was debugging this with rffmpeg, where it was writing out the individual files "over" NFS (with the jellyfin side being the NFS client), I was pretty sure I traced it to the inotify call, which JF was using to know that the transcoding had finished writin gthe file, but that never came

DomiStyle · 2021-02-22T09:50:08Z

I'm also running into this issue on an Ubuntu server with a NFS mounted library.

Transcoding streams start almost immediately but direct play streams take between 5-15 seconds to start.
Is there any info you guys need to debug this further? I'm running Plex from the same NFS mount and there are no issues there.

pakar · 2021-02-24T23:40:40Z

I have the same behavior and it seems possible to reproduce on at least my system.

I cannot reinstall the host-machine this has been tried on, but the slow behavior is repeatable and has a strace attached

Setup is on a single machine:
Host:

Debian 9 - 4-core cpu. 16Gb ram, NFS-server with audio-files.
VM images located on SSD
NFS share located on HDD. Share contains FLAC and mp3's + m3u playlists with relative paths.
VM:
Created with default-settings with virt-manager (2.0.0-3) using a bridged interface with the host.
Fresh install of Ubuntu-server 20.04: 3-core, 2Gb ram, latest jellyfin (both stable and unstable tested with same results)

Latency between VM and host is <1ms

NFS share mount-options: rw,noatime,vers=3,rsize=32768,wsize=32768,hard,proto=udp,timeo=11,retrans=3,sec=sys,mountvers=3,mountproto=udp,local_lock=none

EnableAudioPlaybackTranscoding: True
RemoteClientBitrateLimit: 1000000000
IsInLocalNetwork: True

[2021-02-24 23:41:20.377 +01:00] [WRN] Slow HTTP Response from "http://10.254.128.32:8096/Audio/ac14450024ef74e3cfe99d1860377209/universal?UserId=05c57322e3e046f5b471ba7db6ba2398&DeviceId=TW96aWxsYS81LjAgKExpbnV4OyBBbmRyb2lkIDExOyBLQjIwMDMgQnVpbGQvUlAxQS4yMDEwMDUuMDAxOyB3dikgQXBwbGVXZWJLaXQvNTM3LjM2IChLSFRNTCwgbGlrZSBHZWNrbykgVmVyc2lvbi80LjAgQ2hyb21lLzg4LjAuNDMyNC4xODEgTW9iaWxlIFNhZmFyaS81MzcuMzZ8MTYxNDIwNjM5NjIxMA11&MaxStreamingBitrate=140000000&Container=opus%2Cwebm%7Copus%2Cmp3%2Caac%2Cm4a%7Caac%2Cm4b%7Caac%2Cflac%2Cwebma%2Cwebm%7Cwebma%2Cwav%2Cogg&TranscodingContainer=ts&TranscodingProtocol=hls&AudioCodec=aac&api_key=184f8beb166b4c01829aa975e2367806&PlaySessionId=1614206396231&StartTimeTicks=0&EnableRedirection=true&EnableRemoteMedia=false" to "10.254.127.107" in 0:00:06.2372201 with Status Code 206
[2021-02-24 23:41:20.950 +01:00] [INF] WS "10.254.127.107" closed

strace collected via:

Fresh boot of VM
Start app on Android phone
strace -o strace-slow.txt -f -p 2146
Press play in app
Wait for playback to start (~6 seconds in this instance)
play for a few seconds
terminate strace
strace-slow.txt

CPU during these tests remain below 5%
Network traffic:

Press play on phone
4-6 big traffic spikes (almost like it's re-reading the full file?)
Log prints the "Slow HTTP Response from"
Playback starts on phone.

I also noticed that you can cause an even higher response-times by pressing play on a few files without waiting for the audio to start playing.

[2021-02-25 00:14:36.652 +01:00] [WRN] Slow HTTP Response from "http://10.254.128.32:8096/Audio/5e6ecc3e2d492f22f26e1b02420ffc4f/universal?UserId=05c57322e3e046f5b471ba7db6ba2398&DeviceId=TW96aWxsYS81LjAgKExpbnV4OyBBbmRyb2lkIDExOyBLQjIwMDMgQnVpbGQvUlAxQS4yMDEwMDUuMDAxOyB3dikgQXBwbGVXZWJLaXQvNTM3LjM2IChLSFRNTCwgbGlrZSBHZWNrbykgVmVyc2lvbi80LjAgQ2hyb21lLzg4LjAuNDMyNC4xODEgTW9iaWxlIFNhZmFyaS81MzcuMzZ8MTYxNDIwNjM5NjIxMA11&MaxStreamingBitrate=140000000&Container=opus%2Cwebm%7Copus%2Cmp3%2Caac%2Cm4a%7Caac%2Cm4b%7Caac%2Cflac%2Cwebma%2Cwebm%7Cwebma%2Cwav%2Cogg&TranscodingContainer=ts&TranscodingProtocol=hls&AudioCodec=aac&api_key=184f8beb166b4c01829aa975e2367806&PlaySessionId=1614208091765&StartTimeTicks=0&EnableRedirection=true&EnableRemoteMedia=false" to "10.254.127.107" in 0:00:25.5485581 with Status Code 206
[2021-02-25 00:14:48.945 +01:00] [WRN] Slow HTTP Response from "http://10.254.128.32:8096/Audio/ecd4c6a1c006f39b9744a6aea0f642e9/universal?UserId=05c57322e3e046f5b471ba7db6ba2398&DeviceId=TW96aWxsYS81LjAgKExpbnV4OyBBbmRyb2lkIDExOyBLQjIwMDMgQnVpbGQvUlAxQS4yMDEwMDUuMDAxOyB3dikgQXBwbGVXZWJLaXQvNTM3LjM2IChLSFRNTCwgbGlrZSBHZWNrbykgVmVyc2lvbi80LjAgQ2hyb21lLzg4LjAuNDMyNC4xODEgTW9iaWxlIFNhZmFyaS81MzcuMzZ8MTYxNDIwNjM5NjIxMA11&MaxStreamingBitrate=140000000&Container=opus%2Cwebm%7Copus%2Cmp3%2Caac%2Cm4a%7Caac%2Cm4b%7Caac%2Cflac%2Cwebma%2Cwebm%7Cwebma%2Cwav%2Cogg&TranscodingContainer=ts&TranscodingProtocol=hls&AudioCodec=aac&api_key=184f8beb166b4c01829aa975e2367806&PlaySessionId=1614208091767&StartTimeTicks=0&EnableRedirection=true&EnableRemoteMedia=false" to "10.254.127.107" in 0:00:34.9520038 with Status Code 206
[2021-02-25 00:14:48.946 +01:00] [WRN] Slow HTTP Response from "http://10.254.128.32:8096/Audio/7541f16b3f0c57b1e15ee422eef868b9/universal?UserId=05c57322e3e046f5b471ba7db6ba2398&DeviceId=TW96aWxsYS81LjAgKExpbnV4OyBBbmRyb2lkIDExOyBLQjIwMDMgQnVpbGQvUlAxQS4yMDEwMDUuMDAxOyB3dikgQXBwbGVXZWJLaXQvNTM3LjM2IChLSFRNTCwgbGlrZSBHZWNrbykgVmVyc2lvbi80LjAgQ2hyb21lLzg4LjAuNDMyNC4xODEgTW9iaWxlIFNhZmFyaS81MzcuMzZ8MTYxNDIwNjM5NjIxMA11&MaxStreamingBitrate=140000000&Container=opus%2Cwebm%7Copus%2Cmp3%2Caac%2Cm4a%7Caac%2Cm4b%7Caac%2Cflac%2Cwebma%2Cwebm%7Cwebma%2Cwav%2Cogg&TranscodingContainer=ts&TranscodingProtocol=hls&AudioCodec=aac&api_key=184f8beb166b4c01829aa975e2367806&PlaySessionId=1614208091766&StartTimeTicks=0&EnableRedirection=true&EnableRemoteMedia=false" to "10.254.127.107" in 0:00:36.5218395 with Status Code 206

Give me a shout if you want specific traces.. This jellyfin VM is setup for testing/debugging..

DomiStyle · 2021-02-25T12:13:02Z

Since I had this issue in a few of my other .NET applications I went ahead and created an issue in the .NET repository:
dotnet/runtime#48757

It's reproducible even with a blank C# project so I assume the fault doesn't lie with Jellyfin in this case. That also explains why transcoding doesn't have this issue since ffmpeg accesses the file directly.

DomiStyle · 2021-05-05T11:13:25Z

I think I have identified the problem, my NFS share points to a mergerfs mount. Recently I had an issue with stale file handles and applied the options listed here.

The problem is now gone and my C# applications (including JF) no longer have a 6 second delay.

Can anyone comment if they use any other fuse file systems or even mergerfs? This would still be a .NET problem but at least there's a way to work around it.

crobibero · 2021-05-07T19:57:42Z

Can anyone comment if they use any other fuse file systems or even mergerfs? This would still be a .NET problem but at least there's a way to work around it.

I use mergerfs with snapraid and have immediate DirectPlay playback

paulmorabito · 2021-10-25T07:30:14Z

I had the same problem as above also and filed a separate issue before finding this (#6559). I ended up mounting the drives as SMB and passing through the directories as a workaround.

EorlBruder · 2022-04-09T11:03:15Z

The issue in the .NET-repo now received an update which should solve the performance issues: dotnet/runtime#48757 (comment)

The root cause for the delay is .NET locking over NFSv3 without the lock service being available.

The options are:

Disable .NET locking using DOTNET_SYSTEM_IO_DISABLEFILELOCKING (introduced in .NET 6).
Disable locking for the mount using the nolock option.
Enable the lock service.
Use NFS4 which has built-in lock support.

For me mounting with the nolock-option solved the problem. Is the first option something which should/could be added in jellyfin (don't really know anything about .NET)? Else it might be good to add these solutions to the storage-documentation.

mcarlton00 · 2022-04-10T14:07:13Z

Adding it to the docs is probably the best solution for the time being. Feel free to open a PR adding for it here if you're willing.

Describe the solution causing slow directplay startup issues with NFSv3, which is discussed in this issue on the jellyfin-repository jellyfin/jellyfin#873 Fixes jellyfin/jellyfin#873

Mengine · 2022-04-10T16:29:18Z

Had this exact problem today when moving my library from a freenas nfs share to a truenas scale nfs share. All the server side mount settings were alike so was the client mount options (nfs 3). The nolock option did however solve the problem straight away.

SIGSTKFLT · 2022-04-23T01:06:04Z

Had this exact problem today when moving my library from a freenas nfs share to a truenas scale nfs share. All the server side mount settings were alike so was the client mount options (nfs 3). The nolock option did however solve the problem straight away.

I too was sharing from truenas scale and this has fixed it for me. Thank you!

mcarlton00 added the bug Something isn't working label Feb 12, 2019

This was referenced Feb 13, 2019

Don't compute ETag from large file streams #886

Closed

Drop ETag and use Last-Modified header #890

Merged

anthonylavado added this to Needs triage in Issue Triage for Main Repo Jul 4, 2019

stale bot added the stale Stale and will be closed if no activity occurs label Jul 29, 2019

stale bot closed this as completed Aug 5, 2019

Issue Triage for Main Repo automation moved this from Needs triage to Closed/Done Aug 5, 2019

sparky8251 reopened this Feb 21, 2020

Issue Triage for Main Repo automation moved this from Closed/Done to Needs triage Feb 21, 2020

stale bot removed the stale Stale and will be closed if no activity occurs label Feb 21, 2020

stale bot added the stale Stale and will be closed if no activity occurs label Jan 6, 2021

stale bot closed this as completed Feb 2, 2021

Issue Triage for Main Repo automation moved this from Needs triage to Closed/Done Feb 2, 2021

dkanada added confirmed This issue has been reviewed and confirmed and removed stale Stale and will be closed if no activity occurs labels Feb 6, 2021

dkanada reopened this Feb 6, 2021

Issue Triage for Main Repo automation moved this from Closed/Done to Needs triage Feb 6, 2021

cvium added the upstream The issue is due to a upstream library or project label Apr 10, 2021

paulmorabito mentioned this issue Oct 25, 2021

Jellyfin on Kodi - very slow #6559

Closed

EorlBruder mentioned this issue Apr 10, 2022

Add documentation on possible NFSv3 issue jellyfin-archive/jellyfin-docs#656

Merged

Shadowghost closed this as completed in jellyfin-archive/jellyfin-docs#656 Apr 10, 2022

Issue Triage for Main Repo automation moved this from Needs triage to Closed/Done Apr 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slow direct play startup #873

Slow direct play startup #873

mcarlton00 commented Feb 12, 2019

sachk commented Mar 6, 2019

mcarlton00 commented Mar 6, 2019

JustAMan commented Mar 6, 2019

mcarlton00 commented Mar 6, 2019

sachk commented Mar 6, 2019

mcarlton00 commented Mar 9, 2019

stale bot commented Jul 29, 2019

mcarlton00 commented Feb 21, 2020

mcarlton00 commented May 7, 2020

JustAMan commented May 14, 2020

stale bot commented Jan 6, 2021

mcarlton00 commented Feb 6, 2021

DomiStyle commented Feb 22, 2021

pakar commented Feb 24, 2021 •

edited

strace -o strace-slow.txt -f -p 2146

DomiStyle commented Feb 25, 2021

DomiStyle commented May 5, 2021

crobibero commented May 7, 2021

paulmorabito commented Oct 25, 2021

EorlBruder commented Apr 9, 2022

mcarlton00 commented Apr 10, 2022

Mengine commented Apr 10, 2022

SIGSTKFLT commented Apr 23, 2022

Slow direct play startup #873

Slow direct play startup #873

Comments

mcarlton00 commented Feb 12, 2019

sachk commented Mar 6, 2019

mcarlton00 commented Mar 6, 2019

JustAMan commented Mar 6, 2019

mcarlton00 commented Mar 6, 2019

sachk commented Mar 6, 2019

mcarlton00 commented Mar 9, 2019

stale bot commented Jul 29, 2019

mcarlton00 commented Feb 21, 2020

mcarlton00 commented May 7, 2020

JustAMan commented May 14, 2020

stale bot commented Jan 6, 2021

mcarlton00 commented Feb 6, 2021

DomiStyle commented Feb 22, 2021

pakar commented Feb 24, 2021 • edited

strace -o strace-slow.txt -f -p 2146

DomiStyle commented Feb 25, 2021

DomiStyle commented May 5, 2021

crobibero commented May 7, 2021

paulmorabito commented Oct 25, 2021

EorlBruder commented Apr 9, 2022

mcarlton00 commented Apr 10, 2022

Mengine commented Apr 10, 2022

SIGSTKFLT commented Apr 23, 2022

pakar commented Feb 24, 2021 •

edited