Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

catalyst: After stream restarts, certain servers can't replicate stream #65

Closed
iameli opened this issue Aug 2, 2022 · 4 comments
Closed
Assignees

Comments

@iameli
Copy link
Contributor

iameli commented Aug 2, 2022

We had a stream working for a while. That stream started and stopped. Afterward, we couldn't replicate the stream to certain servers. Just this over and over:

[2022-08-02 18:41:29] MistOutHLS:video+b0dcxvwml48mxt2s (490) INFO: Shared metadata not ready yet - no tracks valid (/tmp/drone-7pq2c6BO3H9JHJZs/drone/src/lib/dtsc.cpp:2151) 

Full logs are here.

@emranemran
Copy link

Another additional failure condition is also being caught using the test defined in livepeer/catalyst#86.
Logs attached.
stream-interruption.log

@emranemran
Copy link

emranemran commented Aug 5, 2022

A few updates:

  • This issue is fairly easy to reproduce with 2 nodes and one ingest stream - stopping/starting the ingest stream will trigger the error condition and playback streams are not able to recover. The test in e2e_test: add failing test to detect issues with stream interruption livepeer/catalyst#86 will catch this condition (currently forced to pass until the issue is resolved)

  • I setup a single catalyst node with one ingest stream and can still reproduce the issue - so it's not specific to two separate nodes and the DTSC link in between (verbose logs attached).
    single-node-error.log

  • Discussed with @Thulinma and it looks like a race condition where the "new" HTTP output attempts to read from the previously deleted stream buffers as that previous session shuts down. Adding a few debug prints as next step to understand the code paths.

@emranemran
Copy link

emranemran commented Aug 9, 2022

Repro steps:
(logs here: out.log)

  • Used this ffmpeg command for ingest:
ffmpeg -re -f lavfi -i testsrc=size=480x320:rate=30 -f lavfi -i sine -vf drawtext="text='%{localtime\:%X}':fontsize=20:fontcolor=white:x=7:y=7" -pix_fmt yuv420p -vcodec libx264 -vb 2000k -preset ultrafast -acodec aac -f flv 'rtmp://127.0.0.1:1935/live/stream+foo'
  • Started Mist standalone (not dockerized) as follows (used json config from catalyst repo since it was setup already):
    Note: built Mist using catalyst branch of mistserver repo
./MistController -c ../catalyst/config/catalyst-one.json -g 5 > out.log 2>&1
  • Verified stream is visible in the Mist UI preview page

  • Used https://hls-js.netlify.app/demo/ to attempt playing back in an external player. Got a few manifestLoad errors but after refreshing the page a few times, got the player to stream successfully here:

[2022-08-08 21:59:26] MistOutHTTP (670714) INFO: Received request: /hls/stream foo/index.m3u8 => HLS (stream+foo) (/home/parallels/mac/mistserver/src/output/output_http.cpp:210) 
[2022-08-08 21:59:26] MistOutHTTP (670716) INFO: Received request: /hls/stream foo/0_1/chunk_50023.0.ts => HLS (stream+foo) (/home/parallels/mac/mistserver/src/output/output_http.cpp:210) 
[2022-08-08 21:59:26] MistOutHTTP (670718) INFO: Received request: /hls/stream foo/0_1/chunk_50023.5.ts => HLS (stream+foo) (/home/parallels/mac/mistserver/src/output/output_http.cpp:210) 
  • Then ctrl+C-ed on ffmpeg process and restart the ffmpeg cmd right away. Immediately hits the Shared metadata not ready yet error and the external player is not able to playback anymore.

Observations:
To track the code path, I added a few simple prints as attached in the patch below to understand the flow a bit better.

I can see MistInBuffer responding when I killed the ffmpeg stream here:

[2022-08-08 21:59:38] MistInBuffer:stream+foo (670197) INFO: Disconnected track 1 (/home/parallels/mac/mistserver/src/input/input_buffer.cpp:486) 
[2022-08-08 21:59:38] MistInBuffer:stream+foo (670197) INFO: Should remove track 1 (/home/parallels/mac/mistserver/src/input/input_buffer.cpp:324) 
[2022-08-08 21:59:38] MistInBuffer:stream+foo (670197) INFO: Disconnected track 0 (/home/parallels/mac/mistserver/src/input/input_buffer.cpp:486) 
[2022-08-08 21:59:38] MistInBuffer:stream+foo (670197) INFO: Should remove track 0 (/home/parallels/mac/mistserver/src/input/input_buffer.cpp:324) 
[2022-08-08 21:59:38] MistInBuffer:stream+foo (670197) INFO: Input closing clean, reason: source disconnected for non-resumable stream (/home/parallels/mac/mistserver/src/input/input.cpp:678) 

Then this is where one MistOutHTTP is exiting and a new one is spawned lining up with the new request from the player.

[2022-08-08 21:59:38] MistOutHLS:stream+foo (670785) INFO: Received signal Terminated (15) from process 670197 (/home/parallels/mac/mistserver/lib/config.cpp:620) 
[2022-08-08 21:59:38] MistOutHTTP (670401) INFO: Exit reason: signal Terminated (15) from process 670197 (/home/parallels/mac/mistserver/src/output/mist_out.cpp:73) 
[2022-08-08 21:59:38] MistOutRTMP:stream foo (670810) INFO: Waiting for stream+foo buffer to be ready... (0) (/home/parallels/mac/mistserver/src/output/output.cpp:1988) 
[2022-08-08 21:59:38] MistOutHLS:stream+foo (670785) INFO: Received signal Terminated (15) from process 670197 (/home/parallels/mac/mistserver/lib/config.cpp:620) 
[2022-08-08 21:59:38] MistOutHLS:stream+foo (670785) INFO: Received signal Terminated (15) from process 670197 (/home/parallels/mac/mistserver/lib/config.cpp:620) 
[2022-08-08 21:59:38] MistOutHTTP (670814) INFO: Received request: /stream+foo.mp4 => MP4 (stream+foo) (/home/parallels/mac/mistserver/src/output/output_http.cpp:210) 
[2022-08-08 21:59:38] MistOutMP4:stream+foo (670814) INFO: Switching to Websocket mode (/home/parallels/mac/mistserver/src/output/output_http.cpp:310) 
[2022-08-08 21:59:38] MistOutHTTP (670815) INFO: Received request: /json_stream+foo.js => HTTP (stream+foo) (/home/parallels/mac/mistserver/src/output/output_http.cpp:210) 
[2022-08-08 21:59:38] MistOutHTTP (670815) INFO: Inside websocketHandler (/home/parallels/mac/mistserver/src/output/output_http_internal.cpp:1088) 
[2022-08-08 21:59:38] MistOutHTTP (670815) INFO: Checking metadata (/home/parallels/mac/mistserver/src/output/output_http_internal.cpp:1119) 
[2022-08-08 21:59:38] MistOutHTTP (670815) INFO: Checking metadata (/home/parallels/mac/mistserver/src/output/output_http_internal.cpp:1119) 
[2022-08-08 21:59:39] MistOutHLS:stream+foo (670785) INFO: Received signal Terminated (15) from process 670197 (/home/parallels/mac/mistserver/lib/config.cpp:620) 
[2022-08-08 21:59:39] MistOutHTTP (670815) INFO: Checking metadata (/home/parallels/mac/mistserver/src/output/output_http_internal.cpp:1119) 
[2022-08-08 21:59:39] MistController (670046) INFO: Stream stream+foo became inactive (/home/parallels/mac/mistserver/src/controller/controller_statistics.cpp:169) 
[2022-08-08 21:59:39] MistOutHLS:stream+foo (670785) INFO: Received signal Terminated (15) from process 670197 (/home/parallels/mac/mistserver/lib/config.cpp:620) 
[2022-08-08 21:59:39] MistOutHTTP (670815) INFO: Checking metadata (/home/parallels/mac/mistserver/src/output/output_http_internal.cpp:1119) 
[2022-08-08 21:59:39] MistOutRTMP:stream foo (670810) INFO: Reconnecting to stream+foo buffer... (0) (/home/parallels/mac/mistserver/src/output/output.cpp:1993) 

After which point the metadata check goes from invalid to valid but streams still can't recover:

[2022-08-08 21:59:41] MistInBuffer:stream+foo (670905) INFO: Input booting (/home/parallels/mac/mistserver/src/input/input.cpp:310) 
[2022-08-08 21:59:41] MistInBuffer:stream+foo (670907) INFO: Starting serve (/home/parallels/mac/mistserver/src/input/input.cpp:536) 
[2022-08-08 21:59:41] MistInBuffer:stream+foo (670907) INFO: Input started (/home/parallels/mac/mistserver/src/input/input.cpp:652) 
[2022-08-08 21:59:41] MistInBuffer:stream+foo (670907) INFO: Setting maxKeepAway from 0 to new value of 45000 (/home/parallels/mac/mistserver/src/input/input_buffer.cpp:586) 
[2022-08-08 21:59:41] MistOutHLS:stream+foo (670900) INFO: QQQ: Shared metadata not ready yet - no tracks valid (/home/parallels/mac/mistserver/lib/dtsc.cpp:2151) 
[2022-08-08 21:59:41] MistOutHTTP (670815) INFO: Checking metadata (/home/parallels/mac/mistserver/src/output/output_http_internal.cpp:1119) 
[2022-08-08 21:59:41] MistOutHTTP (670815) INFO: Checking metadata (/home/parallels/mac/mistserver/src/output/output_http_internal.cpp:1119) 
[2022-08-08 21:59:41] MistOutHTTP (670815) INFO: Checking metadata (/home/parallels/mac/mistserver/src/output/output_http_internal.cpp:1119) 
[2022-08-08 21:59:42] MistOutHLS:stream+foo (670900) INFO: QQQ: Shared metadata not ready yet - no tracks valid (/home/parallels/mac/mistserver/lib/dtsc.cpp:2151) 
[2022-08-08 21:59:42] MistOutHTTP (670815) INFO: Checking metadata (/home/parallels/mac/mistserver/src/output/output_http_internal.cpp:1119) 
[2022-08-08 21:59:42] MistOutHTTP (670815) INFO: Checking metadata (/home/parallels/mac/mistserver/src/output/output_http_internal.cpp:1119) 
[2022-08-08 21:59:42] MistOutHLS:stream+foo (670900) INFO: QQQ: Shared metadata not ready yet - no tracks valid (/home/parallels/mac/mistserver/lib/dtsc.cpp:2151) 
[2022-08-08 21:59:42] MistOutHTTP (670815) INFO: Checking metadata (/home/parallels/mac/mistserver/src/output/output_http_internal.cpp:1119) 
[2022-08-08 21:59:43] MistOutHLS:stream+foo (670900) INFO: QQQ: Shared metadata not ready yet - no tracks valid (/home/parallels/mac/mistserver/lib/dtsc.cpp:2151) 
[2022-08-08 21:59:43] MistController (670046) INFO: Stream stream+foo became active (/home/parallels/mac/mistserver/src/controller/controller_statistics.cpp:163) 
[2022-08-08 21:59:43] MistOutHLS:stream+foo (670900) INFO: QQQ: Shared metadata not ready yet - no tracks valid (/home/parallels/mac/mistserver/lib/dtsc.cpp:2151) 
[2022-08-08 21:59:44] MistOutHLS:stream+foo (670900) INFO: QQQ: Shared metadata not ready yet - no tracks valid (/home/parallels/mac/mistserver/lib/dtsc.cpp:2151) 
[2022-08-08 21:59:44] MistOutHLS:stream+foo (670900) INFO: QQQ: Shared metadata not ready yet - no tracks valid (/home/parallels/mac/mistserver/lib/dtsc.cpp:2151) 
[2022-08-08 21:59:45] MistOutHLS:stream+foo (670900) INFO: QQQ: Shared metadata not ready yet - no tracks valid (/home/parallels/mac/mistserver/lib/dtsc.cpp:2151) 
[2022-08-08 21:59:45] MistOutHLS:stream+foo (670900) INFO: QQQ: Shared metadata not ready yet - no tracks valid (/home/parallels/mac/mistserver/lib/dtsc.cpp:2151) 
[2022-08-08 21:59:46] MistOutHLS:stream+foo (670900) INFO: QQQ: Shared metadata not ready yet - no tracks valid (/home/parallels/mac/mistserver/lib/dtsc.cpp:2151) 
[2022-08-08 21:59:46] MistOutHLS:stream+foo (670900) INFO: QQQ: Shared metadata not ready yet - no tracks valid (/home/parallels/mac/mistserver/lib/dtsc.cpp:2151) 
[2022-08-08 21:59:47] MistOutHLS:stream+foo (670900) INFO: QQQ: Shared metadata not ready yet - no tracks valid (/home/parallels/mac/mistserver/lib/dtsc.cpp:2151) 
[2022-08-08 21:59:47] MistOutHTTP (670815) INFO: Checking metadata (/home/parallels/mac/mistserver/src/output/output_http_internal.cpp:1119) 
[2022-08-08 21:59:47] MistOutHTTP (670815) INFO: metadata valid (/home/parallels/mac/mistserver/src/output/output_http_internal.cpp:1121) 
[2022-08-08 21:59:47] MistOutHTTP (670815) INFO: Checking metadata (/home/parallels/mac/mistserver/src/output/output_http_internal.cpp:1119) 
[2022-08-08 21:59:47] MistOutHTTP (670815) INFO: metadata valid (/home/parallels/mac/mistserver/src/output/output_http_internal.cpp:1121) 
[2022-08-08 21:59:47] MistOutHTTP (670815) INFO: Checking metadata (/home/parallels/mac/mistserver/src/output/output_http_internal.cpp:1119) 
[2022-08-08 21:59:47] MistOutHTTP (670815) INFO: metadata valid (/home/parallels/mac/mistserver/src/output/output_http_internal.cpp:1121) 
[2022-08-08 21:59:47] MistOutHTTP (670815) INFO: Checking metadata (/home/parallels/mac/mistserver/src/output/output_http_internal.cpp:1119) 
[2022-08-08 21:59:47] MistOutHTTP (670815) INFO: metadata valid (/home/parallels/mac/mistserver/src/output/output_http_internal.cpp:1121) 
[2022-08-08 21:59:47] MistOutHTTP (670936) INFO: Received request: /stream+foo.mp4 => MP4 (stream+foo) (/home/parallels/mac/mistserver/src/output/output_http.cpp:210) 
[2022-08-08 21:59:47] MistOutMP4:stream+foo (670936) INFO: Switching to Websocket mode (/home/parallels/mac/mistserver/src/output/output_http.cpp:310) 
[2022-08-08 21:59:47] MistOutHLS:stream+foo (670900) INFO: QQQ: Shared metadata not ready yet - no tracks valid (/home/parallels/mac/mistserver/lib/dtsc.cpp:2151) 

This is where I'm a bit lost - does the MistOutHLS process need to be kicked again for it to not look for the older memory pages?

patch with prints FYI:

diff --git a/src/output/output_http_internal.cpp b/src/output/output_http_internal.cpp
index 50bd526b..817c58af 100644
--- a/src/output/output_http_internal.cpp
+++ b/src/output/output_http_internal.cpp
@@ -1085,6 +1085,7 @@ namespace Mist{
   }

   bool OutHTTP::websocketHandler(const HTTP::Parser & req, bool headersOnly){
+    INFO_MSG("Inside websocketHandler");
     stayConnected = true;
     std::string reqHost = HTTP::URL(req.GetHeader("Host")).host;
     if (req.GetHeader("X-Mst-Path").size()){mistPath = req.GetHeader("X-Mst-Path");}
@@ -1115,7 +1116,11 @@ namespace Mist{
         newState = streamStatus.mapped[0];
       }

-      if (meta){meta.reloadReplacedPagesIfNeeded();}
+      INFO_MSG("Checking metadata");
+      if (meta){
+              INFO_MSG("metadata valid");
+	      meta.reloadReplacedPagesIfNeeded();
+      }
       if (newState != prevState || (newState == STRMSTAT_READY && M.getValidTracks() != prevTracks)){
         if (newState == STRMSTAT_READY){
           thisError = "";
           

@Thulinma
Copy link
Contributor

Thulinma commented Aug 9, 2022

Can you see if 950089a perhaps fixes the issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants