mythtranscode: Fix #12602 by detecting write errors #163

twitham1 · 2018-03-28T04:23:19Z

Fix for https://code.mythtv.org/trac/ticket/12602

Explanation of the patch:

multiplex.h:
the mx multiplex structure gets a new "int error", the error count.
write_out_packs and finish_mpg now return int, the error count (so 0 = success)
currently unused: callers test the error count instead
multiplex.c:
if write() fails, increment error count
log only first 10 failures so we don't flood syslog and cause it to drop messages
if close() fails, increment error count
mpeg2fix.cpp:
copy error count to static variable and pass it to pthread_exit
capture error count in pthread_join and log/fail if non-zero

This all causes mythtv to keep the original recording if disk fills during transcode, instead of truncating the recording. Of course user will need to go free space larger than this recording for a successful transcode.

In mpeg2fix.cpp there is also a commented section where I tried to exit on first write failure. This just deadlocked the other thread. It might be possible to cancel the other thread if you wish to abort at first write error but I was unsure how to do this. As-is, we just continue to fail and return the failure count when the reader is done. I monitored with top and the size doesn't grow so I assume the data is just being dropped rather than filling memory, which is a good thing.

twitham1 · 2018-03-28T15:26:15Z

See also #164 for equivalent pull request against master.

Beirdo · 2018-03-29T00:16:35Z

mythtv/programs/mythtranscode/replex/multiplex.c

@@ -295,7 +295,11 @@ static void writeout_video(multiplex_t *mx)
 	//estimate next pts based on bitrate of this stream and data written
 	viu->dts = uptsdiff(viu->dts + ((nlength*viu->ptsrate)>>8), 0);

-	write(mx->fd_out, outbuf, written);
+	if (write(mx->fd_out, outbuf, written) != written) {
+	  if (mx->error++ < 10)	/* log only first few failures */


Everywhere you are increasing error count, you should do so ensuring there is no chance of wraparound in the 31 bits of the positive 32-bit integer return value. A very large file is not out of the question here, and we could get a very large number of errors should the disk be full, for instance

Yes, I have seen large error count in my log while testing, I think this was over 12GB of failed writes:

E MPEG2Replex mpeg2fix.cpp:504 (WaitBuffers) thread finished with 3576138 write errors

But the error count doesn't matter since in the end we bail out on any amount of errors. The odds of it rolling over to exactly zero are very small - only zero error count will keep the truncated file. If it rolls past zero, then we just log 10 more errors and don't log the actual much larger error count. Maybe simpler is just boolean mx->error = 1: we either have errors or we don't.

Beirdo · 2018-03-29T00:17:25Z

mythtv/programs/mythtranscode/replex/multiplex.c

@@ -637,9 +641,16 @@ void finish_mpg(multiplex_t *mx)
 	if (mx->otype == REPLEX_MPEG2)
 		write(mx->fd_out, mpeg_end,4);

+	if (close(mx->fd_out) < 0) {
+	  mx->error++;


Here's another spot to protect from overflow

twitham1 · 2018-03-30T17:43:34Z

Ok, these last 2 tweaks skip 0 so error count is always true, even if it rolled over. And we succeed only on 0 and fail on any other error count.

My preference would have been to just exit on first write failure so we don't waste time with more failed writes. But when I tried this, I just deadlocked the reader thread. Pthread expert might be able to abort both threads properly. But until then, this resolves bug #12602.

#164 is available for same changes against master. After adjusting the relocated replex to external/replex, the same changes applied with no conflicts.

Thanks for your consideration!

twitham1 · 2020-07-08T21:58:42Z

I don't use 0.27 anymore, so we can drop this and just go with #164 on master for future versions instead.

twitham1 added 5 commits March 27, 2018 00:00

Original patch from 2 years ago.

1ccfd95

Strip my name from log messages and add to code comments instead.

7469d7d

Added twitham name to a few more comments.

2dfbea8

s/twitham/bug12602/g in comments.

4e90202

replex: exit 1 on write failure.

a6f5106

twitham1 changed the title ~~Fix #12602 by detecting write errors~~ mythtranscode: Fix #12602 by detecting write errors Mar 28, 2018

twitham1 mentioned this pull request Mar 28, 2018

mythtranscode: Fix #244 by detecting write errors #164

Merged

Beirdo reviewed Mar 29, 2018

View reviewed changes

twitham1 added 2 commits March 29, 2018 01:49

Avoid error count rolling over to zero.

fba4265

Fail even if error count rolled over to negative.

76f59b8

Avoid negative error counts.

905c7c9

twitham1 closed this Jul 8, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mythtranscode: Fix #12602 by detecting write errors #163

mythtranscode: Fix #12602 by detecting write errors #163

twitham1 commented Mar 28, 2018

twitham1 commented Mar 28, 2018

Beirdo Mar 29, 2018

twitham1 Mar 29, 2018

Beirdo Mar 29, 2018

twitham1 commented Mar 30, 2018 •

edited

Loading

twitham1 commented Jul 8, 2020

mythtranscode: Fix #12602 by detecting write errors #163

mythtranscode: Fix #12602 by detecting write errors #163

Conversation

twitham1 commented Mar 28, 2018

twitham1 commented Mar 28, 2018

Beirdo Mar 29, 2018

Choose a reason for hiding this comment

twitham1 Mar 29, 2018

Choose a reason for hiding this comment

Beirdo Mar 29, 2018

Choose a reason for hiding this comment

twitham1 commented Mar 30, 2018 • edited Loading

twitham1 commented Jul 8, 2020

twitham1 commented Mar 30, 2018 •

edited

Loading