scc2srt: surprising timestamps #59

MathieuDuponchelle · 2020-03-18T01:29:17Z

First off, thanks for this library :)

gst-plugins-rs uses libcaption and has a test scc file for unit testing purposes. It applies to https://www.democracynow.org/shows/2018/12/17?autostart=true, and I find that both the gstreamer wrapper and the scc2srt executable generate empty cues and "offset-by-1" cue timestamps.

cat foo.scc:

Scenarist_SCC V1.0

00:00:00;00 942c 942c

00:00:14;01 9420 9420 94ae 94ae 9454 9454 10ae 10ae 46f2 ef6d 20ce e5f7 20d9 eff2 6b2c 94f2 94f2 10ae 10ae f468 e973 20e9 7320 c4e5 6def e3f2 61e3 7920 ceef f7a1 942f 942f

00:00:17;26 9420 9420 94ae 94ae 9452 9452 97a1 97a1 10ae 10ae d9e5 732c 942c 942c 2049 a76d 2073 7570 70ef f2f4 e96e 6780 94f4 94f4 10ae 10ae c4ef 6e61 ec64 2054 f275 6d70 ae80 942f 942f

scc2srt foo.scc:

01
0:00:00,000 --> 00:00:00,000


02
0:00:00,000 --> 00:00:14,033
From New York,
this is Democracy Now!

03
0:00:14,033 --> 00:00:14,033

Here, I believe:

The first scc cue should be ignored (simply clearing the screen)
The second cue should be timestamped as 00:00:14;01 + the number of frames before 0x94 0x2f, as the cue starts by initiating pop-on mode (0x94 0x20, "resume_caption_loading"). This seems to match pretty well with the actual media, as the lady starts saying "From New-York" some time during the 15th second.
I suspect the last cue isn't drained, probably trivial to fix

From a quick look at the code, I think there are two issues at play here:

libcaption decides that frames are READY too often, for example when encountering 0x94 0x2c (erase_display_memory) , this leads to the creation of a cue.
caption_frame_decode updates its timestamp before ignoring duplicate control commands, this explains the offset-by-1 issue, I believe this could be improved further by tracking the current mode, and in pop on mode updating the timestamp at the end of the caption instead of when the last call returned READY, this would however require making assumptions about the framerate (are SCC files always 29.97 fps?)

The text was updated successfully, but these errors were encountered:

Fixes szatmary#59 * pop_on mode requires incrementing the frame timestamp until end_of_caption is encountered. * caption_frame_decode now always updates the timestamp of the frame when the timestamp parameter != -1. This requires that callers only pass a valid timestamp when a new one is encountered, for example with SCC the timestamp at the start of the cue, then -1 until the next new timestamp. * A new enum member is added for the return value, LIBCAPTION_CLEAR. It allows the caller to determine that closed captions should not be displayed anymore, in order to finish the previous cue earlier than the start of the next cue.

MathieuDuponchelle linked a pull request Mar 18, 2020 that will close this issue

caption_frame_decode: rework API #60

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

scc2srt: surprising timestamps #59

scc2srt: surprising timestamps #59

MathieuDuponchelle commented Mar 18, 2020 •

edited

scc2srt: surprising timestamps #59

scc2srt: surprising timestamps #59

Comments

MathieuDuponchelle commented Mar 18, 2020 • edited

MathieuDuponchelle commented Mar 18, 2020 •

edited