-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed caption export tool #633
Comments
Yes, that sounds like a job for ld-extract-metadata to me. |
closed caption data from the input metadata into a plain text file.
I've done some initial work on this but, like everything laserdisc, it's not a straight-forward problem. The CC data includes lots of commands that tell the subtitling hardware what to do; like scroll up, move cursor, go back, etc, etc. None of this is readily represented in a textual output. So, for now, I've added code which strips most commands except a few which are output as spaces or new-lines. Testing this on my Cinderella LD it produces a pleasing text file that's very readable. Command line is something like: ld-export-metadata --closed-captions /home/sdi/Decodes/cc.txt /home/sdi/Decodes/cinder/cindys1.tbc.json Where cc.txt is the output plain-text file. Of course, this isn't even close to what a subtitle file probably should look like so I'm open to suggestions as to what the correct format should be (along with links to the formatting standard). Once that's agreed, we can work it towards that. Please test and then make some suggestions! (Remember that, thanks to American lawyers, subtitle files are auto-copyright and belong to the owner of the motion picture, so don't include any as examples without randomizing a bit). |
ld-export-metadata can now copy closed caption data from the input metadata into a plain text file.
https://github.com/CCExtractor/ccextractor might be a good resource for this. |
Just checked the git wiki for that project and the only documentation is for building the project - the format description pages are blank... any other suggestions that have documentation - or do you know of input format documents for the link? |
There were some docs in https://github.com/CCExtractor/ccextractor/tree/master/docs But from there I followed some links to http://www.theneitherworld.com/mcpoodle/SCC_TOOLS/DOCS/SCC_FORMAT.HTML The .scc file format which it describes seems to be the standard for closed captions. |
Thanks; the .SCC description makes a lot of sense - since the current code is pulling the text and commands out of the visible line data it should be possible to do - the time-code extraction seems to be the odd bit; the spec doesn't really seem to say why you line-break and insert a new code... I'll have a play. |
There's also .srt format if you want something really low-tech. |
I think SCC is the way to go, since it doesn't add the need to translate the CC into captions - looking at @Gamnn's links you output SCC (which is basically timecoded CC raw-data) and then the CCExtractor tool converts them into other formats like .srt. The CC protocol is pretty complex - so letting another tool interpret it is a lot less work I think. Anyway; I'll give the SCC stuff a go and see if I can use CCExtractor to get something useful from it... otherwise I can make a simple output and drop it as .srt. |
ok, I've removed the raw text output and replaced it with an SCC formatted output. The included timecodes are relative to the input video file (i.e. they are calculated from the field number rather than the VBI timecode/frame number). I'd guess that adding both relative and VBI-based timecode output would be a good future feature though. Turns out the CCExtractor is not the right tool for using the SCC file. I found ttconv which happens to have a nice web-based UI too: if you take the .scc output from ld-export-metadata and set ttconv to output .srt it gives you a nice VLC compatible file back: ...and VLC seems to be totally happy with the result: The code is simple right now, but I have limited test discs with CC; so please test. |
Added wiki documentation: https://github.com/happycube/ld-decode/wiki/Working-with-subtitles Closing this issue as complete now. Please test and report any suggestions/bugs/problems as new issues. Thanks! |
It would be useful to be able to export the closed captions to a subtitle format, maybe with ld-extract-metadata, not sure if this is out of scope for this project or not (unless there is actually such a feature somewhere already but couldn't find it).
The text was updated successfully, but these errors were encountered: