Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for C-SPAN subtitles #5146

Closed
lfl2 opened this issue Mar 6, 2015 · 2 comments
Closed

Add support for C-SPAN subtitles #5146

lfl2 opened this issue Mar 6, 2015 · 2 comments

Comments

@lfl2
Copy link

@lfl2 lfl2 commented Mar 6, 2015

Most of C-SPAN videos, contain subtitles, it is presented in DFXP format, and loaded from URL like this:
http://data.c-spanvideo.org/Programs/aaa/bbb/ccc.dfxp

DFXP can be easily converted to SRT format:

#!/usr/bin/python
# Usage: python tt2srt.py source.xml output.srt

from xml.dom.minidom import parse
import sys
i=1
dom = parse(sys.argv[1])
out = open(sys.argv[2], 'w')
body = dom.getElementsByTagName("body")[0]
paras = body.getElementsByTagName("p")
for para in paras:
    out.write(str(i) + "\n")
    try:
        a=float(para.attributes['begin'].value)
    except ValueError:
        a=0
    out.write('%02d' %(int(a/3600)))
    out.write(":")
    out.write('%02d' % (int(a/60)-60*(int(a/3600))))
    out.write(':')
    out.write('%02d' % (a%60))
    out.write(',')
    out.write('%03d' % (a%60.0-a%60))
    out.write(' --> ')

    a= float(para.attributes['end'].value)

    out.write('%02d' %(int(a/3600)))
    out.write(":")
    out.write('%02d' % (int(a/60)-60*(int(a/3600))))
    out.write(':')
    out.write('%02d' % (a%60))
    out.write(',')
    out.write('%03d' % (a%60.0-a%60))

    out.write("\n")
    for child in para.childNodes:
        if child.nodeName == 'br':
            out.write("\n")
        elif child.nodeName == '#text':
            out.write(unicode(child.data).encode('utf=8'))
    out.write("\n\n")
    i += 1
yan12125 added a commit that referenced this issue Apr 25, 2015
@yan12125
Copy link
Collaborator

@yan12125 yan12125 commented Apr 25, 2015

Thanks @lfl2. I've adopted your implementation and do some modifications so that is can be integrated better with youtube-dl. Now dfxp/TTML subtitles are supported.

@yan12125 yan12125 closed this Apr 25, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants
You can’t perform that action at this time.