Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

merge feature #15

Closed
karelv opened this issue May 12, 2012 · 5 comments
Closed

merge feature #15

karelv opened this issue May 12, 2012 · 5 comments

Comments

@karelv
Copy link

karelv commented May 12, 2012

Hello,

I was looking for a dual subtitle feature. I didn't found.
But I found this fantastic package, I just would like to share my merge script here.
For me it works perfectly on my VLC player.
Perhaps it might be added to your command line tool.

Usage:
./subtitle_merge.py en.srt de.srt en_de.srt

Best regards,
Karel.

See the code below....

@byroot
Copy link
Owner

byroot commented May 12, 2012

I would be happy to add useful features to the srt command.

But before that I need to understand exactly what your script intend to do.

It build hybrid subtitles with mixed languages, right ?

Can you provide me 2 test subtitles to see it in action ?

@karelv
Copy link
Author

karelv commented May 12, 2012

Ok, I've to admit that I was too quick in sharing my code.

So yes, indeed, I wanted to have 2 languages.
Currently I'm learning a new language, it is useful to listen and read the language you are learning while having also the subtitles in your mother language...

here is my new code and 2 input files as well as the result.
They are just 2 test - files

subtitle_merge.py

    import argparse
    from pysrt import SubRipFile, SubRipItem, SubRipTime

    parser = argparse.ArgumentParser(description='Merge 2 srt files.')

    parser.add_argument('fin', type=str, nargs=2,
                       help='input file')
    parser.add_argument(dest='fout', type=str, nargs=1,
                       help='the output file')

    args = parser.parse_args() 
    master = args.fin[0]
    slave = args.fin[1]
    result = args.fout[0]

    msubs = SubRipFile.open (master, encoding='iso-8859-1')
    ssubs = SubRipFile.open (slave, encoding='iso-8859-1')
    rsubs = SubRipFile.from_string ("", encoding='iso-8859-1')
    tp = SubRipTime()

    for msub in msubs:
        #"start before, ends before"
        for ssub in ssubs.slice (starts_after=tp.ordinal - 1, ends_before=msub.start.ordinal+1):
            rsubs.append (SubRipItem(index=len(rsubs), start=ssub.start, end=ssub.end, text=ssub.text))
            ssubs.remove (ssub) # remove as completely handled.

        tp = msub.start

        #"start before; ends before end of master sub"
        for ssub in ssubs.slice (starts_before=msub.start.ordinal, ends_before=msub.end.ordinal):
            rsubs.append (SubRipItem(index=len(rsubs), start=ssub.start, end=msub.start, text=ssub.text))
            rsubs.append (SubRipItem(index=len(rsubs), start=msub.start, end=ssub.end, text=ssub.text))
            rsubs.append (SubRipItem(index=len(rsubs), start=msub.start, end=ssub.end, text=msub.text))        
            ssubs.remove (ssub) # remove as completely handled.
            tp = ssub.end

        #"start before; ends after end of master sub"
        for ssub in ssubs.slice (starts_before=msub.start, ends_after=msub.end):
            rsubs.append (SubRipItem(index=len(rsubs), start=ssub.start, end=msub.start, text=ssub.text))
            rsubs.append (SubRipItem(index=len(rsubs), start=msub.start, end=msub.end, text=ssub.text))
            rsubs.append (SubRipItem(index=len(rsubs), start=msub.start, end=msub.end, text=msub.text))        
            rsubs.append (SubRipItem(index=len(rsubs), start=msub.end, end=ssub.end, text=ssub.text))
            ssubs.remove (ssub) # remove as completely handled.
            tp = msub.end


        #"start during; ends before master"
        for ssub in ssubs.slice (starts_after=msub.start.ordinal-1, ends_before=msub.end.ordinal+1):
            if (tp != ssub.start):
                rsubs.append (SubRipItem(index=len(rsubs), start=tp, end=ssub.start, text=msub.text))            
            rsubs.append (SubRipItem(index=len(rsubs), start=ssub.start, end=ssub.end, text=ssub.text))
            rsubs.append (SubRipItem(index=len(rsubs), start=ssub.start, end=ssub.end, text=msub.text))
            tp = ssub.end
            ssubs.remove (ssub) # remove as completely handled.

        #"start during; ends after master"
        for ssub in ssubs.slice (starts_before=msub.end, ends_after=msub.end):
            rsubs.append (SubRipItem(index=len(rsubs), start=ssub.start, end=msub.end, text=ssub.text))
            rsubs.append (SubRipItem(index=len(rsubs), start=msub.end,   end=ssub.end, text=ssub.text))
            rsubs.append (SubRipItem(index=len(rsubs), start=ssub.start, end=msub.end, text=msub.text))
            ssubs.remove (ssub) # remove as completely handled.
            tp = msub.end

        if (tp != msub.end):
            if (tp.ordinal < msub.start.ordinal):
                tp = msub.start
            rsubs.append (SubRipItem(index=len(rsubs), start=tp, end=msub.end, text=msub.text))
            tp = msub.end

    rsubs.sort(key=lambda SubRipItem: SubRipItem.start.ordinal)

    for idx, rsub in enumerate(rsubs):
        rsub.index = idx
    rsubs.save (result, encoding='iso-8859-1')

master.srt

1
00:00:01,000 --> 00:00:03,000
Master sub title!

2
00:00:10,000 --> 00:00:15,000
First master sub.

3
00:00:20,000 --> 00:00:22,000
2nd master sub.

4
00:00:30,000 --> 00:00:40,000
3rd master sub.

5
00:00:45,000 --> 00:01:00,000
4th master sub.

slave.srt

1
00:00:01,000 --> 00:00:03,000
Slave sub title!

2
00:00:04,000 --> 00:00:05,500
First slave sub before 1st msub.

3
00:00:06,500 --> 00:00:08,000
Second slave sub before 1st msub.

4
00:00:09,000 --> 00:00:10,000
Third slave sub until 1st msub.

5
00:00:15,000 --> 00:00:17,000
Fourth slave sub right after 1st msub.

6
00:00:18,000 --> 00:00:24,000
Fifth slave sub @ before and after 2nd msub.

7
00:00:44,000 --> 00:00:46,000
Sixth slave sub @ 4th msub entry.

8
00:00:47,000 --> 00:00:48,000
Seventh slave sub during 4th msub.

9
00:00:49,000 --> 00:00:50,000
Eighth slave sub during 4th msub.

After ./subtitle_merge.py master.srt slave.srt merge.srt you get:

merge.srt

0
00:00:01,000 --> 00:00:03,000
Slave sub title!

1
00:00:01,000 --> 00:00:03,000
Master sub title!

2
00:00:04,000 --> 00:00:05,500
First slave sub before 1st msub.

3
00:00:06,500 --> 00:00:08,000
Second slave sub before 1st msub.

4
00:00:09,000 --> 00:00:10,000
Third slave sub until 1st msub.

5
00:00:10,000 --> 00:00:15,000
First master sub.

6
00:00:15,000 --> 00:00:17,000
Fourth slave sub right after 1st msub.

7
00:00:18,000 --> 00:00:20,000
Fifth slave sub @ before and after 2nd msub.

8
00:00:20,000 --> 00:00:22,000
Fifth slave sub @ before and after 2nd msub.

9
00:00:20,000 --> 00:00:22,000
2nd master sub.

10
00:00:22,000 --> 00:00:24,000
Fifth slave sub @ before and after 2nd msub.

11
00:00:30,000 --> 00:00:40,000
3rd master sub.

12
00:00:44,000 --> 00:00:45,000
Sixth slave sub @ 4th msub entry.

13
00:00:45,000 --> 00:00:46,000
Sixth slave sub @ 4th msub entry.

14
00:00:45,000 --> 00:00:46,000
4th master sub.

15
00:00:46,000 --> 00:00:47,000
4th master sub.

16
00:00:47,000 --> 00:00:48,000
Seventh slave sub during 4th msub.

17
00:00:47,000 --> 00:00:48,000
4th master sub.

18
00:00:48,000 --> 00:00:49,000
4th master sub.

19
00:00:49,000 --> 00:00:50,000
Eighth slave sub during 4th msub.

20
00:00:49,000 --> 00:00:50,000
4th master sub.

21
00:00:50,000 --> 00:01:00,000
4th master sub.

@karelv
Copy link
Author

karelv commented May 12, 2012

And as you can see I had a quick look to the flavors of git hub! :-)

@byroot
Copy link
Owner

byroot commented May 12, 2012

And can you tell me if this simpler version is working for you ?

import argparse
from pysrt import SubRipFile, SubRipItem, SubRipTime

parser = argparse.ArgumentParser(description='Merge 2 srt files.')

parser.add_argument('fin', type=str, nargs=2,
                  help='input file')
parser.add_argument(dest='fout', type=str, nargs=1,
                  help='the output file')

args = parser.parse_args() 
master, slave = args.fin
result = args.fout[0]

msubs = SubRipFile.open(master, encoding='iso-8859-1')
ssubs = SubRipFile.open(slave, encoding='iso-8859-1')

rsubs = msubs + ssubs
rsubs.sort()
rsubs.clean_indexes()
rsubs.save (result, encoding='iso-8859-1')

@karelv
Copy link
Author

karelv commented May 12, 2012

This is what I did first, or at least something similar...., I'm relatively new to python, so I was adding one by one instead of '+', the sort I did like my shared code, as well as the 'clean_indexes'.

I was not happy with the output since I always get the a mixture: sometime first master then slave subtitles,
sometime the opposite,
sometime it needs up to 6 lines to display all the subtitles.

I like your solution more, it has cleaner code, it executes faster, but I prefer my version for its output.

Thanks for the challenge anyhow!

PS. I tried also your version, it suffer from the same issue, as my very initial version, at least in VLC player 2.0.1.

@byroot byroot mentioned this issue Aug 9, 2012
@byroot byroot closed this as completed Jan 6, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants