Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lexicographical soundfile names #16

Open
AlexLashield opened this issue Aug 19, 2023 · 1 comment
Open

lexicographical soundfile names #16

AlexLashield opened this issue Aug 19, 2023 · 1 comment

Comments

@AlexLashield
Copy link

Thanks for this tool! It really benefits me a lot.
However, there is a little bit problem here when use a batch program to traverse the result set for STT I found that the file names of the results are not following lexicographical order.
I hope to revise slicer2.py
Change Line 182 to

soundfile.write(os.path.join(out, f'%s_%05d.wav' % (os.path.basename(args.audio).rsplit('.', maxsplit=1)[0], i)), chunk, sr)
@yqzhishen
Copy link
Member

Yes, the segments are here sorted numerically instead of lexicographical order. But normally in Windows file explorer, the files can be sorted correctly.

I agree that following lexicongraphical order will bring convenience to many user programs to deal with the segments. However, the problem is how to decide the default length of a segment index. For example, if you use %05d, then what if there are over 100,000 segments (even if this seems to be impossible)? Or if we use dynamic lengths (for example, %02d for 11-100 segments, %03d for 101-1000 segments), will it bring new problems to user programs when matching name patterns? What is your opinion on this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants