YouTube Video Transcription with Whisper #262
Replies: 4 comments 7 replies
-
Congratulations. You are one of the top devs of the year.
…On Thu, Oct 6, 2022 at 12:15 PM marferca ***@***.***> wrote:
Hi everyone!
I have created a Streamlit app that lets you transcribe YouTube videos
using Whisper and download the output as TXT or SubRip. I hope you like it
and will be more than happy to hear your thoughts on this.
Streamlit app:
https://marferca-yt-whisper-demo-streamlit-app-luptcq.streamlitapp.com/
Kudos to the OpenAI team, for your fantastic work and for sharing it with
the community. I hope we can keep building next-gen apps with your powerful
models 🚀 <https://emojiterra.com/rocket/>.
—
Reply to this email directly, view it on GitHub
<#262>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABSNYXXTMOB4XBNR4QFT4L3WB33IHANCNFSM6AAAAAAQ6ZGAGM>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Brilliant. Thanks for your work. Be good to add translate functionality. Is that an easy thing to do? Traditionally I would not have thought so - but these days, who can tell? |
Beta Was this translation helpful? Give feedback.
-
Hey, thanks for creating this app! I was wondering whether it'd be possible to support Youtube videos of longer than 8 mins. I'm thinking of transcribing some videos that are at most 30 minutes long and would love to use this app. Thanks! |
Beta Was this translation helpful? Give feedback.
-
Porblem about audio duration:
def _whisper_result_to_srt(result):
text = []
for i,s in enumerate(result['segments']):
text.append(str(i+1))
time_start = s['start']
hours, minutes, seconds = int(time_start/3600), (time_start/60) % 60, (time_start) % 60
timestamp_start = "%02d:%02d:%06.3f" % (hours, minutes, seconds)
timestamp_start = timestamp_start.replace('.',',')
time_end = s['end']
hours, minutes, seconds = int(time_end/3600), (time_end/60) % 60, (time_end) % 60
timestamp_end = "%02d:%02d:%06.3f" % (hours, minutes, seconds)
timestamp_end = timestamp_end.replace('.',',')
text.append(timestamp_start + " --> " + timestamp_end)
text.append(s['text'].strip() + "\n")
return "\n".join(text)
out = _whisper_result_to_srt(result)
print(out) The output detail shows that this audio file lasts 36 seconds ... Actually, it's not true 1
00:00:00,000 --> 00:00:07,000
项目的物业 富华行物业 管理长安俱乐部
2
00:00:07,000 --> 00:00:36,000
长安俱乐部是十大俱乐部之首 物业费是24块钱一瓶 |
Beta Was this translation helpful? Give feedback.
-
Hi everyone!
I have created a Streamlit app that lets you transcribe YouTube videos using Whisper and download the output as TXT or SubRip. I hope you like it and will be more than happy to hear your thoughts on this.
Streamlit app: https://marferca-yt-whisper-demo-streamlit-app-luptcq.streamlitapp.com/
Kudos to the OpenAI team, for your fantastic work and for sharing it with the community. I hope we can keep building next-gen apps with your powerful models 🚀.
Beta Was this translation helpful? Give feedback.
All reactions