# H46Whisper

H46Whisper is a Google Colab notebook application for streamlined video Japanese subtitle file generation to facilitate subsequent translation and timing process. This application serves the purpose of improving the productivity of Hinatazaka46 subbers.

The notebook is based on [Whisper](https://https://github.com/openai/whisper), a general-prupose speech recognition model.

The output file will be in .ass format with built-in style of selected sub group so it can be directly imported into [Aegisub](https://github.com/Aegisub/Aegisub) for subsequent editing.

Contact: @roihn

N46Whisper 是基于 Google Colab 的应用。开发初衷旨在提高日向坂46字幕组的工作效率。但本应用亦适于所有日语视频的字幕制作。

以下顺次点击各单元格最左端的“运行“图标，执行各单元格即可。

In [None]:
#@markdown **挂载你的谷歌网盘/Mount Google Drive** 
#@markdown **</br>【重要】:** 务必在"修改"->"笔记本设置"->"硬件加速器"中选择GPU！否则处理速度会非常慢。
#@markdown **</br>【IMPORTANT】:** Make sure you select GPU as hardware accelerator in notebook settings, otherwise the processing speed will be very slow.

from google.colab import drive
drive.mount('/drive')
print('Google Drive mounted')
print('挂载完毕')

In [None]:
#@markdown **配置Whisper/Setup Whisper**

! pip install git+https://github.com/openai/whisper.git
!wget https://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@raw.githubusercontent.com/Ayanaminn/N46Whisper/main/srt2ass.py
print('Whisper installed')
print('配置完毕')

In [None]:
#@markdown **上传文件/Upload files**

#@markdown 有三种方式可以用于上传文件：
#@markdown 1. 点击下方上传文件按钮从本地上传
#@markdown 2. 从Google Drive目录下导入
#@markdown 3. 通过url下载

from google.colab import files

uploaded = files.upload()
file_name = list(uploaded.keys())[0]

In [None]:
# @markdown **参数设置/Required settings:**


# @markdown <font size="2">select uploaded file type.
# @markdown <br/>选择上传的文件类型(视频/音频）。</font>

# encoding:utf-8
file_type = "audio"  # @param ["audio","video"]

# @markdown <font size="2">Model size will affect the processing time and transcribe quality.
# @markdown <br/>模型大小将影响转录时间和质量
model_size = "base"  # @param ["base", "medium", "large"]
language = "japanese"  # @param {type:"string"}
# @markdown <font size="2">Please contact us if you want to have your sub style integrated.
# @markdown <br/>当前支持生成字幕格式：
# @markdown <br/><li>ikedaCN - 特蕾纱熊猫观察会字幕组
# @markdown <br/><li>sugawaraCN - 坂上之月字幕组
# @markdown <br/><li>kaedeCN - 三番目の枫字幕组
sub_style = "ikedaCN"  # @param ["default", "ikedaCN", "kaedeCN","sugawaraCN"]

# @markdown **高级设置/Andvanced settings:（尚不可用/Under development）**

# @markdown <font size="2">Don't change anything here unless you know what you are doing.
# @markdown <br/>调节以下参数可能会提高转录质量并避免一些问题，但是不懂请不要调

compression_ratio_threshold = 2.4 # @param {type:"number"}
no_speech_threshold = 0.6 # @param {type:"number"}
logprob_threshold = -1.0 # @param {type:"number"}
condition_on_previous_text = "True" # @param ["True", "False"]

In [None]:
#@markdown **运行Whisper/Run Whisper**
#@markdown </br>完成后ass文件将自动下载到本地/ass file will be auto downloaded after finish.

import os
import ffmpeg
import subprocess
import torch
import whisper
import time
import pandas as pd
from pathlib import Path
import sys
sys.path.append('/drive/content')

assert file_name != ""
assert language != ""


if not os.path.exists(file_name):
  raise ValueError(f"No {file_name} found in current path.")
else:
    try:
        file_basename = Path(file_name).stem
        output_dir = Path(file_name).parent.resolve()
        print(file_basename)
        print(output_dir)      
    except Exception as e:
        print(f'error: {e}')

if file_type == "video":
  print('提取音频中 Extracting audio from video file...')
  os.system(f'ffmpeg -i {file_name} -f mp3 -ab 192000 -vn {file_basename}.mp3')
  print('提取完毕 Done.') 

torch.cuda.empty_cache()
print('加载模型 Loading model...')
model = whisper.load_model(model_size)

#Transcribe
tic = time.time()
print('识别中 Converting...')
result = model.transcribe(audio = f'{file_name}', language= language)
toc = time.time()
print('识别完毕 Done')
print(f'Time consumpution {toc-tic}s')

#Write SRT file
from whisper.utils import write_srt
with open(Path(output_dir) / (file_basename + ".srt"), "w", encoding="utf-8") as srt:
    write_srt(result["segments"], file=srt)
#Convert SRT to ASS

from srt2ass import srt2ass
assSub = srt2ass(file_basename + ".srt", sub_style)
print('ASS subtitle saved as: ' + assSub)
files.download(assSub)
os.remove(file_basename + ".srt")
torch.cuda.empty_cache()
print('字幕生成完毕/All done!')