# yohane <img src="https://hikari.butaishoujo.moe/p/06bfbaf3/680954239740411973.png" height="24px" width="24px" style="display:inline;object-fit:contain;vertical-align:middle" >

---

Please click the badge below to open the latest version of the notebook:

<a target="_blank" href="https://colab.research.google.com/github/Japan7/yohane/blob/main/notebook/yohane.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

---

**Before proceeding, change your runtime type to GPU (Toolbar > Runtime > Change runtime type > T4 GPU).**

![](https://hikari.butaishoujo.moe/p/bde50ce2/out.png)


## Install

Execute the next cells to install yohane.


In [None]:
!python3 --version


In [None]:
%%bash
REPO_URL=https://github.com/Japan7/yohane.git
LATEST_TAG=$(git ls-remote --tags --sort -v:refname $REPO_URL | head -n1 | cut --delimiter='/' --fields=3)

pip3 install git+$REPO_URL@$LATEST_TAG


Restart the runtime if prompted.

In [None]:
!pip3 show yohane


## Parameters

The next cells will set the parameters for the yohane pipeline.

In [None]:
# @title Song { display-mode: "form" }
# @markdown Run this cell and use the form below to **upload your song**.
#
# @markdown It can be either an audio or video file.
#
# @markdown **If it fails, try another browser or upload your file manually in the Files section of the left side bar.**

from google.colab import files

files.upload_file("song")


In [None]:
# @title Lyrics { display-mode: "form", run: "auto" }
# @markdown Run this cell and **paste your lyrics** in the box below.

from IPython.display import display
from ipywidgets import Layout, Textarea

lyrics_area = Textarea(layout=Layout(width="100%", height="200px"))
display(lyrics_area)


In [None]:
# @title Vocals Extractor { display-mode: "form", run: "auto" }
# @markdown Run this cell and select the desired **Vocals Extractor**:
# @markdown - `VocalRemoverVocalsExtractor` is based on the [`vocal-remover`](https://github.com/tsurumeso/vocal-remover) library. Take this one if you don't know what to choose.
# @markdown - `HybridDemucsVocalsExtractor` uses `torchaudio`'s [Hybrid Demucs model](https://pytorch.org/audio/2.1.0/tutorials/hybrid_demucs_tutorial.html) which is faster but less aggressive.
# @markdown - `None` if you don't care and want to skip the vocals extraction step.

from yohane.audio import VocalRemoverVocalsExtractor, HybridDemucsVocalsExtractor

vocals_extractor_class = VocalRemoverVocalsExtractor # @param ["VocalRemoverVocalsExtractor", "HybridDemucsVocalsExtractor", "None"] {type:"raw"}
vocals_extractor = vocals_extractor_class() if vocals_extractor_class is not None else None


## Run

When ready, execute the next cells to run the pipeline.


In [None]:
# @title Generate
# @markdown **Replace the song filename here if you uploaded it manually**

import logging
from pathlib import Path
from yohane import Yohane

logging.basicConfig(level="INFO", force=True)

song_filename = "song" # @param {type:"string"}

yohane = Yohane(vocals_extractor)
yohane.load_song(Path(song_filename))
yohane.load_lyrics(lyrics_area.value)
yohane.extract_vocals()
yohane.force_align()
subs = yohane.make_subs()


In [None]:
# @title Save and download

from google.colab import files

subs.save("karaoke.ass")
files.download("karaoke.ass")


The karaoke should have been downloaded. If not, open Files in the left side bar and look for `karaoke.ass`.

**Next recommended steps in Aegisub:**

1. Load the .ass and the video
2. Replace the _Default_ style with your own
3. Due to the normalization during the process, lines are lowercased and special characters have been removed: use the original lines in comments to fix the timed lines
4. Subtitle > Select Lines… > check _Comments_ and _Set selection_ > OK and delete the selected lines
5. Listen to each line and fix their End time
6. Iterate over each line in karaoke mode and merge/fix syllable timings

**Happy editing!**

![](https://hikari.butaishoujo.moe/p/bd9b7a37/genjitsunoyohane-ep1-scr01.jpg)
