<a href="https://colab.research.google.com/github/milimilim0406/colab-voice/blob/main/run_colab.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# VoxUnravel - Google Colab Setup

This notebook is created to run the **VoxUnravel** project in the Google Colab environment.\
Please make sure to select **Runtime -> Change runtime type -> Hardware accelerator: GPU** from the top left menu before executing.

In [3]:
import os

# 1. Clone GitHub Repository
GITHUB_URL = 'https://github.com/N01N9/VoxUnravel.git'
PROJECT_DIR = '/content/VoxUnravel'

if not os.path.exists(PROJECT_DIR):
    !git clone {GITHUB_URL}
else:
    print("Already cloned! Pulling latest changes...")
    %cd {PROJECT_DIR}
    !git pull origin main

# Move to the project directory
%cd {PROJECT_DIR}

Cloning into 'VoxUnravel'...
remote: Enumerating objects: 629, done.[K
remote: Counting objects: 100% (629/629), done.[K
remote: Compressing objects: 100% (334/334), done.[K
remote: Total 629 (delta 284), reused 563 (delta 234), pack-reused 0 (from 0)[K
Receiving objects: 100% (629/629), 29.78 MiB | 25.18 MiB/s, done.
Resolving deltas: 100% (284/284), done.
/content/VoxUnravel


## Create 4 Independent Virtual Environments and Install Required Packages

This step uses the `install_colab.sh` script to create 4 virtual environments (`main`, `sep`, `dia`, `asr`) necessary for the project.\
The script is configured to create a total of 4 virtual environments using the `python3.11 -m venv` command and install the corresponding requirements in each.

*The installation may take around 5-10 minutes.*

In [4]:
import os

# Ensure we are in the project directory
%cd /content/VoxUnravel

# Convert line endings from CRLF to LF to fix 'bad interpreter' error
!sed -i 's/\r$//' install_colab.sh

# Grant execution permissions to the script
!chmod +x install_colab.sh

# Run the installation script
!./install_colab.sh

/content/VoxUnravel
VoxUnravel Colab All-in-One Setup
Get:1 https://cloud.r-project.org/bin/linux/ubuntu jammy-cran40/ InRelease [3,632 B]
Get:2 https://cli.github.com/packages stable InRelease [3,917 B]
Get:3 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64  InRelease [1,581 B]
Get:4 https://cloud.r-project.org/bin/linux/ubuntu jammy-cran40/ Packages [85.2 kB]
Get:5 https://cli.github.com/packages stable/main amd64 Packages [357 B]
Hit:6 http://archive.ubuntu.com/ubuntu jammy InRelease
Get:7 http://security.ubuntu.com/ubuntu jammy-security InRelease [129 kB]
Get:8 https://r2u.stat.illinois.edu/ubuntu jammy InRelease [6,555 B]
Get:9 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64  Packages [2,383 kB]
Get:10 http://archive.ubuntu.com/ubuntu jammy-updates InRelease [128 kB]
Get:11 https://r2u.stat.illinois.edu/ubuntu jammy/main all Packages [9,785 kB]
Get:12 https://ppa.launchpadcontent.net/deadsnakes/ppa/ubuntu jammy InRelease [18.1 k

## Verify the Created Environments

If the installation script has successfully completed, verify that the python paths of virtual environments for each system have been properly generated in the `environments.json` file to integrate with the main program.

In [5]:
import json
import os

env_path = 'data/environments.json'
if os.path.exists(env_path):
    with open(env_path, 'r') as f:
        envs = json.load(f)
    print("✅ Virtual environment paths verified (Total of 4 or more mappings):")
    for k, v in envs.items():
        print(f"  - {k}: {v}")
else:
    print("❌ environments.json file not found. Please check the installation logs in the cell above.")

✅ Virtual environment paths verified (Total of 4 or more mappings):
  - main: /content/.venv_main/bin/python
  - inference: /content/.venv_sep/bin/python
  - separation: /content/.venv_sep/bin/python
  - diarization: /content/.venv_dia/bin/python
  - asr: /content/.venv_asr/bin/python
  - pipeline: /content/.venv_main/bin/python


## Launch Web GUI (Gradio)

Run the code below to launch a Gradio interface. It will provide a public link (e.g., `https://xxxx.gradio.live`) that you can click to open a clean user interface.

In [10]:
# Launch Gradio Web GUI
!/content/.venv_main/bin/python gradio_app.py

2026-02-26 08:51:30,187 - INFO - HTTP Request: HEAD https://huggingface.co/api/telemetry/https%3A/api.gradio.app/gradio-initiated-analytics "HTTP/1.1 200 OK"
2026-02-26 08:51:30,218 - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK"
* Running on local URL:  http://127.0.0.1:7860[0m
2026-02-26 08:51:30,454 - INFO - HTTP Request: GET http://127.0.0.1:7860/gradio_api/startup-events "HTTP/1.1 200 OK"
2026-02-26 08:51:30,488 - INFO - HTTP Request: HEAD http://127.0.0.1:7860/ "HTTP/1.1 200 OK"
2026-02-26 08:51:30,708 - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK"
[0m* Running on public URL: https://464fb85f8846a95b2b.gradio.live[0m
[0m
This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)[0m
2026-02-26 08:51:30,972 - INFO - HTTP Request: HEAD https://huggingface.co/api

In [13]:
import os
import glob
from google.colab import files

# 1. Path Setup
os.chdir("/content/VoxUnravel")
wav_dir = "tts_dataset_output/diarization/Speaker_1"
asr_dir = "tts_dataset_output/asr_results/Speaker_1"
if not os.path.exists(asr_dir):
    os.makedirs(asr_dir)

# 2. Create Repair Worker Script
print("Step 1: Creating repair script...")
worker_code = """
import whisper
import os
import glob

print("Loading Whisper Medium model...")
model = whisper.load_model("medium")
wav_files = glob.glob("tts_dataset_output/diarization/Speaker_1/*.wav")

for wav in wav_files:
    txt_name = os.path.basename(wav).replace(".wav", ".txt")
    txt_path = os.path.join("tts_dataset_output/asr_results/Speaker_1", txt_name)

    if not os.path.exists(txt_path):
        print("Processing: " + os.path.basename(wav))
        result = model.transcribe(wav, language="ko")
        with open(txt_path, "w", encoding="utf-8") as tf:
            tf.write(result["text"].strip())
"""
with open("repair_worker.py", "w", encoding="utf-8") as f:
    f.write(worker_code)

# 3. Execute Repair
print("Step 2: Running Whisper ASR...")
os.system("/content/.venv_asr/bin/python repair_worker.py")

# 4. Generate .list file
print("Step 3: Generating list file...")
all_lines = []
wav_files = glob.glob(os.path.join(wav_dir, "*.wav"))
for wav_path in sorted(wav_files):
    wav_name = os.path.basename(wav_path)
    txt_path = os.path.join(asr_dir, wav_name.replace(".wav", ".txt"))
    if os.path.exists(txt_path):
        with open(txt_path, "r", encoding="utf-8") as tf:
            text = tf.read().strip()
        line = "dataset/rua/" + wav_name + "|rua|ko|" + text
        all_lines.append(line)

with open("tts_dataset_output/rua_heaven.list", "w", encoding="utf-8") as f:
    f.write("\n".join(all_lines))

# 5. Zip and Download
print("Step 4: Archiving results...")
os.system("zip -j rua_heaven_complete.zip tts_dataset_output/diarization/Speaker_1/*.wav tts_dataset_output/rua_heaven.list")

print("Done! Starting download...")
files.download('rua_heaven_complete.zip')


Step 1: Creating repair script...
Step 2: Running Whisper ASR...
Step 3: Generating list file...
Step 4: Archiving results...
Done! Starting download...


<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

In [15]:
!rm -rf tts_dataset_output

In [None]:
%cd /content/VoxUnravel
!/content/.venv_main/bin/python gradio_app.py --share

/content/VoxUnravel
2026-02-26 11:32:15,440 - INFO - HTTP Request: HEAD https://huggingface.co/api/telemetry/https%3A/api.gradio.app/gradio-initiated-analytics "HTTP/1.1 200 OK"
2026-02-26 11:32:15,486 - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK"
* Running on local URL:  http://127.0.0.1:7860[0m
2026-02-26 11:32:15,995 - INFO - HTTP Request: GET http://127.0.0.1:7860/gradio_api/startup-events "HTTP/1.1 200 OK"
2026-02-26 11:32:16,041 - INFO - HTTP Request: HEAD http://127.0.0.1:7860/ "HTTP/1.1 200 OK"
2026-02-26 11:32:16,286 - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK"
[0m* Running on public URL: https://d3fdfbff1e41312193.gradio.live[0m
[0m
This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)[0m
2026-02-26 11:32:16,544 - INFO - HTTP Request: HEAD https:

In [18]:
# 2. 루니 비상 복구 스크립트 (Medium 모델로 안정성 확보)
worker_code = """
import whisper
import os
import glob

# 루니가 이번엔 Speaker_1으로 잡혔으므로 경로 수정
target_speaker = "Speaker_1"
asr_out_dir = f"tts_dataset_output/asr_results/{target_speaker}"
diar_dir = f"tts_dataset_output/diarization/{target_speaker}"

if not os.path.exists(asr_out_dir):
    os.makedirs(asr_out_dir, exist_ok=True)

print(f"Loading Whisper Medium model for Roony ({target_speaker})...")
model = whisper.load_model("medium")
wav_files = glob.glob(f"{diar_dir}/*.wav")

for wav in sorted(wav_files):
    txt_name = os.path.basename(wav).replace(".wav", ".txt")
    txt_path = os.path.join(asr_out_dir, txt_name)

    if not os.path.exists(txt_path):
        print(f"Processing Roony (Medium): {os.path.basename(wav)}")
        # Whisper medium으로 안정적으로 텍스트 추출
        result = model.transcribe(wav, language="ko")
        with open(txt_path, "w", encoding="utf-8") as tf:
            tf.write(result["text"].strip())
    else:
        print(f"Skipping (Already exists): {os.path.basename(wav)}")
"""

with open("colab_worker.py", "w", encoding="utf-8") as f:
    f.write(worker_code)

import subprocess
print("Starting Roony's Recovery Process...")
subprocess.run(["python", "colab_worker.py"])
print("Roony's ASR is FINISHED! Proceed to next cell.")


Starting Roony's Recovery Process...
Roony's ASR is FINISHED! Proceed to next cell.


In [9]:
target_dir = "/content/VoxUnravel/input"
os.makedirs(target_dir, exist_ok=True)

print("🚀 파일 위치 확인 중...")
for root, dirs, files in os.walk('/content'):
     for file in files:
        if file in ['rua_raw_voice.wav', 'roony_raw_voice.wav']:
             current_path = os.path.join(root, file)
             destination_path = os.path.join(target_dir, file)
             if current_path != destination_path:
                 shutil.move(current_path, destination_path)
                 print(f"✅ {file} 이동 완료: {destination_path}")
print("\n📂 현재 input 폴더 내용:")
!ls -lh /content/VoxUnravel/input/

print("\n🚀 웹 UI 다시 시작...")
%cd /content/VoxUnravel
!/content/.venv_main/bin/python gradio_app.py --share

🚀 파일 위치 확인 중...

📂 현재 input 폴더 내용:
total 0

🚀 웹 UI 다시 시작...
/content/VoxUnravel
2026-02-26 07:27:30,081 - INFO - HTTP Request: HEAD https://huggingface.co/api/telemetry/https%3A/api.gradio.app/gradio-initiated-analytics "HTTP/1.1 200 OK"
2026-02-26 07:27:30,099 - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK"
* Running on local URL:  http://127.0.0.1:7860[0m
2026-02-26 07:27:30,540 - INFO - HTTP Request: GET http://127.0.0.1:7860/gradio_api/startup-events "HTTP/1.1 200 OK"
2026-02-26 07:27:30,576 - INFO - HTTP Request: HEAD http://127.0.0.1:7860/ "HTTP/1.1 200 OK"
2026-02-26 07:27:30,804 - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK"
[0m* Running on public URL: https://6bc533fdaa5deb93d1.gradio.live[0m
[0m
This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)[0