# 🎧 TheLostChapter CMS - All-in-One

Trình quản lý nội dung sách nói với:

| Chức năng | Mô tả |
|-----------|--------|
| 🎙️ **Voice Recorder** | Record trực tiếp từ browser, lọc noise, chỉnh volume |
| 🧑‍💻 **Voice Cloning** | Clone giọng nói từ file mẫu (viXTTS) |
| 📝 **Content Editor** | Xem và chỉnh sửa nội dung sách |
| 🔊 **Audio Generation** | Tạo audio từ giọng clone hoặc record trực tiếp |
| 🚀 **GitHub Push** | Lưu thay đổi lên GitHub |

## ⚡ Quick Start
1. **Runtime → Change runtime type → T4 GPU** (cho voice cloning)
2. Thêm `GITHUB_TOKEN` vào Colab Secrets (🔑 sidebar)
3. **Run All** (Ctrl+F9)

In [None]:
#@title 📦 Setup Environment { display-mode: "form" }
#@markdown ### Cài đặt môi trường và clone repo

GITHUB_USERNAME = "nmnhut-it" #@param {type:"string"}
REPO_NAME = "english-learning-app" #@param {type:"string"}
BRANCH = "claude/audio-book-app-8dJZq" #@param {type:"string"}

import subprocess, sys, os

# ========== Install Dependencies ==========
print("="*50)
print("📦 Installing dependencies...")
print("="*50)

subprocess.run([sys.executable, "-m", "pip", "install", "-q", 
                "coqui-tts", "torchcodec", "soundfile", "pydub", 
                "noisereduce", "ipywidgets"], check=True)

import torch, json, re, numpy as np, soundfile as sf
from pathlib import Path
from datetime import datetime
from pydub import AudioSegment
from google.colab import userdata, files, output
from IPython.display import Audio, display, HTML, Javascript
import ipywidgets as widgets
import noisereduce as nr

HAS_GPU = torch.cuda.is_available()
print(f"\n✅ Installed! GPU: {torch.cuda.get_device_name() if HAS_GPU else 'None (record-only mode)'}")

# ========== Clone Repo ==========
print("\n" + "="*50)
print("📥 Cloning repository...")
print("="*50)

try:
    GITHUB_TOKEN = userdata.get('GITHUB_TOKEN')
except:
    GITHUB_TOKEN = input("Enter GitHub token: ")

REPO_URL = f"https://{GITHUB_USERNAME}:{GITHUB_TOKEN}@github.com/{GITHUB_USERNAME}/{REPO_NAME}.git"
REPO_DIR = Path(f"/content/{REPO_NAME}")

if REPO_DIR.exists():
    os.chdir(REPO_DIR)
    subprocess.run(["git", "pull", "origin", BRANCH], check=True)
else:
    subprocess.run(["git", "clone", "--depth", "1", "-b", BRANCH, REPO_URL, str(REPO_DIR)], check=True)

os.chdir(REPO_DIR)
subprocess.run(["git", "config", "user.email", "colab@thelostchapter.app"])
subprocess.run(["git", "config", "user.name", "TheLostChapter CMS"])

# Setup paths
TLC_DIR = REPO_DIR / "the-lost-chapter"
VOICES_DIR = TLC_DIR / "voices"
CONTENT_DIR = TLC_DIR / "content" / "books"
VOICES_DIR.mkdir(parents=True, exist_ok=True)

print(f"✅ Repository ready at: {REPO_DIR}")

# ========== Store Globals ==========
# These will be populated by subsequent cells
model = None
voice_profiles = {"vi": None, "en": None}

print("\n" + "="*50)
print("🎉 Setup complete! Choose your workflow below.")
print("="*50)

---
## 🎤 OPTION A: Record Voice Directly

Record audio trực tiếp từ browser, không cần GPU.

In [None]:
#@title 🎙️ Voice Recorder { display-mode: "form" }
#@markdown ### Record giọng nói trực tiếp từ browser
#@markdown 
#@markdown **Tips:**
#@markdown - Nói rõ ràng, ở môi trường yên tĩnh
#@markdown - Lọc noise sẽ tự động áp dụng
#@markdown - Sau khi record, file sẽ lưu tạm thời

NOISE_REDUCTION = 0.5 #@param {type:"slider", min:0, max:1, step:0.1}
VOLUME_BOOST = 1.0 #@param {type:"slider", min:0.5, max:2.0, step:0.1}

# HTML/JS for recording
RECORDER_HTML = '''
<div id="recorder-container" style="padding: 20px; background: #1e1e1e; border-radius: 10px; color: white; font-family: sans-serif;">
    <h3 style="margin-top: 0;">🎙️ Voice Recorder</h3>
    
    <div style="margin: 20px 0;">
        <label style="display: block; margin-bottom: 10px;">Record for:</label>
        <select id="rec-target" style="padding: 10px; font-size: 16px; border-radius: 5px; width: 200px;">
            <option value="chapter">Chapter Audio</option>
            <option value="voice-vi">Voice Sample (VI)</option>
            <option value="voice-en">Voice Sample (EN)</option>
        </select>
    </div>
    
    <div style="margin: 20px 0;">
        <label style="display: block; margin-bottom: 10px;">Chapter ID (if recording chapter):</label>
        <input type="text" id="rec-chapter-id" value="ch01" style="padding: 10px; font-size: 16px; border-radius: 5px; width: 180px;">
        <select id="rec-lang" style="padding: 10px; font-size: 16px; border-radius: 5px; margin-left: 10px;">
            <option value="vi">Vietnamese</option>
            <option value="en">English</option>
        </select>
    </div>
    
    <div style="display: flex; gap: 10px; margin: 20px 0;">
        <button id="btn-record" onclick="toggleRecording()" 
                style="padding: 15px 30px; font-size: 18px; background: #e53935; color: white; border: none; border-radius: 50px; cursor: pointer;">
            ⏺ Start Recording
        </button>
        <button id="btn-play" onclick="playRecording()" disabled
                style="padding: 15px 30px; font-size: 18px; background: #444; color: white; border: none; border-radius: 50px; cursor: pointer;">
            ▶ Play
        </button>
        <button id="btn-save" onclick="saveRecording()" disabled
                style="padding: 15px 30px; font-size: 18px; background: #43a047; color: white; border: none; border-radius: 50px; cursor: pointer;">
            💾 Save
        </button>
    </div>
    
    <div id="rec-status" style="margin: 15px 0; padding: 10px; background: #333; border-radius: 5px;">
        Ready to record
    </div>
    
    <div id="rec-timer" style="font-size: 48px; font-weight: bold; text-align: center; margin: 20px 0;">
        00:00
    </div>
    
    <audio id="rec-audio" controls style="width: 100%; margin-top: 10px; display: none;"></audio>
</div>

<script>
let mediaRecorder = null;
let audioChunks = [];
let recordingBlob = null;
let timerInterval = null;
let startTime = null;

async function toggleRecording() {
    const btn = document.getElementById("btn-record");
    const status = document.getElementById("rec-status");
    
    if (mediaRecorder && mediaRecorder.state === "recording") {
        // Stop recording
        mediaRecorder.stop();
        btn.innerHTML = "⏺ Start Recording";
        btn.style.background = "#e53935";
        status.innerHTML = "Processing...";
        clearInterval(timerInterval);
    } else {
        // Start recording
        try {
            const stream = await navigator.mediaDevices.getUserMedia({ audio: true });
            mediaRecorder = new MediaRecorder(stream, { mimeType: "audio/webm" });
            audioChunks = [];
            
            mediaRecorder.ondataavailable = (e) => {
                audioChunks.push(e.data);
            };
            
            mediaRecorder.onstop = () => {
                recordingBlob = new Blob(audioChunks, { type: "audio/webm" });
                const audioUrl = URL.createObjectURL(recordingBlob);
                const audio = document.getElementById("rec-audio");
                audio.src = audioUrl;
                audio.style.display = "block";
                document.getElementById("btn-play").disabled = false;
                document.getElementById("btn-save").disabled = false;
                status.innerHTML = "✅ Recording complete! Click Save to process.";
                stream.getTracks().forEach(track => track.stop());
            };
            
            mediaRecorder.start();
            btn.innerHTML = "⏹ Stop Recording";
            btn.style.background = "#f44336";
            status.innerHTML = "🔴 Recording...";
            
            // Timer
            startTime = Date.now();
            timerInterval = setInterval(() => {
                const elapsed = Math.floor((Date.now() - startTime) / 1000);
                const mins = String(Math.floor(elapsed / 60)).padStart(2, "0");
                const secs = String(elapsed % 60).padStart(2, "0");
                document.getElementById("rec-timer").innerHTML = `${mins}:${secs}`;
            }, 1000);
        } catch (err) {
            status.innerHTML = "❌ Error: " + err.message;
        }
    }
}

function playRecording() {
    document.getElementById("rec-audio").play();
}

async function saveRecording() {
    if (!recordingBlob) return;
    
    const status = document.getElementById("rec-status");
    status.innerHTML = "💾 Saving...";
    
    const target = document.getElementById("rec-target").value;
    const chapterId = document.getElementById("rec-chapter-id").value;
    const lang = document.getElementById("rec-lang").value;
    
    // Convert blob to base64
    const reader = new FileReader();
    reader.readAsDataURL(recordingBlob);
    reader.onloadend = () => {
        const base64 = reader.result.split(",")[1];
        google.colab.kernel.invokeFunction("notebook.save_recording", [base64, target, chapterId, lang], {});
    };
}
</script>
'''

# Python callback for saving
def save_recording(base64_audio, target, chapter_id, lang):
    import base64
    from pydub import AudioSegment
    import noisereduce as nr
    import soundfile as sf
    import io
    
    print(f"\n💾 Processing recording...")
    print(f"   Target: {target}")
    print(f"   Chapter: {chapter_id}")
    print(f"   Language: {lang}")
    
    # Decode base64
    audio_bytes = base64.b64decode(base64_audio)
    
    # Save temp webm
    temp_webm = "/content/temp_recording.webm"
    with open(temp_webm, "wb") as f:
        f.write(audio_bytes)
    
    # Convert to wav
    audio = AudioSegment.from_file(temp_webm, format="webm")
    audio = audio.set_frame_rate(24000).set_channels(1)
    
    # Apply volume boost
    if VOLUME_BOOST != 1.0:
        gain_db = 20 * np.log10(VOLUME_BOOST)
        audio = audio + gain_db
        print(f"   Volume boost: {VOLUME_BOOST}x ({gain_db:.1f} dB)")
    
    # Export to numpy for noise reduction
    temp_wav = "/content/temp_recording.wav"
    audio.export(temp_wav, format="wav")
    
    # Apply noise reduction
    if NOISE_REDUCTION > 0:
        data, sr = sf.read(temp_wav)
        reduced = nr.reduce_noise(y=data, sr=sr, prop_decrease=NOISE_REDUCTION)
        sf.write(temp_wav, reduced, sr)
        print(f"   Noise reduction: {NOISE_REDUCTION*100:.0f}%")
    
    # Determine output path
    if target == "voice-vi":
        output_path = VOICES_DIR / "my-voice-vi.wav"
    elif target == "voice-en":
        output_path = VOICES_DIR / "my-voice-en.wav"
    else:
        # Chapter audio - need to get book ID
        books = list(CONTENT_DIR.iterdir())
        if books:
            book_dir = books[0]  # Default to first book
            audio_dir = book_dir / "audio"
            audio_dir.mkdir(exist_ok=True)
            output_path = audio_dir / f"{chapter_id}-{lang}.wav"
        else:
            output_path = Path(f"/content/{chapter_id}-{lang}.wav")
    
    # Copy to final location
    import shutil
    shutil.copy(temp_wav, output_path)
    
    duration = len(AudioSegment.from_wav(str(output_path))) / 1000
    print(f"\n✅ Saved: {output_path}")
    print(f"   Duration: {duration:.1f}s")
    print(f"   Size: {output_path.stat().st_size/1024:.1f} KB")
    
    return str(output_path)

# Register callback
output.register_callback('notebook.save_recording', save_recording)

# Display recorder
display(HTML(RECORDER_HTML))

---
## 🧑‍💻 OPTION B: Voice Cloning (GPU Required)

Clone giọng nói từ file mẫu để tạo audio tự động.

In [None]:
#@title 🧠 Load viXTTS Model { display-mode: "form" }
#@markdown ### Load model voice cloning (cần GPU)

if not HAS_GPU:
    print("❌ Không có GPU! Voice cloning cần GPU.")
    print("💡 Tip: Runtime → Change runtime type → T4 GPU")
    print("\nHoặc sử dụng Voice Recorder ở Option A để record trực tiếp.")
else:
    print("="*50)
    print("🚀 Loading viXTTS model...")
    print("="*50)
    
    from huggingface_hub import hf_hub_download
    from TTS.tts.configs.xtts_config import XttsConfig
    from TTS.tts.models.xtts import Xtts
    from TTS.tts.layers.xtts import tokenizer as xtts_tokenizer
    
    # Patch for Vietnamese
    _orig_preprocess = xtts_tokenizer.VoiceBpeTokenizer.preprocess_text
    def _patched(self, txt, lang):
        if lang == "vi":
            txt = txt.replace('"', '')
            txt = re.sub(r'\s+', ' ', txt)
            return txt.strip()
        return _orig_preprocess(self, txt, lang)
    xtts_tokenizer.VoiceBpeTokenizer.preprocess_text = _patched
    
    MODEL_DIR = Path("/content/models/vixtts")
    MODEL_DIR.mkdir(parents=True, exist_ok=True)
    
    for f in ["config.json", "model.pth", "vocab.json"]:
        if not (MODEL_DIR / f).exists():
            print(f"  Downloading {f}...")
            hf_hub_download(repo_id="capleaf/viXTTS", filename=f, local_dir=str(MODEL_DIR))
        else:
            print(f"  ✓ {f} (cached)")
    
    config = XttsConfig()
    config.load_json(str(MODEL_DIR / "config.json"))
    model = Xtts.init_from_config(config)
    model.load_checkpoint(config, checkpoint_path=str(MODEL_DIR / "model.pth"),
                          vocab_path=str(MODEL_DIR / "vocab.json"))
    model.cuda()
    print(f"\n✅ Model loaded on GPU!")

In [None]:
#@title 🎵 Clone Voice from Samples { display-mode: "form" }
#@markdown ### Clone giọng nói từ file mẫu

if model is None:
    print("❌ Model chưa load! Chạy cell 'Load viXTTS Model' trước.")
else:
    print("="*50)
    print("🎵 Scanning voice samples...")
    print("="*50)
    
    # Find samples
    vi_samples = list(VOICES_DIR.glob('*-vi.*')) + list(VOICES_DIR.glob('*_vi.*')) + list(VOICES_DIR.glob('*vi.wav'))
    en_samples = list(VOICES_DIR.glob('*-en.*')) + list(VOICES_DIR.glob('*_en.*')) + list(VOICES_DIR.glob('*en.wav'))
    
    print(f"\n🎵 Voice samples:")
    print(f"   VI: {[f.name for f in vi_samples] if vi_samples else '❌ MISSING'}")
    print(f"   EN: {[f.name for f in en_samples] if en_samples else '❌ MISSING'}")
    
    def convert_to_wav(input_file, output_name="speaker"):
        wav_path = f"/content/{output_name}.wav"
        ext = Path(input_file).suffix.lower()
        
        if ext == '.m4a':
            audio = AudioSegment.from_file(str(input_file), format='m4a')
        elif ext == '.mp3':
            audio = AudioSegment.from_mp3(str(input_file))
        elif ext == '.wav':
            audio = AudioSegment.from_wav(str(input_file))
        else:
            audio = AudioSegment.from_file(str(input_file))
        
        audio = audio.set_frame_rate(22050).set_channels(1)
        audio.export(wav_path, format="wav")
        duration = len(audio) / 1000
        print(f"  ✓ Converted {ext} → wav ({duration:.1f}s)")
        return wav_path, duration
    
    def clone_voice(sample_file, profile_name, lang_code):
        profile_file = VOICES_DIR / f"{profile_name}.pt"
        
        if profile_file.exists():
            print(f"\n✅ {lang_code.upper()}: Profile exists ({profile_name}.pt)")
            data = torch.load(profile_file, weights_only=False)
            return data["gpt_cond_latent"].cuda(), data["speaker_embedding"].cuda()
        
        print(f"\n🧬 {lang_code.upper()}: Cloning from {sample_file.name}...")
        wav_path, duration = convert_to_wav(sample_file, f"speaker_{lang_code}")
        
        if duration < 10:
            print(f"  ⚠️ Warning: Audio is only {duration:.1f}s. Recommend 30-60s.")
        
        gpt_cond_latent, speaker_embedding = model.get_conditioning_latents(audio_path=wav_path)
        
        torch.save({
            "gpt_cond_latent": gpt_cond_latent.cpu(),
            "speaker_embedding": speaker_embedding.cpu(),
            "source": sample_file.name,
            "language": lang_code,
            "duration": duration,
            "created": datetime.now().isoformat(),
            "model": "viXTTS"
        }, profile_file)
        
        print(f"  ✅ Saved as {profile_name}.pt")
        return gpt_cond_latent.cuda(), speaker_embedding.cuda()
    
    # Clone voices
    if vi_samples:
        voice_profiles["vi"] = clone_voice(vi_samples[0], "default-vi", "vi")
    
    if en_samples:
        voice_profiles["en"] = clone_voice(en_samples[0], "default-en", "en")
    
    print("\n" + "="*50)
    print("🎵 Voice profiles ready!")
    print("="*50)

---
## 📝 Content Editor

Xem và chỉnh sửa nội dung sách.

In [None]:
#@title 📚 Browse Books { display-mode: "form" }
#@markdown ### Xem danh sách sách và chapters

print("="*50)
print("📚 Available Books")
print("="*50)

books_data = {}

for book_dir in sorted(CONTENT_DIR.iterdir()):
    if not book_dir.is_dir():
        continue
    
    book_json = book_dir / "book.json"
    if book_json.exists():
        with open(book_json, encoding='utf-8') as f:
            book = json.load(f)
        
        books_data[book_dir.name] = book
        
        print(f"\n📖 {book['title']}")
        print(f"   ID: {book_dir.name}")
        print(f"   Language: {book.get('language', 'vi')}")
        print(f"   Chapters: {len(book.get('chapters', []))}")
        
        for ch_id in book.get('chapters', []):
            ch_file = book_dir / "chapters" / f"{ch_id}.json"
            if ch_file.exists():
                with open(ch_file, encoding='utf-8') as f:
                    ch = json.load(f)
                print(f"      - {ch_id}: {ch.get('title', 'Untitled')}")

if not books_data:
    print("❌ No books found!")

In [None]:
#@title ✏️ Edit Chapter Content { display-mode: "form" }
#@markdown ### Chỉnh sửa nội dung chapter

BOOK_ID = "gentle-mind" #@param {type:"string"}
CHAPTER_ID = "ch01" #@param {type:"string"}

chapter_file = CONTENT_DIR / BOOK_ID / "chapters" / f"{CHAPTER_ID}.json"

if not chapter_file.exists():
    print(f"❌ Chapter not found: {chapter_file}")
else:
    with open(chapter_file, encoding='utf-8') as f:
        chapter_data = json.load(f)
    
    print(f"📖 Editing: {chapter_data.get('title', CHAPTER_ID)}")
    print("="*50)
    
    # Create widgets for each section
    section_widgets = []
    
    for i, section in enumerate(chapter_data.get('sections', [])):
        section_type = section.get('type', 'unknown')
        section_lang = section.get('lang', '')
        
        if section_type == 'markdown':
            content = section.get('content', '')
            
            # Header
            header = widgets.HTML(
                value=f"<h4 style='margin: 20px 0 10px 0;'>📄 Section {i+1} ({section_type}) {f'[{section_lang.upper()}]' if section_lang else ''}</h4>"
            )
            
            # Text area
            textarea = widgets.Textarea(
                value=content,
                layout=widgets.Layout(width='100%', height='200px'),
                description='',
            )
            textarea.section_index = i
            section_widgets.append((header, textarea, section_lang))
            
        elif section_type == 'exercise':
            ex_id = section.get('id', f'ex{i}')
            ex_type = section.get('exerciseType', 'unknown')
            question = section.get('question', '')
            
            header = widgets.HTML(
                value=f"<h4 style='margin: 20px 0 10px 0;'>🧩 Section {i+1} (exercise: {ex_type})</h4>"
            )
            
            info = widgets.HTML(
                value=f"<p style='color: #666;'>ID: {ex_id}<br>Question: {question}</p>"
            )
            section_widgets.append((header, info, None))
    
    # Save button
    save_button = widgets.Button(
        description='💾 Save Changes',
        button_style='success',
        layout=widgets.Layout(width='200px', height='40px')
    )
    
    status_output = widgets.Output()
    
    def on_save_click(b):
        with status_output:
            status_output.clear_output()
            print("💾 Saving...")
            
            # Update chapter data from widgets
            for header, widget, lang in section_widgets:
                if hasattr(widget, 'section_index'):
                    idx = widget.section_index
                    chapter_data['sections'][idx]['content'] = widget.value
            
            # Write back to file
            with open(chapter_file, 'w', encoding='utf-8') as f:
                json.dump(chapter_data, f, ensure_ascii=False, indent=2)
            
            print(f"✅ Saved to {chapter_file.name}")
    
    save_button.on_click(on_save_click)
    
    # Display all widgets
    for header, widget, lang in section_widgets:
        display(header)
        display(widget)
    
    display(widgets.HBox([save_button]))
    display(status_output)

---
## 🔊 Audio Generation

Tạo audio từ nội dung sách.

In [None]:
#@title 🎵 Generate Audio with Cloned Voice { display-mode: "form" }
#@markdown ### Tạo audio từ giọng clone

BOOK_ID = "gentle-mind" #@param {type:"string"}
SKIP_EXISTING = True #@param {type:"boolean"}
GENERATE_VI = True #@param {type:"boolean"}
GENERATE_EN = True #@param {type:"boolean"}

if model is None:
    print("❌ Model chưa load! Chạy 'Load viXTTS Model' trước.")
else:
    # Load voice profiles
    print("="*50)
    print("🎵 Loading voice profiles...")
    print("="*50)
    
    vi_profile = VOICES_DIR / "default-vi.pt"
    en_profile = VOICES_DIR / "default-en.pt"
    
    gpt_vi, spk_vi, gpt_en, spk_en = None, None, None, None
    
    if GENERATE_VI and vi_profile.exists():
        vi_data = torch.load(vi_profile, weights_only=False)
        gpt_vi = vi_data["gpt_cond_latent"].cuda()
        spk_vi = vi_data["speaker_embedding"].cuda()
        print(f"✅ VI: Loaded")
    elif GENERATE_VI:
        print("❌ VI: No profile found")
        GENERATE_VI = False
    
    if GENERATE_EN and en_profile.exists():
        en_data = torch.load(en_profile, weights_only=False)
        gpt_en = en_data["gpt_cond_latent"].cuda()
        spk_en = en_data["speaker_embedding"].cuda()
        print(f"✅ EN: Loaded")
    elif GENERATE_EN:
        print("❌ EN: No profile found")
        GENERATE_EN = False
    
    if not GENERATE_VI and not GENERATE_EN:
        print("❌ No voice profiles! Clone voice first.")
    else:
        # Setup directories
        BOOK_DIR = CONTENT_DIR / BOOK_ID
        AUDIO_DIR = BOOK_DIR / "audio"
        AUDIO_DIR.mkdir(parents=True, exist_ok=True)
        
        def extract_text_by_lang(sections, target_lang):
            texts = []
            for section in sections:
                if section.get('type') != 'markdown':
                    continue
                if section.get('lang', '') == target_lang:
                    content = section.get('content', '')
                    lines = []
                    for line in content.split('\n'):
                        line = line.strip()
                        if line in ['---', '']: 
                            continue
                        if line.startswith('#'):
                            lines.append(line.lstrip('#').strip())
                        else:
                            lines.append(line)
                    texts.append(' '.join(lines))
            return ' '.join(texts)
        
        def generate_audio(text, output_path, lang, gpt_cond, speaker_emb):
            sentences = [s.strip() for s in re.split(r'[.!?]', text) if s.strip() and len(s.strip()) > 3]
            if not sentences:
                return 0
            
            all_audio, timestamps = [], []
            silence = np.zeros(int(24000 * 0.5))
            current_time = 0.0
            
            for i, sentence in enumerate(sentences):
                display_text = sentence[:50] + "..." if len(sentence) > 50 else sentence
                print(f"  [{i+1}/{len(sentences)}] {display_text}")
                
                out = model.inference(sentence + ".", lang, gpt_cond, speaker_emb, temperature=0.7)
                audio_data = out["wav"]
                
                duration = len(audio_data) / 24000
                timestamps.append({"start": round(current_time, 2), "end": round(current_time + duration, 2), "text": sentence})
                current_time += duration + 0.5
                
                all_audio.extend([audio_data, silence])
            
            combined = np.concatenate(all_audio)
            sf.write(str(output_path), combined, 24000)
            
            with open(output_path.with_suffix('.json'), 'w', encoding='utf-8') as f:
                json.dump(timestamps, f, ensure_ascii=False, indent=2)
            
            return len(combined) / 24000
        
        # Load book
        with open(BOOK_DIR / "book.json") as f:
            book = json.load(f)
        
        print(f"\n📖 Book: {book['title']}")
        print(f"📑 Chapters: {book['chapters']}")
        
        generated, skipped = 0, 0
        
        for chapter_id in book['chapters']:
            chapter_file = BOOK_DIR / "chapters" / f"{chapter_id}.json"
            with open(chapter_file) as f:
                chapter = json.load(f)
            
            print(f"\n{'='*40}")
            print(f"📖 {chapter_id}: {chapter['title']}")
            
            sections = chapter.get('sections', [])
            
            if GENERATE_VI:
                output_vi = AUDIO_DIR / f"{chapter_id}-vi.wav"
                if SKIP_EXISTING and output_vi.exists():
                    print(f"⏭️ {chapter_id}-vi.wav: exists")
                    skipped += 1
                else:
                    vi_text = extract_text_by_lang(sections, "vi")
                    if vi_text.strip():
                        print(f"\n🇻🇳 Generating Vietnamese...")
                        duration = generate_audio(vi_text, output_vi, "vi", gpt_vi, spk_vi)
                        print(f"  ✅ {output_vi.name} ({duration:.1f}s)")
                        generated += 1
            
            if GENERATE_EN:
                output_en = AUDIO_DIR / f"{chapter_id}-en.wav"
                if SKIP_EXISTING and output_en.exists():
                    print(f"⏭️ {chapter_id}-en.wav: exists")
                    skipped += 1
                else:
                    en_text = extract_text_by_lang(sections, "en")
                    if en_text.strip():
                        print(f"\n🇺🇸 Generating English...")
                        duration = generate_audio(en_text, output_en, "en", gpt_en, spk_en)
                        print(f"  ✅ {output_en.name} ({duration:.1f}s)")
                        generated += 1
        
        print(f"\n{'='*50}")
        print(f"📊 Summary: {generated} generated, {skipped} skipped")

In [None]:
#@title 🔊 Preview Audio { display-mode: "form" }

BOOK_ID = "gentle-mind" #@param {type:"string"}
CHAPTER = "ch01" #@param ["ch01", "ch02", "ch03"]
LANGUAGE = "vi" #@param ["vi", "en"]

audio_file = CONTENT_DIR / BOOK_ID / "audio" / f"{CHAPTER}-{LANGUAGE}.wav"

if audio_file.exists():
    print(f"🔊 Playing: {audio_file.name}")
    display(Audio(str(audio_file)))
else:
    print(f"❌ Not found: {audio_file}")
    print(f"\nAvailable files:")
    audio_dir = audio_file.parent
    if audio_dir.exists():
        for f in sorted(audio_dir.glob("*.wav")):
            print(f"   - {f.name}")

---
## 🚀 Push to GitHub

Lưu tất cả thay đổi lên GitHub.

In [None]:
#@title 🚀 Commit & Push Changes { display-mode: "form" }
#@markdown ### Push tất cả thay đổi lên GitHub

COMMIT_MESSAGE = "Update content and audio" #@param {type:"string"}

print("="*50)
print("📊 Git Status")
print("="*50)

os.chdir(REPO_DIR)

# Show status
result = subprocess.run(["git", "status", "--short"], capture_output=True, text=True)
if result.stdout.strip():
    print(result.stdout)
else:
    print("✅ No changes to commit.")

# Add all changes
subprocess.run(["git", "add", "the-lost-chapter/"])

# Check if there are staged changes
result = subprocess.run(["git", "diff", "--cached", "--quiet"])
if result.returncode == 0:
    print("\n⚠ Nothing to commit.")
else:
    print(f"\n📝 Committing: {COMMIT_MESSAGE}")
    subprocess.run(["git", "commit", "-m", COMMIT_MESSAGE])
    
    print(f"\n🚀 Pushing to {BRANCH}...")
    result = subprocess.run(["git", "push", "origin", BRANCH], capture_output=True, text=True)
    
    if result.returncode == 0:
        print(f"✅ Pushed successfully!")
        print(f"\n🔗 View at: https://github.com/{GITHUB_USERNAME}/{REPO_NAME}/tree/{BRANCH}")
    else:
        print(f"❌ Push failed:")
        print(result.stderr)

In [None]:
#@title 📄 View Recent Commits { display-mode: "form" }

os.chdir(REPO_DIR)
result = subprocess.run(["git", "log", "--oneline", "-10"], capture_output=True, text=True)
print("📄 Recent commits:")
print(result.stdout)

---
## 📤 Upload Voice Samples

Upload file voice mẫu nếu chưa có.

In [None]:
#@title 📤 Upload Voice Samples { display-mode: "form" }
#@markdown ### Upload voice samples từ máy tính

print("📤 Upload your voice samples:")
print("   - my-voice-vi.m4a (30-60 giây tiếng Việt)")
print("   - my-voice-en.m4a (30-60 seconds English)")
print()

uploaded = files.upload()

if uploaded:
    import shutil
    for filename, content in uploaded.items():
        # Auto-rename
        if 'vi' in filename.lower() or 'viet' in filename.lower():
            new_name = "my-voice-vi" + Path(filename).suffix
        elif 'en' in filename.lower() or 'eng' in filename.lower():
            new_name = "my-voice-en" + Path(filename).suffix
        else:
            new_name = filename
        
        dest = VOICES_DIR / new_name
        shutil.copy(filename, dest)
        print(f"✅ Saved: {dest}")
    
    print(f"\n📁 Files in voices/:")
    for f in VOICES_DIR.iterdir():
        print(f"   - {f.name}")
else:
    print("⚠ No files uploaded.")