# # Phase 2: Story2Audio Pipeline

This notebook implements the Story2Audio pipeline for Phase 2 of the NLP project. It:
- Preprocesses a story into chunks.
- Enhances chunks using tiiuae/falcon-rw-1b locally.
- Generates audio using hexgrad/Kokoro-82M locally.
- Stitches audio into a final .mp3 file.

**Requirements**:
- Python 3.11
- FFmpeg installed and added to PATH
- Dependencies: transformers, torch, kokoro, pydub, soundfile
- Hardware: CPU (GPU recommended for faster inference)

**Output**: outputs/final_story.mp3

In [1]:
import numpy as np
print(np.__version__)
print(np.array([1, 2, 3]))

1.24.4
[1 2 3]


In [2]:
import os
import logging
from src.preprocess import chunk_story
from src.enhancer_local import StoryEnhancer
from src.kokoro_tts import text_to_coqui_audio
from src.utils import combine_audio

# Set up logging
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
logger = logging.getLogger(__name__)

  from .autonotebook import tqdm as notebook_tqdm


# # Step 1: Load and Chunk Story

In [3]:
try:
    # Load sample story
    with open('sample_story.txt', 'r', encoding='utf-8') as f:
        story_text = f.read()
    logger.info('✅Story loaded successfully')

    # Chunk story (~150 words per chunk)
    chunks = chunk_story(story_text, chunk_size=150)
    logger.info(f'✅Story split into {len(chunks)} chunks')
except Exception as e:
    logger.error(f'Error in preprocessing: {e}')
    raise

INFO:__main__:✅Story loaded successfully
INFO:__main__:✅Story split into 1 chunks


# # Step 2: Enhance Chunks with tiiuae/falcon-rw-1b

In [4]:
try:
    # Initialize enhancer
    enhancer = StoryEnhancer()
    logger.info('✅StoryEnhancer initialized Locally')

    # Enhance each chunk
    enhanced_chunks = []
    for idx, chunk in enumerate(chunks):
        enhanced = enhancer.enhance_chunk(chunk)
        enhanced_chunks.append(enhanced)
        logger.info(f'✅Enhanced chunk {idx + 1}/{len(chunks)}')

    # Save enhanced chunks to a file
    with open('enhanced_chunks.txt', 'w', encoding='utf-8') as f:
        for chunk in enhanced_chunks:
            f.write(chunk + '\n')
    logger.info('✅Enhanced chunks saved to enhanced_chunks.txt')

except Exception as e:
    logger.error(f'Error in enhancement: {e}')
    raise

  return self.fget.__get__(instance, owner)()
Device set to use cpu
INFO:src.enhancer_local:Initialized StoryEnhancer locally with model: tiiuae/falcon-rw-1b
INFO:__main__:✅StoryEnhancer initialized Locally
INFO:src.enhancer_local:Tokenized input length: 164 tokens
INFO:__main__:✅Enhanced chunk 1/1
INFO:__main__:✅Enhanced chunks saved to enhanced_chunks.txt


# # Step 3: Generate Audio with Kokoro_tts.


In [5]:
try:
    # Read enhanced chunks from file
    with open('enhanced_chunks.txt', 'r', encoding='utf-8') as f:
        enhanced_chunks = [line.strip() for line in f if line.strip()]
    logger.info(f'✅Read {len(enhanced_chunks)} enhanced chunks from enhanced_chunks.txt')

    os.makedirs('outputs/temp', exist_ok=True)
    audio_files = text_to_coqui_audio(enhanced_chunks, output_dir='outputs/temp')
    logger.info(f'✅Generated audio files: {audio_files}')
except Exception as e:
    logger.error(f'Error in audio generation: {e}')
    raise

INFO:__main__:✅Read 1 enhanced chunks from enhanced_chunks.txt




INFO:src.kokoro_tts:Generated audio for chunk 1 - Graphemes: Once upon a time, Lila wandered through the forest alone because her father, the king, had come home late that night. When he returned home the next day, he brought her father-in-law a beautiful golden box, which he, Phonemes: wˈʌns əpˈɑn ɐ tˈIm, lˈilə wˈɑndəɹd θɹu ðə fˈɔɹəst əlˈOn bəkˈʌz hɜɹ fˈɑðəɹ, ðə kˈɪŋ, hæd kˈʌm hˈOm lˈAt ðˈæt nˈIt. wˌɛn hi ɹətˈɜɹnd hˈOm ðə nˈɛkst dˈA, hi bɹˈɔt hɜɹ fˈɑðəɹənlˌɔ ɐ bjˈuTəfəl ɡˈOldən bˈɑks, wˌɪʧ hi
INFO:src.kokoro_tts:Audio saved to outputs/temp/chunk_0.wav
INFO:__main__:✅Generated audio files: ['outputs/temp/chunk_0.wav']


# # Step 4: Stitch Audio into Final MP3

In [6]:
try:
    # Combine audio files
    output_path = 'outputs/final_story.mp3'
    combine_audio(audio_files, output_path)
    logger.info(f'✅Audio generated: {output_path}')
except Exception as e:
    logger.error(f'Error in audio stitching: {e}')
    raise

INFO:src.utils:Audio stitched and saved to outputs/final_story.mp3
INFO:__main__:✅Audio generated: outputs/final_story.mp3


# # Step 5: Verify Output

In [7]:
if os.path.exists(output_path):
    logger.info('✅ Verification: Final audio file exists and is playable')
else:
    logger.error('❌ Verification: Final audio file not found')
    raise FileNotFoundError(f'Output file {output_path} not found')

INFO:__main__:✅ Verification: Final audio file exists and is playable
