# Katha: Smart Cultural Storyteller
## Project Notebook: Design, Architecture, and Roadmap

---

## 1. Introduction

**Katha** is an AI-powered cultural storytelling platform that revitalizes ancient Indian epics such as the **Ramayana** and **Mahabharata** for modern audiences. The platform blends cultural authenticity with AI-driven multimedia to create immersive, interactive reading experiences across text, audio, visuals, and maps.

This notebook documents:
* Current implemented features
* System architecture
* Planned enhancements
* Draft technical logic for upcoming AI modules

## 2. Problem Statement

Traditional access to Indian epics faces several challenges:
* **Static text formats** reduce engagement for younger audiences accustomed to rich media.
* **Language barriers** (archaic English or Sanskrit) limit accessibility.
* **Lack of spatial context** makes it hard to visualize journeys (e.g., Rama's path from Ayodhya to Lanka).
* **Minimal interactivity** leads to passive consumption rather than active exploration.

Katha addresses these by combining **AI narration**, **dynamic visualization**, **interactive maps**, and **cultural design**.

## 3. Core Objectives

1. **Preserve Cultural Authenticity**: Honor the source material while modernizing its presentation.
2. **Enhance Engagement**: Use multimedia (visuals, audio) and gamification to keep users reading.
3. **Ensure Accessibility**: Provide multi-lingual support and easy-to-read interfaces.
4. **Scalable Architecture**: Build a system where AI content generation (audio/video) can be automated.

## 4. Current Implementation (Deployment-Ready)

The following features are fully implemented and functional in the v1.0 release:

### 4.1 Smart Chapter Reader
* Clean, immersive reading UI with simplified navigation.
* Ambient background themes that adapt to the story's mood.
* Adjustable text rendering for optimal readability.

### 4.2 AI Audiobooks
* Integrated Text-to-Speech (TTS) narration.
* Scene-by-scene audio playback.
* (In-Progress) Background sound layering for immersion.

### 4.3 Fast Visuals (Static)
* Instant AI-generated scene illustrations using lower-latency models.
* Served immediately to users to provide visual context.
* Acts as a placeholder for the upcoming Cinematic Video Reels.

### 4.4 Interactive Epic Map
* Geographic visualization of epic events.
* Clickable locations (e.g., Ayodhya, Janakpur, Lanka, Kurukshetra).
* Content linking: Clicking a map marker opens the related chapter.

### 4.5 Gamification Engine
* **XP System**: Users earn Experience Points for reading and exploring.
* **Streaks**: Daily reading tracking.
* **Badges**: Unlockable achievements like "Scholar", "Devotee", and "Explorer".

### 4.6 Authentication System
* Secure Login and Registration flows.
* User profile management.
* Persistent reading history across devices.

## 5. System Architecture Overview

The Katha platform uses a modern full-stack architecture:

### 5.1 High-Level Components

1.  **Frontend (React/Vite)**
    *   **Reader UI**: The core reading experience.
    *   **Visuals & Map**: Interactive components using Leaflet/Mapbox and standard video players.
    *   **Gamification**: UI elements for badges and progress bars.

2.  **Backend (FastAPI/Python)**
    *   **User Management**: Auth, profiles, progress tracking.
    *   **Content Delivery**: Serving chapters, scenes, and media assets.
    *   **AI Orchestration**: Managing request queues for image/audio/video generation.

3.  **AI Services (External APIs)**
    *   **Text-to-Speech**: ElevenLabs / EdgeTTS.
    *   **Image Generation**: Pollinations.ai / Stability AI.
    *   **Video Generation (Planned)**: SVD / Runway.
    *   **Translation (Planned)**: DeepL / Google Translate API.

4.  **Database**
    *   **SQLModel (SQLite/PostgreSQL)**: Stores Users, Stories, Chapters, Scenes, and various mappings.

## 6. Upcoming Features (Phase 2)

The next phase focuses on high-fidelity media production and linguistic inclusivity.

### 6.1 Cinematic Video Reel Generation

**Goal**: Replace static scene visuals with short, high-quality cinematic video reels to increase immersion.

**Planned Capabilities**:
*   **Model Integration**: Stable Video Diffusion (SVD), Runway, or Pika.
*   **Aspect Ratio**: 9:16 (Vertical video, mobile-first design).
*   **Duration**: 4–6 seconds looping clips.
*   **Pipeline**: Automated prompt-to-video workflow.

#### Draft Logic for Video Generation

In [None]:
import time
from typing import Optional

class VideoGenerationService:
    """
    Mock service to orchestrate High-Fidelity Video Generation.
    Integrates with models like SVD or external APIs.
    """
    
    def __init__(self, model_name="stable-video-diffusion"):
        self.model_name = model_name

    def enhance_prompt(self, base_prompt: str) -> str:
        """
        Adds cinematic keywords to ensure consistent high-quality output.
        """
        modifiers = [
            "cinematic",
            "slow pan camera movement",
            "dramatic lighting",
            "4k resolution",
            "highly detailed",
            "ancient india atmosphere"
        ]
        return f"{base_prompt}, {', '.join(modifiers)}"

    def generate_reel(self, scene_id: int, prompt: str, duration: int = 4) -> str:
        """
        Generates a video reel for a specific scene.
        """
        print(f"[VideoService] Processing Scene {scene_id} using {self.model_name}...")
        
        # 1. Enhance Prompt
        final_prompt = self.enhance_prompt(prompt)
        print(f"[VideoService] Enhanced Prompt: '{final_prompt}'")

        # 2. Call AI Model API (Simulated)
        # Replace with actual API call: e.g., client.text_to_video(...)
        print(f"[VideoService] Sending request to {self.model_name} API...")
        time.sleep(1) # Simulating API latency
        
        # 3. Process Result
        # Ideally, we would download the video, optimize it to WebP/MP4, and upload to storage.
        generated_url = f"https://katha-storage.s3.amazonaws.com/reels/scene_{scene_id}_hq.mp4"
        
        print(f"[VideoService] Generation Complete: {generated_url}")
        return generated_url

# Example Usage
video_service = VideoGenerationService(model_name="SVD-XT-1.1")
reel_url = video_service.generate_reel(
    scene_id=101,
    prompt="A golden chariot flying through gathering storm clouds"
)
print(f"Result: {reel_url}")

### 6.2 Multi-Lingual Support (Hindi & Sanskrit)

**Goal**: Enable users to experience epics in their original and vernacular languages while maintaining narrative accuracy.

**Planned Capabilities**:
*   **Language Toggle**: Switch between English, Hindi, and Sanskrit on the fly.
*   **Devanagari Rendering**: Proper font support for Indian scripts.
*   **Localized AI Narration**: Using culturally appropriate voice models for each language.

#### Draft Logic for Translation & Narration

In [None]:
class LocalizationService:
    """
    Handles translation and localized audio generation.
    """
    
    def __init__(self):
        self.voice_map = {
            "hindi": "hindi_narrator_male_deep",
            "sanskrit": "sanskrit_scholar_priest",
            "english": "english_storyteller_soothing"
        }

    def translate_text(self, text: str, target_lang: str) -> str:
        """
        Calls an external Neural Machine Translation service.
        """
        # Mock Translation Logic
        print(f"[LocService] Translating text to {target_lang}...")
        if target_lang == "hindi":
            return "(Hindi Translation Placeholder) प्राचीन काल में..."
        elif target_lang == "sanskrit":
            return "(Sanskrit Translation Placeholder) पुराकाले..."
        return text

    def generate_localized_asset(self, scene_id: int, original_text: str, target_lang: str):
        """
        Orchestrates the full localization pipeline for a scene.
        """
        # 1. Translate
        translated_text = self.translate_text(original_text, target_lang)
        
        # 2. Select Voice
        voice_id = self.voice_map.get(target_lang, "default_voice")
        print(f"[LocService] Selected Voice ID: {voice_id}")
        
        # 3. Generate Audio (Simulated TTS Call)
        print(f"[LocService] Generating Audio via TTS...")
        audio_url = f"/static/audio/{target_lang}/scene_{scene_id}.mp3"
        
        return {
            "language": target_lang,
            "text": translated_text,
            "audio_url": audio_url,
            "voice_used": voice_id
        }

# Example Usage
loc_service = LocalizationService()
asset_hindi = loc_service.generate_localized_asset(
    scene_id=42,
    original_text="Long ago, in the city of Ayodhya...",
    target_lang="hindi"
)
print(asset_hindi)

## 7. Ethical and Cultural Considerations

When integrating AI with sacred or culturally significant texts, we adhere to strict guidelines:
1.  **No Distortion**: The AI should not alter the core narrative or moral lessons of the original scriptures.
2.  **Visual Respect**: Generated imagery must be respectful of traditional iconographies (e.g., correct attire, symbols).
3.  **Educational Intent**: Users are informed that AI features are for immersion and educational purposes, bridging the gap to original texts.

## 8. Future Extensions

Beyond the current implementation and verified next steps, the long-term vision includes:
*   **Personalized Paths**: Story branches based on user choices (where applicable in folklore).
*   **AI Oracle (Rishi)**: A chatbot interface to ask questions about characters and dharma.
*   **Classroom Mode**: Tools for teachers to track student progress and assign chapters.
*   **Offline Support**: Full PWA capabilities for reading without internet.

## 9. Conclusion

**Katha** stands at the intersection of heritage and technology. The current deployment serves as a robust foundation—providing a fast, interactive, and gamified reading experience. With the upcoming integration of cinematic video reels and multi-lingual support, Katha aims to become the definitive platform for experiencing the timeless wisdom of Indian epics.