# 🎤 CLARISSA Voice Input Showcase

**Talk to Your Reservoir Simulation**

This notebook demonstrates CLARISSA's voice interface for controlling reservoir simulations through natural language commands.

---

## 🚀 Quick Start

**Just run the cell below!** It will:
1. Set up everything automatically
2. Show a voice recording interface
3. Transcribe your speech (if API key is set)

---

## 🌐 Web Demo

For an even simpler experience, use the **standalone web demo** (no setup required):

👉 **[Voice Notebook Demo](https://irena-40cc50.gitlab.io/demos/voice-demo.html)**

👉 **[Recording Suite (Voice + Screen)](https://irena-40cc50.gitlab.io/demos/demo-recording-suite.html)**

---

In [None]:
# ═══════════════════════════════════════════════════════════════════════════════
# 🚀 QUICK START - Just run this cell!
# ═══════════════════════════════════════════════════════════════════════════════
#
# This single cell sets up everything and launches the voice interface.
# After recording, your audio is automatically transcribed and ready to use.
#
# 💡 For the standalone web demo (no setup needed), visit:
#    https://irena-40cc50.gitlab.io/demos/voice-demo.html
#

# ─────────────────────────────────────────────────────────────────────────────────
# Silent Setup
# ─────────────────────────────────────────────────────────────────────────────────
import subprocess
import sys

# Install packages quietly
subprocess.run([sys.executable, "-m", "pip", "install", "-q", "openai"], 
               capture_output=True)

from IPython.display import display, Javascript, HTML, Audio, clear_output
from base64 import b64decode
import os

# Check environment
try:
    from google.colab import output
    IN_COLAB = True
except ImportError:
    IN_COLAB = False
    print("⚠️  This notebook works best in Google Colab!")
    print("   Open in Colab: https://colab.research.google.com/github/wolfram-laube/clarissa/blob/main/docs/tutorials/notebooks/16_Voice_Input_Showcase.ipynb")
    print()
    print("   Or use the web demo: https://irena-40cc50.gitlab.io/demos/voice-demo.html")

# ─────────────────────────────────────────────────────────────────────────────────
# Voice Recorder (Colab-compatible)
# ─────────────────────────────────────────────────────────────────────────────────

VOICE_UI = """
(function() {
    // Remove any existing UI
    const existing = document.getElementById('clarissa-voice-ui');
    if (existing) existing.remove();
    
    // State
    let mediaRecorder, audioChunks = [], audioContext, analyser, dataArray;
    let isRecording = false, timerInterval, seconds = 0;
    
    // Create UI
    const container = document.createElement('div');
    container.id = 'clarissa-voice-ui';
    container.innerHTML = `
        <style>
            #clarissa-voice-ui * { box-sizing: border-box; }
            .cv-container {
                font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;
                background: linear-gradient(135deg, #1a1a2e 0%, #16213e 100%);
                border-radius: 20px;
                padding: 30px;
                max-width: 500px;
                margin: 20px auto;
                box-shadow: 0 15px 50px rgba(0,0,0,0.5);
                color: white;
            }
            .cv-header { text-align: center; margin-bottom: 25px; }
            .cv-title {
                font-size: 1.8em;
                font-weight: 700;
                background: linear-gradient(90deg, #e94560, #0f3460);
                -webkit-background-clip: text;
                -webkit-text-fill-color: transparent;
                margin: 0;
            }
            .cv-subtitle { color: #888; margin-top: 8px; font-size: 0.95em; }
            .cv-waveform {
                height: 80px;
                background: rgba(0,0,0,0.4);
                border-radius: 12px;
                display: flex;
                align-items: center;
                justify-content: center;
                gap: 3px;
                padding: 0 15px;
                margin: 20px 0;
            }
            .cv-bar {
                width: 4px;
                height: 15px;
                background: linear-gradient(180deg, #e94560 0%, #0f3460 100%);
                border-radius: 2px;
                transition: height 0.05s ease-out;
            }
            .cv-timer {
                text-align: center;
                font-size: 2.5em;
                font-weight: 700;
                font-family: 'SF Mono', Monaco, monospace;
                color: #4caf50;
                margin: 15px 0;
            }
            .cv-status {
                text-align: center;
                color: #aaa;
                min-height: 24px;
                margin: 15px 0;
            }
            .cv-status.recording { color: #e94560; font-weight: 600; }
            .cv-status.success { color: #4caf50; }
            .cv-status.error { color: #ff6b6b; }
            .cv-buttons {
                display: flex;
                justify-content: center;
                gap: 15px;
                margin-top: 20px;
            }
            .cv-btn {
                padding: 15px 35px;
                font-size: 16px;
                font-weight: 600;
                border: none;
                border-radius: 30px;
                cursor: pointer;
                transition: all 0.3s ease;
                display: flex;
                align-items: center;
                gap: 8px;
            }
            .cv-btn:hover { transform: translateY(-2px); }
            .cv-btn-record {
                background: linear-gradient(145deg, #e94560, #c23a51);
                color: white;
                box-shadow: 0 4px 20px rgba(233,69,96,0.4);
            }
            .cv-btn-stop {
                background: linear-gradient(145deg, #4caf50, #388e3c);
                color: white;
                box-shadow: 0 4px 20px rgba(76,175,80,0.4);
            }
            .cv-hint {
                text-align: center;
                color: #666;
                font-size: 0.85em;
                margin-top: 20px;
                font-style: italic;
            }
            .cv-link {
                text-align: center;
                margin-top: 15px;
                padding-top: 15px;
                border-top: 1px solid rgba(255,255,255,0.1);
            }
            .cv-link a { color: #e94560; text-decoration: none; font-size: 0.85em; }
            .cv-link a:hover { text-decoration: underline; }
            @keyframes pulse {
                0%, 100% { box-shadow: 0 4px 20px rgba(233,69,96,0.4); }
                50% { box-shadow: 0 4px 35px rgba(233,69,96,0.7); }
            }
            .cv-btn-record.recording { animation: pulse 1s ease-in-out infinite; }
        </style>
        <div class="cv-container">
            <div class="cv-header">
                <h2 class="cv-title">🎤 CLARISSA Voice Input</h2>
                <p class="cv-subtitle">Speak to control reservoir simulation</p>
            </div>
            <div class="cv-waveform" id="cv-waveform"></div>
            <div class="cv-timer" id="cv-timer">00:00</div>
            <div class="cv-status" id="cv-status">Click "Start Recording" to begin</div>
            <div class="cv-buttons">
                <button class="cv-btn cv-btn-record" id="cv-record">🎤 Start Recording</button>
                <button class="cv-btn cv-btn-stop" id="cv-stop" style="display:none">⏹️ Stop</button>
            </div>
            <p class="cv-hint">💡 Try: "show permeability" or "what is the water cut?"</p>
            <div class="cv-link">
                <a href="https://irena-40cc50.gitlab.io/demos/voice-demo.html" target="_blank">
                    Open full demo in new tab ↗
                </a>
            </div>
        </div>
    `;
    document.body.appendChild(container);
    
    // Create waveform bars
    const waveform = document.getElementById('cv-waveform');
    for (let i = 0; i < 35; i++) {
        const bar = document.createElement('div');
        bar.className = 'cv-bar';
        waveform.appendChild(bar);
    }
    const bars = waveform.querySelectorAll('.cv-bar');
    
    const recordBtn = document.getElementById('cv-record');
    const stopBtn = document.getElementById('cv-stop');
    const status = document.getElementById('cv-status');
    const timer = document.getElementById('cv-timer');
    
    function animateWaveform() {
        if (!isRecording) return;
        analyser.getByteFrequencyData(dataArray);
        bars.forEach((bar, i) => {
            const value = dataArray[i % dataArray.length] || 0;
            bar.style.height = Math.max(8, value / 3) + 'px';
        });
        requestAnimationFrame(animateWaveform);
    }
    
    function updateTimer() {
        seconds++;
        const mins = Math.floor(seconds / 60);
        const secs = seconds % 60;
        timer.textContent = mins.toString().padStart(2, '0') + ':' + secs.toString().padStart(2, '0');
    }
    
    recordBtn.onclick = async () => {
        try {
            status.textContent = '🔄 Requesting microphone...';
            status.className = 'cv-status';
            
            const stream = await navigator.mediaDevices.getUserMedia({ 
                audio: { channelCount: 1, sampleRate: 16000, echoCancellation: true, noiseSuppression: true }
            });
            
            audioContext = new AudioContext();
            analyser = audioContext.createAnalyser();
            analyser.fftSize = 128;
            dataArray = new Uint8Array(analyser.frequencyBinCount);
            audioContext.createMediaStreamSource(stream).connect(analyser);
            
            mediaRecorder = new MediaRecorder(stream, { mimeType: 'audio/webm' });
            audioChunks = [];
            mediaRecorder.ondataavailable = e => audioChunks.push(e.data);
            mediaRecorder.start(100);
            
            isRecording = true;
            seconds = 0;
            timerInterval = setInterval(updateTimer, 1000);
            animateWaveform();
            
            recordBtn.style.display = 'none';
            stopBtn.style.display = 'flex';
            status.innerHTML = '🔴 <b>Recording...</b> Speak now!';
            status.className = 'cv-status recording';
            recordBtn.classList.add('recording');
            
        } catch (err) {
            status.textContent = '❌ ' + err.message;
            status.className = 'cv-status error';
        }
    };
    
    stopBtn.onclick = () => {
        if (!isRecording) return;
        isRecording = false;
        clearInterval(timerInterval);
        
        mediaRecorder.stop();
        mediaRecorder.stream.getTracks().forEach(t => t.stop());
        
        status.textContent = '⏳ Processing...';
        status.className = 'cv-status';
        stopBtn.style.display = 'none';
        bars.forEach(bar => bar.style.height = '15px');
        
        mediaRecorder.onstop = async () => {
            const blob = new Blob(audioChunks, { type: 'audio/webm' });
            const reader = new FileReader();
            reader.onloadend = () => {
                window._clarissaAudioResult = reader.result;
                status.textContent = '✅ Recording complete! Processing...';
                status.className = 'cv-status success';
                google.colab.kernel.invokeFunction('notebook.process_audio', [reader.result], {});
            };
            reader.readAsDataURL(blob);
            if (audioContext) audioContext.close();
        };
    };
})();
"""

# Python callback to process audio
def process_audio(audio_base64):
    """Process recorded audio - called from JavaScript."""
    global last_recording, last_transcript
    
    clear_output(wait=True)
    
    audio_data = audio_base64.split(',')[1]
    audio_bytes = b64decode(audio_data)
    last_recording = audio_bytes
    
    print("═" * 60)
    print("✅ Recording captured!")
    print("═" * 60)
    print(f"   Audio size: {len(audio_bytes):,} bytes")
    print()
    print("🔊 Your recording:")
    display(Audio(audio_bytes, autoplay=False))
    
    api_key = os.getenv('OPENAI_API_KEY')
    if api_key:
        print()
        print("─" * 60)
        print("📝 Transcribing with Whisper...")
        print("─" * 60)
        
        try:
            from openai import OpenAI
            import tempfile
            
            client = OpenAI(api_key=api_key)
            
            with tempfile.NamedTemporaryFile(suffix='.webm', delete=False) as f:
                f.write(audio_bytes)
                temp_path = f.name
            
            with open(temp_path, 'rb') as audio_file:
                transcript = client.audio.transcriptions.create(
                    model="whisper-1",
                    file=audio_file,
                    language="en",
                    prompt="Reservoir simulation: permeability, porosity, pressure, saturation, water cut, oil rate"
                )
            
            os.unlink(temp_path)
            last_transcript = transcript.text
            
            print()
            print(f'   "{transcript.text}"')
            print()
            print("─" * 60)
            print("💡 Use `last_transcript` variable to access this text")
            
        except Exception as e:
            print(f"   ❌ Transcription failed: {e}")
            last_transcript = None
    else:
        print()
        print("─" * 60)
        print("💡 To enable transcription, set your OpenAI API key:")
        print('   os.environ["OPENAI_API_KEY"] = "sk-..."')
        print("   Then run this cell again.")
        print()
        print("💡 Use `last_recording` variable to access audio bytes")
        last_transcript = None
    
    print()
    print("🎤 Run this cell again to record another command")

if IN_COLAB:
    output.register_callback('notebook.process_audio', process_audio)

# ─────────────────────────────────────────────────────────────────────────────────
# Launch UI
# ─────────────────────────────────────────────────────────────────────────────────

last_recording = None
last_transcript = None

if IN_COLAB:
    print("🎤 CLARISSA Voice Input Ready!")
    print("─" * 40)
    print()
    display(Javascript(VOICE_UI))


---

## 🔑 Optional: Enable Transcription

To automatically transcribe your recordings with OpenAI Whisper, run this cell and enter your API key:

In [None]:
# Set your OpenAI API key for automatic transcription
import os
from getpass import getpass

if not os.getenv('OPENAI_API_KEY'):
    api_key = getpass("Enter your OpenAI API key: ")
    os.environ['OPENAI_API_KEY'] = api_key
    print("✅ API key set! Recordings will now be automatically transcribed.")
else:
    print("✅ API key already configured.")

---

## 📚 Working with Results

After recording, you can access your data:

```python
# Audio bytes (webm format)
last_recording

# Transcribed text (if API key set)
last_transcript
```

### Example: Process the transcript

```python
if last_transcript:
    # Parse intent
    if "permeability" in last_transcript.lower():
        print("User wants to see permeability data!")
    elif "water cut" in last_transcript.lower():
        print("User is asking about water cut!")
```

---

## 🔗 Related Resources

- **[Web Demo](https://irena-40cc50.gitlab.io/demos/voice-demo.html)** - Standalone voice interface
- **[Recording Suite](https://irena-40cc50.gitlab.io/demos/demo-recording-suite.html)** - Voice + Screen recording
- **[CLARISSA Documentation](https://irena-40cc50.gitlab.io/)** - Full project docs