# üí¨ Nova2 Omni Interactive Chat Workshop

## üéØ Workshop Overview

Welcome to the **Amazon Nova2 Omni Interactive Chat Workshop**! This hands-on session demonstrates the powerful multimodal capabilities of Nova2 Omni, AWS's advanced foundation model that can understand and generate text, images, and process audio inputs.

### What You'll Learn:
- **Multimodal Conversations**: Chat with Nova2 using text and voice inputs
- **Real-time Audio Processing**: Record voice messages and get intelligent responses
- **Image Generation**: Request custom images through natural language
- **Interactive UI**: Experience a modern chat interface with Nova2

### Key Features Demonstrated:
- üé§ **Voice Input**: Record audio messages for transcription and response
- üìù **Text Chat**: Traditional text-based conversations
- üé® **Image Creation**: Generate images from text descriptions
- üß† **Intelligent Responses**: Context-aware AI conversations

---

## üìã Prerequisites & Setup

Before we begin, ensure you have:
- ‚úÖ AWS Account with Bedrock access
- ‚úÖ Nova2 Omni model access enabled in `us-west-2` region
- ‚úÖ Microphone access for voice input features
- ‚úÖ Python environment with Jupyter notebook support

### üîß Installation
The following cell installs the required dependencies for our interactive chat application:

In [None]:
!pip install sounddevice ipywidgets boto3 -q
!pip install -r ../requirements.txt

### üìö Import Libraries & Initialize Nova2 Omni

Let's import the necessary libraries and set up our connection to Amazon Nova2 Omni:

- **boto3**: AWS SDK for Python to interact with Bedrock
- **sounddevice**: For real-time audio recording
- **ipywidgets**: For creating interactive UI components
- **Nova2 Omni Model**: `us.amazon.nova-2-omni-v1:0` - AWS's multimodal foundation model

In [5]:
import boto3
import json
import base64
import sounddevice as sd
import numpy as np
import wave
import io
import ipywidgets as widgets
import re
from IPython.display import display, HTML
from datetime import datetime

bedrock = boto3.client('bedrock-runtime', region_name='us-west-2')
model_id = "us.amazon.nova-2-omni-v1:0"

In [6]:
class NovaChat:
    def __init__(self):
        self.recording = False
        self.audio_data = []
        self.messages = []
        
        self.chat_display = widgets.HTML(
            value=self._render_chat(),
            layout=widgets.Layout(width='100%', height='600px', overflow='auto', 
                                border='1px solid #ddd', border_radius='15px', padding='10px')
        )
        
    def _render_chat(self):
        if not self.messages:
            return """
            <div style='height: 100%; display: flex; align-items: center; justify-content: center; 
                       background: linear-gradient(135deg, #667eea 0%, #764ba2 100%); 
                       border-radius: 10px; color: white; font-size: 18px;'>
                ü§ñ Start chatting with Nova Omni...
            </div>
            """
        
        return f"<div style='padding: 10px;'>{''.join(self.messages)}</div>"
    
    def format_markdown(self, text):
        # Convert markdown to HTML
        text = re.sub(r'\*\*(.*?)\*\*', r'<strong>\1</strong>', text)
        text = re.sub(r'\*(.*?)\*', r'<em>\1</em>', text)
        text = re.sub(r'^### (.*?)$', r'<h4 style="margin: 10px 0 5px 0; color: #fff;">\1</h4>', text, flags=re.MULTILINE)
        text = re.sub(r'^## (.*?)$', r'<h3 style="margin: 15px 0 8px 0; color: #fff;">\1</h3>', text, flags=re.MULTILINE)
        text = re.sub(r'^\* (.*?)$', r'‚Ä¢ \1', text, flags=re.MULTILINE)
        text = re.sub(r'^- (.*?)$', r'‚Ä¢ \1', text, flags=re.MULTILINE)
        text = text.replace('\n', '<br>')
        text = re.sub(r'---+', '<hr style="border: 1px solid rgba(255,255,255,0.3); margin: 10px 0;">', text)
        return text
        
    def start_recording(self):
        self.recording = True
        self.audio_data = []
        
        def callback(indata, frames, time, status):
            if self.recording:
                self.audio_data.append(indata.copy())
        
        self.stream = sd.InputStream(callback=callback, samplerate=16000, channels=1, dtype=np.float32)
        self.stream.start()
        self.update_status("üé§ Recording...")
        
    def stop_recording(self):
        self.recording = False
        if hasattr(self, 'stream'):
            self.stream.stop()
            self.stream.close()
        
        if self.audio_data:
            audio_array = np.concatenate(self.audio_data, axis=0)
            audio_int16 = (audio_array * 32767).astype(np.int16)
            
            wav_buffer = io.BytesIO()
            with wave.open(wav_buffer, 'wb') as wav_file:
                wav_file.setnchannels(1)
                wav_file.setsampwidth(2)
                wav_file.setframerate(16000)
                wav_file.writeframes(audio_int16.tobytes())
            wav_buffer.seek(0)
            
            self.process_audio(wav_buffer.getvalue())
            
    def process_audio(self, audio_bytes):
        try:
            self.update_status("ü§ñ Processing...")
            
            audio_base64 = base64.b64encode(audio_bytes).decode('utf-8')
            response = bedrock.invoke_model(
                modelId=model_id,
                contentType="application/json",
                accept="application/json",
                body=json.dumps({
                    "messages": [{
                        "role": "user",
                        "content": [
                            {"audio": {"format": "wav", "source": {"bytes": audio_base64}}},
                            {"text": "Transcribe this audio."}
                        ]
                    }],
                    "inferenceConfig": {"maxTokens": 500}
                })
            )
            
            result = json.loads(response['body'].read())
            text = result['output']['message']['content'][0]['text']
            
            self.add_message("You", text, True)
            self.get_response(text)
            
        except Exception as e:
            self.update_status(f"‚ùå Error: {e}")
            
    def update_status(self, message):
        if hasattr(self, 'status_widget'):
            color = "#28a745" if "‚úÖ" in message else "#dc3545" if "‚ùå" in message else "#007bff"
            self.status_widget.value = f"<div style='padding: 8px 15px; background: {color}; color: white; border-radius: 20px; text-align: center; font-weight: 500;'>{message}</div>"
    
    def add_message(self, sender, message, is_user):
        time = datetime.now().strftime("%H:%M")
        
        # Format AI responses with markdown
        if not is_user:
            message = self.format_markdown(message)
        
        if is_user:
            style = "background: linear-gradient(135deg, #667eea 0%, #764ba2 100%); color: white; float: right; border-radius: 20px 20px 5px 20px;"
            icon = "üë§"
        else:
            style = "background: linear-gradient(135deg, #f093fb 0%, #f5576c 100%); color: white; float: left; border-radius: 20px 20px 20px 5px;"
            icon = "ü§ñ"
        
        html = f"""
        <div style="margin: 15px 0; clear: both;">
            <div style="{style} padding: 15px 20px; display: inline-block; max-width: 95%; box-shadow: 0 2px 10px rgba(0,0,0,0.1);">
                <div style="font-size: 11px; opacity: 0.8; margin-bottom: 8px;">{icon} {sender} ‚Ä¢ {time}</div>
                <div style="font-size: 14px; line-height: 1.5;">{message}</div>
            </div>
        </div>
        """
        
        self.messages.append(html)
        self.chat_display.value = self._render_chat()
    
    def add_image_message(self, sender, image_base64):
        time = datetime.now().strftime("%H:%M")
        
        html = f"""
        <div style="margin: 15px 0; clear: both;">
            <div style="background: linear-gradient(135deg, #f093fb 0%, #f5576c 100%); color: white; float: left; border-radius: 20px 20px 20px 5px; padding: 15px 20px; display: inline-block; max-width: 95%; box-shadow: 0 2px 10px rgba(0,0,0,0.1);">
                <div style="font-size: 11px; opacity: 0.8; margin-bottom: 8px;">ü§ñ {sender} ‚Ä¢ {time}</div>
                <div style="font-size: 14px; margin-bottom: 10px;">Generated image:</div>
                <img src="data:image/png;base64,{image_base64}" style="max-width: 100%; border-radius: 10px;" />
            </div>
        </div>
        """
        
        self.messages.append(html)
        self.chat_display.value = self._render_chat()
        
    def send_message(self, text):
        if not text.strip():
            return
            
        self.add_message("You", text, True)
        self.get_response(text)
        
    def get_response(self, text):
        try:
            self.update_status("ü§ñ Thinking...")
            
            response = bedrock.invoke_model(
                modelId=model_id,
                contentType="application/json",
                accept="application/json",
                body=json.dumps({
                    "messages": [{"role": "user", "content": [{"text": text}]}],
                    "inferenceConfig": {"maxTokens": 2000, "temperature": 0.7}
                })
            )
            
            result = json.loads(response['body'].read())
            
            for content_item in result['output']['message']['content']:
                if 'text' in content_item:
                    self.add_message("Nova", content_item['text'], False)
                elif 'image' in content_item:
                    image_data = content_item['image']['source']['bytes']
                    self.add_image_message("Nova", image_data)
            
            self.update_status("‚úÖ Ready")
                        
        except Exception as e:
            self.update_status(f"‚ùå Error: {e}")
    
    def clear_chat(self):
        self.messages = []
        self.chat_display.value = self._render_chat()
        self.update_status("‚úÖ Chat cleared")

chat = NovaChat()
print("üöÄ Nova Chat Ready!")

üöÄ Nova Chat Ready!


## üèóÔ∏è Understanding the NovaChat Class

The `NovaChat` class we just created provides:

### Core Capabilities:
- **üé§ Audio Recording**: Real-time voice capture and processing
- **üìù Text Processing**: Natural language understanding and generation
- **üé® Image Generation**: Create images from text descriptions
- **üí¨ Chat Interface**: Modern, responsive UI with message history

### Key Methods:
- `start_recording()` / `stop_recording()`: Handle voice input
- `process_audio()`: Convert speech to text using Nova2 Omni
- `get_response()`: Generate intelligent responses
- `add_message()`: Display formatted chat messages

---

In [7]:
display(HTML("""
<style>
.widget-button {
    border-radius: 50% !important;
    font-weight: 600 !important;
    padding: 0 !important;
    margin: 5px !important;
    transition: all 0.3s ease !important;
    font-size: 18px !important;
}
.widget-textarea textarea {
    border-radius: 15px !important;
    border: 2px solid #e0e0e0 !important;
    padding: 15px !important;
    font-size: 14px !important;
    font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif !important;
}
.widget-textarea textarea:focus {
    border-color: #667eea !important;
    box-shadow: 0 0 10px rgba(102, 126, 234, 0.3) !important;
}
</style>
"""))

text_input = widgets.Textarea(
    placeholder='üí≠ Ask me anything or request an image...',
    layout=widgets.Layout(width='100%', height='90px')
)

send_btn = widgets.Button(description='üì§', button_style='primary', layout=widgets.Layout(width='50px', height='50px'))
record_btn = widgets.Button(description='üé§', button_style='info', layout=widgets.Layout(width='50px', height='50px'))
clear_btn = widgets.Button(description='üóëÔ∏è', button_style='danger', layout=widgets.Layout(width='50px', height='50px'))

status_widget = widgets.HTML(value="<div style='padding: 10px 20px; background: #28a745; color: white; border-radius: 25px; text-align: center; font-weight: 500; box-shadow: 0 2px 10px rgba(40, 167, 69, 0.3);'>‚úÖ Ready to chat</div>")
chat.status_widget = status_widget

def on_send(b):
    if text_input.value.strip():
        chat.send_message(text_input.value)
        text_input.value = ''

def on_record_toggle(b):
    if not chat.recording:
        chat.start_recording()
        record_btn.description = '‚èπ'
        record_btn.button_style = 'warning'
    else:
        chat.stop_recording()
        record_btn.description = 'üé§'
        record_btn.button_style = 'info'

def on_clear(b):
    chat.clear_chat()
    text_input.value = ''

send_btn.on_click(on_send)
record_btn.on_click(on_record_toggle)
clear_btn.on_click(on_clear)

header = widgets.HTML("""
<div style='text-align: center; padding: 25px; background: linear-gradient(135deg, #667eea 0%, #764ba2 100%); 
           border-radius: 20px; margin-bottom: 20px; color: white; box-shadow: 0 4px 20px rgba(102, 126, 234, 0.3);'>
    <h2 style='margin: 0; font-size: 32px; font-weight: 700;'>ü§ñ Nova2 Omni Chat</h2>
    <p style='margin: 8px 0 0 0; opacity: 0.9; font-size: 16px;'>Intelligent Text & Image Generation</p>
</div>
""")

input_section = widgets.VBox([
    widgets.HTML("<h4 style='color: #333; margin: 15px 0 10px 0; font-size: 18px;'>üí¨ Your Message</h4>"),
    text_input,
    widgets.HBox([send_btn, record_btn, clear_btn], 
                 layout=widgets.Layout(justify_content='center', margin='15px 0')),
    status_widget
], layout=widgets.Layout(padding='25px', background='#f8f9fa', border_radius='20px', 
                        box_shadow='0 2px 15px rgba(0,0,0,0.1)'))

main_ui = widgets.VBox([
    header,
    chat.chat_display,
    input_section
], layout=widgets.Layout(width='100%', margin='0', padding='10px'))

display(main_ui)

VBox(children=(HTML(value="\n<div style='text-align: center; padding: 25px; background: linear-gradient(135deg‚Ä¶

## üéÆ How to Use the Interactive Chat

### Interface Controls:
- **üì§ Send Button**: Send your typed message to Nova2 Omni
- **üé§ Record Button**: Click to start recording, click again (‚èπ) to stop and send
- **üóëÔ∏è Clear Button**: Clear the entire chat history

### Try These Examples:

#### üí¨ Text Conversations:
- "Explain quantum computing in simple terms"
- "Write a creative story about a robot chef"
- "Help me plan a weekend trip to Seattle"

#### üé® Image Generation:
- "Create an image of a futuristic city at sunset"
- "Generate a logo for a coffee shop called 'Bean Dreams'"
- "Draw a cartoon cat wearing a space helmet"

#### üé§ Voice Interactions:
- Click the microphone button and speak your question
- Nova2 will transcribe your speech and respond intelligently
- Perfect for hands-free interaction!

### üöÄ Ready to Chat!
The interface below is now ready for interaction. Start by typing a message or clicking the microphone to record your voice!

## üéØ Workshop Summary

Congratulations! You've successfully built and interacted with an intelligent multimodal chat application using Amazon Nova2 Omni.

### What We Accomplished:
- ‚úÖ **Multimodal AI Integration**: Combined text, voice, and image capabilities
- ‚úÖ **Real-time Audio Processing**: Implemented voice-to-text functionality
- ‚úÖ **Interactive UI**: Created a modern chat interface with icon-based controls
- ‚úÖ **Image Generation**: Demonstrated Nova2's creative capabilities

### Key Takeaways:
- **Nova2 Omni** is a powerful multimodal model that can handle diverse input types
- **Real-time interaction** is possible with proper audio handling and UI design
- **Bedrock integration** makes it easy to access advanced AI capabilities
- **Interactive notebooks** provide an excellent platform for AI experimentation

### Next Steps:
- Explore other Nova2 capabilities in the repository's workshops
- Customize the chat interface for your specific use cases
- Integrate this pattern into your own applications
- Experiment with different prompt engineering techniques

---

**üöÄ Continue exploring the other workshops in this repository to discover more Nova2 Omni capabilities!**