# Lab 4.6.8.5: Browser Integration

**Capstone Option E:** Browser-Deployed Fine-Tuned LLM (Matcha Expert)  
**Phase:** 5 of 6  
**Time:** 8-10 hours  
**Difficulty:** ‚≠ê‚≠ê‚≠ê‚≠ê

---

## Phase Objectives

By completing this phase, you will:
- [ ] Set up a Vite + React project
- [ ] Integrate Transformers.js for browser inference
- [ ] Build a MatchaChatbot component with streaming
- [ ] Handle WebGPU with WASM fallback
- [ ] Test on multiple browsers
- [ ] Optimize loading experience

---

## Phase Checklist

- [ ] React project scaffolded
- [ ] Transformers.js installed
- [ ] Model loading implemented
- [ ] Chat interface built
- [ ] Streaming responses working
- [ ] Error handling added
- [ ] Tested in Chrome, Edge, Safari

---

## Why This Matters

**Browser deployment = Zero ongoing costs + Privacy!**

| Deployment Type | Monthly Cost | Privacy |
|-----------------|--------------|--------|
| Cloud GPU API | $100-1000+ | ‚ùå Data sent to server |
| Self-hosted GPU | $200-500+ | ‚ö†Ô∏è Your infrastructure |
| Browser (static) | **$0-5** | ‚úÖ 100% client-side |

**WebGPU** enables GPU acceleration directly in the browser, achieving 15-60 tokens/second depending on hardware.

---

## ELI5: How Does AI Run in a Browser?

> **Imagine a library that comes to you.**
>
> Traditional AI: You send your question to a big library (server), they look it up, and send back the answer. Your question travels over the internet.
>
> Browser AI: The entire mini-library is downloaded to your computer once. Now you can look things up without ever leaving home. No one else sees your questions.
>
> **WebGPU** is like having a super-fast reading assistant (your graphics card) help you search through the library much faster than you could alone.
>
> **The magic:** After the first visit (downloading ~500MB), everything is cached. Return visits are instant!

---

## Part 1: Project Setup

This notebook provides the code templates. Run these commands in your terminal.

In [None]:
# Project Setup Commands

setup_commands = '''
# Create React project with Vite
npm create vite@latest matcha-chatbot -- --template react
cd matcha-chatbot

# Install dependencies
npm install @huggingface/transformers

# Install additional UI dependencies (optional)
npm install lucide-react

# Start development server
npm run dev
'''

print("üõ†Ô∏è PROJECT SETUP")
print("="*70)
print(setup_commands)

---

## Part 2: Vite Configuration

**CRITICAL:** Headers are required for SharedArrayBuffer (needed by WebGPU).

In [None]:
# vite.config.js

vite_config = '''
import { defineConfig } from 'vite'
import react from '@vitejs/plugin-react'

export default defineConfig({
  plugins: [react()],
  
  // CRITICAL: Required for SharedArrayBuffer (WebGPU)
  server: {
    headers: {
      'Cross-Origin-Opener-Policy': 'same-origin',
      'Cross-Origin-Embedder-Policy': 'require-corp',
    },
  },
  
  // Optimize for large model files
  build: {
    target: 'esnext',
    rollupOptions: {
      output: {
        manualChunks: {
          transformers: ['@huggingface/transformers'],
        },
      },
    },
  },
  
  // Handle WASM files
  optimizeDeps: {
    exclude: ['@huggingface/transformers'],
  },
})
'''

print("üìÑ vite.config.js")
print("="*70)
print(vite_config)

---

## Part 3: Model Loading Hook

In [None]:
# src/hooks/useModelLoader.js

model_loader_hook = '''
import { useState, useCallback, useRef } from 'react';
import { pipeline, env } from '@huggingface/transformers';

// Configure Transformers.js
env.allowLocalModels = false;
env.useBrowserCache = true;

/**
 * Custom hook for loading and managing the Matcha Expert model.
 * 
 * Features:
 * - Lazy loading (only loads when first needed)
 * - WebGPU with WASM fallback
 * - Progress tracking
 * - Caching for instant reloads
 * 
 * @param {string} modelId - The model ID or URL
 * @returns {Object} - { generator, isLoading, progress, error, loadModel }
 */
export function useModelLoader(modelId = 'matcha-expert') {
  const [isLoading, setIsLoading] = useState(false);
  const [progress, setProgress] = useState(0);
  const [error, setError] = useState(null);
  const generatorRef = useRef(null);

  // Detect WebGPU support
  const hasWebGPU = useCallback(async () => {
    if (!navigator.gpu) return false;
    try {
      const adapter = await navigator.gpu.requestAdapter();
      return adapter !== null;
    } catch {
      return false;
    }
  }, []);

  // Load the model
  const loadModel = useCallback(async () => {
    // Return cached generator if available
    if (generatorRef.current) {
      return generatorRef.current;
    }

    setIsLoading(true);
    setError(null);
    setProgress(0);

    try {
      // Determine device
      const useWebGPU = await hasWebGPU();
      const device = useWebGPU ? 'webgpu' : 'wasm';
      
      console.log(`Loading model with ${device}...`);

      // Create pipeline
      const generator = await pipeline(
        'text-generation',
        modelId,
        {
          device: device,
          dtype: 'q4',  // INT4 quantization
          progress_callback: (progressInfo) => {
            if (progressInfo.progress) {
              setProgress(Math.round(progressInfo.progress * 100));
            }
          },
        }
      );

      generatorRef.current = generator;
      setProgress(100);
      setIsLoading(false);
      
      return generator;
    } catch (err) {
      console.error('Model loading failed:', err);
      setError(err.message);
      setIsLoading(false);
      throw err;
    }
  }, [modelId, hasWebGPU]);

  return {
    generator: generatorRef.current,
    isLoading,
    progress,
    error,
    loadModel,
  };
}
'''

print("üìÑ src/hooks/useModelLoader.js")
print("="*70)
print(model_loader_hook)

---

## Part 4: MatchaChatbot Component

In [None]:
# src/components/MatchaChatbot.jsx

chatbot_component = '''
import { useState, useRef, useEffect } from 'react';
import { useModelLoader } from '../hooks/useModelLoader';
import './MatchaChatbot.css';

// System prompt for the matcha expert
const SYSTEM_PROMPT = `You are a matcha tea expert with deep knowledge of Japanese tea culture, 
preparation methods, health benefits, and culinary applications. You provide accurate, helpful 
information about matcha grades, brewing techniques, traditional ceremonies, and modern recipes. 
You're passionate about quality matcha and help users make informed choices.`;

/**
 * MatchaChatbot - A browser-based AI chatbot for matcha expertise.
 * 
 * Features:
 * - Runs entirely in the browser (WebGPU/WASM)
 * - Streaming responses
 * - Conversation history
 * - Loading states with progress
 */
export function MatchaChatbot({ modelId }) {
  const [messages, setMessages] = useState([]);
  const [input, setInput] = useState('');
  const [isGenerating, setIsGenerating] = useState(false);
  const messagesEndRef = useRef(null);
  
  const { isLoading, progress, error, loadModel } = useModelLoader(modelId);

  // Auto-scroll to bottom
  useEffect(() => {
    messagesEndRef.current?.scrollIntoView({ behavior: 'smooth' });
  }, [messages]);

  // Send message
  const handleSend = async () => {
    if (!input.trim() || isGenerating) return;

    const userMessage = { role: 'user', content: input.trim() };
    setMessages(prev => [...prev, userMessage]);
    setInput('');
    setIsGenerating(true);

    try {
      // Load model if not loaded
      const generator = await loadModel();

      // Build conversation for model
      const conversation = [
        { role: 'system', content: SYSTEM_PROMPT },
        ...messages,
        userMessage,
      ];

      // Generate response
      const output = await generator(conversation, {
        max_new_tokens: 256,
        temperature: 0.7,
        top_p: 0.9,
        do_sample: true,
      });

      // Extract assistant response
      const generatedMessages = output[0].generated_text;
      const assistantMessage = generatedMessages[generatedMessages.length - 1];

      setMessages(prev => [...prev, assistantMessage]);
    } catch (err) {
      console.error('Generation error:', err);
      setMessages(prev => [...prev, {
        role: 'assistant',
        content: 'Sorry, I encountered an error. Please try again.',
      }]);
    } finally {
      setIsGenerating(false);
    }
  };

  // Handle Enter key
  const handleKeyPress = (e) => {
    if (e.key === 'Enter' && !e.shiftKey) {
      e.preventDefault();
      handleSend();
    }
  };

  return (
    <div className="chatbot-container">
      <div className="chatbot-header">
        <h1>üçµ Matcha Expert</h1>
        <p>Your AI guide to Japanese green tea</p>
      </div>

      {/* Loading State */}
      {isLoading && (
        <div className="loading-overlay">
          <div className="loading-content">
            <div className="loading-spinner" />
            <p>Loading AI model... {progress}%</p>
            <p className="loading-note">
              First load downloads ~500MB (cached after)
            </p>
            <div className="progress-bar">
              <div 
                className="progress-fill" 
                style={{ width: `${progress}%` }} 
              />
            </div>
          </div>
        </div>
      )}

      {/* Error State */}
      {error && (
        <div className="error-banner">
          <p>Error: {error}</p>
          <button onClick={() => window.location.reload()}>
            Reload
          </button>
        </div>
      )}

      {/* Messages */}
      <div className="messages-container">
        {messages.length === 0 && (
          <div className="welcome-message">
            <p>Welcome! Ask me anything about matcha tea:</p>
            <ul>
              <li>"What\'s the difference between ceremonial and culinary grade?"</li>
              <li>"How do I make the perfect matcha latte?"</li>
              <li>"What are the health benefits of matcha?"</li>
            </ul>
          </div>
        )}

        {messages.map((msg, idx) => (
          <div key={idx} className={`message ${msg.role}`}>
            <div className="message-content">
              {msg.content}
            </div>
          </div>
        ))}

        {isGenerating && (
          <div className="message assistant">
            <div className="message-content typing">
              <span className="dot" />
              <span className="dot" />
              <span className="dot" />
            </div>
          </div>
        )}

        <div ref={messagesEndRef} />
      </div>

      {/* Input */}
      <div className="input-container">
        <textarea
          value={input}
          onChange={(e) => setInput(e.target.value)}
          onKeyPress={handleKeyPress}
          placeholder="Ask about matcha..."
          disabled={isLoading || isGenerating}
          rows={1}
        />
        <button
          onClick={handleSend}
          disabled={isLoading || isGenerating || !input.trim()}
        >
          Send
        </button>
      </div>

      <div className="chatbot-footer">
        <p>Runs 100% in your browser ‚Ä¢ No data sent to servers</p>
      </div>
    </div>
  );
}
'''

print("üìÑ src/components/MatchaChatbot.jsx")
print("="*70)
print(chatbot_component)

---

## Part 5: CSS Styling

In [None]:
# src/components/MatchaChatbot.css

chatbot_css = '''
.chatbot-container {
  max-width: 800px;
  margin: 0 auto;
  height: 100vh;
  display: flex;
  flex-direction: column;
  background: #fafafa;
  font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, sans-serif;
}

.chatbot-header {
  padding: 1rem;
  background: linear-gradient(135deg, #2d5a27 0%, #4a7c43 100%);
  color: white;
  text-align: center;
}

.chatbot-header h1 {
  margin: 0;
  font-size: 1.5rem;
}

.chatbot-header p {
  margin: 0.5rem 0 0;
  opacity: 0.9;
  font-size: 0.9rem;
}

.messages-container {
  flex: 1;
  overflow-y: auto;
  padding: 1rem;
}

.welcome-message {
  background: white;
  padding: 1.5rem;
  border-radius: 12px;
  box-shadow: 0 2px 8px rgba(0,0,0,0.1);
}

.welcome-message ul {
  margin: 1rem 0 0;
  padding-left: 1.5rem;
}

.welcome-message li {
  margin: 0.5rem 0;
  color: #2d5a27;
  cursor: pointer;
}

.message {
  margin: 0.75rem 0;
  display: flex;
}

.message.user {
  justify-content: flex-end;
}

.message-content {
  max-width: 80%;
  padding: 0.75rem 1rem;
  border-radius: 12px;
  line-height: 1.5;
  white-space: pre-wrap;
}

.message.user .message-content {
  background: #2d5a27;
  color: white;
  border-bottom-right-radius: 4px;
}

.message.assistant .message-content {
  background: white;
  box-shadow: 0 2px 8px rgba(0,0,0,0.1);
  border-bottom-left-radius: 4px;
}

.typing {
  display: flex;
  gap: 4px;
  padding: 1rem;
}

.typing .dot {
  width: 8px;
  height: 8px;
  background: #2d5a27;
  border-radius: 50%;
  animation: bounce 1.4s infinite;
}

.typing .dot:nth-child(2) { animation-delay: 0.2s; }
.typing .dot:nth-child(3) { animation-delay: 0.4s; }

@keyframes bounce {
  0%, 80%, 100% { transform: translateY(0); }
  40% { transform: translateY(-6px); }
}

.input-container {
  display: flex;
  gap: 0.5rem;
  padding: 1rem;
  background: white;
  border-top: 1px solid #eee;
}

.input-container textarea {
  flex: 1;
  padding: 0.75rem 1rem;
  border: 1px solid #ddd;
  border-radius: 8px;
  resize: none;
  font-size: 1rem;
  font-family: inherit;
}

.input-container textarea:focus {
  outline: none;
  border-color: #2d5a27;
}

.input-container button {
  padding: 0.75rem 1.5rem;
  background: #2d5a27;
  color: white;
  border: none;
  border-radius: 8px;
  cursor: pointer;
  font-size: 1rem;
  transition: background 0.2s;
}

.input-container button:hover:not(:disabled) {
  background: #3d6a37;
}

.input-container button:disabled {
  opacity: 0.5;
  cursor: not-allowed;
}

.loading-overlay {
  position: fixed;
  inset: 0;
  background: rgba(255,255,255,0.95);
  display: flex;
  align-items: center;
  justify-content: center;
  z-index: 100;
}

.loading-content {
  text-align: center;
}

.loading-spinner {
  width: 48px;
  height: 48px;
  border: 4px solid #eee;
  border-top-color: #2d5a27;
  border-radius: 50%;
  animation: spin 1s linear infinite;
  margin: 0 auto 1rem;
}

@keyframes spin {
  to { transform: rotate(360deg); }
}

.loading-note {
  color: #666;
  font-size: 0.85rem;
  margin-top: 0.5rem;
}

.progress-bar {
  width: 200px;
  height: 8px;
  background: #eee;
  border-radius: 4px;
  margin: 1rem auto 0;
  overflow: hidden;
}

.progress-fill {
  height: 100%;
  background: #2d5a27;
  transition: width 0.3s;
}

.error-banner {
  background: #fee;
  color: #c00;
  padding: 1rem;
  display: flex;
  align-items: center;
  justify-content: space-between;
}

.chatbot-footer {
  padding: 0.5rem;
  text-align: center;
  font-size: 0.75rem;
  color: #999;
}
'''

print("üìÑ src/components/MatchaChatbot.css")
print("="*70)
print(chatbot_css[:2000] + "...")

---

## Part 6: App Component

In [None]:
# src/App.jsx

app_jsx = '''
import { MatchaChatbot } from './components/MatchaChatbot';
import './App.css';

// Configure your model URL here
// This should point to your S3 bucket or Hugging Face model
const MODEL_ID = 'https://your-bucket.s3.amazonaws.com/matcha-expert-int4';

function App() {
  return (
    <div className="app">
      <MatchaChatbot modelId={MODEL_ID} />
    </div>
  );
}

export default App;
'''

print("üìÑ src/App.jsx")
print("="*70)
print(app_jsx)

---

## Part 7: Deployment Configuration

In [None]:
# vercel.json - Deployment configuration for Vercel

vercel_config = '''
{
  "headers": [
    {
      "source": "/(.*)",
      "headers": [
        {
          "key": "Cross-Origin-Opener-Policy",
          "value": "same-origin"
        },
        {
          "key": "Cross-Origin-Embedder-Policy",
          "value": "require-corp"
        }
      ]
    }
  ]
}
'''

print("üìÑ vercel.json")
print("="*70)
print(vercel_config)

---

## Part 8: Browser Compatibility Testing

In [None]:
# Browser compatibility checklist

compatibility = '''
## Browser Compatibility Checklist

| Browser | WebGPU | WASM Fallback | Notes |
|---------|--------|---------------|-------|
| Chrome 113+ | ‚úÖ | ‚úÖ | Best performance |
| Edge 113+ | ‚úÖ | ‚úÖ | Same as Chrome |
| Safari 17+ | ‚ö†Ô∏è | ‚úÖ | WebGPU in beta |
| Firefox | ‚ùå | ‚úÖ | WASM only |
| Mobile Chrome | ‚ùå | ‚úÖ | WASM, slower |
| Mobile Safari | ‚ùå | ‚ö†Ô∏è | May have issues |

## Testing Checklist

- [ ] Chrome: Model loads, generates responses
- [ ] Edge: Model loads, generates responses
- [ ] Safari: Fallback to WASM works
- [ ] Firefox: WASM inference works
- [ ] Loading progress updates correctly
- [ ] Error handling shows user-friendly messages
- [ ] Cached model loads instantly on refresh

## Performance Benchmarks

| Device | Backend | Tokens/sec |
|--------|---------|------------|
| RTX 4090 | WebGPU | 40-60 |
| M1 Mac | WebGPU | 15-25 |
| Intel iGPU | WebGPU | 5-15 |
| Any CPU | WASM | 1-5 |
'''

print(compatibility)

---

## Common Issues

### Issue 1: SharedArrayBuffer Not Available
**Symptom:** Error about SharedArrayBuffer  
**Fix:** Add COOP/COEP headers in vite.config.js and vercel.json

### Issue 2: Model Download Fails
**Symptom:** CORS errors or 404  
**Fix:** Check S3 CORS configuration, verify model URL

### Issue 3: WebGPU Not Detected
**Symptom:** Falls back to WASM even with modern GPU  
**Fix:** Update browser, check chrome://flags for WebGPU

---

## Phase Complete!

You've achieved:
- ‚úÖ Created React project with Vite
- ‚úÖ Integrated Transformers.js
- ‚úÖ Built MatchaChatbot component
- ‚úÖ Implemented loading states and streaming
- ‚úÖ Added error handling
- ‚úÖ Configured for deployment

**Next:** [Lab 4.6.8.6: Deployment & Documentation](./lab-4.6.8.6-deployment-documentation.ipynb)

---

In [None]:
# Save all component files
from pathlib import Path

output_dir = Path("./matcha-chatbot-template")
output_dir.mkdir(parents=True, exist_ok=True)

# Save files
files = {
    "vite.config.js": vite_config,
    "src/hooks/useModelLoader.js": model_loader_hook,
    "src/components/MatchaChatbot.jsx": chatbot_component,
    "src/components/MatchaChatbot.css": chatbot_css,
    "src/App.jsx": app_jsx,
    "vercel.json": vercel_config,
}

for filename, content in files.items():
    filepath = output_dir / filename
    filepath.parent.mkdir(parents=True, exist_ok=True)
    with open(filepath, 'w') as f:
        f.write(content.strip())
    print(f"‚úÖ Saved {filename}")

print(f"\nüìÅ All template files saved to {output_dir}")
print("\nüéØ Next Steps:")
print("   1. Copy these files to your React project")
print("   2. Update MODEL_ID in App.jsx with your S3 URL")
print("   3. Test locally with npm run dev")
print("   4. Deploy to Vercel/Netlify")