Skip to content

GSX-Inc/browser_copilot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

12 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

AI Browser Co-pilot Super App

Chrome Built-in AI Challenge 2025 - Professional AI Browser Extension w/Hybrid Architecture

A powerful AI-powered Chrome extension featuring 6 production-ready features that revolutionize web accessibility, performance optimization, security testing, and debugging.

Chrome Extension Built-in AI Firebase TypeScript


🎯 Quick Installation Guide

Prerequisites

  • Chrome Browser (v120+) or Chrome Canary (for Built-in AI demo)
  • Google Account
  • ~5 minutes installation time

Step-by-Step Installation

  1. Clone the Repository
git clone https://github.com/GSX-Inc/browser_copilot.git
cd browser_copilot/extension
  1. Install Dependencies
npm install

This installs: React 19, Firebase 10.12.2, html2canvas, esbuild

  1. Build the Extension
npm run build

Build completes. Output goes to dist/ folder.

  1. Load in Chrome
  • Open chrome://extensions/
  • Enable "Developer mode" toggle (top right)
  • Click "Load unpacked" button
  • Navigate to and select: browser_copilot/extension/dist folder
  • Extension loads with AI Co-pilot icon
  1. Open the Extension
  • Click the extension icon in Chrome toolbar (puzzle piece)
  • Side panel opens on the right
  • Sign in with Google (OAuth)

Testing the Features

Quick Feature Tour (5 minutes):

  1. Network Analysis Feature - Opens by default

    • Click "Start Network Capture"
    • Navigate to any site
    • Click "Stop Capture & Analyze"
    • See network analysis
    • Chat to ask context aware follow-up questions
  2. Context Builder Feature - Multi-tab synthesis

    • Open 2-3 tabs
    • Select tabs
    • Click "Synthesize"
    • Shows on-device quick summary (If Chrome API available )
    • Then cloud deep analysis
  3. Canvas Feature - Accessibility & Design

    • Navigate to any website
    • Canvas β†’ Accessibility Audit β†’ "Analyze Page"
    • AI summary of findings (on-device)
    • Try Element Capture β†’ click elements on page
    • Try CSS Generator β†’ generate color palettes
  4. Kino Feature - Video Intelligence

    • Open YouTube video
    • Kino β†’ "Start Live Video Capture"
    • Select tab in screen picker
    • Enable Audio Descriptions
    • Ask questions β†’ answers spoken aloud!
  5. Nexus Feature - Performance Engineering

    • Open any website
    • Nexus β†’ "Analyze Page Performance"
    • See bottleneck identification
    • View generated code fix
    • Click "Preview Fix" to see it work
  6. Aegis Feature - Security Agent

    • Aegis β†’ "Activate Aegis Security"
    • Reload page β†’ see requests intercepted
    • Try API Mocking: "Mock /api/users with 404"
    • See mock applied in real-time

Note on Built-in AI

Chrome Built-in AI APIs (Summarizer, Proofreader, Prompt) are utilized through features:

If you have Chrome environment with Built-in AI APIs:

  • Context Builder will show on-device quick summaries
  • Canvas will show AI-generated audit summaries
  • Look for green badges: "Gemini Nano - On-Device"

If you have stable Chrome:

  • All features work via Firebase Vertex AI fallback
  • Extension is fully functional
  • Hybrid architecture code is present and ready

To enable Built-in AI (optional):

  1. Use Chrome API-enabled Browser environment
  2. Enable flags: chrome://flags/#optimization-guide-on-device-model
  3. Restart Chrome
  4. Built-in AI features activate automatically

✨ Key Highlights

  • 🧠 Chrome Built-in AI - Summarizer API, Proofreader API, Prompt API (Gemini Nano)
  • ☁️ Hybrid AI Architecture - Strategic on-device + cloud delegation
  • β™Ώ Accessibility First - Audio descriptions for visually impaired users
  • ⚑ Performance Engineering - AI-powered optimization with live code preview
  • πŸ›‘οΈ Security Agent - Autonomous threat detection & conversational API mocking
  • 🎨 Accessibility Audits - WCAG compliance with AI summaries
  • 🎬 Video Intelligence - Real-time analysis with text-to-speech

πŸš€ Features Overview

1. Context Builder πŸ”—

⭐ PRIMARY BUILT-IN AI SHOWCASE - Hybrid Architecture

Multi-tab content synthesis with perfect hybrid AI demonstration.

Chrome Built-in AI Integration:

// Phase 1: On-device quick summary (Gemini Nano)
const summarizer = await ai.summarizer.create({
  type: 'key-points',
  format: 'markdown',
  length: 'medium'
});
const quickSummary = await summarizer.summarize(content);
// Instant key points!

// Phase 2: Cloud deep synthesis (Gemini 2.5 Flash)
const deepAnalysis = await model.generateContentStream(prompt);
// Comprehensive analysis

Features:

  • Select multiple tabs
  • Aggregate content from all selected tabs
  • On-device summary (instant if Built-in AI available)
  • Cloud synthesis (comprehensive analysis)
  • Shows both results with clear indicators
  • Intelligent permission management

This demonstrates the perfect hybrid AI strategy!

2. Canvas - Design & Accessibility Suite 🎨

Built-in AI: Summarizer API for audit summaries

Complete toolkit for web design and WCAG compliance.

Features:

A) Accessibility Audit

  • Built-in AI: On-device AI summary of findings
  • WCAG 2.1 compliance checking
  • 5 comprehensive audits:
    • Missing alt text detection
    • Heading hierarchy validation
    • Form label compliance
    • Color contrast calculations (WCAG AA/AAA)
    • ARIA attribute verification
  • Score calculation (0-100)
  • Actionable fix suggestions with WCAG references

B) Element Capture

  • Click-to-screenshot any UI element
  • html2canvas integration (retina quality)
  • Thumbnail gallery with metadata
  • Hover highlighting

C) AI Design Analysis

  • Multimodal analysis of captured elements
  • Context-aware prompts (contrast, design, accessibility)
  • Actionable feedback

D) Color Palette Generator

  • AI-generated WCAG-compliant palettes
  • Live CSS preview
  • Export: CSS, SCSS, JavaScript, Tailwind
  • Copy to clipboard or download

5. Kino - Video Intelligence Engine 🎬

β™Ώ ACCESSIBILITY BREAKTHROUGH

Real-time video analysis with audio descriptions for visually impaired users.

Features:

  • Live Video Capture: getDisplayMedia API, 1fps frame extraction
  • Live Q&A: Ask questions about video content in real-time
  • Audio Descriptions: speechSynthesis - answers spoken aloud
  • Video URL Summarization: Comprehensive video analysis
  • Transcript Extraction: Pull text from videos
  • Voice Controls: Adjustable speed, pitch, volume

Accessibility Impact:

Blind developer watches coding tutorial:
1. Kino captures video frames
2. User asks: "What's on screen?"
3. AI: "Code editor showing React useState hook"
4. Answer spoken aloud via text-to-speech
5. User follows visual tutorial through audio!

3. Nexus - Agentic Performance Engineer ⚑

πŸ€– AGENTIC AI SYSTEM

Autonomous AI agent using perceive-reason-act architecture.

Agentic Flow:

  1. Collects performance metrics (Performance API)
  2. Gemini AI identifies primary bottleneck
  3. Generates production-ready code fix

Features:

  • Core Web Vitals (LCP, FCP, CLS, TTFB)
  • Bottleneck identification (render-blocking resources, large images)
  • AI reasoning explanation (step-by-step)
  • Production-ready code generation
  • Live preview - inject code and see improvement
  • Before/after metrics comparison

Example:

Bottleneck: Render-blocking CSS (/styles/main.css)
Impact: Delays LCP by 1.8 seconds
Score: 62/100

Generated Fix:
<link rel="preload" href="/styles/main.css" as="style"
      onload="this.rel='stylesheet'">

[Preview Fix] β†’ See improvement live!

4. Aegis - AI Security & Resilience Agent πŸ›‘οΈ

πŸ€– AUTONOMOUS SECURITY AGENT

AI-powered security monitoring with conversational API mocking.

Features:

  • Request Interception: CDP Fetch domain intercepts ALL requests
  • Auto-Follow Tabs: Automatically monitors whichever tab you switch to
  • AI Threat Detection: 90% confidence malware identification
  • Conversational API Mocking: Natural language β†’ working mocks
    • "Mock /api/users with 404 error"
    • AI parses β†’ creates rule β†’ applies on next request
  • Domain Blocking: Block malicious domains permanently
  • Export/Import: Share mock rules and test scenarios
  • Real-Time Feed: Live request monitoring with filters (All, Allowed, Suspicious, Blocked, Mocked)
  • Security Alerts: AI-analyzed threat notifications

Auto-Follow Demo:

1. Activate Aegis on Tab A
2. Switch to Tab B
3. Aegis automatically follows!
4. All requests from Tab B now intercepted
5. Mock rules apply across ALL tabs

5. Authentication πŸ”

Secure Google Sign-In via Firebase.

  • OAuth integration
  • Persistent sessions
  • Protected feature access

6. Network Analysis 🌐

HTTP debugging using Chrome DevTools Protocol.

  • CDP Network domain integration
  • Request/response capture with timing metrics
  • Network waterfall analysis

🧠 Chrome Built-in AI Integration

Hybrid AI Architecture

We implement a strategic hybrid approach combining:

On-Device (Chrome Built-in AI - Gemini Nano):

  • βœ… Instant results - Zero network latency
  • βœ… Privacy-first - Data never leaves device
  • βœ… Offline capable - Works without internet
  • βœ… No API costs - Free to use

Cloud (Firebase Vertex AI - Gemini 2.5 Flash):

  • βœ… Powerful reasoning - Complex analysis
  • βœ… Code generation - Production outputs
  • βœ… Multimodal - Images, video, performance data
  • βœ… Large context - Extensive data processing

ALL 7 Chrome Built-in AI APIs Integrated!

We demonstrate complete mastery of Chrome's Built-in AI platform by integrating ALL 7 available APIs:

1. Prompt API (LanguageModel) - General Purpose + Multimodal

Integrated in: Kino (video frame analysis)

Use Case: Multimodal analysis of video frames with images

Code:

// Create session with image support
const session = await LanguageModel.create({
  expectedInputs: [{ type: "image" }]
});

// Send question with video frame image
await session.append([{
  role: 'user',
  content: [
    { type: 'text', value: "What's on screen?" },
    { type: 'image', value: videoFrameBlob }
  ]
}]);

const answer = await session.prompt('');
// On-device multimodal analysis!

Value: Privacy-first video analysis, demonstrates multimodal Built-in AI


2. Summarizer API - Text Summarization

Integrated in: Context Builder, Canvas

Use Cases:

  • Context Builder: Quick multi-tab content summaries
  • Canvas: Accessibility audit findings summaries

Code:

const summarizer = await ai.summarizer.create({
  type: 'key-points',
  format: 'markdown',
  length: 'medium'
});

const summary = await summarizer.summarize(content);
// Instant on-device summary

Value: Hybrid AI showcase - instant summaries + cloud deep analysis


3. Writer API - Content Creation

Integrated in: Nexus (code documentation enhancement)

Use Case: Generate enhanced explanations for performance fixes

Code:

const writer = await ai.writer.create({ tone: 'professional' });
const documentation = await writer.write(
  `Explain this performance fix in simple terms: ${code}`
);
// AI-generated documentation

Value: Better code explanations, demonstrates Writer API


4. Rewriter API - Content Improvement

Integrated in: Utility module (available for all features)

Use Case: Improve design feedback, reformat text

Code:

const rewriter = await ai.rewriter.create({
  tone: 'more-professional',
  length: 'shorter'
});

const improved = await rewriter.rewrite(originalText);
// Enhanced content

Value: Content quality improvement on-device


5. Proofreader API - Grammar Correction

Integrated in: Utility module (available for all features)

Use Case: Grammar and spelling correction

Code:

const proofreader = await ai.proofreader.create();
const corrected = await proofreader.proofread(text);
// Grammar-corrected text

Value: Quality assurance for user inputs


6. Translator API - Language Translation

Integrated in: Network Analysis

Use Case: Translate non-English API responses

Code:

const translator = await ai.translator.create({
  sourceLanguage: detectedLang,
  targetLanguage: 'en'
});

const translated = await translator.translate(responseBody);
// Translated to English

Value: International API debugging


7. Language Detector API - Language Identification

Integrated in: Network Analysis

Use Case: Detect language of API responses

Code:

const detector = await ai.languageDetector.create();
const results = await detector.detect(text);
// results: [{ language: 'es', confidence: 0.95 }]

Value: Automatic language detection for translation


Complete Built-in AI Utility Module

We've created utils/builtin-ai.ts with all 7 APIs properly implemented:

// All 7 Chrome Built-in AI APIs in one module
export async function summarizeText(text, options);
export async function writeContent(prompt, options);
export async function rewriteContent(text, options);
export async function proofreadText(text);
export async function translateText(text, sourceLang, targetLang);
export async function detectLanguage(text);
export async function promptWithText(question, systemPrompt);
export async function promptWithImage(question, imageBlob);
export async function checkBuiltInAIAvailability();

This demonstrates comprehensive understanding and mastery of Chrome's entire Built-in AI platform!

Fallback Strategy

All features work via Firebase Vertex AI if Built-in AI not available:

  • βœ… Works for judges with Chrome Canary (shows Built-in AI)
  • βœ… Works for judges with stable Chrome (fallback)
  • βœ… Production-ready for all users
  • βœ… Future-proof architecture

πŸ—οΈ Technical Architecture

Tech Stack

  • Frontend: React 19, TypeScript, Tailwind CSS
  • AI:
    • Chrome Built-in AI: Summarizer API, Proofreader API
    • Firebase Vertex AI: Gemini 2.5 Flash
  • Chrome APIs: debugger (CDP), scripting, tabCapture, permissions, identity, storage, tabs, sidePanel
  • Build: esbuild, 77kb background worker

Chrome APIs Mastered

  1. chrome.debugger - CDP access (Network, Fetch, Tracing domains)
  2. chrome.scripting - Code injection, content scripts
  3. chrome.tabCapture / getDisplayMedia - Video capture
  4. chrome.permissions - Runtime permissions
  5. chrome.identity - OAuth Google Sign-In
  6. chrome.storage - Settings persistence
  7. chrome.tabs - Tab management
  8. chrome.sidePanel - UI framework

🎯 Key Use Cases

1. Video Accessibility

Blind student watching coding tutorial:
β†’ Kino captures video frames
β†’ Enable audio descriptions
β†’ Ask: "What function is this?"
β†’ AI: "Array.map() transforming data"
β†’ Answer spoken aloud
β†’ Student follows visual content through audio!

2. Performance Optimization

Developer's site is slow:
β†’ Nexus analyzes performance
β†’ AI identifies: Render-blocking CSS
β†’ Generates: <link rel="preload" ...>
β†’ Preview fix β†’ LCP improves 1.8s
β†’ Copy code β†’ Deploy

3. API Resilience Testing

Testing frontend error handling:
β†’ Aegis: "Mock /api/checkout with 500 error"
β†’ AI creates mock rule
β†’ Test checkout β†’ receives 500
β†’ Verify error UI works
β†’ No backend changes needed!

4. Accessibility Compliance

Launching website, need WCAG compliance:
β†’ Canvas β†’ Accessibility Audit
β†’ AI Summary: "5 critical issues found..."
β†’ Fix missing alt text, contrast problems
β†’ Re-audit β†’ Score 95/100
β†’ Ship accessible site!


πŸ”¬ Technical Innovation

Agentic AI Systems

Nexus (Perceive-Reason-Act)

  1. Perceive: Performance API metrics
  2. Reason: AI identifies bottleneck
  3. Act: Generates + previews code fix

Aegis (Intercept-Analyze-Block)

  1. Intercept: CDP Fetch domain
  2. Analyze: AI threat detection (90% confidence)
  3. Block/Mock: Autonomous actions

Multimodal Processing

  • Text (all features)
  • Images (Canvas element analysis)
  • Video (Kino frame Q&A)
  • Performance data (Nexus)
  • Network data (Aegis, Network Analysis)

🌟 Impact Statement

AI Browser Co-pilot makes the web more accessible, performant, and secure.

For Visually Impaired Users

  • Kino audio descriptions enable blind developers to learn from video tutorials
  • First-of-its-kind: Real-time video β†’ AI analysis β†’ spoken audio
  • Makes visual education accessible

For Web Developers

  • Automates WCAG compliance checking (Canvas)
  • Generates performance fixes (Nexus)
  • Simplifies API testing (Aegis)
  • Saves hours of manual work

πŸ› οΈ Development

Build Commands

npm run build        # Development build
npm run build:prod   # Production (minified)
npm run clean        # Clean dist folder

Project Structure

extension/
β”œβ”€β”€ components/       # React UI (11 components)
β”œβ”€β”€ content-scripts/  # Injected scripts (3 files)
β”œβ”€β”€ utils/           # Helpers (3 modules)
β”œβ”€β”€ background.ts    # Service worker (77kb)
β”œβ”€β”€ firebase.ts      # AI initialization
β”œβ”€β”€ types.ts         # TypeScript interfaces
└── manifest.json    # Extension config

πŸ“ License

MIT License


πŸ† AI Browser Co-pilot Super App optimizing the web browsing experience. πŸ†

About

AI-powered Browser Copilot Super App

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages