Skip to content

avnishsinha/Zero-

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

6 Commits
Β 
Β 
Β 
Β 

Repository files navigation

zer0 πŸ€–

zer0 is an AI-powered browser extension that acts as a bridge between your thoughts and actions. Using advanced language models (OpenAI/Azure OpenAI), zer0 can understand natural language instructions and autonomously perform tasks in your browser.


🌟 Features

  • AI-Powered Automation: Execute complex browser tasks using natural language commands
  • Intelligent DOM Interaction: Automatically clicks, fills forms, and navigates websites
  • Task History Tracking: Monitor and review all actions performed by the agent
  • Voice Input Support: Control the extension using voice commands
  • Multi-Model Support: Compatible with OpenAI and Azure OpenAI services
  • Privacy-Focused: All processing happens locally in your browser

πŸ› οΈ Technologies Used

  • Frontend: React 18, TypeScript, Chakra UI
  • State Management: Zustand with Immer
  • AI Integration: OpenAI API, Azure OpenAI
  • Build Tools: Webpack, Babel
  • Testing: Jest
  • Extension Platform: Chrome Extension Manifest V3

πŸ“‚ Repository Structure

browser-extension-master/
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ common/              # Reusable React components
β”‚   β”‚   β”œβ”€β”€ App.tsx
β”‚   β”‚   β”œβ”€β”€ TaskUI.tsx
β”‚   β”‚   β”œβ”€β”€ TaskHistory.tsx
β”‚   β”‚   └── ModelDropdown.tsx
β”‚   β”œβ”€β”€ helpers/             # Core logic and utilities
β”‚   β”‚   β”œβ”€β”€ availableActions.ts      # Defines browser actions
β”‚   β”‚   β”œβ”€β”€ determineNextAction.ts   # AI decision logic
β”‚   β”‚   β”œβ”€β”€ domActions.ts            # DOM manipulation
β”‚   β”‚   β”œβ”€β”€ parseResponse.ts         # AI response parsing
β”‚   β”‚   └── simplifyDom.ts           # DOM simplification
β”‚   β”œβ”€β”€ pages/               # Extension pages
β”‚   β”‚   β”œβ”€β”€ Background/      # Service worker
β”‚   β”‚   β”œβ”€β”€ Content/         # Content script
β”‚   β”‚   β”œβ”€β”€ Popup/           # Extension popup
β”‚   β”‚   β”œβ”€β”€ Options/         # Settings page
β”‚   β”‚   └── Panel/           # DevTools panel
β”‚   β”œβ”€β”€ state/               # Zustand state stores
β”‚   β”‚   β”œβ”€β”€ currentTask.ts
β”‚   β”‚   β”œβ”€β”€ settings.ts
β”‚   β”‚   └── ui.ts
β”‚   └── manifest.json        # Extension manifest
β”œβ”€β”€ utils/                   # Build scripts
β”œβ”€β”€ package.json
└── webpack.config.js

πŸ‘¨β€πŸ’» Team Members and Contributions

1. Avnish

  • Led the architecture and development of the AI agent system
  • Implemented OpenAI/Azure OpenAI integration
  • Designed the task execution pipeline

2. Armann

  • Developed the DOM interaction and manipulation system
  • Built the content script injection mechanism
  • Optimized the browser automation engine

3. Utkarsh

  • Implemented testing framework and wrote unit tests
  • Focused on debugging and error handling
  • Enhanced the deployment and build pipeline

4. Agam

  • Created the UI components and user interface
  • Developed the responsive layout using Chakra UI
  • Implemented the task history and status displays

πŸš€ Installation & Setup

Prerequisites

  • Node.js (v14 or higher)
  • pnpm (recommended) or npm
  • A Chromium-based browser (Chrome, Edge, Brave, etc.)
  • OpenAI API key or Azure OpenAI credentials

Installation Steps

  1. Clone the repository

    git clone https://github.com/avnishsinha/Zero-.git
    cd Zero-/browser-extension-master
  2. Install dependencies

    pnpm install
    # or
    npm install
  3. Build the extension

    pnpm build
    # or
    npm run build
  4. Load the extension in Chrome

    • Open Chrome and navigate to chrome://extensions/
    • Enable "Developer mode" (toggle in top-right corner)
    • Click "Load unpacked"
    • Select the build folder from the project directory
  5. Configure API credentials

    • Click the extension icon in your browser toolbar
    • Go to Options/Settings
    • Enter your OpenAI API key or Azure OpenAI credentials
    • Select your preferred model

πŸ’‘ Usage

  1. Activate the extension

    • Use the keyboard shortcut: Ctrl+Shift+Y (Windows/Linux) or Cmd+Shift+Y (Mac)
    • Or click the extension icon in your toolbar
  2. Enter a task

    • Type a natural language instruction (e.g., "Fill out this form with my information")
    • Or use voice input to describe the task
  3. Watch zer0 work

    • The AI agent will analyze the page
    • Execute actions step-by-step
    • Display progress in the task history
  4. Review results

    • Check the task history to see all actions performed
    • Review any errors or completion messages

Example Tasks

  • "Click the login button"
  • "Fill in the email field with example@email.com"
  • "Navigate to the pricing page"
  • "Search for AI tools on this website"

πŸ§ͺ Development

Run in development mode

pnpm start
# or
npm start

Run tests

pnpm test
# or
npm test

Code formatting

pnpm prettier
# or
npm run prettier

Linting

pnpm lint
# or
npm run lint

πŸ”§ Available Actions

The AI agent can perform the following actions on web pages:

  • click: Click on any element
  • setValue: Fill input fields with text
  • finish: Mark the task as complete
  • fail: Indicate inability to complete the task

More actions can be added by extending the availableActions array in the codebase.


🌟 Future Enhancements

  • Add more browser actions (hover, scroll, drag-and-drop)
  • Support for Firefox and Safari
  • Enhanced error recovery and retry logic
  • Local model support (Ollama, LM Studio)
  • Multi-step task planning and execution
  • Screenshot and vision capabilities
  • Task templates and saved workflows
  • Collaborative task sharing

⚠️ Privacy & Security

  • Data Privacy: All AI processing happens through your API keys; we don't store or access your data
  • API Keys: Your API keys are stored locally in the browser's secure storage
  • Permissions: The extension requires broad permissions to interact with web pages as requested
  • Review Actions: Always review the task history to understand what actions were performed

🀝 Contributing

Contributions are welcome! Please feel free to submit issues or pull requests.

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

πŸ“§ Contact

For inquiries or collaboration:


πŸ“œ License

This project is licensed under the MIT License.


πŸ™ Acknowledgments

  • OpenAI and Azure OpenAI for providing the AI capabilities
  • The Chrome Extensions community for excellent documentation
  • All contributors and testers who helped improve zer0

zer0 - Bridging the gap between your thoughts and browser actions πŸš€

About

This Project was made by me and my team at the Cal Hacks 10.0 hackathon in Nov 2023.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors