-
-
Notifications
You must be signed in to change notification settings - Fork 168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OmniMCP: Direct Host Control Bridge Between OmniParser and Claude MCP #947
Open
abrichr
wants to merge
24
commits into
main
Choose a base branch
from
feat/omnimcp-clean
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This commit adds OmniMCP, a system that enables Claude to control the computer using the Model Control Protocol. Key components: - OmniParser adapter for UI element detection - MCP server implementation - CLI interface for commands and debugging - Comprehensive documentation OmniMCP combines OmniParser's visual understanding with Claude's natural language capabilities to automate UI interactions.
- Create dedicated omnimcp folder with pyproject.toml and setup.py - Add installation scripts for Windows (install.bat) and Unix (install.sh) - Set up minimal package structure that uses OpenAdapt imports - Configure entry points for CLI commands 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
Updated comment in omnimcp.py to use "CLI mode" instead of "interactively" for consistency with other documentation and code. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
Replace hardcoded 800x600 visualization size with actual monitor dimensions from utils.get_monitor_dims() to ensure consistent scaling across different display configurations. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Mark install.sh as executable for Unix/Mac users - Add a note to the README about permissions in case Git doesn't preserve them 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Create a dedicated pathing.py module for OpenAdapt path management - Add descriptive error messages for troubleshooting import issues - Centralize path setup logic with proper error handling - Update importing modules to use the new path handling 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Add lazy imports for BeautifulSoup in utils.py functions - Add jinja2 to OmniMCP dependencies - Simplify setup.py to use dependencies from pyproject.toml - Preserve OpenAdapt path handling in setup.py 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Add posthog to OmniMCP dependencies - Keep BeautifulSoup lazy loaded in utils.py functions - Revert DistinctIDPosthog class to its original implementation 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Add multiprocessing-utils to OmniMCP dependencies - Restore original implementation of process_local storage - Add development command to README.md for resetting environment 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Add numpy as a dependency for array operations - Required by utils.py 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Add orjson as a dependency for fast JSON handling - Required by utils.py 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
77b04c2
to
b30c6a7
Compare
- Add dictalchemy for SQLAlchemy dict utilities - Required for openadapt.db module 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Update models.py to use string literals for BeautifulSoup types - Allow OmniMCP to run without BeautifulSoup dependency 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Add joblib for caching functionality - Required by openadapt.cache module 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Add boto3 and botocore for AWS SDK - Required for deploying OmniParser service 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Add allow_no_parser flag to make it explicit when running without OmniParser - Fail by default if OmniParser server is not available - Update README with clear instructions for OmniParser configuration - Add TODO for future Anthropic ComputerUse integration 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Add detailed comparison of OmniMCP and Anthropic ComputerUse approaches - Describe key architectural differences and integration opportunities - Add TODO comment for future ComputerUse integration possibilities 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Add auto-deploy functionality with user confirmation - Add skip-confirmation flag to deploy without prompting - Add TODO for simplified AWS configuration in the future - Update documentation with new options and deployment scenarios - Expand README with detailed OmniParser configuration instructions 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Added an environment variable override for PROJECT_NAME - Added .env.example to show required AWS credentials - Updated README with clearer installation instructions - Added CLAUDE.md with important command notes - Added paramiko dependency for OmniParser deployment - Modified omnimcp.py to ensure PROJECT_NAME consistency - Simplified openadapt/adapters/__init__.py imports
This is a work-in-progress commit that: 1. Moves OmniMCP, OmniParser adapter, and MCP server to omnimcp package 2. Updates imports and dependencies to match new structure 3. Adds Computer Use integration (loop.py) as a demo 4. Updates setup.py to include the new entry points Still TODO: - Ensure all imports from OpenAdapt are minimal (just utils.py) - Finish testing the OmniParser + MCP integration - Clean up any remaining references to OpenAdapt
This commit makes OmniMCP more independent from OpenAdapt: 1. Create a local config.py to replace openadapt.config dependency 2. Use the Anthropic SDK directly instead of openadapt.drivers.anthropic 3. Update the Claude model to use latest versions (3.5/3.7) 4. Replace run_omnimcp.py with a local implementation 5. Update imports throughout the codebase to use local modules
- Fixed import path in omniparser.py to use correct deploy.deploy.models.omniparser.deploy - Added subnet creation for VPCs without subnets - Fixed key path handling to avoid permission issues - Improved EC2 instance discovery to connect to remote server - Enhanced documentation in CLAUDE.md with detailed troubleshooting steps - Added PROJECT_NAME to .env.example for consistency - Fixed string formatting in deploy.py Docker commands 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
OmniMCP: Direct Host Control Bridge Between OmniParser and Claude MCP
What Makes OmniMCP Unique
OmniMCP bridges Microsoft's OmniParser (for UI detection) with Anthropic's Model Control Protocol (MCP) to enable direct host computer control:
Unlike Computer Use (which runs in a VM with custom tools), OmniMCP provides a lightweight bridge that runs directly on the host and captures the entire screen, making it more flexible for general automation tasks outside a sandbox.
Key Improvements
Fixed OmniParser Auto-Deployment
Modular Package Structure
Three Operational Modes
CLI Mode:
Server Mode:
Debug Mode:
Installation and Usage
AWS Requirements
For OmniParser deployment to work properly:
Key Implementation Files
omnimcp/omnimcp.py
: Core implementationomnimcp/adapters/omniparser.py
: OmniParser client and deployment logicomnimcp/mcp/server.py
: MCP server implementationdeploy/models/omniparser/deploy.py
: AWS deployment script with fixes🤖 Generated with Claude Code
Co-Authored-By: Claude noreply@anthropic.com