Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 0 additions & 6 deletions integrations/computer-use/anthropic.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,6 @@ title: "Anthropic"

[Computer Use](https://docs.claude.com/en/docs/agents-and-tools/tool-use/computer-use-tool) is Anthropic's groundbreaking capability that enables Claude to interact with computers the way humans do—by looking at screens, moving cursors, clicking buttons, and typing text. This powerful feature allows AI agents to control web browsers, navigate interfaces, and perform complex tasks across applications.

With Computer Use, Claude can:
- **Navigate websites and applications** by interpreting visual interfaces
- **Click buttons and fill forms** just like a human would
- **Take screenshots** to understand and verify its actions
- **Perform multi-step workflows** that span multiple applications or web pages

By integrating Computer Use with Kernel, you can run these AI-powered browser automations on cloud-hosted infrastructure, eliminating the need for local browser management and enabling scalable, reliable AI agents.

## Quick setup with Computer Use
Expand Down
28 changes: 16 additions & 12 deletions integrations/computer-use/gemini.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -2,30 +2,34 @@
title: "Gemini"
---

Google's [Gemini 2.5 Computer Use model](https://blog.google/technology/google-deepmind/gemini-computer-use-model/) is a specialized model built on Gemini 2.5 Pro's capabilities to power agents that can interact with user interfaces.
[Gemini 2.5 Computer Use](https://blog.google/technology/google-deepmind/gemini-computer-use-model/) is Google's groundbreaking capability that enables AI models to interact with computers the way humans do—by looking at screens, moving cursors, clicking buttons, and typing text. This powerful feature allows AI agents to control web browsers, navigate interfaces, and perform complex tasks across applications.

By integrating Gemini 2.5 Computer Use with Kernel, you can run these AI-powered browser automations on cloud-hosted infrastructure, eliminating the need for local browser management and enabling scalable, reliable AI agents.

## Quick setup with our example template
## Quick setup with Computer Use

Get started quickly with our TypeScript template that demonstrates Gemini 2.5 Computer Use with Kernel.
Get started with Gemini Computer Use and Kernel using our pre-configured app template:

Check out the [Open-source Gemini Template](https://github.com/onkernel/ts-stagehand-google-cua-agent) repository for a complete working example that shows how to:
- Set up Gemini 2.5 Computer Use with Kernel
- Use Stagehand for browser automation
- Run AI-powered web interactions on cloud infrastructure
```bash
npx @onkernel/create-kernel-app my-computer-use-app
```

## Benefits of using Kernel with Gemini Computer Use
Choose `TypeScript` as the programming language and then select `gemini-cua` as the template.

Then follow the [Quickstart guide](/quickstart/) to deploy and run your Computer Use automation on Kernel's infrastructure.

## Benefits of using Kernel with Computer Use

- **No local browser management**: Run Computer Use automations without installing or maintaining browsers locally
- **Scalability**: Launch multiple browser sessions in parallel for concurrent automations
- **Stealth mode**: Built-in anti-detection features for web interactions
- **Scalability**: Launch multiple browser sessions in parallel for concurrent AI agents
- **Stealth mode**: Built-in anti-detection features for reliable web interactions
- **Session persistence**: Maintain browser state across automation runs
- **Live view**: Debug your automations with real-time browser viewing
- **Live view**: Debug your Computer Use agents with real-time browser viewing
- **Cloud infrastructure**: Run computationally intensive AI agents without local resource constraints

## Next steps

- Check out [live view](/browsers/live-view) for debugging your automations
- Check out [live view](/browsers/live-view) for debugging your Computer Use automations
- Learn about [stealth mode](/browsers/stealth) for avoiding detection
- Learn how to properly [terminate browser sessions](/browsers/termination)
- Learn how to [deploy](/apps/deploy) your Computer Use app to Kernel
4 changes: 3 additions & 1 deletion integrations/computer-use/openai.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,9 @@
title: "OpenAI"
---

[Computer Use](https://openai.com/index/computer-using-agent/) is OpenAI's feature that enables AI models to interact with computers like humans do - through screen observation, cursor movement, and keyboard input. By integrating with Kernel, you can run Computer Use automations with cloud-hosted browsers, allowing your AI agents to navigate websites, fill forms, and interact with web applications autonomously.
[Computer Use](https://openai.com/index/computer-using-agent/) is OpenAI's feature that enables AI models to interact with computers the way humans do—by looking at screens, moving cursors, clicking buttons, and typing text. This powerful feature allows AI agents to control web browsers, navigate interfaces, and perform complex tasks across applications.

By integrating Computer Use with Kernel, you can run these AI-powered browser automations on cloud-hosted infrastructure, eliminating the need for local browser management and enabling scalable, reliable AI agents.

## Quick setup with our Computer Use example app

Expand Down