Skip to content

Files

Latest commit

 

History

History

PC-Agent

PC-Agent: A Hierarchical Multi-Agent Collaboration Framework for Complex Task Automation on PC

📢News

🔥[2025-03-12] The code has been updated.

🔥[2025-02-21] We have released an updated version of PC-Agent. Check the paper for details. The code will be updated soon.

🔥[2024-08-23] We have released the code of PC-Agent, supporting both Mac and Windows platforms.

📺Demo

Download.paper.from.Chorme.mp4
Search.NBA.FMVP.and.send.to.friend.mp4
Write.an.introduction.of.Alibaba.in.Word.mp4

📋Introduction

  • PC-Agent is a multi-agent collaboration system, which can achieve automated control of productivity scenarios (e.g. Chrome, Word, and WeChat) based on user instructions.
  • Active perception module designed for dense and diverse interactive elements are better adapted to the PC platform.
  • The hierarchical multi-agent cooperative structure improves the success rate of more complex task sequences.

🔧Getting Started

Installation

Now Windows is supported.

conda create --name pcagent python=3.10
source activate pcagent

# For Windows
pip install -r requirements.txt

git clone https://github.com/Topdu/OpenOCR.git
pip install openocr-python

Configuration

Edit config.json to add your API keys and customize settings:

# API configuration
{
  "vl_model_name": "GPT-4o",
  "llm_model_name": "GPT-4o",
  "token": "sk-...", # Replace with your actual API key
  "url": "https://api.openai.com/v1"
}

Test on your computer

  1. Run the run.py with your instruction and your GPT-4o api token. For example,
python run.py --instruction="Create a new doc on Word, write a brief introduction of Alibaba, and save the document." 
  1. Optionally, you can add specific operational knowledge via the --add_info option to help PC-Agent operate more accurately.

  2. To further improve the operation efficiency of PC-Agent, you can set --disable_reflection to skip the reflection process. Note that this may reduce the success rate of the operation.

  3. If the task is not very complex, you can set --simple 1 to skip the task decomposition.