Skip to content

[Question/Proposal] Evaluating external custom agents (AEagent) against AssetOpsBench #284

@ygpark2

Description

@ygpark2

Hi @shuxin and the AssetOpsBench team,

Following up on my recent email conversation with Dhaval, I am opening this issue to discuss the best approach for evaluating an external agent against AssetOpsBench.

Background

I am the author of AEagent, an open-source, Elixir/OTP-based autonomous agent system. It focuses on strategic planning, tool execution with safety policies, long-term execution memory, and multi-agent delegation. Given its architecture, I believe AssetOpsBench is a perfect fit for testing its industrial reasoning and MCP-based workflow capabilities.

Questions regarding Integration Path

I would like to build an adapter to evaluate AEagent, but I noticed there are two distinct structures in the repository:

  1. CODS Track 1/2 scripts
  2. The newer aobench scenario-server/client structure

Could you clarify the recommended path for integrating an external agent?

  • Are the CODS Track scripts strictly for the fixed competition workflow, or can they be adapted for external systems?
  • Or is targeting the newer aobench scenario-client interface the preferred method moving forward?

Proposed Contribution

AEagent currently exposes a CLI interface and can return outputs in the expected JSON format (including result and trace fields). Once the preferred integration path is clarified, I would be very happy to build the adapter and contribute an example external-agent runner or integration guide to the repository, which might be useful for other researchers and developers.

Looking forward to your guidance on where to start!

Best regards,
Young Gyu Park

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions