Skip to content

Are agentic systems in scope for your leaderboard? #114

@L3Gaunt

Description

@L3Gaunt

I suspect it is easy to beat the scores you are getting (and maybe even get closer to 100 %) by multi-turn and agentic systems like LLMLingua-2 or GraphReader.

Aggregating tricks, and understanding how to get to an acceptable performance by an LLM, seems important to someone building a system in production.

Would you consider accepting submissions of such agentic systems in your leaderboard? In particular, if you do, it would be interesting to include information on total tokens consumed/number of consecutive steps taken as well.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions