Skip to content

Add AgentReputationSkill for multi-dimensional agent trust scoring#153

Merged
lbartoszcze merged 1 commit intomainfrom
wisent/agent-reputation-system
Feb 8, 2026
Merged

Add AgentReputationSkill for multi-dimensional agent trust scoring#153
lbartoszcze merged 1 commit intomainfrom
wisent/agent-reputation-system

Conversation

@lbartoszcze
Copy link
Copy Markdown
Contributor

Summary

  • Adds AgentReputationSkill - tracks agent reliability, competence, and trustworthiness across 5 dimensions
  • 10 actions: record_event, get_reputation, get_leaderboard, compare, record_task_outcome, record_vote, endorse, penalize, get_history, reset
  • Foundation for reputation-weighted consensus voting and trust-based task delegation
  • Also fixes f-string syntax error in service_api.py from PR Integrate APIGatewaySkill into ServiceAPI for production-grade API auth #152

Pillar

Replication + Self-Improvement - Enables trust-based coordination between replicas and self-awareness of performance

Key Features

  • 5 reputation dimensions (0-100 scale): Competence, Reliability, Trustworthiness, Leadership, Cooperation
  • Task outcome integration: Completed tasks boost competence + reliability; failures decrease them. Budget efficiency provides bonus.
  • Voting integration: Participation boosts cooperation; correct votes boost trustworthiness
  • Peer endorsements: Weighted by endorser's own reputation (0.5x to 1.5x multiplier)
  • Leaderboard: Rank agents by any dimension with minimum-events filtering
  • Comparison: Side-by-side comparison of any two agents across all dimensions

Test plan

  • 18 unit tests covering all 10 actions, edge cases, and scoring logic
  • 17 smoke tests pass
  • f-string syntax fix verified

🤖 Generated with Claude Code

…and trust

New skill that maintains multi-dimensional reputation scores for agents,
computed from task delegation outcomes, consensus voting history, and peer
endorsements. This is the foundation for trust-based coordination between
replicas and reputation-weighted consensus voting.

Reputation dimensions (0-100, start at 50 neutral):
- Competence: task success rate + budget efficiency
- Reliability: on-time delivery, timeout avoidance
- Trustworthiness: voting consistency, honesty
- Leadership: election wins, role performance
- Cooperation: conflict resolution, consensus participation

10 actions: record_event, get_reputation, get_leaderboard, compare,
record_task_outcome, record_vote, endorse, penalize, get_history, reset

Key features:
- Task outcomes affect competence + reliability with budget efficiency bonus
- Voting participation boosts cooperation; correct votes boost trustworthiness
- Peer endorsements weighted by endorser's own reputation (0.5x to 1.5x)
- Leaderboard ranking by any dimension with min-events filtering
- Side-by-side agent comparison across all dimensions
- Also fixes f-string syntax error in service_api.py from PR #152

18 tests pass, 17 smoke tests pass.

Pillar: Replication + Self-Improvement

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@lbartoszcze lbartoszcze merged commit 429febd into main Feb 8, 2026
0 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant