Skip to content

MoralAgentSim/social-evol-sim

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Social Simulation with Moral Agents

An evolutionary multi-agent simulation exploring how moral behaviors emerge through cooperation. LLM-powered agents with different moral frameworks (universal, reciprocal, kin-focused, selfish) interact in a resource-gathering environment — hunting, sharing, fighting, and reproducing — to test hypotheses about why morality might be favored by natural selection.

🏛️ Ecosystem & Project Page

This simulation engine is the core framework for the ACL 2026 paper: "Investigating Moral Evolution via LLM-based Agent Simulation", hosted under the MoralAgentSim organization.

Setup

Prerequisites

  • Python 3.12+
  • uv (package manager)
  • Git

Installation

git clone https://github.com/MoralAgentSim/social-evol-sim.git
cd social-evol-sim

# Install dependencies
uv sync

# Set up environment variables
cp .env.example .env
# Edit .env and set OPENROUTER_API_KEY (recommended — gives access to all
# OpenRouter-supported models through a single provider)

Database (optional)

For checkpoint storage in PostgreSQL:

pip install "psycopg[binary]"

Running the Simulation

# Start a fresh simulation
uv run python main.py run --config_dir configZ_major_v2

# Run with real-time dashboard
uv run python main.py run --config_dir configA_z8_easyHunting_visible --dashboard

# Resume from a checkpoint
uv run python main.py resume <RUN_ID> --config.world.max_life_steps 50

# Resume from a specific time step
uv run python main.py resume <RUN_ID> --time_step 10

# List available runs
uv run python main.py list-runs

# Estimate token usage and cost
uv run python main.py estimate-cost --config_dir configZ_major_v2

# Use OpenRouter as the LLM provider (any OpenRouter-supported model works)
uv run python main.py run \
  --config_dir configZ_major_v2 \
  --config.llm.provider openrouter \
  --config.llm.chat_model anthropic/claude-sonnet-4

# Combine with other flags
uv run python main.py run \
  --config_dir configZ_major_v4 \
  --config.llm.provider openrouter \
  --config.llm.chat_model openai/gpt-4o-mini \
  --config.llm.async_config.max_concurrent_calls 2 \
  --config.world.max_life_steps 1 \
  --dashboard

# Run 4 kin-focused agents only (override agent count and ratios)
uv run python main.py run \
  --config_dir configZ_major_v4 \
  --config.llm.provider openrouter \
  --config.llm.chat_model google/gemini-2.5-flash \
  --config.llm.async_config.max_concurrent_calls 20 \
  --config.agent.initial_count 4 \
  --config.agent.ratio.kin_focused_moral 1.0 \
  --config.agent.ratio.universal_group_focused_moral 0.0 \
  --config.agent.ratio.reciprocal_group_focused_moral 0.0 \
  --config.agent.ratio.reproductive_selfish 0.0 \
  --config.world.max_life_steps 3 \
  --dashboard

Model selection

This project follows the OpenRouter model-naming convention (<vendor>/<model-id>), giving access to the full catalogue of providers — Anthropic, OpenAI, Google, DeepSeek, Mistral, Meta, and more — through a single API key. See the full list at openrouter.ai/models.

Examples: anthropic/claude-sonnet-4, openai/gpt-4o-mini, google/gemini-2.5-flash, deepseek/deepseek-chat-v3-0324.

CLI Subcommands

Subcommand Description
run Start a fresh simulation (--config_dir required)
resume <RUN_ID> Resume from a checkpoint
list-runs List available simulation runs
estimate-cost Estimate token usage and cost (--config_dir required)

Shared Flags (for run and resume)

Flag Description
--checkpoint_dir Checkpoint save location (default: ./data)
--dashboard Enable Rich Live real-time dashboard
--log_level debug, info, warning, error, critical
--debug_responses Save raw LLM responses on validation errors
--no_db Disable database, file-only checkpoints
--config.* Override any nested config field (auto-generated from Pydantic model)

Common config overrides:

Override Description
--config.world.max_life_steps N Max simulation steps
--config.world.communication_and_sharing_steps N Communication frequency
--config.llm.provider LLM provider (recommended: openrouter; also supports openai, deepseek, alibaba)
--config.llm.chat_model Model id in OpenRouter format (e.g., anthropic/claude-sonnet-4, openai/gpt-4o-mini)
--config.llm.async_config.max_concurrent_calls N Max concurrent LLM calls (default: 10)
--config.agent.initial_count N Number of starting agents

Architecture

The simulation runs an async three-phase step loop:

  1. Phase 1 — Parallel LLM Decisions: All alive agents query the LLM concurrently (frozen checkpoint state). Returns pure AgentDecisionResult objects with no side effects.
  2. Phase 2 — Sequential Action Application: Decisions are applied one-by-one. A stale-action guard catches ValueError for race conditions (e.g., two agents hunting the same prey).
  3. Phase 3 — Environment Updates: Social and physical environment updates (plant regrowth, prey respawn).

Agent Actions

Agents choose from 8 action types each step: Collect, Allocate, Hunt, Fight, Rob, Reproduce, Communicate, DoNothing.

Morality Types

Agents are assigned one of 5 moral frameworks that shape their LLM prompts:

  • Universal group-focused moral — cooperates broadly
  • Reciprocal group-focused moral — tit-for-tat cooperation
  • Kin-focused moral — prioritizes family/offspring
  • Reproductive selfish — self-interested, reproduces aggressively
  • Reproduction-averse selfish — self-interested, avoids reproduction costs

🧬 Experimental Design & Core Configurations

The evolutionary trajectory of the simulation is governed by physical environment limits and cognitive (LLM) framing parameters. All core experiments are configured to run across 80 scaled time steps with a statistically rigorous N=4 simulation replication per condition.

Each config directory under config/ contains an isolated settings.json (world physics parameters, LLM model tuning, agent ratios) alongside injected moral framework prompt templates. We expose four fundamental experimental environments to observe the emergence (or extinction) of specific moral behaviors under selective pressures:

1. Baseline Configuration (configZ_major_v*)

  • Mechanics: High resource spawn rate (carrying_capacity: abundant) and frictionless (0 HP cost) agent-to-agent communication.
  • Evolutionary Pressure: Low pressure environments act as a control. Without life-threatening constraints, purely survival-oriented selection pressure drops.
  • Observed Emergence: Kin-focused agents dominate this epoch. They safely exploit the peaceful environment to rapidly expand familial lineages without requiring complex, risky trust negotiations outside their in-group.

2. Resource Scarcity (configA_z8_easyHunting_visible)

  • Mechanics: Ecological carrying capacity is artificially suppressed. The environment's prey/food regeneration rates are halved, and baseline vitality drainage per step is increased.
  • Evolutionary Pressure: High environmental attrition. Agents can no longer survive independently; cooperation and resource sharing become mandatory for long-term health.
  • Observed Emergence: Naturally selects for Reciprocal agents. Capable of evaluating external agents and negotiating resource sharing, Reciprocal trust clusters scale horizontally to bypass localized familial starvation limits.

3. Social Interaction Cost & Friction (config03_*)

  • Mechanics: Imposes a stiff metabolic penalty for dialogue. The Communicate action now costs an explicit 1 HP and 10 tokens per invocation.
  • Evolutionary Pressure: Taxing dialogue structurally penalizes highly social and cooperative types (Universal/Reciprocal) who rely heavily on communication protocols to forge alliances.
  • Observed Emergence: Driven by the metabolic drain of socialization, isolated Selfish agents thrive. By circumventing the communication tax entirely and hoarding individual resources, they out-live the over-extending cooperative agents.

4. Moral Type Observability (config04_*)

  • Mechanics: Removes LLM cognitive blindness regarding peer alignment. Observation prompts are injected with explicit tags revealing the internal moral alignment of targeting peers.
  • Evolutionary Pressure: Simulates a perfect "reputation system." Free-rider and defection problems are mathematically nullified because risk is perfectly calculable before interaction.
  • Observed Emergence: Highly-efficient Selective Altruism. Universal and Reciprocal agents immediately isolate Selfish defector types, starving them of shared resources which results in the rapid, enforced extinction of selfish behaviors.

Testing

# Run all tests
uv run pytest scr/tests/ -v

# Run a single test file
uv run pytest scr/tests/test_stale_action_guard.py -v

# Run a specific test
uv run pytest scr/tests/test_async_step.py::TestEventBus::test_publish_subscribe -v

Integration tests that require API keys will auto-skip when keys are unavailable.

Project Structure

main.py                              # Entry point (async)
config/                              # Simulation configurations
scr/
  api/
    llm_api/                         # LLM client (litellm), config, providers
    db_api/                          # PostgreSQL checkpoint storage
  models/
    agent/                           # Agent, actions, responses, decision_result
    environment/                     # Physical & social environments
    simulation/                      # Checkpoint
    core/                            # Config, metadata, logs
    prompt_manager/                  # Prompt construction, messages
  simulation/
    runner/                          # simulation_step (3-phase), runner, resumer
    agent_decision/                  # Async LLM decision-making, retry
    act_manager/                     # Action dispatch + handlers
    env_manager/                     # Environment step logic
    cli/                             # CLI parsing + command execution
    event_bus.py                     # AsyncIO pub/sub
    dashboard.py                     # Rich Live dashboard
  utils/                             # Logging, checkpoint I/O, random
  tests/                             # Unit and integration tests

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages