Testing whether the ADICO institutional grammar framework is linguistically universal or typologically specific.
This repository contains code, data, and analysis for a research project examining how linguistic typology shapes institutional reasoning. We test the Crawford-Ostrom Institutional Grammar (IG) framework by generating deliberative debates in four typologically diverse languages and analyzing whether the ADICO structure (Attribute-Deontic-Aim-Condition-Or-else) emerges naturally.
Does the "attributed, deontically qualified action" assumed by institutional analysis represent a universal cognitive unit, or does it reflect the grammatical affordances of English and similar languages?
ADICO is a genre-specific grammar, not a universal template. All four languages can produce ADICO-compatible outputs when explicitly instructed, but none do so by default. Each language channels normativity through different grammatical pathways:
| Language | Alignment | Default Normative Strategy |
|---|---|---|
| English | Nominative-accusative | Agent-action framing, modal deontics |
| Basque | Ergative-absolutive | Process-orientation, distributed agency |
| Czech | Nom-acc + aspect | Middle voice, state descriptions |
| Hebrew | Nom-acc + binyanim | Causative templates, implicit deontics |
ErgativeAgentsSims2025/
├── debate.py # Main debate generation script
├── research_agent.py # Cross-linguistic analysis engine
├── visualizations.py # Chart and dashboard generation
├── requirements.txt # Python dependencies
│
├── logs2025/ # Raw debate transcripts (JSONL)
│ ├── english_*.jsonl
│ ├── basque_*.jsonl
│ ├── czech_*.jsonl
│ └── hebrew_*.jsonl
│
├── research_outputs/ # Analysis results
│ └── session_*/
│ ├── reports/ # JSON + Markdown reports
│ └── visualizations/ # Charts and dashboards
│
├── article_figures/ # Publication-ready figures
│ ├── appendix/ # Appendix visualizations
│ └── *.png
│
├── docs/ # Documentation
│ ├── ARTICLE_MATERIALS.md # Draft article content
│ ├── THREE_STUDY_COMPARISON.md
│ ├── DATA_DICTIONARY.md # Data file documentation
│ └── METHODOLOGY.md # Full methodology
│
├── analyzers/ # NLP analysis modules
│ ├── morphological_analyzer.py
│ ├── syntactic_analyzer.py
│ └── ...
│
└── tests/ # Test suite
- Python 3.10 or higher
- OpenAI API key (for debate generation)
- ~4GB disk space (for NLP models)
# Clone the repository
git clone https://github.com/yourusername/ErgativeAgentsSims2025.git
cd ErgativeAgentsSims2025
# Create virtual environment
python -m venv venv
# Activate (Windows PowerShell)
.\venv\Scripts\Activate.ps1
# Activate (Linux/Mac)
source venv/bin/activate
# Install dependencies
pip install -r requirements.txt
# Download NLP models
python -m spacy download en_core_web_sm
python -c "import stanza; stanza.download('eu')" # Basque
python -c "import stanza; stanza.download('cs')" # Czech
python -c "import stanza; stanza.download('he')" # Hebrew
# Configure API key
echo "OPENAI_API_KEY=your_key_here" > .envTo reproduce the main findings using existing data:
# Analyze the neutral condition debates (4 languages)
python research_agent.py --logs logs2025/english_ai_harm_prevention_open_20260210_*.jsonl \
logs2025/basque_ai_harm_prevention_open_20260210_*.jsonl \
logs2025/czech_ai_harm_prevention_open_20260210_*.jsonl \
logs2025/hebrew_ai_harm_prevention_open_20260210_*.jsonl
# Results appear in research_outputs/session_YYYYMMDD_HHMMSS/# Run a 6-round debate with neutral prompts
python debate.py --language english --open-form --topic ai_harm_prevention --rounds 6
# Run with rule-demanding prompts (15 rounds)
python debate.py --language basque --open-form --topic ai_harm_prevention --rounds 15 \
--prompt-style proposal
# Run with anti-rules prompts
python debate.py --language czech --open-form --topic ai_harm_prevention --rounds 6 \
--prompt-style anti-rules# Generate appendix figures
python generate_appendix_figures.py
# Generate article figures
python generate_article_figures.py
# Figures saved to article_figures/JSONL files containing debate transcripts. Each line is a JSON object with:
{
"round": 1,
"speaker": "Agent_A",
"timestamp": "2026-02-10T13:35:56.123Z",
"content": "Debate utterance text...",
"metadata": {
"language": "english",
"topic": "ai_harm_prevention",
"prompt_condition": "neutral"
}
}File naming convention: {language}_{topic}_{mode}_{datetime}_{hash}.jsonl
Analysis outputs include:
research_report_*.json- Structured quantitative dataresearch_report_*.md- Human-readable analysisSESSION_SUMMARY.md- Cross-linguistic comparison
Key metrics in JSON reports:
subjects_per_sentence- Agent visibility measurehhi_agency- Herfindahl-Hirschman Index for agency concentrationvoice_valency_analysis- Distribution of grammatical voice typesinformation_status- Du Bois Given A Constraint metrics
See docs/DATA_DICTIONARY.md for complete documentation.
| Condition | Prompt Style | Rounds | Purpose |
|---|---|---|---|
| Rule-demanding | "formulate clear rules and guidelines" | 15 | Test ADICO capacity |
| Anti-rules | "focus on experiences, not formal rules" | 6 | Test natural defaults |
| Neutral | "describe how things should be" | 6 | Baseline comparison |
- Subject Realization - Measures explicit agent naming (subjects per sentence)
- Agency Distribution (HHI) - Concentration of grammatical roles (A/S/O)
- Voice Analysis - Distribution of active, passive, middle, causative constructions
- Information Status - Du Bois's Given A Constraint adherence
- ADICO Failure Modes - Which components resist coding, by language
Results should be interpreted with awareness that:
- Debates are LLM-mediated (GPT-4o), not native speaker production
- Single topic (AI harm prevention) may not generalize
- NLP tools have variable accuracy across languages
- Genre conventions may influence outputs
See docs/METHODOLOGY.md for full methodology documentation.
| Metric | English | Basque | Czech | Hebrew |
|---|---|---|---|---|
| Subjects/Sentence | 1.78 | 0.80 | 1.80 | 1.37 |
| HHI Agency | 0.415 | 0.457 | 0.336 | 0.336 |
| Active Transitive % | 81.5 | 23.5 | 53.5 | 21.3 |
| Given A Adherence % | 51.9 | 42.8 | 54.1 | 50.1 |
The analysis identifies four deliberative grammar types that emerge naturally:
- PPO (Process-Participant-Orientation) - Basque default
- RST (Relational-State-Transition) - Czech default
- AFIG (Affected-First Grammar) - Ergative-aligned discourse
- ECG (Enunciative-Contextual Grammar) - Hebrew default
If you use this project in research, please cite:
@software{cross_ling_ig_2026,
title = {Cross-Linguistic Institutional Grammar Analysis},
author = {[Author Name]},
year = {2026},
url = {https://github.com/yourusername/ErgativeAgentsSims2025},
note = {Code and data for testing ADICO universality across languages}
}This project is licensed under the MIT License - see LICENSE for details.
- OpenAI GPT-4o for debate generation
- Stanford NLP Group for Stanza
- spaCy for English analysis
- The Ostrom Workshop for institutional analysis frameworks
- Crawford, S., & Ostrom, E. (1995). A Grammar of Institutions. APSR
- Dixon, R. M. W. (1994). Ergativity. Cambridge University Press
- Du Bois, J. W. (1987). The Discourse Basis of Ergativity. Language
- Dowty, D. (1991). Thematic Proto-Roles. Language
For questions about the code or methodology, please open an issue on GitHub.