Skip to content

civorra/Chorus

Repository files navigation

Chorus::Engine

CPAN version CI Perl License

Chorus is a Perl inference engine that turns a normative corpus into a conformity-checking pipeline. An AI agent builds the knowledge base; the engine executes it deterministically and traceably — no LLM, no network, on any machine with Perl.

The system works in two distinct phases:

Phase A — Build   [AI agent, supervised, once per standard]
  Raw corpus → chorus-feed → KB + YAML rules
             → chorus-check → deployable Perl pipeline

Phase B — Execute [Chorus alone, no LLM, for every project]
  project.json → perl run.pl → conformity report
  100 % deterministic · reproducible · certifiable

The LLM intervenes only in Phase A — reading the corpus, structuring knowledge, generating artefacts. In Phase B, it no longer intervenes: the Perl pipeline runs alone, deterministically and reproducibly.

Normative corpus (PDF, plain text, Word, Excel)
        │
   chorus-pdf / chorus-word / chorus-excel + chorus-feed   ← AI agent extracts and formalises the rules
        │
   KB: ontology · YAML rules · normative tables
        │
   chorus-check               ← generates the Perl pipeline, runs it
        │
   perl run.pl project.json   ← deterministic, reproducible, no AI agent
        ▼
  ✅ COMPLIANT / ❌ NON_COMPLIANT  (per element, per agent, with reason and reference)

Origin

Chorus belongs to the tradition of symbolic AI — explicit knowledge representation, typed structures, deterministic inference. In the lineage of expert systems and Marvin Minsky's Frames.

The first version was born in 2013 from the porting to Perl of an original LISP project. The goal was twofold: to show that Perl was perfectly suited to this kind of implementation, and to offer the CPAN community an inference engine inspired by Minsky's Frames — typed objects, slots, inheritance, inference chain.

More than a decade later, an LLM's analysis of the project revealed an unexpected complementarity: where the symbolic engine excels at executing rules deterministically and traceably, the LLM excels at reading a corpus and formalising them. The real friction — writing YAML rules by hand, a tedious task — was the LLM's natural ground.

That encounter gave rise to version 2.

Chorus v2 is an augmented symbolic system: the inference engine remains sovereign — frames, slots, inference chain, no neural network in the decision layer. The LLM is a preprocessing tool, not a decision-maker. Two forms of AI, complementary rather than competing.


Why an LLM cannot run the verification itself

Chorus occupies a specific position in the current AI landscape. Most hybrid systems use a language model as the decision layer and rules as guardrails. Chorus inverts this: the LLM is an extraction tool that reads documents and formalises rules; the inference engine handles all reasoning. The LLM never draws a conclusion.

1. Exhaustive corpus coverage — impossible to guarantee. A language model does probabilistic completion, not exhaustive enumeration. Rare clauses, normative footnotes, and cross-references between standards are silently omitted. The problem: the model does not know what it omits.

2. Consistency across a full project dossier — certain degradation. A real dossier includes many heterogeneous documents — specifications, calculation notes, product data sheets, supporting evidence. On long contexts, an LLM loses precision on items introduced early and does not reliably detect cross-document contradictions.

3. Reproducibility — absent by nature. Two runs on the same project can produce different verdicts. For a control bureau or an insurer, this is disqualifying.

4. Traceability — structurally absent. An LLM may hallucinate references, paraphrase imprecisely, or conflate two clauses. It cannot guarantee that each assertion is anchored to a specific article of a specific standard.

5. Normative updates — opaque. When a standard is revised, there is no way to know which part of the LLM's reasoning is affected. With an explicit rule engine, the update is surgical: the affected YAML rules are identified, corrected, and re-tested in isolation.

The division of labour

An LLM is an excellent extractor and translator of normative text into formal rules. It is a poor conformity checker.

This is precisely the division of labour Chorus implements: the LLM generates and formalises the rules (chorus-feed); the inference engine executes them deterministically and traceably (chorus-check). Together they cover what neither can do alone.

Running chorus-check twice on the same project file, on any machine, always produces the same output — no sampling, no temperature, no randomness in the decision layer.


AI-assisted pipeline — chorus-* commands

The chorus-* commands are AI agent skills — not shell scripts. Each is loaded by an AI agent (Claude, Copilot, ECA…) and executed interactively in your development environment. The Perl pipeline they produce runs entirely on its own: no AI agent, no LLM, no network connection required at runtime.

Pipeline overview

Normative corpus (PDF, plain text, Word, Excel)
        │
   chorus-pdf          ← extracts PDFs (hybrid by default / text / auto / images)
   chorus-word         ← extracts Word documents (.docx)
   chorus-excel        ← extracts Excel spreadsheets and CSV (.xlsx, .csv)
        │
   corpus/<NNN>-<slug>.txt / -vision.md
        │
   chorus-feed         ← builds the KB: ontology, YAML rules, Helpers.pm
        │
   agent/agents/*.org · rules/**/*.yml · lib/.../Helpers.pm
        │                 ← domain expert reviews and corrects
   chorus-check        ← generates Feed.pm, Agent/*.pm, Expert.pm, run.pl
        │                   then runs: perl run.pl project.json
        ▼
  ✅ COMPLIANT / ❌ NON_COMPLIANT  (per element, per agent, with reason)
        │
   chorus-strengthen   ← classifies gaps, produces enrichment roadmap
        │
   chorus-feed --enrich ← targeted KB enrichment
        └──────────────────────────────────────────┐
                                                   │ reinforcement loop
                                            chorus-check --all ✅

The project file fed to chorus-check can be:

  • written by hand (if the slot vocabulary is known)
  • generated from the KB with chorus-create-project (conforming + KO variants, optional 4-file coverage suite --batch)
  • aligned from engineer documents with chorus-import-project (PDF, Word, Excel, inline table) — bridges engineer terminology to KB slot names * by enriching a thesaurus and assigning a confidence level to each source term:
Level Meaning
✅ certain Exact or trivially equivalent match
⚠️ probable Close match with documented transformation
❓ ambiguous Multiple KB candidates — human decision required
⛔ gap Required slot absent from source — blocks the pipeline
⬜ out-of-scope Present in source, absent from KB — noted but ignored

The alignment report produced (import-report-NNN.org) serves as the audit trail for each mapping decision and the thesaurus is re-read and enriched on subsequent imports to refine the match with the corpus terminology.

Commands at a glance

Command Role
chorus-quickstart Guided overview — start here if new to Chorus
chorus-pdf Extract a PDF corpus (hybrid by default / text / auto / images)
chorus-word Extract a Word document (.docx) into an enriched corpus
chorus-excel Extract an Excel spreadsheet or CSV into an enriched corpus
chorus-feed Build or enrich the KB from a corpus
chorus-check Generate infrastructure + run conformity check
chorus-create-project Generate a synthetic project JSON from the KB
chorus-import-project Align engineer documents with KB slot names
chorus-strengthen Identify rule gaps, produce enrichment roadmap

Reinforcement loop

Once the first pipeline is running, chorus-strengthen classifies every discordance (rule too strict, rule too permissive, Feed targeting gap) and recommends the corpus needed to close each gap:

chorus-create-project <sb> --batch          ← 4-file coverage suite
chorus-check <sb> --all                     ← synthesis table
chorus-strengthen <sb>                      ← gap report + roadmap
chorus-feed <sb> corpus-fix.txt --enrich    ← targeted enrichment
chorus-check <sb> --all                     ← verify convergence ✅

Once generated, runs without an AI agent

# On any machine with Perl installed:
perl run.pl project.json

# Re-run with a different project — no regeneration:
perl run.pl other-project.json

Full command reference: doc/en/04-chorus-commands.md


Application domains

Chorus is not tied to any particular sector. A domain is Chorus-compatible whenever three conditions hold:

  1. The project is described by typed elements — each object to validate (structural member, contractual clause, software component…) has measurable attributes and a discriminating type.
  2. The standard states thresholds, conditions and reference tables — explicit requirements, not open-ended prose.
  3. The decision must be traceable and reproducible — audit, certification, regulatory filing, litigation.

| Domain | Typical corpus | |---|---|---| | 🔐 Cybersecurity / NIS2 / DORA | SecNumCloud v3.2, NIS2 Annex II, DORA, ETSI EN 319 412 | | 🌿 CSRD / Environment | ESRS E1–E5, S1–S4, GHG Protocol, EU Taxonomy | | 🏗️ Construction / BIM | Eurocodes EC2/EC3/EC5, Building Regs, DTU | | ⚖️ GDPR / Public procurement | GDPR Art. 13/14/28/30/35, NIS2, procurement code | | 🏦 Finance / RegTech | Basel IV (CRR3), MiFID II, EMIR | | 💊 Pharmaceuticals / GMP | EU GMP Annex 1, ICH Q8/Q9/Q10, European Pharmacopoeia | | 🏥 Medical devices | MDR 2017/745, ISO 13485, IEC 62304, ISO 14971 | | 🚗 Automotive / ISO 26262 | ASIL A/B/C/D, ASPICE v3.1, MISRA C:2012 | | ✈️ Aerospace / DO-178C | DO-178C, ARP4754A, AMC 20-115 (EASA) | | ⚡ Energy / Nuclear | RCC-M, IEC 61511, ASN safety guide, IEC 62351 |

The key variable is corpus quality, not domain complexity. A well-structured corpus (numbered requirements, explicit reference tables, defined hierarchy levels) onboards in 2 to 4 weeks.

Full domain reference: doc/en/03-applications.md


Full working example

sandboxes/demo_en — timber-frame construction compliance against BS EN 338, EC5, Building Regulations Part L/B, BS EN 13501 (simulation).

perl sandboxes/demo_en/run.pl sandboxes/demo_en/project-01.json

Engine internals (YAML DSL, Chorus::Frame API, _MAX_CYCLES, _reset()): doc/en/01-intro.md

The core — Perl inference engine

The chorus-* pipeline runs on a pure Perl inference engine with no runtime dependency beyond the standard CPAN (YAML, Scalar::Util, Digest::MD5).

Chorus implements the classic recognise–act cycle of the expert-system tradition: at each iteration, the engine identifies rules applicable to the current working memory, fires them, then loops — until nothing changes or a goal is reached.

The working memory is made of Chorus::Frame objects whose properties (slots) carry domain knowledge. Chorus::Expert chains several specialised engines over a shared working memory.

Module Role
Chorus::Frame Knowledge representation — slots, inheritance, global registries, forward/backward chaining
Chorus::Engine Inference loop — rules, scope combinatorics, flow control, YAML loading
Chorus::Expert Multi-agent orchestration — shared BOARD, outer loop
Chorus::Collection::List Ordered Frame sequences — bidirectional prev/succ navigation, merge, positional tests
Chorus::Collection::Filter Regex-like filtering on Frame sequences — capture groups in @_VFILTER

Direct API

use Chorus::Engine;
use Chorus::Frame;

my $agent = Chorus::Engine->new();

Chorus::Frame->new(color => 'blue', label => 'sky');
Chorus::Frame->new(color => 'red',  label => 'fire');

$agent->addrule(
    _SCOPE => { f => sub { [ grep { $_->{color} eq 'blue' } fmatch(slot => 'color') ] } },
    _APPLY => sub {
        my %o = @_;
        return if $o{f}->{tagged};
        $o{f}->set('tagged', 'yes');
        print "Tagged: ", $o{f}->{label}, "\n";   # → Tagged: sky
        return 1;
    },
);

$agent->loop();

The YAML DSL expresses the same logic without repetitive Perl code:

RULE: tag-blue-frames
FIND:
  f:
    attribut: color
    filtre:   blue
EXCEPTION: defined $f->{tagged}
ACTION: |
  $f->set('tagged', 'yes');
  print "Tagged: $f->{label}\n";   # → Tagged: sky
  return 1;

Each YAML rule lives in its own .yml file. To load them, save the rule as rules/tag-blue-frames.yml and call loadRules() instead of addrule():

use Chorus::Engine;
use Chorus::Frame;

my $agent = Chorus::Engine->new();

Chorus::Frame->new(color => 'blue', label => 'sky');
Chorus::Frame->new(color => 'red',  label => 'fire');

$agent->loadRules('rules/');   # loads all *.yml in the directory

$agent->loop();

Files are compiled in alphabetical order — prefix with R01-, R02-… to control priority. Multiple loadRules() calls accumulate.

Full technical reference: perldoc Chorus::Engine · perldoc Chorus::Frame · perldoc Chorus::Expert


Installation

cpanm Chorus::Engine

Or from source:

perl Makefile.PL && make && make test && make install

Documentation


Contributing

Contributions are welcome — bug reports, documentation fixes, new examples, or rule engine improvements.


Repository

https://github.com/civorra/Chorus

About

Perl inference engine — LLM formalises rules from normative corpora, Chorus executes them deterministically.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages