Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 40 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,46 @@ it reaches `1.0.0`.

### Added

- **19 new AC-aligned rules + domain-agnostic `text_features` extractor** ([#TBD]):
Closes the AC discrepancy backlog identified in
`docs/comparison/lambda-rag-vs-air-canada.md`. Adds 19 new rules to
`samples/contracts/ac-demo-ruleset.json` (now v2.0.0, 24 rules total)
covering payment terms, IP/work-for-hire, liability carve-outs,
insurance limits, security/cryptography, privacy obligations
(residency, breach window, consent, retention, explicit-laws), AI
addenda, subcontracting approval, service locations, and Quebec
governance.
- **`TextFeatureExtractor`** (new in `LambdaRag.Projection`): pure-regex,
domain-agnostic numeric extraction over English prose. Adds
`text_features.{day_counts, month_counts, year_counts, percent_values,
dollar_amounts}` arrays plus `_min`/`_max` scalars on every projected
section. Rule authors target numeric thresholds via lambdas like
`input1.text_features.day_count_max <= 45` — usable by **any** ruleset,
not just AC.
- Engine remains domain-agnostic: 11 new tests
(`TextFeatureExtractorTests` + `GenericTextFeaturesEvaluationTests`)
prove the extractor and evaluator work on synthetic non-AC corpora
(vendor bond, permit response, ESG recycled-content, oil-and-gas).
- Projector bumped to **v1.4.0**; topic-map (`contract.v1.json`) bumped
to **v1.1.0** (adds `tax`, `subcontracting`, `ai`, `service_locations`
topics).
- End-to-end vs the AC contract: **`pass=5 fail=21 gap=1`** — every Fail
is a genuine deterministic finding (NET 60 > 45, 2% > 1.5%, no Quebec
governance, etc.).

### Fixed

- `TextFeatureExtractor.DollarRx`: shorthand suffixes (`m|b|k`) no longer
match the leading letter of an unrelated trailing word (e.g. `$1,000,000
bond` was previously parsed as `$1,000,000 b` → 10¹⁵). Suffix now
requires a word boundary via negative lookahead `(?![A-Za-z])`.
- `TextFeatureExtractor.DayCountRx`: now also matches hyphenated forms
(`120-day`, `30-day`) in addition to spaced forms (`120 days`).

## [Unreleased — earlier]

### Added

- **AC-style comment format in markup output** ([#TBD]):
Word comments now match the Air Canada contract-review UX so reviewers
see the same visual feedback as in the agentic flow:
Expand Down
28 changes: 28 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -186,6 +186,34 @@ If the ontology you need isn't in the table above, copy
`my-industry.v1.json`, add your headings/aliases per topic, rebuild,
and pass `--topic-map my-industry.v1` to the extractor.

### Numeric thresholds with `text_features` (projector v1.4.0+)

Every projected section now carries a `text_features` block with
generic numeric facts extracted from the section's prose:

| Field | What it captures | Example match |
|-------|------------------|---------------|
| `day_counts` / `day_count_min` / `day_count_max` | day quantities | `45 days`, `120-day cure`, `90 calendar days` |
| `month_counts` / `_min` / `_max` | month quantities | `12 months`, `36-month term` |
| `year_counts` / `_min` / `_max` | year quantities | `5 years`, `2-year warranty` |
| `percent_values` / `percent_min` / `percent_max` | percentages | `1.5%`, `30 percent` |
| `dollar_amounts` / `dollar_min` / `dollar_max` | dollar values | `$5,000,000`, `$1.5M`, `USD 10,000,000`, `CAD$ 2.5 million` |

Rule lambdas reference these fields directly — no per-domain code:

```json
{
"predicate": "input1.topics.Contains(\"insurance\") && input1.text_features.dollar_amounts.Count > 0",
"lambda": "input1.text_features.dollar_max >= 5000000"
}
```

This is a *generic* extractor: it works on **any** domain (vendor
bonds, ESG recycled-content thresholds, permit response windows,
pipeline pressure-test durations…). The same rule shape is used for
contracts, public-sector permitting, oil-and-gas, FSI policies, and
governance frameworks.

## CLI cheat sheet

```
Expand Down
91 changes: 91 additions & 0 deletions docs/comparison/lambda-rag-vs-air-canada.md
Original file line number Diff line number Diff line change
Expand Up @@ -257,3 +257,94 @@ Both engine-level defects flagged in §4 were fixed in this same PR (no separate
Re-run vs the same AC sample contract: `pass=4 fail=1 gap=1` (was `pass=3 fail=0 gap=2`). §9 now correctly **Fails** on IP indemnity, §9 + §11 both **Pass** on the explicit liability cap, the spurious §9 liability Gap is gone, and the legitimate `AC-WAR-001` Gap (no warranty section — AC missed it) remains.

The 22 `N`-class coverage gaps require authoring the AC-policy ruleset from the six PDFs (`out/ac-full/ac-policies-ruleset.json`) and are tracked separately. Section 5 of this document lists the 19 priority rules to author.


## Phase F — 19 priority rules + `text_features` extractor (post-audit, this PR)

Closes the §5 backlog identified above. Three changes, all merged together
in this PR:

1. **19 new rules** authored as data only in
`samples/contracts/ac-demo-ruleset.json` (now **v2.0.0**, 24 rules
total). New rule IDs: `AC-LIAB-CARVEOUTS`, `AC-TERM-CONV`,
`AC-PAY-NET45`, `AC-PAY-INT-MAX`, `AC-TAX-EXCL`,
`AC-IP-WORKFORHIRE`, `AC-INS-GCL-5M`, `AC-INS-CYBER-10M`,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question (typo): Check whether the rule ID AC-IP-WORKFORHIRE is intentionally spelled without a hyphen.

The identifier differs from the natural-language phrase (“work-for-hire”) and the changelog (IP/work-for-hire). Please confirm whether WORKFORHIRE is the intended canonical form, or if this should be adjusted (e.g., AC-IP-WORK-FOR-HIRE) for consistency with your naming conventions.

`AC-CYBER-CRYPTO`, `AC-PRIV-RESIDENCY`, `AC-PRIV-72H-AC`,
`AC-PRIV-CONSENT`, `AC-PRIV-RETENTION`,
`AC-PRIV-EXPLICIT-LAWS`, `AC-AI-ADDENDUM`,
`AC-AI-PRIVACY-SUPP`, `AC-SUBK-APPROVAL`, `AC-SVC-LOCATION`,
`AC-LAW-QUEBEC`.

2. **`TextFeatureExtractor`** (new in `LambdaRag.Projection`,
projector v1.4.0). A pure-regex, *domain-agnostic* numeric extractor
that runs over each section's prose and emits
`text_features.{day_counts, month_counts, year_counts,
percent_values, dollar_amounts}` arrays plus `_min`/`_max`
scalars. Rule authors target numeric thresholds via lambdas like
`input1.text_features.day_count_max <= 45` — usable by **any**
ruleset, not just AC. The keystone of the genericness story: the
engine never looks at AC-specific content; it just exposes
structured numeric facts that *any* downstream policy can compare
against.

3. **Topic-map `contract.v1.json` → v1.1.0**: adds four generic
topics (`tax`, `subcontracting`, `ai`, `service_locations`)
so the new rules' `input1.topics.Contains(...)` predicates can
target the right sections without hard-coding string regexes in
lambdas.

### Genericness guardrail

The user's hard constraint was *"every time we tighten what we're doing
we need to make sure it's generic enough to reuse with completely
different rules, documents, and domains."* To prove the engine stays
domain-agnostic this PR adds **11 new tests**:

- `TextFeatureExtractorTests` (7) — regex behaviour proven on
oil-and-gas pipeline prose, ESG recycled-content, generic
payment terms, etc.
- `GenericTextFeaturesEvaluationTests` (4) — full evaluator runs
with **synthetic non-AC rulesets** (`rs-vendor-x`,
`rs-municipal`, `rs-esg`) over synthetic non-AC sections,
proving the engine evaluates `text_features`-based predicates
generically.

All four existing corpus verticals (`contract`, `oil-gas`,
`permitting`, `gov-architecture`, `fsi`) continue to match
their golden verdicts byte-for-byte after the projector v1.4.0 bump
(only mechanical drift in the new `text_features` field; **zero
verdict changes**).

### End-to-end vs the AC sample contract

| Run | Pass | Fail | Gap | Err |
|-----|------|------|-----|-----|
| Before this PR (5 rules) | 4 | 1 | 1 | 0 |
| After this PR (24 rules) | **5** | **21** | **1** | **0** |

Spot-checked Fails (all genuinely correct findings):

- `AC-PAY-NET45` Fails on §4 — contract's `Net 60` violates
`day_count_max <= 45`.
- `AC-PAY-INT-MAX` Fails on §4 — `2% per month` exceeds
`percent_max <= 1.5`.
- `AC-LAW-QUEBEC` Fails on §12 — Governing law is Ontario, not
Quebec.
- `AC-INS-GCL-5M` / `AC-INS-CYBER-10M` Fail on insurance limits
below `` / ``.
Comment on lines +333 to +334
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (typo): The example for insurance limits has missing threshold values (empty backticks).

In the bullet for AC-INS-GCL-5M / AC-INS-CYBER-10M, the text below `` / ``. looks like an unfilled placeholder for the actual threshold amounts. Please replace these with the intended limits (e.g. $5M / $10M) so the rule is clear.

Suggested change
- `AC-INS-GCL-5M` / `AC-INS-CYBER-10M` Fail on insurance limits
below `` / ``.
- `AC-INS-GCL-5M` / `AC-INS-CYBER-10M` Fail on insurance limits
+ below `$5M` / `$10M`.


### Two extractor bugs found and fixed

While building the genericness tests, two regex defects in
`TextFeatureExtractor` surfaced and were fixed in this same PR:

- **`DollarRx` shorthand-suffix bug**: `(million|billion|m|b|k)?`
matched the leading `b` of an unrelated trailing word, so
`,000,000 bond` was parsed as `,000,000 b` → 10¹⁵. Fixed by
requiring a word boundary via `(?![A-Za-z])` lookahead.
- **`DayCountRx` hyphen support**: `120-day cure window` (very
common in legal English) wasn't matching. Fixed by allowing
`[\s-]*` between the digit and the unit word.

Both fixes leave AC end-to-end outcomes identical (pass=5 fail=21
gap=1 unchanged).
Loading
Loading