Skip to content

feat: add frontmatter validation to schema system#597

Merged
phernandez merged 2 commits intomainfrom
feature/schema-frontmatter-validation
Feb 21, 2026
Merged

feat: add frontmatter validation to schema system#597
phernandez merged 2 commits intomainfrom
feature/schema-frontmatter-validation

Conversation

@phernandez
Copy link
Member

Summary

  • Add optional settings.frontmatter validation rules to schema notes, using the same Picoschema syntax as the schema block
  • Validates frontmatter keys like tags, status, etc. for presence, enum value membership, and array contents
  • Schema notes only (inline schemas skip frontmatter validation automatically via default frontmatter_fields=[])

Schema Note Format

settings:
  validation: warn
  frontmatter:
    tags?(array): string
    status?(enum): [draft, published]
schema:
  name: string
  role?: string

Changes

  • parser.py: Add frontmatter_fields to SchemaDefinition, parse settings.frontmatter via existing parse_picoschema()
  • validator.py: Add frontmatter param to validate_note(), add _validate_frontmatter_field() helper
  • schema_router.py: Pass frontmatter=frontmatter to both validate_note() call sites
  • SPEC-SCHEMA.md / SPEC-SCHEMA-IMPL.md: Document frontmatter validation rules and examples

Test plan

  • 3 parser tests: with frontmatter, without, non-dict settings.frontmatter
  • 11 validator tests: required/optional present/missing, enum valid/invalid, array, None/empty, extra keys
  • 3 router integration tests: matching metadata passes, missing required warns, enum mismatch warns
  • just fast-check passes (lint, format, typecheck, impacted tests, smoke)

🤖 Generated with Claude Code

Add optional frontmatter validation rules to schema notes under
`settings.frontmatter`, using the same Picoschema syntax as the
`schema` block. This allows schema notes to validate metadata keys
like `tags` and `status` in addition to observation/relation content.

- Add `frontmatter_fields` to `SchemaDefinition` dataclass
- Parse `settings.frontmatter` in `parse_schema_note` via existing `parse_picoschema()`
- Add `_validate_frontmatter_field()` helper for presence/enum checks
- Pass frontmatter dict through `validate_note()` to schema router
- Inline schemas skip frontmatter validation automatically (default `[]`)
- 17 new tests: 3 parser, 11 validator, 3 router integration

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: phernandez <paul@basicmachines.co>
@github-actions
Copy link
Contributor

github-actions bot commented Feb 21, 2026

Claude finished @phernandez's task in 2m 22s —— View job


PR Review: feat: add frontmatter validation to schema system

Tasks:

  • Gather context and understand the request
  • Read changed files
  • Check against team checklist
  • Post detailed review

Summary

Clean, well-structured addition. The approach of reusing parse_picoschema() for frontmatter field parsing is the right call — no duplication, consistent behavior. Test coverage is thorough for the happy path and most edge cases. One behavioral issue found worth addressing before merge.


Bug: Empty frontmatter dict skips validation

src/basic_memory/schema/validator.py:122

if frontmatter and schema.frontmatter_fields:

This uses truthiness on frontmatter. If frontmatter={} (empty dict), bool({}) == False and validation is skipped entirely — even if the schema has required frontmatter keys.

The stated intent is that None means "no frontmatter data available" (inline schema path), while {} means "a note with no frontmatter keys." These two cases should produce different results: required fields on a note with empty frontmatter should warn/error.

The fix is:

if frontmatter is not None and schema.frontmatter_fields:

In practice the router always injects at least {"type": entity_type} via _entity_frontmatter(), so this won't manifest on real entities unless entity_type is also None. But it's a latent correctness bug and the None-vs-{} distinction is semantically meaningful here.

Related: test_frontmatter_none_skips_validation (validator.py:356) correctly tests None, but there's no corresponding test for frontmatter={}. Adding test_empty_frontmatter_dict_triggers_validation would close this gap. Fix this →


Minor: Doc has stale router path

docs/specs/SPEC-SCHEMA-IMPL.md (line 282 area, "API Endpoints" section):

**Location:** src/basic_memory/api/schema_router.py

Actual path is src/basic_memory/api/v2/routers/schema_router.py. Low-stakes but easy to fix in this PR since these docs are already being touched.


Behavioral note: empty array treated as "present"

src/basic_memory/schema/validator.py:298-304

if isinstance(value, list):
    return FieldResult(field=schema_field, status="present", values=[str(v) for v in value])

If tags: [] appears in frontmatter, the field is reported as "present" with values=[]. Whether this is intentional (key exists = present) or a gap (empty array = effectively missing) is a product decision. Either is defensible, but it's worth documenting or adding a test to make the behavior explicit.


Team Checklist

Code Quality & Standards

  • Follows Basic Memory's coding conventions in CLAUDE.md — section headers, decision point comments, narrative style all present
  • Python 3.12+ type annotations and async patterns
  • SQLAlchemy 2.0 best practices — N/A (no DB changes)
  • FastAPI conventions followed — router, dependency injection, response models
  • 100-character line length maintained
  • No commented-out code blocks

Testing & Documentation

  • Unit tests for new functions/methods — 3 parser tests, 11 validator tests
  • Integration tests for new features — 3 router integration tests
  • Test coverage for edge cases — None/non-dict settings, extra keys ignored, strict vs warn mode
  • 100% test coverage maintainedfrontmatter={} path not tested; see bug above
  • Documentation updated — SPEC-SCHEMA.md and SPEC-SCHEMA-IMPL.md both updated
  • SPEC-SCHEMA-IMPL.md has stale router path (minor)

Basic Memory Architecture

  • MCP tools follow atomic, composable design — no MCP changes in scope
  • Database changes include Alembic migrations — N/A, no schema changes
  • Preserves local-first architecture principles
  • Knowledge graph operations maintain consistency
  • Markdown file handling preserves integrity
  • Parser reuse is the right design — parse_picoschema() used for both schema and frontmatter fields

Security & Performance

  • No hardcoded secrets or credentials
  • Input validation — non-dict settings.frontmatter gracefully defaults to []
  • Proper error handling — parser raises ValueError for missing required fields
  • Performance considerations — N/A for this scope
  • No sensitive data in logs

Compatibility

  • File path comparisons — N/A, no path operations added
  • No emojis or Unicode in console/log output

Overall

Good PR. One real bug to fix (if frontmatterif frontmatter is not None), one doc nit, and an optional test addition. The core design is clean and the test suite is solid.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d8d7a6688f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

# Trigger: schema has frontmatter_fields and caller provided frontmatter dict
# Why: settings.frontmatter rules validate metadata keys like tags, status
# Outcome: frontmatter fields produce the same FieldResult/warning/error as content fields
if frontmatter and schema.frontmatter_fields:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Validate empty frontmatter dicts instead of skipping

Change the guard to avoid truthiness checks here: when frontmatter={} is passed, this block is skipped entirely and required settings.frontmatter fields are never marked missing, so strict-mode validation can incorrectly pass. Since validate_note() accepts a frontmatter dict and {} is a valid “no keys present” input, this should run whenever frontmatter is provided (e.g., frontmatter is not None).

Useful? React with 👍 / 👎.

`frontmatter={}` is a valid input where required keys should be flagged
missing. The truthiness check `if frontmatter` skipped validation for
empty dicts, allowing strict-mode to incorrectly pass.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: phernandez <paul@basicmachines.co>
@github-actions
Copy link
Contributor

github-actions bot commented Feb 21, 2026

Claude finished @phernandez's task in 1m 13s —— View job


PR Review: feat: add frontmatter validation to schema system

Tasks:

  • Gather context and understand the request
  • Read changed files to verify fixes
  • Check against team checklist
  • Post detailed review

Update: Follow-up Commit Review

This review covers the follow-up commit e70f55a ("fix: use is not None guard for empty frontmatter dict validation") pushed after the initial review.


Bug Fix: Verified ✅

src/basic_memory/schema/validator.py:122

The guard has been corrected:

# Before (buggy)
if frontmatter and schema.frontmatter_fields:

# After (correct)
if frontmatter is not None and schema.frontmatter_fields:

This correctly distinguishes None (inline schema path — skip validation) from {} (note with no frontmatter keys — run validation and flag missing required fields).


New Tests: Complete ✅

Two tests were added that close the coverage gap identified in the prior review (tests/schema/test_validator.py:363-378):

  • test_empty_frontmatter_dict_validates_missing_keys — confirms frontmatter={} in warn mode produces a warning for a required field marked missing
  • test_empty_frontmatter_dict_strict_fails — confirms frontmatter={} in strict mode causes passed=False with an error entry

Both tests are correctly structured and cover the semantically distinct None-vs-{} cases.


Remaining Nit: Stale Doc Path

docs/specs/SPEC-SCHEMA-IMPL.md:282 — still points to the old router location:

Location: src/basic_memory/api/schema_router.py

Actual path is src/basic_memory/api/v2/routers/schema_router.py. Low-stakes and doesn't affect runtime behavior, but easy to fix. Fix this →


Team Checklist

Code Quality & Standards

  • Follows Basic Memory's coding conventions in CLAUDE.md — section headers, decision point comments, narrative style all present
  • Python 3.12+ type annotations and async patterns
  • SQLAlchemy 2.0 best practices — N/A (no DB changes)
  • FastAPI conventions followed — router, dependency injection, response models
  • 100-character line length maintained
  • No commented-out code blocks

Testing & Documentation

  • Unit tests for new functions/methods — 3 parser tests, 11 validator tests (now 13 with new empty-dict cases)
  • Integration tests for new features — 3 router integration tests
  • Test coverage for edge cases — None/non-dict settings, extra keys ignored, strict vs warn mode, {} dict
  • 100% test coverage maintainedfrontmatter={} path now covered by two new tests
  • Documentation updated — SPEC-SCHEMA.md and SPEC-SCHEMA-IMPL.md both updated
  • SPEC-SCHEMA-IMPL.md:282 has stale router path (minor)

Basic Memory Architecture

  • MCP tools follow atomic, composable design — no MCP changes in scope
  • Database changes include Alembic migrations — N/A, no schema changes
  • Preserves local-first architecture principles
  • Knowledge graph operations maintain consistency
  • Markdown file handling preserves integrity
  • Parser reuse is the right design — parse_picoschema() used for both schema and frontmatter fields

Security & Performance

  • No hardcoded secrets or credentials
  • Input validation — non-dict settings.frontmatter gracefully defaults to []
  • Proper error handling — parser raises ValueError for missing required fields
  • Performance considerations — N/A for this scope
  • No sensitive data in logs

Compatibility

  • File path comparisons — N/A, no path operations added
  • No emojis or Unicode in console/log output

Overall

The primary bug from the first review has been fixed correctly, and both new tests accurately cover the frontmatter={} path in warn and strict modes. One doc nit remains (stale router path in SPEC-SCHEMA-IMPL.md:282). The core implementation is clean and ready for merge.

@phernandez phernandez merged commit 8451f2b into main Feb 21, 2026
26 checks passed
@phernandez phernandez deleted the feature/schema-frontmatter-validation branch February 21, 2026 04:57
@phernandez phernandez added this to the v0.19.0 milestone Feb 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant