Skip to content

Schema docs: enum and array examples with descriptions produce invalid YAML #16

@groksrc

Description

@groksrc

Summary

The schema docs at /concepts/schema-system show enum and array fields with descriptions placed after the value (e.g. status?(enum): [active, inactive, alumni], current relationship status). This syntax is invalid YAML and fails to parse in any standard YAML 1.1 / 1.2 parser, because trailing text after a flow sequence ([...]) is a syntax error.

The picoschema spec itself (dotprompt picoschema reference) places the description inside the parentheses, before the colon:

status(enum, current state): [active, inactive, alumni]
tags(array, list of tags): string

This form is valid YAML.

Relationship to #612

#612 ("ParseError: Schema frontmatter with picoschema commas breaks YAML parser") was filed and closed as COMPLETED on 2026-02-27, but the bug is still reproducible in basic-memory 0.20.3 (verified 2026-05-05) and the documentation has not been updated. The example that triggers the parse failure is the "Complete annotated example" in the schema docs:

schema:
  expertise?(array): string, areas of knowledge
  status?(enum): [active, inactive, alumni], current relationship status

Filing this as a separate issue against the docs because the docs are the entry point and need a fix regardless of how the parser issue is ultimately resolved. Even if the parser never accepts this syntax, the docs should not show users an example that fails before any Basic Memory code runs.

Steps to reproduce

Save this as a .md file:

---
title: Person
type: schema
entity: person
schema:
  name: string, full legal name
  role?: string, current job title
  expertise?(array): string, areas of knowledge
  status?(enum): [active, inactive, alumni], current relationship status
settings:
  validation: warn
---

This is the example from docs.basicmemory.com/raw/concepts/schema-system.md "Complete annotated example". Try to parse it with PyYAML or any standard YAML parser:

import yaml
yaml.safe_load(open('person.md').read().split('---')[1])

Result:

yaml.scanner.ScannerError: while parsing a block mapping
expected <block end>, but found ','

Expected

Either:

  1. Docs fix: Update the docs to use the spec-correct syntax with descriptions inside the parens. For example:

    schema:
      name: string, full legal name
      role?: string, current job title
      expertise?(array, areas of knowledge): string
      status?(enum, current relationship status): [active, inactive, alumni]

    This is valid YAML and matches the picoschema reference.

  2. Or, parser fix: Pre-process the YAML text to extract trailing descriptions before passing to the YAML parser, so the docs example is also accepted.

Option 1 is simpler and aligns with the picoschema spec. Option 2 introduces a Basic-Memory-specific deviation from the spec.

Additional finding

In testing, the schema parser does not actually parse picoschema modifiers like (enum, ...): [...] or (array, ...): type even when the YAML parses cleanly — it treats the entire status(enum, current state) as the field name verbatim. Filing that separately.

Why this matters

The schema docs are the entry point for users adopting validation. The "Complete annotated example" is the most-likely-copied snippet. Currently it produces an opaque YAML scanner error before any Basic Memory code runs, which suggests to a new user that schemas don't work at all.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions