Skip to content

feat(struct): add StructDef and dot-access MVP#41

Merged
thiremani merged 28 commits intomasterfrom
codex/structdef-on-master
Mar 14, 2026
Merged

feat(struct): add StructDef and dot-access MVP#41
thiremani merged 28 commits intomasterfrom
codex/structdef-on-master

Conversation

@thiremani
Copy link
Copy Markdown
Owner

Summary

  • add Phase 1 StructDef support in .pt using header-row + single value row syntax
  • add .spt dot field access (p.name) with parser/type-solver/compiler support
  • store struct constants as LLVM named struct globals and extract fields via extractvalue
  • enforce duplicate struct type rejection and duplicate struct field header rejection
  • keep struct metadata in Code.Struct (parallel to Code.Const and Code.Func)
  • align struct type mangling/demangling to nominal type form (no Struct_ namespace prefix)

Syntax covered

p = Person
    :name age height
    "Tejas" 35 184.5
p.name, p.age, p.height

Implementation notes

  • lexer: . tokenization with .5 float compatibility
  • AST: StructLiteral, DotExpression, StructDef container
  • parser: struct literal parsing in code mode; dot postfix parsing in script mode
  • compiler/types/solver/cfg: struct kind/type handling + field access typing + traversal support
  • formatting: friendlier direct-struct print diagnostic

Tests

  • go test ./...
  • python3 test.py tests/struct
  • python3 test.py

thiremani and others added 21 commits February 27, 2026 17:40
Move reserved type-name source to a shared types package and use it from parser + compiler validations.

Also reject reserved function names (e.g. Int/I64) while keeping such identifiers valid for variables.
…ecks

Introduce StructStatement as a first-class code statement and wire it through parser/codecompiler/compiler flows.

Track immutable names via Code.ConstNames for shared redeclaration and CFG write-protection checks.

Allow repeated struct type definitions when headers match, and reject only conflicting headers across parse and merged compile validation.

Refactor codeparser const-binding validation to operate on identifier lists directly (remove temporary ConstStatement wrapper), and align struct-header comparison helper naming.

Update parser/compiler tests for repeated vs conflicting struct definitions and struct statement expectations.
Update StructLiteral.String to print multiline output with indented header and row blocks.

Add parser assertion for struct statement string rendering.

Validated with go test -race ./parser ./compiler and go test ./...
Remove the defensive nil check when emitting struct row expressions in StructLiteral.String.

Validated with go test -race ./parser ./compiler and go test ./...
Use singular Row for Phase 1 struct literals to reflect single-instance semantics and reduce ambiguity with future struct arrays.

Update parser/compiler/solver/cfg references and struct parser tests accordingly.

Validated with go test -race ./parser ./compiler and go test ./...
Remove trivial isGlobalConst helper and perform ConstNames lookup directly inside isDefined.

No behavior change; validated with go test ./compiler.
Guard unknown-header validation when the first-seen struct entry has empty headers.
Remove dead seenHeaders tracking in codecompiler and drop unreachable lexer dot-number branch.
Add parser regression test for empty init before full struct definition.
Rename validateAndTrackStructHeaders to validateStructHeaders for readability.
No behavior changes.
Redesign struct validation to use a clean two-pass architecture:
- Pass 1: find canonical definition (max-header statement, first-seen wins)
- Pass 2: validate all statements against canonical def (field existence + order)

Key changes:
- Replace structDef with Struct directly, eliminating duplicate type
- Add FieldSet (map[string]struct{}) to Struct for O(1) field lookups
- Move struct field validation from parser to compiler (semantic concern)
- Remove validateFuncDefs (parser owns reserved-name validation)
- Separate checkFieldOrder from validateStructUsage for clarity
- Fix equal-width subset ordering bug (defer conflict checks to pass 2)
- Simplify getStructSchema to pure StructCache lookup
- Use zero values for missing fields instead of definition defaults

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Extract struct constant compilation into compileStructConst method
- Remove tautological onElse != nil check in bounds.go
- Use early continue to flatten if/else in struct field loop
- Run gofmt

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Move all reserved-name validation from parser to compiler, following the
production compiler pattern where the parser is purely syntactic and
semantic checks live in the compiler phase.

- Move `reservedTypeNames` map and `IsReservedTypeName` from `types/`
  package into `compiler/types.go`; delete `types/` package
- Add `Compiler.rejectReservedName` method for unified error reporting
- Add `CodeCompiler.validateReservedNames` covering constants, struct
  type names, struct binding names, and function names
- Add `ScriptCompiler.validateReservedNames` covering script variables
- Add `Struct.FieldIndex` helper; reorder field-order check before
  field-type check in `validateStructHeaders`
- Use `prior := len(c.Errors)` pattern in `validateStructDefs` to
  avoid false positives from earlier validation errors
- Add unit tests for all reserved-name branches (const, func, struct
  binding, struct type, script variable)
- Remove reserved-name checks and tests from parser

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@thiremani
Copy link
Copy Markdown
Owner Author

Code Review: codex/structdef-on-master

Solid PR — the struct definition, constant compilation, and dot-access MVP is well-structured with good test coverage (345 lines of compiler tests, parser tests, and an E2E integration test). The two-pass canonical-definition approach and reserved-name consolidation are clean design decisions. A few issues to address:


1. TypeStructLiteral leaves FieldSet nil on solver-inferred Struct

Severity: Important (latent bug) | Confidence: 88%

compiler/solver.go ~line 820:

structType := Struct{
    Name:   sl.Token.Literal,
    Fields: fields,
    // FieldSet is nil!
}

buildStructDef in codecompiler.go correctly initializes FieldSet, but the solver path doesn't. Currently safe because compileDotExpression and TypeDotExpression both use Fields (not FieldSet), but any future code that calls FieldSet[name] on a solver-derived struct type will silently return "not found" for valid fields. This inconsistency between the two Struct construction sites is a trap.

Fix:

fieldSet := make(map[string]struct{}, len(fields))
for _, f := range fields {
    fieldSet[f.Name] = struct{}{}
}
structType := Struct{
    Name:     sl.Token.Literal,
    Fields:   fields,
    FieldSet: fieldSet,
}

2. cfg.checkWrite / cfg.isDefined will panic if CodeCompiler is nil

Severity: Latent critical | Confidence: 85%

compiler/cfg.go lines ~439 and ~491 unconditionally dereference cfg.CodeCompiler:

cc := cfg.CodeCompiler
if _, ok := cc.Code.ConstNames[e.Name]; ok {  // panic if cc == nil

NewCFG accepts a nil CodeCompiler and Analyze is public. A nil-CodeCompiler CFG that processes any write event will nil-panic. Currently all call sites pass a non-nil CodeCompiler, but this is a fragile contract.

Fix: Add a nil guard:

if cc := cfg.CodeCompiler; cc != nil {
    if _, ok := cc.Code.ConstNames[e.Name]; ok {
        cfg.addError(e.Token, ...)
    }
}

3. collectStructDefs: Ambiguous same-arity definitions produce confusing error messages

Severity: Important (UX) | Confidence: 82%

compiler/codecompiler.go ~lines 135-140:

if !exists || len(stmt.Value.Headers) > len(existing.Fields) {
    def, defErrs := buildStructDef(stmt)
    defs[typeName] = def
}

When two full-length definitions have the same field count, first-seen silently wins. The second is then validated as a "usage" — producing either "field X not in struct type Y" (non-overlapping fields) or "ambiguous struct field order" (same fields, different order). Both are technically correct rejections, but misleading to users who wrote what they intended as two separate definitions.

Consider detecting tied-max and emitting a dedicated "conflicting struct definition for type X" error. No test covers the equal-arity-different-fields case.


4. Code.MergeStruct.Map retains first-merged entry regardless of field count

Severity: Latent (minor) | Confidence: 85%

ast/ast.go ~line 81:

maps.Copy(c.Struct.Map, other.Struct.Map)

maps.Copy destination-wins semantics means Struct.Map keeps the first-merged entry for a type name, even if a later-merged file has the canonical (max-header) definition. Currently harmless because collectStructDefs works from Struct.Statements, not Struct.Map. But Struct.Map now misrepresents the canonical definition — any future caller relying on it will get the wrong answer.


5. bounds.go — nil-check removal on onElse

Severity: Low | Confidence: 80%

compiler/bounds.go line ~87: The nil-check on onElse was removed. All current callers pass a non-nil callback, but this narrows the contract of withGuardedBranch silently. If this is intentional (all callers must provide onElse), consider documenting it; otherwise restore the guard.


Minor observations

  • Struct alignment (compiler.go ~line 979): Hardcoded to 8 bytes is correct for Phase 1 scalar/pointer fields, but will need revisiting if nested structs or smaller-than-pointer field types are added.
  • Test quality: Excellent coverage of edge cases — ambiguous order, unknown fields, type mismatch, reserved names, empty init before definition. The TestStructUnknownFieldNoSpuriousError test is particularly good defensive testing.
  • parseStructRowConstants: The comma handling is a nice touch for future flexibility, but currently structs use space-separated values. If commas aren't part of the syntax yet, consider whether the comma path is dead code or intentional forward-compatibility.

Overall this is a clean, well-tested MVP. Issues 1 and 3 are the most actionable. Nice work! 🎯

thiremani and others added 7 commits March 10, 2026 23:52
…, and dot-spacing

- checkFieldTypes: allow int→float promotion to match codegen widening
- demangleIdent: restrict _ consumption to n-prefix case only, preventing
  greedy merging of adjacent nominal types (e.g., PersonAnimal)
- demangleNominalType: delegate to demangleIdent for full mixed-identifier
  support (ASCII + Unicode segments like foo_π, πbar)
- parseDotPostfix: reject whitespace after '.' in field access (p. name)
- Extract demangleIdentSegment to reduce cognitive complexity

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- TypeStructLiteral: initialize FieldSet to match buildStructDef, preventing
  potential nil-map panic if future code reads FieldSet on solver-derived types
- validateStructStmt: emit "struct X redefined with different fields" when
  same-arity definitions have unknown fields, instead of misleading per-field errors
- checkFieldOrder: clarify message to "fields reordered ... all fields must
  match definition order"
- Code.Merge: document last-writer-wins behavior on Struct.Map

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove dead empty-fields guard and switch to compact space-separated
format (e.g. Point{x:I64 y:F64}) consistent with Array.String().

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Structs can now be printed directly with `p` or as format markers
("-p", "x=-p y=99"). Output format:

    Person
        :name age height
        Tejas 35 184.5

The struct block ends with \n so it naturally separates from
following values in mixed prints (p, 99) without needing extra
printf calls or format-string splitting.

Changes:
- compiler/format.go: structFormatArgs builds the multi-line format
  string (Name\n    :fields\n    values\n); parseFormatting routes
  StructKind to structFormatArgs; defaultSpecifier returns %s for structs
- compiler/compiler.go: appendPrintSymbol handles StructKind by
  appending the struct format (no spec+space, no flush needed)
- tests/struct: add print test cases and expected output
- parser/scriptparser_test.go: extend dot-expression whitespace tests

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Use FLOAT directly for dot-prefixed numeric literals.
This keeps bare '.' as PERIOD while making the token type match the only valid outcome for inputs like '.5'.
Reject comma-separated struct value rows with an explicit parser error and simplify the row parser now that commas are no longer accepted.
Also reuse a shared line-continuation helper across struct and array header/row parsing.
Inline struct literal construction and remove redundant newline/indent expectPeek checks in parseStructLiteralStatement while preserving the existing bare-definition cases.
@thiremani thiremani merged commit b526f74 into master Mar 14, 2026
2 checks passed
@thiremani thiremani deleted the codex/structdef-on-master branch March 14, 2026 14:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant