Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions specs/020-subject-serialization/contracts/api.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,12 +12,12 @@ toGram :: Pattern Subject -> String

```haskell
-- | Parse a gram notation string into a Pattern Subject.
-- Automatically assigns unique IDs (e.g., _anon_1) to anonymous subjects.
-- Automatically assigns unique IDs (e.g., #1) to anonymous subjects.
fromGram :: String -> Either ParseError (Pattern Subject)
```

## Invariants

1. **Round-Trip**: `fromGram (toGram s) == Right s` (modulo potential ID generation for previously anonymous subjects, which become named).
2. **Identity**: `toGram` output for a Subject with ID `_anon_1` is `(_anon_1)` (or similar valid syntax).
2. **Identity**: `toGram` output for a Subject with ID `#1` is `(#1)` (or similar valid syntax).

4 changes: 2 additions & 2 deletions specs/020-subject-serialization/data-model.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,13 +8,13 @@ The core data structure representing a node or relationship's content.

| Field | Type | Description | Constraints |
|-------|------|-------------|-------------|
| `identity` | `Symbol` | Unique identifier | **Mandatory**. Cannot be empty string in a valid graph (though type allows it). Parsed anonymous subjects receive generated IDs (e.g., `_anon_1`). |
| `identity` | `Symbol` | Unique identifier | **Mandatory**. Cannot be empty string in a valid graph (though type allows it). Parsed anonymous subjects receive generated IDs (e.g., `#1`). |
| `labels` | `Set String` | Classification tags | Unique set. |
| `properties` | `Map String Value` | Key-value attributes | Keys are strings. Values are typed. |

### Identity Generation

- **Format**: `_anon_<N>`
- **Format**: `#<N>` (e.g., `#1`, `#2`, ...)
- **Scope**: Local to a single `fromGram` parse operation.
- **Counter**: Starts at 1 for each parse.

Expand Down
10 changes: 5 additions & 5 deletions specs/020-subject-serialization/research.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,12 +14,12 @@
2. **Sequential IDs (Global)**: Use a global counter.
* *Pros*: Simple.
* *Cons*: Requires `IO` / `MVar` / global state. Breaks purity.
3. **Sequential IDs (Local/Deterministic)**: Use a counter scoped to the `fromGram` call (e.g., `_anon_1`, `_anon_2`).
3. **Sequential IDs (Local/Deterministic)**: Use a counter scoped to the `fromGram` call (e.g., `#1`, `#2`).
* *Pros*: Pure, deterministic, easy to test.
* *Cons*: IDs are only unique within that specific parse result. Merging two parsed graphs could cause collisions if not handled (but that's a separate concern; `Subject` semigroup handles merging).

**Decision**: **Option 3: Sequential IDs (Local/Deterministic)**.
We will generate IDs of the form `_anon_<N>` (or similar distinct prefix) using a `State` monad during the transformation phase (`Gram.Transform`).
We will generate IDs of the form `#<N>` (or similar distinct prefix) using a `State` monad during the transformation phase (`Gram.Transform`).

**Implementation Details**:
- Modify `transformGram` to be `transformGram :: CST.Gram -> P.Pattern S.Subject` (keeping signature pure) but internally use `evalState` with a stateful transformation function.
Expand All @@ -30,16 +30,16 @@ We will generate IDs of the form `_anon_<N>` (or similar distinct prefix) using
transformIdentifier Nothing = do
n <- get
put (n + 1)
return $ S.Symbol ("_anon_" ++ show n)
return $ S.Symbol ("#" ++ show n)
transformIdentifier (Just (CST.IdentSymbol (CST.Symbol s))) = return $ S.Symbol s
-- ...
```

## 2. Round-trip Consistency

**Context**: We want `fromGram . toGram == id` (conceptually).
If we parse `()` -> `Subject "_anon_1"`, then `toGram` will produce `(_anon_1)`.
Parsing `(_anon_1)` -> `Subject "_anon_1"`.
If we parse `()` -> `Subject "#1"`, then `toGram` will produce `(#1)`.
Parsing `(#1)` -> `Subject "#1"`.
This preserves the data identity.

**Decision**: Accept that anonymous subjects become named subjects after a round-trip. This is consistent with the requirement that `Subject` *has* an identity. The "anonymous" syntax is just a shorthand for "I don't care about the ID, make one up". Once made up, it persists.
Expand Down