# NanoServices URN & Message Semantics — Team Guide
**Date:** 2025-08-29

This notebook documents the **finalized, core-compatible** conventions for URN naming and message semantics.
It is intentionally **transport-agnostic** (Temporal/HTTP/MQTT are bindings outside the core ontology).



## Executive summary (decisions)

**1) `urn:system` is *only* a scope.**  
No more dual use. We always use the canonical pattern:

```
urn:<scope>:<namePath>(#<operation>)?
```

**2) We consistently build URNs from `name` (not `identifier`).**  
Names must be normalized and deterministic. A dotted `namePath` expresses hierarchy.

**3) Tasks/operations use the fragment.**  
Example: `urn:system:host.fuseki-worker#get` (`#get` is the operation on the resource).

**4) Message stays within the core model as a Projection.**  
- `Projection.urn` is the **Task endpoint** (e.g., `urn:system:host.fuseki-worker#get`).  
- `View` carries **arguments and results** (`contentType`, `language`, `schemaRef`, and `content`).  
- **No** `from` / `to` fields added; transport/binding is external (Port/Connection), optional.

**5) One-message-per-task (default).**  
A single Message is created and **carried through a plan**; when a task completes, its result is written back into the **same** Message's `View.content` and passed to the next task.  
When crossing a system boundary, the same Message is **serialized** by the Port/Connection (e.g., Temporal), but its semantics don't change.



## Canonical grammar

```
urn:<scope>:<namePath>(#<operation>)?
```

- `<scope>`: e.g., `system`, `data`, `user`, `process`, ...
- `<namePath>`: **dot-separated**, normalized path (hierarchy), ASCII-only, lowercase: `[a-z0-9.-]+`
- `#<operation>`: optional operation/task name (e.g., `get`, `put`, `register`)

### Normalization rules for names
1. Lowercase all letters.
2. Unicode NFKD → ASCII fallback (drop diacritics).
3. Replace spaces/underscores with `-`.
4. Allow only `[a-z0-9.-]`.
5. Collapse duplicate `.` or `-`.
6. Trim leading/trailing `.` or `-`.

### Examples
- Host (main system): `urn:system:host`
- Subsystems: `urn:system:host.worker`, `urn:system:host.fuseki`
- Worker task: `urn:system:host.fuseki-worker#get`
- Data resource (kept as-is): `urn:data:system`


In [None]:

import unicodedata
import re

VALID_NAME_RE = re.compile(r"^[a-z0-9.-]+$")
VALID_SCOPE_RE = re.compile(r"^[a-z][a-z0-9-]*$")
URN_RE = re.compile(r"^urn:([a-z][a-z0-9-]*):([a-z0-9.-]+)(?:#([a-z0-9.-]+))?$")

def normalize_name(name):
    """
    Normalize a single name segment or a dotted path to the canonical form:
    - lowercase
    - NFKD unicode normalization and ASCII stripping
    - spaces/underscores -> '-'
    - keep only a-z 0-9 . -
    - collapse duplicate '.' and '-'
    - strip leading/trailing '.' and '-'
    """
    if name is None:
        raise ValueError("name cannot be None")
    s = unicodedata.normalize("NFKD", str(name)).encode("ascii", "ignore").decode("ascii")
    s = s.lower().replace("_", "-").replace(" ", "-")
    # keep only [a-z0-9.-]
    s = re.sub(r"[^a-z0-9.-]+", "-", s)
    # collapse '--' and '..' repeatedly
    while "--" in s or ".." in s:
        s = s.replace("--", "-").replace("..", ".")
    # strip leading/trailing separators
    s = s.strip(".-")
    if not s:
        raise ValueError(f"Invalid empty name after normalization (input={name!r})")
    if not VALID_NAME_RE.match(s):
        raise ValueError(f"Name normalization failed: {s!r}")
    return s

def compose_urn(scope, name_path, op=None):
    """
    Compose a URN with the canonical grammar:
      urn:<scope>:<namePath>(#<op>)?
    """
    if scope is None or name_path is None:
        raise ValueError("scope and name_path are required")
    scope = str(scope).lower()
    if not VALID_SCOPE_RE.match(scope):
        raise ValueError("Invalid scope %r. Must match %s" % (scope, VALID_SCOPE_RE.pattern))
    name_path = normalize_name(name_path)
    if op is not None:
        op = normalize_name(op)
        return "urn:%s:%s#%s" % (scope, name_path, op)
    return "urn:%s:%s" % (scope, name_path)

def parse_urn(urn):
    """
    Parse a URN into {scope, namePath, operation} or raise ValueError.
    """
    m = URN_RE.match(urn)
    if not m:
        raise ValueError("Invalid URN: %r" % (urn,))
    scope, name_path, op = m.group(1), m.group(2), m.group(3)
    return {"scope": scope, "namePath": name_path, "operation": op}

def example_urns():
    return [
        compose_urn("system", "host"),
        compose_urn("system", "host.worker"),
        compose_urn("system", "host.fuseki"),
        compose_urn("system", "host.fuseki-worker", "get"),
        compose_urn("data", "system"),
        compose_urn("user", "alice.profile"),
        compose_urn("process", "host.temporal"),
    ]

example_urns()



## Message semantics (core-compatible)

- **Message is a Projection.**  
  `Projection.urn = <task-endpoint>`; e.g., `urn:system:host.fuseki-worker#get`

- **View carries arguments & results**, e.g.:
  - `contentType`: `text/plain`, `application/json`, `text/turtle`, `application/ld+json`, ...
  - `language`, `schemaRef` (optional)
  - `content`: for `get` → the target resource URN (e.g., `"urn:data:system"`); for `put` → the dataset/model.

- **Policy** and **Schedule** belong to the Message (or the calling System/NanoService) — **never** to the Port.

- **Transport binding** (Temporal/HTTP/MQTT) is outside the ontology.  
  The Message is serialized/deserialized by the Port/Connection as needed; semantics remain unchanged.

### One-message-per-task (default)
- **A single Message** is created for the task call.
- The task runs **synchronously** (e.g., via a local bean/adapter like `FusekiBean`) and writes its result back into the **same** Message's `View.content`.
- The **plan** moves that **same** Message to the next task by changing only `Projection.urn` to the next task endpoint.

### Example: `get`
- Call:
  - `Projection.urn = "urn:system:host.fuseki-worker#get"`
  - `View.content = "urn:data:system"`
- Task execution (internally synchronous):
  - Reads the dataset and sets `View.content` to the result (e.g., Turtle).
  - Optionally changes `View.contentType` to `text/turtle`.
- Next plan step:
  - Re-target the Message by updating `Projection.urn` (e.g., to a transform task), keep `View.content` as input.


In [None]:

# Demonstration of canonical composition & parsing
urns = example_urns()
for u in urns:
    print(u, "->", parse_urn(u))

# A quick normalization showcase
samples = [
    ("system", "Host  FüSEKI-Worker", "GET"),
    ("data",   "SYSTEM", None),
    ("user",   "Alice  Profile", None),
    ("process","HOST..Temporal", None),
]
print("\nNormalization showcase:")
for scope, name, op in samples:
    print("input:", scope, name, op, "=>", compose_urn(scope, name, op))



## Migration checklist (safe & minimal)

1. Replace ambiguous resource uses of `urn:system` with **`urn:system:host`** (or another explicit name).  
2. Ensure all service/system/worker names are normalized and expressed as dotted paths if hierarchical (e.g., `host.fuseki`).
3. Keep `urn:data:system` intact (it is already `scope=data`, `name=system`).  
4. Update documentation/examples to show `#operation` fragments only for tasks (e.g., `#get`, `#put`, `#register`).
5. Optional but recommended:
   - Maintain an internal stable `uid` per entity (for redirects/renames), **not** used in URNs.
   - Add a small redirect table for legacy URNs if any exist.



## Do & Don't

**Do**
- Use `urn:<scope>:<namePath>(#<op>)?` everywhere.
- Keep names normalized (lowercase ASCII, `[a-z0-9.-]`).
- Use `#<op>` only for actual operations/tasks.
- Keep Message ontology unchanged; put all arguments/results in `View`.

**Don't**
- Re-use `urn:system` as a resource — it's a **scope only**.
- Add `from/to` to the Message — transport belongs to Port/Connection.
- Put Policy on Port/Connection — Policy belongs to Message/System/NanoService.



## FAQ

**Q: Why not put `to/from` into Message?**  
A: We keep the core ontology clean and stable. Addressing of a *task endpoint* is already captured by the Message being a Projection with `Projection.urn`. Transport-specific routing (`queue`, `endpoint`) belongs to **binding** (Port/Connection), not the Message.

**Q: How do we reference the "main system" now?**  
A: Use `urn:system:host`. Subsystems: `urn:system:host.worker`, `urn:system:host.fuseki`, etc.

**Q: What about `identifier`?**  
A: We only use `name` for URN composition. If you need internal indirection for renames, keep a hidden `uid` (not part of the URN).

**Q: Can a plan change the Message address?**  
A: Yes. The plan sets the next task by updating the Message's `Projection.urn` to the next task endpoint. The `View.content` moves along as the working payload.



## Appendix — Quick reference

**Grammar:** `urn:<scope>:<namePath>(#<operation>)?`  
**Scopes:** `system`, `data`, `user`, `process`, ...  
**Name path:** dotted hierarchy, normalized (`[a-z0-9.-]+`)  
**Operation:** optional fragment (task), normalized

**Examples**
- System host: `urn:system:host`
- Fuseki: `urn:system:host.fuseki`
- Worker task get: `urn:system:host.fuseki-worker#get`
- Data system dataset: `urn:data:system`
