# ACDC Schemas

**Objective:** Understand the role of schemas in defining ACDC structures, how they leverage Self-Addressing Identifiers (SAIDs) for verifiability, and learn how to create and process a basic schema using the KERI `kli` tool.

*(Assumed Knowledge: You should understand what an ACDC and be familiar with core KERI concepts like AIDs and SAIDs.)*

## What is a Schema?

Before we can issue or verify an Authentic Chained Data Container (ACDC), we need a blueprint that describes exactly what information it should contain and how that information should be structured. This blueprint is called a **Schema**.

Schemas serve several critical purposes:

* **Structure & Validation:** They define the names, data types (like text, numbers, dates), and constraints for the data within an ACDC. This allows recipients to validate that a received ACDC contains the expected information in the correct format.
* **Interoperability:** When different parties agree on a common schema, they can reliably exchange and understand ACDCs for a specific purpose (e.g., everyone knows what fields to expect in a "Membership Card" ACDC).
* **Verifiability:** As we'll see, KERI schemas themselves are cryptographically verifiable, ensuring the blueprint hasn't been tampered with.

## Writing ACDC Schemas

ACDC schemas are written using the widely adopted **JSON Schema** specification. If you're familiar with JSON Schema, you'll find ACDC schemas very similar, with a few KERI-specific conventions.

Let's look at the main parts of a typical ACDC schema:

```json
{
    "$id": "",
    "$schema": "http://json-schema.org/draft-07/schema#",
    "title": "Sample Credential",
    "description": "A very basic credential schema for demonstration.",
    "type": "object",
    "credentialType": "SampleCredential",
    "version": "1.0.0",
    "properties": {
        "v": {
            "description": "Credential Version String",
            "type": "string"
        },
        "d": {
            "description": "Credential SAID",
            "type": "string"
        },
        "u": {
            "description": "One time use nonce",
            "type": "string"
        },
        "i": {
            "description": "Issuer AID",
            "type": "string"
        },
        "rd": {
            "description": "Registry SAID",
            "type": "string"
        },
        "s": {
            "description": "Schema SAID",
            "type": "string"
        },
        "a": {
            "oneOf": [
                {
                    "description": "Attributes block SAID",
                    "type": "string"
                },
                {
                    "$id": "",
                    "description": "Attributes block",
                    "type": "object",
                    "properties": {
                        "d": {
                            "description": "Attributes data SAID",
                            "type": "string"
                        },
                        "i": {
                            "description": "Issuee AID",
                            "type": "string"
                        },
                        "dt": {
                            "description": "Issuance date time",
                            "type": "string",
                            "format": "date-time"
                        },
                        "claim": {
                            "description": "Custom claim being made",
                            "type": "string"
                        }
                    },
                    ...
                }
            ]
        }
    },
    ...
}
```

1.  **Schema Metadata (Top Level):** Describes the schema itself.
    * `$id`: This field holds the SAID of the entire schema file once processed. It's not a URL like in standard JSON Schema. It's computed after all internal SAIDs are calculated.
    * `$schema`: Specifies the JSON Schema version (e.g., `"http://json-schema.org/draft-07/schema#"`). 🚧
    * `title`, `description`: Human-readable name and explanation.
    * `type`: Usually `"object"` for the top level of an ACDC schema. 🚧
    * `credentialType`: A specific name for this type of credential.
    * `version`: A semantic version for this specific credential type (e.g., `"1.0.0"`) to manage schema evolution (Distinct from the ACDC instance's `v` field).
2.  **`properties` (Top Level):** Defines the fields that will appear in the ACDC's envelope and payload. 🚧
    * ACDC Metadata Fields: Defines required fields like
        * `v`: ACDC version/serialization
        * `d`: ACDC SAID
        * `u`: salty nonce
        * `i`: Issuer AID
        * `rd`: Registry SAID
        * `s`: Schema SAID
    * Payload Sections: Defines the payload structures
        * `a`: Defines the structure for the **attributes block**, which holds the actual data or claims being made by the credential.
            * **`oneOf`**: This standard JSON Schema keyword indicates that the value for the `a` block in an actual ACDC instance can be *one of* the following two formats:
                1.  **Compact Form (String):**
                    * `{"description": "Attributes block SAID", "type": "string"}`: This option defines the *compact* representation. Instead of including the full attributes object, the ACDC can simply contain a single string value: the SAID of the attributes block itself. This SAID acts as a verifiable reference to the full attribute data, which might be stored elsewhere. (We won't cover compact ACDCs in this material.)
                2.  **Un-compact Form (Object):**
                    * `{"$id": "", "description": "Attributes block", "type": "object", ...}`: This option defines the full or un-compacted representation, where the ACDC includes the complete attributes object directly.
                        * **`$id`**: This field will hold the SAID calculated for *this specific attributes block structure* after the schema is processed (`SAIDified`). Initially empty `""` when writing the schema.
                        * **`description`**: Human-readable description of this block.
                        * **`type`: `"object"`**: Specifies that this form is a JSON object.
                        * **`properties`**: Defines the fields contained within the attributes object:
                            * **`d`**: Holds the SAID calculated from the *actual data* within the attributes block
                            * **`i`**: The AID of the **Issuee** or subject of the credential – the entity the claims are *about*.
                            * **`dt`**: An ISO 8601 date-time string indicating when the credential was issued.
                            * **`claim`** (and other custom fields): These are the specific data fields defined by your schema. In this example, `"claim"` is a string representing the custom information this credential conveys. You would define all your specific credential attributes here.
3.  **`additionalProperties`, `required`:** Standard JSON Schema fields controlling whether extra properties are allowed and which defined properties must be present.

## Schema SAIDs

A key feature of KERI and ACDCs is the use of SAIDs (Self-Addressing Identifiers) for schemas. The SAID in a schema's `$id` field is a hash of the canonical form of that schema block.

Calculating and embedding these SAIDs requires a specific process, often called **"SAIDifying"**. This involves calculating the SAIDs for the innermost blocks (like attributes, edges, rules) first, embedding them, and then calculating the SAID for the next level up, until the top-level schema SAID is computed.

* **Why it matters:**
    * **Integrity:** If anyone modifies the schema file after its SAID has been calculated and embedded, the SAID will no longer match the content, making tampering evident.
    * **Immutability:** Once a schema version is SAIDified and published, it's cryptographically locked. New versions require a new SAID.
    * **Lookup:** SAIDs provide a universal, unique identifier to retrieve a specific, verified version of a schema.

## Practical: SAIDifying a Schema with KLI

Let's create a very simple schema and process it using the `kli` tool.

### Step 1: Write the Schema JSON

First, create a JSON file (e.g., `simple_credential_schema.json`) with the basic structure. Notice the `$id` fields are initially empty strings `""` – they will be filled in by `kli`.

### Step 2: SAIDify the Schema
Now, use the KERI command-line tool (kli) to process this file. The `kli saidify` command calculates the necessary digests and embeds them as SAIDs into the `$id` fields within the file itself.

In [21]:
!cp ./data/sample_schema.json.bak ./data/sample_schema.json

!jq '.. | objects | .["$id"]? // empty' ./data/sample_schema.json

!kli saidify --file ./data/sample_schema.json --label '$id'

!jq '.. | objects | .["$id"]? // empty' ./data/sample_schema.json


[0;32m""[0m
[0;32m""[0m
[0;32m"EGjvtlqnmv7NAyDujKJN4N7ZFdFGBLnG2-Ie8q-GGayf"[0m
[0;32m""[0m


After running this command, if you inspect simple_credential_schema.json again, you will see that the previously empty `"$id": ""` fields (both the top-level one and the one inside the a block) have been populated with SAID strings (long Base64-like identifiers).

For example, the top-level `$id` might now look something like:
`"$id": "EABC123abc..."` 
And the $id inside the a block might look like:
`"$id": "EFGH456def..."`

(Note: The actual SAIDs generated will depend on the exact content and formatting used by the kli saidify tool at the time of execution.)

You now have a cryptographically verifiable schema identified by its top-level SAID!

### Step 3: Caching Schemas (Conceptual) 🚧
For an issuer to issue an ACDC using this schema, and for a recipient/verifier to validate it, they both need access to this exact, SAIDified schema definition.

How do they get it? The SAID acts as a universal lookup key. Common ways to make schemas available include:

- Simple Web Server: Host the SAIDified JSON file on a basic web server (like the vLEI-server mentioned in the tutorial ). Controllers can be configured (often via OOBIs, covered later) to fetch the schema from that URL using its SAID.   
- Content-Addressable Network: Store the schema on a network like IPFS, where the SAID naturally aligns with the content hash used for retrieval.
- Direct Exchange: For specific interactions, the schema could potentially be exchanged directly between parties (though less common for widely used schemas).
The key point is that the schema, identified by its SAID, must be retrievable by parties needing to issue or verify credentials based on it.

In the next notebook, we'll use our SAIDified schema to set up a Credential Registry and issue our first actual ACDC.