GENERAL

Why use a schema (a.k.a "Turing tarpit")? Why not use pure Perl?

Schema language is a specialized language (DSL) that should be more concise to write than equivalent Perl code for common validation tasks. Its goal is never to be as powerful as Perl.

90% of the time, my schemas are some variations of the simple cases like:

"str*"
["str":   {"len_between": [1, 10], "match": "some regex"}]
["str":   {"in": ["a", "b", "c", ...]}]
["array": {"of": "some_other_type"}]
["hash":  {"keys": {"key1": "some schema", ...}, "req_keys": [...], ...}]

and writing schemas is faster and less tedious/error-prone than writing equivalent Perl code, plus Data::Sah can generate JavaScript code and human description text for me. For more complex validation I stay with Sah until it starts to get unwieldy. It usually can go pretty far since I can add functions and custom clauses to its types; it's for the very complex and dynamic validation needs that I go pure Perl. Your mileage may vary.

What does "Sah" mean?

Sah is an Indonesian word, meaning "valid" or "legal". It's picked because it's short.

The previous incarnation of this module uses the namespace Data::Schema, started in 2009 and deprecated in 2011 in favor of "Sah".

Comparison to other schema languages and type systems

Comparison to JSON schema?

JSON schema limits its type system to that supported by JSON/JavaScript.
JSON schema's syntax is simpler.

Its metaschema (schema for the schema) is only about 130 lines. There are no shortcut forms.
JSON schema's features are more limited.

No expression, no function.

Comparison to Data::Rx?

TBD

Comparison to Data::FormValidator (DFV)?

TBD

Comparison to Moose types?

TBD

SYNTAX

General

Why is `req` not enabled the default?

I am following SQL's behavior. A type declaration like:

INT

in SQL means NULL is allowed, while:

INT NOT NULL

means NULL is not allowed. The above is equivalent to specifying this in Sah:

int*

One could argue that setting req to 1 by default is safer/more convenient to her/whatever, and int should mean ["int", "req", 1] while something like perhaps int? means ["int", "req", 0]. But this is simply a design choice and each has its pros/cons. Nullable by default can also be convenient in some cases, like when specifying program options where most of the options are optional.

How about adding a `default_req` configuration in `Data::Sah` then?

In general I am against compiler configuration which changes language behavior. In this case, it makes a simple schema like int to have ambiguous meaning (is undefined value allowed? Ir not allowed? It depends on compiler configuration).

How to express "not-something"? Why isn't there a `not` or `not_in` clause?

There are generally no not_CLAUSE clauses. Instead, a generic !CLAUSE syntax is provided. Examples:

// an integer that is not 0
["int", {"!is": 0}]

// a username that is not one of the forbidden/reserved ones
["str", {"!in": ["root", "admin", "superuser"]}]

How to state `in` as well as `!in` in the same clause set?

You can't do this since it will cause a conflict:

["str ", {"in": ["a","b","c"], "!in": ["x","y","z"]}]

However, you can do this:

["str ", {"cset&": [{"in": ["a","b","c"]}, {"!in": ["x","y","z"]}]}]

How to express mutual failure ("if A fails, B must also fails")?

You can use if clause and negate the clauses. For example:

"if": [{"!div_by": 2}, {"!div_by": 5}]

General advice when writing schemas?

Avoid any or all if you know that data is of a certain type

For performance and ease of reflection, it is better to create a custom clause than using the any type, especially with long list of alternatives. An example:

// dns_record is either a_record, mx_record, ns_record, cname_record, ...
["any", "of", [
        "a_record",
        "mx_record",
        "ns_record",
        "cname_record",
        ...
    ]
]

// base_record
["hash", "keys", {
    "owner": "str*",
    "ttl": "int",
}]

// a_record
["base_record", "merge.normal.keys", {
    "type": ["str*", "is", "A"],
    "address": "str*"
}]

// mx_record
["base_record", "merge.normal.keys", {
    "type": ["str*", "is", "MX"],
    "host": "str*",
    "prio": "int"
}]

...

If you see the declaration above, every record is a hash. So it is better to declare dns_record as a hash instead of an any. But we need to select a different schema based on the type key. We can develop a custom clause like this:

["hash", "select_schema_on_key", ["type", {
    "A": "a_record",
    "MX": "mx_record",
    "NS": "ns_record",
    "CNAME": "cname_record",
    ...
}]]

This will be faster.

Hash

How does Sah check allowed/unallowed keys?

If keys clause is specified, then by default only keys defined in keys clause is allowed, unless the .restrict attribute is set to false, in which case no restriction to allowed keys is done by the clause. The same case for re_keys.

If allowed_keys and/or allowed_keys_re clause is specified, then only keys matching those clauses are allowed. This is in addition to restriction placed by other clauses, of course.

How do I specify schemas for some keys, but still allow some other keys?

Set the .restrict attribute for keys or re_keys to false. Example:

["hash", {
    "keys": {"a": "int", "b": "int"},
    "keys.restrict": 0,
    "allowed_keys": ["a", "b", "c", "d", "e"]
]

The above schema allows keys a, b, c, d, e and specifies values for a, b. Another example:

["hash", {
    "keys": {"a": "int", "b": "int"},
    "keys.restrict": 0,
    "allowed_keys_re": "^[ab_]",
]

The above schema specifies values for a, b but still allows other keys beginning with an underscore.

What is the difference between the `keys` and `req_keys` clauses?

req_keys require keys to exist, but their values are governed by the schemas in keys or keys_re. Here are four combination possibilities, each with the schema:

To require a hash key to exist, but its value can be undef:

["hash", "keys", {"a": "int"}, "req_keys": ["a"]]

To allow a hash key to not exist, but when it exists it must not be undef:

["hash", "keys", {"a": "int*"}]

To allow a hash key to not exist, or its value to be undef when exists:

["hash", "keys", {"a": "int"}]

To require hash key exist and its value must not be undef:

["hash", "keys", {"a": "int*"}, "req_keys": ["a"]]

Merging and hash keys?

XXX (Turn off hash merging using the '' Data::ModeMerge options key.

POD ERRORS

Hey! The above document had some coding errors, which are explained below:

Around line 56:: You forgot a '=back' before '=head3'

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FAQ.pod

FAQ.pod

GENERAL

Why use a schema (a.k.a "Turing tarpit")? Why not use pure Perl?

What does "Sah" mean?

Comparison to other schema languages and type systems

Comparison to JSON schema?

Comparison to Data::Rx?

Comparison to Data::FormValidator (DFV)?

Comparison to Moose types?

SYNTAX

General

Why is `req` not enabled the default?

How about adding a `default_req` configuration in `Data::Sah` then?

How to express "not-something"? Why isn't there a `not` or `not_in` clause?

How to state `in` as well as `!in` in the same clause set?

How to express mutual failure ("if A fails, B must also fails")?

General advice when writing schemas?

Hash

How does Sah check allowed/unallowed keys?

How do I specify schemas for some keys, but still allow some other keys?

What is the difference between the `keys` and `req_keys` clauses?

Merging and hash keys?

POD ERRORS

Files

FAQ.pod

Latest commit

History

FAQ.pod

File metadata and controls

GENERAL

Why use a schema (a.k.a "Turing tarpit")? Why not use pure Perl?

What does "Sah" mean?

Comparison to other schema languages and type systems

Comparison to JSON schema?

Comparison to Data::Rx?

Comparison to Data::FormValidator (DFV)?

Comparison to Moose types?

SYNTAX

General

Why is req not enabled the default?

How about adding a default_req configuration in Data::Sah then?

How to express "not-something"? Why isn't there a not or not_in clause?

How to state in as well as !in in the same clause set?

How to express mutual failure ("if A fails, B must also fails")?

General advice when writing schemas?

Hash

How does Sah check allowed/unallowed keys?

How do I specify schemas for some keys, but still allow some other keys?

What is the difference between the keys and req_keys clauses?

Merging and hash keys?

POD ERRORS

Why is `req` not enabled the default?

How about adding a `default_req` configuration in `Data::Sah` then?

How to express "not-something"? Why isn't there a `not` or `not_in` clause?

How to state `in` as well as `!in` in the same clause set?

What is the difference between the `keys` and `req_keys` clauses?