Schema Syntax
Essentially, all data can be described by function.
For example, a positive integer:
def validate_int_plus(value):
assert isinstance(value, int), 'invalid number'
assert value > 0, 'number must > 0'
Another example, a string which max length is 120 chars:
def validate_str_maxlen_120(value):
assert isinstance(value, str), 'invalid string'
assert len(value) <= 120, 'string too long'
Classify the validate functions, validator turns out.
For example, a integer validator which has max value and min value:
def int_validator(min, max):
def validate(value):
assert isinstance(value, int), 'invalid number'
assert value >= min, f'number must >= {min}'
assert value <= max, f'number must <= {max}'
return validate
Schema is just a represention of validator.
After some surveing, I found a Step-By-Step syntax is suitable.
For example, a integer validator:
int
A min 0 integer validator:
int.min(0)
A min 0, max 100 interger validator:
int.min(0).max(100)
It can be detailed step by step, according to requirements.
In addition, if the param value is True
, we can omit the param value.
int.optional(True) == int.optional
For list and dict validator, we can validate it's elements:
list( int.min(0).max(100) )
dict(
key1=int.min(0).max(100),
key2=str.maxlen(10),
)
Since int, str, dict, list...etc is already defined in Python, we add T.
prefix to validator name, eg:
T.int.min(0).max(100)
T.dict(
key1=T.int.min(0).max(100),
key2=T.str.maxlen(10),
)
You can also define dict schema by class:
@modelclass
class MyModel:
key1 = T.int.min(0).max(100)
key2 = T.str.maxlen(10)
From a structural point of view, JSON data can be divided into 3 types:
- scalar: string number true false null.
- sequence: also known as array or list.
- mapping: a collection of name/value pairs, also known as object or dictionary.
Schema is also JSON, it describe JSON data by the 3 structures too.
As a result, the schema was called Isomorph-JSON-Schema.
mapping use $self to describe self, other keys describe it's inner elements:
{
"$self": "schema",
"key": "schema"
}
sequence use first element to describe self, second element to describe inner elements:
["schema", Item]
a sequence can omit self-describe, only describe inner elements:
[Item]
scalar use a string to describe self:
"schema"
Note: the JSON syntax is for exchange schema between languages.
this is actual data:
{
"id": 1,
"name": "A green door",
"price": 12.50,
"tags": ["home", "green"]
}
and it's schema in Python:
T.dict(
id=T.int.desc('The unique identifier for a product'),
name=T.str.desc('Name of the product'),
price=T.float.exmin(0),
tags=T.list(
T.str.minlen(1)
).unique
)
the same schema by Python class:
@modelclass
class Product:
id = T.int.desc('The unique identifier for a product')
name = T.str.desc('Name of the product')
price = T.float.exmin(0)
tags = T.list(
T.str.minlen(1)
).unique
the same schema in JSON:
{
"$self": "dict",
"id": "int.desc('The unique identifier for a product')",
"name": "str.desc('Name of the product')",
"price": "float.exmin(0)",
"tags": [
"list.unique",
"str.minlen(1)"
]
}