# what can schema do for you!

schema are critical to strong software interfaces. we can note this in modern python, type annotations, sql schema, numpy and pandas dtypes. the open web is driven off of schema such resource description frameworks that description html metadata. there are standards for presenting schema in `json`. 

in all, schema, non-empircal rules, are consistent features across programming interfaces, and cultures at large. perhaps, schema are so embedded into our systems because it was first discussed [Plato and Aristotle's philosopical camps](https://plato.stanford.edu/entries/schema/#ScheHistLogi).

in this work we will not define what schema are, we will discuss different forms of schema and what their definitions provide.

from here on we'll use schemata to demonstrate what external affordances schema provide us. what we will observe are different forms, or outward representations, of schema in different contexts. for example, `repr` and `str` are contents `object` with _perhaps_ differing forms. within interactive computing environments, like `jupyter`, we can extend our schema to interactive, modern web displays.


## a practical study of the github api schema

the github api defines an openapi specification. this is helpful to their product because this schema provides a general description, in json, that can be translated into different systems or environments. one immediate benefit to github, by having a conventional schema, is they api documentation and playground for free https://docs.github.com/en/rest

another position we can view that api schema from is the `https://api.github.com` endpoint. we could explore the schema in our web browser or `pandas` to observe different forms of our schema.

In [1]:
    import pandas

In [2]:
    if "api" not in locals():
        api = __import__("pandas").read_json("https://api.github.com", typ=pandas.Series).to_frame("github api")
    api.T

Unnamed: 0,current_user_url,current_user_authorizations_html_url,authorizations_url,code_search_url,commit_search_url,emails_url,emojis_url,events_url,feeds_url,followers_url,...,rate_limit_url,repository_url,repository_search_url,current_user_repositories_url,starred_url,starred_gists_url,user_url,user_organizations_url,user_repositories_url,user_search_url
github api,https://api.github.com/user,https://github.com/settings/connections/applic...,https://api.github.com/authorizations,https://api.github.com/search/code?q={query}{&...,https://api.github.com/search/commits?q={query...,https://api.github.com/user/emails,https://api.github.com/emojis,https://api.github.com/events,https://api.github.com/feeds,https://api.github.com/user/followers,...,https://api.github.com/rate_limit,https://api.github.com/repos/{owner}/{repo},https://api.github.com/search/repositories?q={...,"https://api.github.com/user/repos{?type,page,p...",https://api.github.com/user/starred{/owner}{/r...,https://api.github.com/gists/starred,https://api.github.com/users/{user},https://api.github.com/user/orgs,https://api.github.com/users/{user}/repos{?typ...,https://api.github.com/search/users?q={query}{...


because github obeys a schema, we can be confident that the shape of this data we consistently be the same, and if something is wrong we know their is documentation in the open api docs.

an important feature of schema are that they may translated and applied into different languages. for example, when we load the github api into pandas we can translate the schema to dtypes or json table schema.

### dtypes

In [3]:
    api.T.dtypes

current_user_url                        object
current_user_authorizations_html_url    object
authorizations_url                      object
code_search_url                         object
commit_search_url                       object
emails_url                              object
emojis_url                              object
events_url                              object
feeds_url                               object
followers_url                           object
following_url                           object
gists_url                               object
hub_url                                 object
issue_search_url                        object
issues_url                              object
keys_url                                object
label_search_url                        object
notifications_url                       object
organization_url                        object
organization_repositories_url           object
organization_teams_url                  object
public_gists_

### table schema

In [4]:
    pandas.read_json(api.to_json(orient="table"), typ=pandas.Series)["schema"]

{'fields': [{'name': 'index', 'type': 'string'},
  {'name': 'github api', 'type': 'string'}],
 'primaryKey': ['index'],
 'pandas_version': '0.20.0'}

https://github.blog/2020-07-27-introducing-githubs-openapi-description/
https://github.com/github/rest-api-description

## learning some schema by writing it

In [5]:
    from schemata import *
    import math, pytest

In [6]:
    NegativeNumber = Numeric.ExclusiveMaximum[0] 
    NegativeNumber.schema(ravel=True)

{'exclusiveMaximum': 0}

In [7]:
    NegativeNumber = Float[NegativeNumber]
    NegativeNumber.schema(ravel=True)

{'type': 'number', 'exclusiveMaximum': 0}

### schema provides validation

one of the most immediate of defining is schema is that it provides validation.

In [8]:
    with pytest.raises(BaseException):
        NegativeNumber(1.)

In [9]:
    assert NegativeNumber(-1.) == -1

### schema define the null space

sometimes what a thing is can only be defined by what it is not.

In [10]:
    NotNegativeNumber = -NegativeNumber
    NotNegativeNumber.schema(ravel=True)

{'not': {'type': 'number', 'exclusiveMaximum': 0}}

### schema describe things

In [11]:
    class NegativeNumber(NegativeNumber, Examples[-math.pi, -42.] + Description[
        "a castable negative float"
    ]): 
        pass
    NegativeNumber.schema()

{'type': 'number',
 'exclusiveMaximum': 0,
 'examples': (-3.141592653589793, -42.0),
 'description': 'a castable negative float'}

In [12]:
    class NotNegativeNumber(-NegativeNumber, Examples[0, math.pi, 100], Description[
        "the null space of negative numbers"
    ]):
        pass
    NotNegativeNumber.schema()

{'not': __main__.NegativeNumber,
 'examples': (0, 3.141592653589793, 100),
 'description': 'the null space of negative numbers'}

based on the enriched schema we can infer `numpy` style docstrings as shown for `NegativeNumber` and `NotNegativeNumber` below.

### schema generate documentation

In [13]:
    print(NegativeNumber.__doc__, "\n"*3, NotNegativeNumber.__doc__)

a castable negative float

Examples
--------

>>> NegativeNumber(-3.141592653589793)
-3.141592653589793
>>> NegativeNumber(-42.0)
-42.0 


 the null space of negative numbers

Examples
--------

>>> NotNegativeNumber(0)
0
>>> NotNegativeNumber(3.141592653589793)
3.141592653589793
>>> NotNegativeNumber(100)
100
>>> with __import__("pytest").raises(BaseException): NotNegativeNumber(-3.141592653589793)
>>> with __import__("pytest").raises(BaseException): NotNegativeNumber(-42.0)


by formalizing a schema as numpy docstrings we can immediately use the `builtins` `doctest` module to test our types. with the test results shown below.

these generated docstrings also enrich the translation to documentation using the `autodoc` and `napolean` `sphinx` extensions.

https://schemata--4.org.readthedocs.build/en/4/src/schemata/tests/readme.html

#### `sphinx-jsonschema`

https://schemata--4.org.readthedocs.build/en/4/src/schema.html

### schema generate test cases

In [14]:
    import doctest;doctest.testmod(verbose=2, optionflags=doctest.ELLIPSIS)

Trying:
    NegativeNumber(-3.141592653589793)
Expecting:
    -3.141592653589793
ok
Trying:
    NegativeNumber(-42.0)
Expecting:
    -42.0
ok
Trying:
    NotNegativeNumber(0)
Expecting:
    0
ok
Trying:
    NotNegativeNumber(3.141592653589793)
Expecting:
    3.141592653589793
ok
Trying:
    NotNegativeNumber(100)
Expecting:
    100
ok
Trying:
    with __import__("pytest").raises(BaseException): NotNegativeNumber(-3.141592653589793)
Expecting nothing
ok
Trying:
    with __import__("pytest").raises(BaseException): NotNegativeNumber(-42.0)
Expecting nothing
ok
1 items had no tests:
    __main__
2 items passed all tests:
   2 tests in __main__.NegativeNumber
   5 tests in __main__.NotNegativeNumber
7 tests in 3 items.
7 passed and 0 failed.
Test passed.


TestResults(failed=0, attempted=7)

let's catch our breath here. we just demonstrated that python can be used to write `jsonschema` specifications describing a `NegativeNumber` and it's null space `NotNegativeNumber`. by using specific keys, described in the `jsonschema` specification would could infer:
* numpy docstrings
    * include `doctest`s 
* evaluated both the null and non-space of our `NegativeNumber`

#### advanced testing with `hypothesis`

an advanced application in statistically samples tests with `hypothesis`

In [15]:
    import hypothesis
    types = hypothesis.strategies.sampled_from((NegativeNumber, NotNegativeNumber))

    @hypothesis.strategies.composite
    def draw_pair(draw, type=types):
        t = draw(type)
        return t, draw(t.strategy())

    @hypothesis.given(draw_pair())
    @hypothesis.settings(max_examples=4)
    def test_types(pair):
        global counter
        counter += 1
        cls, value = pair
        assert cls(value) == value

running the test type functions same types and values.

In [16]:
    counter = 0; test_types()
    F"{counter} tests run"

'4 tests run'

at this point we haven't really made our own instance, the python form of the schema, of `NegativeNumber`.

### schema constrains form

the form is what it looks like.

there are different context that this `NegativeNumber` can be printed in like:

* the standard python `print`

In [17]:
    NegativeNumber(-math.pi)

-3.141592653589793

* a richer ascii representation using the `rich` library

In [18]:
    # NBVAL_IGNORE_OUTPUT
    NegativeNumber(-math.pi).print()

#### schema and form

i think now is a good place for a little wordplay. to me, i find validation in the nearness of __schema__ and __form__ in the name of the popular `react-jsonschema-form` library that presents user interfaces bases on schema. it has widespread use in other in the javascript community.

`schemata` ~loosely~ adopts  `react-jsonschema-form` to encode user interfaces, including layout and styling. the back haul for `schemata` UI are the popular `ipywidgets` that construct similar widgets to rjsf

In [19]:
    WideSlider = ui.Slider + ui.Layout[dict(width="60%")] + ui.Style[dict(width="100px")] + Numeric.Minimum[-10]
    WideSlider.schema()

{'ui:widget': 'slider',
 'layout': {'width': '60%'},
 'style': {'width': '100px'},
 'minimum': -10}

In [20]:
    NegativeNumber[WideSlider](-math.pi)

FloatSlider(value=-3.141592653589793, description='a castable negative float', layout=Layout(width='60%'), max…

In [21]:
    NegativeNumber.text()(-math.tau)

FloatText(value=-6.283185307179586, description='a castable negative float')

## conclusion

a general feature of schema, particularly `jsonschema`, is that its definition impacts documentation, testing, and interactive programming experiences. 

there is a healthy practice into translating code or ideas in schema. it is a pathway to understanding the basis of a thing.

---

## understanding schema takes time.

this document has been a journey from me, i've approached schema from literary, scientific, and computable schema hunting for a ghost. because, at best, a schema is an informal abstract.

while learning schema, it is important to observe many positions for schema, and the extended outcomes of well described systems. schema can move between systems and define constraints. each schema translated into a different system creates a different. these are the forms we must observe.

so to understand schema, i wrote code about schema, and explored tools that extended schema to help me do more with less. the outcome of these studies is a python library called `schemata` that implements, and extends, `jsonschema` conventions to python in general.

on the open web however, we have explicit specifications for schema, written in terse language, that all others to adopt implementations in their language. these implementations are forms of schema, the specification languages enable consistent interfaces across programming systems.