Skip to content

Latest commit

 

History

History
603 lines (489 loc) · 32.8 KB

README_checks.md

File metadata and controls

603 lines (489 loc) · 32.8 KB

checks.pl

A more powerful replacement for the venerable must_be/2 to perform type and domain checks on predicate entry (among others).

** This documentation is out of sync with the latest update **

Synopsis

Check that term X fulfills all the conditions in the list Conditions. Conditions that are marked tuned will preferentially fail instead of throwing if the condition is not fulfilled:

check_that(+X,@Conditions)

Check that term X fulfills all the conditions in the list Conditions. If Tuned is instantiated to hard, conditions that are marked tuned will throw if the condition is not fulfilled. If Tuned is instantiated to soft, conditions that are marked tuned will fail if the condition is not fulfilled.

check_that(+X,@Conditions,@Tuned)

Same as check_that/2, but a Name for X is given. This name will be used when the message for an excpetion is constructed:

check_that_named(X,Conditions,Name)

Same as check_that/3, but a Name for X is given. This name will be used when the message for an excpetion is constructed:

check_that_named(X,Conditions,Name,Tuned)

Exception terms

The exception terms thrown by check_that/N and check_that_named/N are not ISO exception terms, although they still retain the outer error(Formal,Context) structure. They are structured as follows:

error(check(Type,Expected,Msg,Culprit),_).

Where Type is an exception-identifying atom that is generally one of:

  • type - the culprit is of the wrong type to pass the check
  • domain - the culprit is of the correct type but of the wrong domain to pass the check
  • uninstantiation - the culprit is too instantiated (generally, fully instantiated when it shouldn't be)
  • instantiation - the culprit is not instantiated enough (generally, fully uninstantiated when it shouldn't be)

and

  • Expected - either an uninstantiated term or a string that explains what is expected
  • Msg - either an uninstantaited term or a string giving additional information
  • Culprit - the term that caused the exception to be thrown; may be large or contain sensitive information!

A hook into prolog:error_message/1 formats the exception for the toplevel printer.

Description

check_that/3 and friends: a replacement for the must_be/2 predicate of (SWI-)Prolog. must_be/2 is used to check preconditions on predicate entry, but is not very flexible. Can we change that?

A call to check_that/3 looks as follows:

check_that(X,Conditions,Tuned)

where

  • X is the term that is subject to being checked.
  • Conditions is a proper list of conditions to be evaluated left-to-right
  • Tuned is a flag that determines whether to, in certain settings, preferentially throw (if it is hard) or fail (if it is soft; actually anything other than hard)

The simpler

check_that(X,Conditions)

assumes that Tuned is soft.

The more extensive

check_that_named(X,Conditions,Name)
check_that_named(X,Conditions,Name,Tuned)

also take a Name to designate the X that is being checked. This Name can then be inserted into exception messages. Generally one would not bother with this.

TODO:

I seems one sometimes wants to provide an explicit error message instead of having check_that construct a confusing one. Make that possible!

The list of conditions

The list of conditions in the call check_that(X,Conditions) behaves like a "conditional and" (aka. "short-circuiting and") of verifications.

A condition is a compound term, a tagged check. The functor name of the condition specifies what to do depending on the outcome of the check.

The behaviour is as given below for the various tags allowed in conditions.

The "check precondition fails" generally means that the actual check cannot determine from the current state of X whether to reasonably succeed or fail because X is not instantiated enough. The check would then raise an instantiation exception (in the current implementation, a non-ISO exception term error(check(instantiation,_,_,_),_).

The Prolog default checks like atom/1 take the high way and fail if an uninstantaited term is passed. However, atom(_) failing means that either it pretends to know something about _ that it doesn't (namely that this is not an atom) or it is actually a second-order predicate like var/1 that can analyze the momentary state of the computation and say that a term is indeed uninstantiated. In either case, we have something dubious.

break/1

  • Precondition fails: An exception (generally a 'uninstantiated error' exception) is thrown.
  • Verification fails: The condition succeeds, leading to examination of the next condition to the right.
  • Verification succeeds: Condition processing stops and check_that succeeds overall

smooth/1

  • Precondition fails: The condition fails, leading to the whole of check_that/N failing. This is like the behaviour of prolog predicates like atom/N when they are given an uninstantiated term: They just fail.
  • Verification fails: The condition fails, leading to the whole of check_that/N failing.
  • Verification succeeds: The condition succeeds, leading to examination of the next condition to the right.

soft/1

  • Precondition fails: An exception (generally a 'uninstantiated error' exception) is thrown.
  • Verification fails: The condition fails, leading to the whole of check_that/N failing.
  • Verification succeeds: The condition succeeds, leading to examination of the next condition to the right.

tuned/1 and the Tuned flag is soft: behaves like soft/1

  • Precondition fails: An exception (generally a 'uninstantiated error' exception) is thrown.
  • Verification fails: The condition fails, leading to the whole of check_that/N failing.
  • Verification succeeds: The condition succeeds, leading to examination of the next condition to the right.

tuned/1 and the Tuned flag is hard: behaves like hard/1

  • Precondition fails: An exception (generally a 'uninstantiated error' exception) is thrown.
  • Verification fails: An exception (generally a 'type error' if X is out-of-type, and a domain error if X is 'out of domain') is thrown.
  • Verification succeeds: The condition succeeds, leading to examination of the next condition to the right.

hard/1

  • Check precondition fails: An exception (generally a 'uninstantiated error' exception) is thrown.
  • Check verification fails: An exception (generally a 'type error' if X is out-of-type, and a domain error if X is 'out of domain') is thrown.
  • Verification succeeds: The condition succeeds, leading to examination of the next condition to the right.

How would one use this

 client code                     circumspect                       "logical"
      |                          client code                      client code
      |                      (in particular if                         |
      |                    it calls predicate_x/N+1                    |
      |                       with Throw = true)                       |
      |                               |                                |
      |                               |                                |
      |                               |                                |
      V                               |                                V
predicate_x/N                         |                        predicate_x_smooth/N
      |                               |                       - - - - - - - - - - -
      |                               |                     captures all the argument
      |                               |                 subdomains where predicate_x/N+1
      +----------------------------+  |                    would throw and *fails them*
               Calls               |  |               (using appropriate check_that/2 calls)
          predicate_x/N+1          |  |                                |
               with                |  |                                |
           Throw = false           |  |                                |
          - - - - - - - -          |  |  +-----------------------------+
      As a consequence, some       |  |  |                Calls
      domains of problematic       |  |  |           predicate_x/N+1
      arguments subdomains will    |  |  |                with
      just cause failure instead   |  |  |            Throw = false
      of exceptions.               |  |  |
                                   |  |  |
                                   |  |  |
                                   |  |  |
                                   V  V  V
                                predicate_x/N+1
                               - - - - - - - - -
                             takes an additional
                       "Throw" argument that determines
                      what check_that/3 conditions tagged
                       as "tuned/1" will do: throw or fail
                                      |
                                      |
                                      |
                                      V
                          explicitly allow the cases
                            of unbound arguments
                          - - - - - - - - - - - - -
                       This is done with calls that break        | Note that checks generally throw if they
                         if the argument is unbound:             | are given an unbound argument, which is why
                       check_that(X,[break(var),...])            | "break" is needed. Contrariwise, Prolog's atom(_) fails
                                      |                          | but check_that(_,[soft(atom)]) throws "instantiation error".
                                      |                          | Alternatively, check_that(_,[smooth(atom)]) fails.
                                      |
                                      V
                            underspecified problem? ------------> throw "instantion error"
                            - - - - - - - - - - - -               Conditions checking this should be tagged with hard/1
                               some arguments or                  so that they always throw. If you want
                            tuples of arguments are               to see failure, you should catch-and-fail
                            not instantiated enough               in a wrapping predicate.
                                      |
                                      |
                                      |
                                      V
                                  wrong type? ------------------> These conditions should be tagged with hard/1
                               - - - - - - - -                    as such an error smells like a programming error.
                              some arguments are                  In a "smooth" setting, one may want these to fail,
                              of the wrong type                   in that case, you should catch-and-fail
                       (e.g. passing 'foo' as integer arg)        in a wrapping predicate.
                                      |
                                      |
                                      |
                                      V
                                wrong domain? ------------------> These conditions should be tagged with tuned/1
                                - - - - - - -                     so that they throw if member(Throw,[true,throw]) but only
                       some arguments or argument tuples          fail otherwise, thus causing predicate_x/N+1 to
                         are out of the allowed domain            appear more relaxed regarding some problems.
                        (such a test can become complex)
                                      |
                                      |
                                      |
                                      V
                         cross-argument correlations -----------> Generally the same as domain errors.
                         - - - - - - - - - - - - - -
                  complex conditions may need to be checked;
                  for these, use passany, passall, passnone
                          checks. This can be costly.
                                      |
                                      |
                                      |
                                      V
                               "BUSINESS CODE" ------------------------> Throw or fail? It depends, but if a problem is
                  actual processing can more easily and cheaply          detected "after the entry checks" on would
                  check additional conditions at certain points          probably want to throw.
                      and should not refrain from doing so.
                                                    |
                                                    +------------------> Augment with assertion/1 calls according to taste.
                                                                         These can be considered "live documentation" and
                                                                         can always be compiled-out later.

Check keywords implemented so far

"TBC" stands for "The term to be checked".

Keyword If TBC is var TBC must be a/an ....
true doesn't care This check always succeeds, whatever TBC is.
false,fail doesn't care This check always fails, whatever TBC is.
random(Probability) doesn't care This check randomly fails with probability 0 =< Probability =< 1.
Useful for playing chaos monkey.
var covered by test uninstantiated term.
nonvar covered by test instantiated term.
nonground covered by test nonground term (and thus may be uninstantiated).
ground covered by test ground term (and thus will be instantiated).
atom,symbol throws atom.
atomic,constant throws atomic term.
compound throws compound term.
boolean throws one of the atoms true or false.
pair throws compound term with functor name - and arity 2.
string throws SWI-Prolog string.
stringy throws either an atom or an SWI-Prolog string.
nonempty_stringy throws nonempty stringy: a stringy that is different from '' and "".
char throws atom of length 1. This is the traditional Prolog char type.
char_list,chars throws proper list of 0 or more chars. Unbound elements are not allowed.
code throws integer between 0 and 0x10FFFF (a Unicode code point).
Detailed range checks are not done.
code_list,codes throws proper list of 0 or more codes. Unbound elements are not allowed.
chary throws char or a code.
chary_list,charys throws proper list of 0 or more chars or codes (but consistently only one of those).
Unbound list elements cause an exception.
stringy_typeid throws one of the atoms string or atom. Compare with the check boolean.
chary_typeid throws one of the atoms char or code. Compare with the check boolean.
number throws number (encompasses float, rational, integer).
float throws float (64 bit double precision), including +1.0Inf, -1.0Inf, NaN, -0.0.
float_not_nan throws float, excluding NaN.
float_not_inf throws float, excluding +1.0Inf, -1.0Inf.
float_not_neginf throws float, excluding -1.0Inf.
float_not_posinf throws float, excluding +1.0Inf.
int/integer throws integer.
rational throws SWI-Prolog _rational, which encompasses integer.
nonint_rational,proper_rational throws SWI-Prolog rational that is not an integer.
negnum,negnumber throws strictly negative number.
posnum,posnumber throws strictly positive number.
neg0num,neg0number throws negative-or-zero number.
pos0num,pos0number throws positive-or-zero number.
non0num,non0number throws non-zero number.
negint,negative_integer throws strictly negative integer.
posint,positive_integer throws strictly positive integer.
neg0int throws negative-or-zero integer.
pos0int,nonneg throws positive-or-zero integer.
negfloat, throws strictly negative float.
posfloat throws strictly positive float.
neg0float throws negative-or-zero float.
pos0float throws positive-or-zero float.
inty throws integer or a float that represents an integer, e.g 1.0.
neginty throws strictly negative inty.
posinty throws strictly positive inty.
neg0inty throws negative-or-0 inty.
pos0inty throws positive-or-0 inty.
list,proper_list throws proper list, including the empty list. (TODO: open lists)
nonempty_list throws proper list that is not empty.
dict throws SWI-Prolog dict (which is a compound term following some special requirements).
dict_has_key(Key) throws term must be a dict, and it must contain an entry for Key.
cyclic throws term for which one can unambiguously decide whether it is cyclic.
cyclic_now covered by test term that has a cyclic structure now.
check_that(_,soft(cyclic_now)) fails.
acyclic_now covered by test term that has no cyclic structure now, but may acquire it later, unless the term is ground.
check_that(_,soft(acyclic_now)). succeeds.
acyclic_forever covered by test term that is both ground and acyclic and will never become cyclic.
check_that(_,soft(acyclic_forever)). fails.
stream throws term that is either "stream name" (certain atoms) or a valid stream handle (certain blobs).
The known stream names are user_input, user_output, user_error, current_input, current_output.
unifies(Z) covered by test term that unifies with Z. Unification is rolled back by use \+ \+.
member(ListOfValues) throws term that is member of the ListOfValues. Membership test is unification, as for member/2.
forall(ListOfChecks) depends TBC must pass all checks in ListOfChecks. Recursive. Same as writing the checks down normally, thus not really useful.
forany(ListOfChecks) depends TBC must pass at least one check in ListOfChecks. Recursive. Useful for implementing an or.
fornone(ListOfChecks) depends TBC must pass no check in ListOfChecks. Recursive. Useful for negation.
passall(Check) throws Check is one of the check keywords. TBC must be a proper list. All the terms in that list must pass Check. Useful for terse code.
passany(Check) throws Check is one of the check keywords. TBC must be a proper list. At least one terms in that list must pass Check.
passnone(Check) throws Check is one of the check keywords. TBC must be a proper list. None of the terms in that list must pass Check.

Keywords provided by must_be/2 not yet implemented

  • between(FloatL,FloatU): if FloatL is float, all other values may be float or integer (FIXME?); the limits are both INCLUSIVE; limits may be equal but NOT reversed
  • between(IntL,IntU): if IntL is integer, all other values must be integer; the limits are both INCLUSIVE; limits may be equal but NOT reversed. FIXME: there should be specific between_int/2 and between_float/2 if one goes that way.
  • text: atom or string or chars or codes (but not numbers even though some predicates "textify" those)
  • list(Type): proper list with elements of type Type (must_be/2(Type,_) is called on elements); empty list is allowed; on error the index is not indicated. A type like "list(list(integer))" is ok! Actually corresponds to passall(Type)
  • list_or_partial_list: A partial list (one ending in a variable: [x|_]). This includes an unbound variable.
  • callable: passes callable/1. Relatively usesless, as "callable" is ill-defined. Basically (atom(X);compound(X))
  • encoding: valid name for a character encoding; see current_encoding/1, e.g. utf8 (but not "utf8" or 'utf-8'; also fails for 'iso_8859_1')
  • type: Meta: Term is a valid type specification for must_be/2. This is done by looking up whether a clause has_type(Type,_) :- .... exists. Example: must_be(type,list(integer)). However, "must_be(type,list(grofx))": OK, but "must_be(type,grofx)": NOT OK.

Possible extension:

  • predicate_indicator: A Name/Arity predicate indicator
  • list_length_X: Tests for length of lists (larger, smaller, equal)
  • subsumes
  • does_not_unify / dif

A note on cyclic/acyclic

Consider the "instantiation career" of a term, going from "most uninstantiated" to "ground":

                       uninstantiated
                             |
                             V
              +--------------+-------------+
              |                            |
              V                            |
      nonground noncyclic                  |
              |                            |
              +----------------------------+
              |                            |
              |                            V
              |                     nonground cyclic
              |                            |
              V                            V
        ground acylic                ground cyclic

For example

                             X=_
                   not "cyclic", "acyclic"
                             |
                             V
              +--------------+-------------+
              |                            |
              V                            |
            X=s(_)                         |
   not "cyclic", "acyclic"                 |
              |                            |
              +----------------------------+
              |                            |
              |                            V
              |                         X=s(X,_)
              |                 "cyclic", not "acyclic"
              |                            |
              V                            V
            X=s(a)                       X=s(X)
    not "cyclic", "acyclic"      "cyclic", not "acyclic"

We would like to see a predicate which throws unless it can be sure, which is not the case with the builtins cyclic_term/1 and acyclic_term/1.

The predicates cyclic_term/1 and acyclic_term/1 are in fact "second order": They say something about the current state of computation, not about the term they are examining.

We define the following four checks, where acyclic_now corresponds exactly to acyclic_term/1 and cyclic_now corresponds exactly to cyclic_term/1. Conditions using those check never throw.

cyclic cyclic_now
(cyclic_term/1)
acyclic_now
(acyclic_term/1)
acyclic_forever
uninstantiated throw (hard mode) false (could change) true (could change) false
nonground acyclic throw (hard mode) false (could change) true (could change) false
ground acylic false (for sure) false (for sure) true (for sure) true
nonground cyclic true (for sure) true (for sure) false (for sure) false
ground cyclic true (for sure) true (for sure) false (for sure) false

Examples

Fail if X is not a string (but throw if X is unbound): check_that(X,[soft(string)])

?- check_that(foo,[soft(string)]).
false.

Throw if X is not a string (or unbound): check_that(X,[hard(string)])

?- check_that(foo,[hard(string)]).
ERROR: check failed : type error (the culprit is not of the required type)
ERROR:    message   : the value should fulfill 'string-ness'
ERROR:    culprit   : foo

Throw if X is not an integer and then fail if X is not a positive integer: check_that(X,[hard(int),soft(posint)])

?- check_that(foo,[hard(int),soft(posint)]).
ERROR: check failed : type error (the culprit is not of the required type)
ERROR:    message   : the value should fulfill 'integer-ness'
ERROR:    culprit   : foo
?- check_that(-1,[hard(int),soft(posint)]).
false.
?- check_that(1,[hard(int),soft(posint)]).
true.

A type of test often done on predicate entry: break off with success if X is unbound, but otherwise it must absolutely be stringy, and even a nonempty stringy (for example): check_that(X,[break(var),hard(stringy),soft(nonempty_stringy)]).

?- check_that(X,[break(var),hard(stringy),soft(nonempty_stringy)]).
true.
?- check_that(12,[break(var),hard(stringy),soft(nonempty_stringy)]).
ERROR: check failed : type error (the culprit is not of the required type)
ERROR:    message   : the value should fulfill 'stringy-ness'
ERROR:    culprit   : 12
?- check_that("",[break(var),hard(stringy),soft(nonempty_stringy)]).
false.
?- check_that("foo",[break(var),hard(stringy),soft(nonempty_stringy)]).
true.

Smooth, soft, tuned, hard failure modes

Using the tuned/1 tag, you can switch from "hard" to "soft" behaviour:

This behaves as if you had written hard(posint):

check_that(X,[hard(int),tuned(posint)],throw)

This behaves as if you had written soft(posint):

check_that(X,[hard(int),tuned(posint)],false)

Succeed if X is unbound, and then fail or throw depending on Throw if X is not a member of the given list:

check_that(X,[break(var),tuned(member([alpha,bravo,charlie]))],Throw).

Running it:

?- check_that(X,[break(var),tuned(member([alpha,bravo,charlie]))],throw).
true.
?- check_that(bar,[break(var),tuned(member([alpha,bravo,charlie]))],throw).
ERROR: check failed : domain error (the culprit is outside the required domain)
ERROR:    message   : the value should fulfill 'list_membership-ness'
ERROR:    culprit   : bar
?- check_that(bar,[break(var),tuned(member([alpha,bravo,charlie]))],false).
false.

Usage scenarios:

  • Use check_that/2 to verify predicate entry preconditions
    • Expecting to switch them off at "production time" (assertions) to gain performance
    • They are also goo "live documentation" saying exactly what parts of the input space are covered by throws and which ones by fails One can rely on "implicitly building" that space via the behaviour of the predicates called in turn, but may become unclear what that space is. (This may be ok depending on coding style)
    • Using check_that/2 "normally" as a guard to make the predicate
      • throw on extremely bad arguments (i.e. badly typed ones)
      • fail on bad arguments (i.e. out-of-domain ones)
    • Logic/Search parts of the program will preferentially fail (or only enter the situation where a fail makes sense)
    • Functional programming parts of the program will preferentially throw (or only enter the situation where a throw makes sense)
  • Use check_that/2 to verify invariants inside of predicates (generally not needed as this is done by checking pre/postconditions in Prolog)
    • Expecting to switch them off at "production time" (assertions) to gain performance
    • TODO: Such cases must be marked as "this is not expected to be violated in running code" (throwing a really nasty exception) And then the test must be switch-offable using a specific hierarchical key, just like you switch logging on or off that way.
  • Use check_that/2 normally in code, as just a particular form of a guard, i.e. it is not expected that they will be switched off
  • (There is no easy way to perform postconditions-on-success in Prolog except by wrapping a predicate with another predicate. Annoying.)

Assertions

  • Assertions are basically the subset of conditions that one does not expect to fail or throw at runtime. The idea is to remove those tests because they "never fail" and the insurance will pick up the slack if they do. Prolog is special in that "failing" is an integral part of its approach, so switching off checks wholesale is not an option (Unless one wants to really have separate instructions for "necessary checks" and "assertion checks") To "remove unnecessary checks" it must both be possible to:
    1. identify them. This can be done my giving them a special name, e.g. lenient_assert instead of just lenient
    2. be able to "remove them" cheaply; this can be done during compilation phase: the marked conditions can be written out
    3. select those which should be removed based on program structure: eg. all those in module XY should be removed this also seems a case for the compilation phase

Design question especially clear in case of complex checking of lists:

  • The structure to test may have multiple layers of testing. For example, for a "proper list of char":
    • X is a var -> fail or exception
    • X is not a proper list -> fail or exception
    • X contains vars -> fail or exception
    • X contains non-chars -> fails or exception
    • and finally success A caller could demand throwing down to any level of the above (and possibly accept var) The proper way to have this flexibility is exactly to use check_that/2 with the detailed more-and-more-precise check sequence, going from strictness to leniency depending on taste, for example check_that(X,[break(var),hard(list),hard(list(nonvar)),tuned(list(char))]) instead of a single monolitic check_that(X,[break(var),hard(chars)]) However, in that case the "rightmost checks" may perform wasteful checks against that we already know will succeed. So there is a need for doing bare-bones checks (maybe?). Probably not worth it. Note that what exception to throw is generally made in this order: X uninstantiated -> throw instantiation error X instantiated but not of the correct type (e.g. expecting integer but it's a float) -> throw type error X of the correct type but not in the correct domain (e.g. expecting positive integer but it's negative) -> throw domain error

Bugs

Good:

?- check_that([a,_,b],hard(chars)).
ERROR: check failed : instantiation error (the culprit is not instantiated (enough))
ERROR:    message   : the value should be instantiated. Can't check for 'atom-ness'
?- check_that([a,_,b],soft(chars)).
ERROR: check failed : instantiation error (the culprit is not instantiated (enough))
ERROR:    message   : the value should be instantiated. Can't check for 'atom-ness'

But:

?- check_that([1,_,b],soft(chars)).
false.

because the above fails at 1. It should throw.