Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

LPW2013

Daniel Perrett edited this page · 1 revision
Clone this wiki locally

Test::Proto - QA Sugar for any* validation problem

Introduction

Can you hear me?

Please let me know if I'm talking too

qr/fast|slow|quiet|loud/

Or if there is anything else preventing you from getting the most out of this talk.

Who am I?

Daniel Perrett
Reference Systems Controller,
Cambridge University Press

(I help make dictionaries and put them on the web.)

  • This involves lots of QA.
  • Lexicographers are hard to find.
  • Now data is mostly JSON, XML, tables, etc.

Thus, big incentive to automate QA.

What is this about?

  • I have a problem.
  • There are some solutions.
  • There's another way of handling this problem.
  • Examples of implementation.
  • Highlights of implementation.
  • Limitations and future development.

What is the problem?

Defining the problem

  • I have a $thing.
  • I don't know what it contains.
  • I know what I want it to contain.
  • Though perhaps not exactly.

Data Validation

Imprecision

When you deal with a lot of data, sometimes you don't want to be doing exact input/output.

Example

my $thing = {
    id=>'12345',
    name=>"200g Ginger Biscuits"
};

if ($thing->{id} > 0) {
    print "\n" . $thing->{id};
    print "\t" . $thing->{name};
}

What can go wrong

  • $data needs to be a hashref, or it dies
  • The ID needs to be defined, or it warns
  • The ID needs to be a number, or it warns
  • The ID probably needs to be an integer

I just want a yes or no answer!

The alternative is laborious

Defensive programming often requires lots of work, like this big castle

Standard solution

  • IF ref $data is that of a hashref
  • AND its id is defined
  • AND its id looks like a number
  • AND its id is an integer
  • AND its id is greater than 0

... this means several lines, lots of repetition of $data->{id}, and checking we've got our orders of precedence right.

Clever solutions

For any given problem, there is probably a shortcut to reduce the number of steps.

  • eval, do
  • blessed, looks_like_number
  • map, grep, first
  • //, short-circuiting

Fast, but idiosyncratic, potentially fragile, and/or unintuitive.

CPAN Solutions

  • Test::Deep, Test::Deep::NoTest
  • Data::Sah
  • Data::DPath, Data::Path

Each does a job well, but not easily extensible.

Problem summary

A general approach to validation that is

  • widely applicable
  • obvious
  • reusable
  • extensible

Contexts

  • Input validation (arguments, Moo(se) attributes)
  • In .t files, ensuring data structures returned by your code are ok
  • Ensuring that a server API is spitting out the JSON or XML you're expecting it to
  • Dispatch

Matcher Coderef

The native matcher type

“We already have a generic ‘matcher’ object in perl: the coderef”

-- Father Chrysostomos, perl5-porters@perl.org, 2013-08-18, via RT

if (sub{shift == 42}->($got)){
    print 'Correct!';
}

Inconvenient to write

  • Argument handling is cumbersome
  • Lots of ugly code
  • Each merge is another level of sub { $foo->($_[0]) and $bar->($_[0])}

Matcher Object

The Protoype Object

The prototype is an object which embodies a pattern of a type of thing you are looking for.

Calling a method on the object adds a test case.

Once you've composed your prototype, then you run it on the test subject.

You can reuse a prototype, and add more test cases later.

Hardly original

  • Test::Deep: object = condition/structure
  • Data::DPath: object = expression

Test::Proto makes more use of methods.

Validating Scalars

Test::Proto::Base

p means...

Test::Proto::Base->new

...assuming you earlier did...

use Test::Proto qw( p );

Trivial Example

my $pTrue = 
  p->true
   ->ne('NULL')
   ->unlike(qr/^false$/i);
$pTrue->ok($unknown);

ok and validate

validate means "run the test and return a result object".

if ( p->defined->validate($foo) ) { ... }

ok means "run the test, talking to Test::Builder, printing diagnostics, and return a result object"

p->defined->ok($foo);

Creating a prototype

sub{$_[0] eq 'foo'};
p->eq('foo');
p('foo');

sub{$_[0] > 10};
p->num_gt(10);

Upgrading

p($expected) will attempt to upgrade $expected.

Upgrading prototypes is safe: p(p) is p.

$code = sub{$_[0] eq 'foo'}
p($code);
p->try($code);

p(qr/foo/);
p->like(qr/foo/);

Lots of useful functions

  • like, unlike
  • looks_like_number, looks_unlike_number
  • all_of, some_of, any_of, none_of
  • blessed, ref, is_a, refaddr, refaddr_of

Scalars only

Core assumption

Test::Proto will only handle scalars.

But that's ok, because anything can be a scalar if you reference it.

So, to validate a list, put it into an ArrayRef.

Validating Hashes

hash_of

use Test::Proto qw( pHash );

pHash->hash_of({'name'=>qr/bar/})

i.e. all the keys match and all the values match.

superhash_of, subhash_of

pHash->superhash_of (
    {id => ( p->num_gt(0) )}
);

cf. Test::Deep.

keys

pHash->keys ( ['id', 'name'] );

... but order is not defined!

Luckily, we can use another prototype. We need to know about validating arrays.

Validating Arrays

What can we do with ArrayRefs?

  • Most or all of the useful stuff from List::Util, some of it improved!
  • More general utility functions
  • Sets and Bags
  • Series Validation

Utility functions

  • array_before, array_after
  • ascending, descending, array_all_unique, array_all_same
  • in_groups (e.g. of 3), group_when

Set and bag comparison

This is when you know what something should contain, but not the order.

pHash->keys(
    pArray->bag_of(['name','id'])
);

Set Example

e.g. [4,1,2,3,2,3,1] is a subset of [1..5]

p->subset_of([1..5])
 ->ok([4,1,2,3,2,3,1]);
p->superset_of([1..5])
 ->ok([1,2,1,3..6])

Note that you can use items in a set repeatedly.

Bag Example

A bag is different. You can use items in a bag only once (but you may have them in the bag more than once).

p->subbag_of([1..5])
 ->ok([4,1,2,3,2,3,1]); # fails!
p->subbag_of([1..5,1..3])
 ->ok([4,1,2,3,2,3,1]); # passes!

Remember the difference!

Items taken out of a bag cannot be reused

You can use prototypes

p->subbag_of([p, p->ne('3'), '2'])
 ->ok([1,2,3]);

An Ancient Bug in Test::Deep

For this I guess most people normally use Test::Deep.

But it is buggy. It has had a known (and documented) bug for 10 years:

cmp_deeply(
    ['furball', 'furry'], 
    bag(re("^fur"), re("furb"))
)

Solved

Test::Proto solves this bug by calculating all the matches and determining if any combination can work.

       furball furry 
/^fur/    1      1
/furb/    1      0

The downside is it becomes slow when the table is very large!

Series Comparison

If you've ever wanted to use regex-like quantifiers and groupings (i.e. ?,*.+.(?:...)) on arrays, here's how:

pArray->contains_only(
    pSeries(
        pRepeatable(
            pAlternation('A', 'B'),
        )->max(5),
        'C'
    )
)->ok(['B','B','A','C']); 

Object Validation

$p->method('open')->ok($subject); 
    # i.e. 'can'

$p->method_list_context(
    'open', ['test.txt','>'], [$pFileHandle]
)->ok($subject);

$p->method_void_context(
    'open', ['test.txt','>']
)->ok($subject);

Extensibility

Any data can be semantic

Even things which are not objects can have layers of meaning.

Example

"CBEDUK7408.WAV" is a string. It's also:

  • a potentially valid filename
  • a name of a file which actually exists
  • a British English soundfile
  • recorded for the Business English Dictionary

Why is this interesting

Perhaps I want to composit tests for these.

pSoundFileName
    ->is_british
    ->file_exists
    ->dict(p->any('CBED', 'CEED'))

Subclassing

package SoundFileName;
use strict; use warnings;
use Moo;
use Test::Proto::Common;
extends 'Test::Proto::Base';
with 'Test::Proto::Value';

# ...

Define your tests: add another test

sub is_british {
    shift->like(qr/UK/i, @_);
}

This test just wraps like. However, it appears in the diags as like, not as is_british.

Define your tests: simple_test

simple_test file_exists => sub {
    return -e shift;
};

This is a simple test which has a true/false return, but doesn't do any subtests.

Define your tests: define_test

sub dict {
    my ( $self, $expected, $reason ) = @_;
    $self->add_test(
        'dict',
        { expected => $expected },
        $reason
    );      
}
define_test dict => sub {
    my ( $runner, $data, $reason ) = @_;
    my $args = $data->{args};
    my $expected = upgrade($data->{expected});
    my $dict = substr($runner->subject, 1, 4);
    return $expected->validate( $dict, $runner );
};

This uses $expected as a prototype and validates it against the first four characters of the subject.

Dispatch

Experimental

This is how I envisaged given/when to work.

use Test::Proto::Where;

my $something = {foo=>'bar'}

print test_subject $something =>
            where [], sub{ 'Empty array' },
            where pHash, sub{ 'A hash' },
            otherwise sub { 'Something else' };

Limitations

Verbose for certain operations

@$got > 3;

pArray->array_length(
    p->num_gt(3)
  )->validate($got);

... but still safer.

Performance?

  • Currently creates deeply-nested result object for each call to validate or ok.
  • Some development could mitigate this if important.
  • Never going to be as fast as raw perl

Perverse Subjects

my $tabby = 
  Acme::Cat::Schroedinger->new();
p([])->ok($tabby);
my $ginger = 
  Acme::Cat::Schroedinger->new();
p({})->ok($ginger);
p({})->ok($tabby); # fails!

It is possible to create objects which save state or mutate their properties or behaviour when inspected, predictably, randomly or preversely!

The future!

Contributors welcome!

github.org/pdl/Test-Proto

Todo

  • Handle scalar refs
  • Validate regex (as opposed to using regex)
  • Make the formatter prettier (or make more formatters)
  • Parallelising tests
  • Upgrade Test::Deep objects

Bugs?

  • Decent Code Coverage: 800+ tests, coverage report >95%
  • But I'm sure there are bugs
  • If you find any, please report them!

The End

Any questions?

...

Thankyou!

  • Castle image © M. Benoist, CC-BY-SA 3.0
  • Bag image © Bengt B, CC BY-SA 1.0
Something went wrong with that request. Please try again.