Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Documentation for the 'rules' system and the 'Scheduler' #5

Merged
merged 5 commits into from

3 participants

@markstos

This is a follow-up to the pull request to my previous pull request, in which I created a patch to add sequence-only exceptions for parallel test runs, only to discover that there was an undocumented option to prove called '--rules' which already has this feature.
--rules is handled by TAP::Parser::Scheduler, and much of the parts of the code for --rules and the Scheduler were not documented since they were introduced in 2008.

Despite being difficult to discover by most people, the rules system has been well exercised over time, as it is used by t/harness in the Perl distribution, which uses a more advanced syntax for it.

The ability to mark some tests as a not parallel-ready will be a very welcome feature for users with large test suites. Indeed, it turns out it was key to speeding up the Perl testing process. With this feature available, it can be a fairly simple process to convert a complex test suite to parallel testing:

  1. Start with tests 100% passing as a baseline (or at least inventory those which are failing)
  2. Try a fully parallel test run (or 5 or 10)
  3. Note which tests fail when run in parallel
  4. Add these exceptions to your .proverc using the --rules syntax.

Profit!

There are a couple of discussion points going forward:

  1. In the little documentation that was pre-existing, the rule system was marked as "experimental" in 2008. Is it still experimental because it's been largely undocumented and undiscovered, or is it considered stable now before because it's 4 years old? I would inclined to continue to mark it as experimental until more people can discover it and provide feedback. I left it marked as such in the documentation.

  2. Now that I figured out how to use it via prove, I think the syntax for prove could use particular scrutiny to see if that's what we want to keep, or if we want to simply it further. Here's my concern. The primary use-case I see for having "rules" options in prove is to make some tests as not-parallel-ready. Yet, here's the cumbersome, non-memorable recipe for that:

    # All tests are allowed to run in parallel, except those starting with "p"
    --rules='seq=t/p.t' --rules='par=*'

That's a pretty gross syntax for what could be a common usage. On one hand, people may copy/paste this syntax into their .proverc file and forget. On the other, I think we can do better. My initial inclination is to leave the "--rules" flag exposed for advanced uses, but support a simpler shorthand, for example, this could mean the same as the above:

# Tests starting with "p" are not parallel-ready, they must be run in sequence. 
--not-parallel='t/p*.t' 

There would be some details to work out there, like how the "simple" and advanced syntaxes would interact, but it seems worth considering.

@omega omega commented on the diff
bin/prove
@@ -264,6 +265,61 @@ The C<--state> switch may be used more than once.
$ prove -b --state=hot --state=all,save
+=head2 --rules
+
+The C<--rules> option is used to control which tests are run sequentially and
+which are run in parallel, if the C<--jobs> option is specified. The option may
+be specified multiple times, and the order matters.
+
+The most practical use is likely to specify that some tests are not
+"parallel-ready". Since mentioning a file with --rules doens't cause it to
+selected to run as a test, you can "set and forget" some rules preferences in
+your .proverc file. Then you'll be able to take maximum advantage of the
+performance benefits of parallel testing, while some exceptions are still run
+in parallel.
@omega
omega added a note

did you mean "sequentially" here? Right now it says parallel twice in my mind :)

Correct. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@schwern
Owner

I don't know anything about the rules system, but having docs is better than not having docs. Even if they're wrong, it's a good place to revise from.

@schwern schwern merged commit 99633c2 into from
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Aug 18, 2012
  1. @markstos
  2. @markstos
Commits on Aug 20, 2012
  1. @markstos
  2. @markstos
  3. @markstos

    refine rules docs

    markstos authored
        - Update code comments to quit giving the impression that --rules
          specifiy tests to run.
    
        - Provide link to further documentation if you need it.
This page is out of date. Refresh to see the latest.
View
56 bin/prove
@@ -70,6 +70,7 @@ Options that take arguments:
-j, --jobs N Run N test jobs in parallel (try 9.)
--state=opts Control prove's persistent state.
--rc=rcfile Process options from rcfile
+ --rules Rules for parallel vs sequential processing.
=head1 NOTES
@@ -264,6 +265,61 @@ The C<--state> switch may be used more than once.
$ prove -b --state=hot --state=all,save
+=head2 --rules
+
+The C<--rules> option is used to control which tests are run sequentially and
+which are run in parallel, if the C<--jobs> option is specified. The option may
+be specified multiple times, and the order matters.
+
+The most practical use is likely to specify that some tests are not
+"parallel-ready". Since mentioning a file with --rules doens't cause it to
+selected to run as a test, you can "set and forget" some rules preferences in
+your .proverc file. Then you'll be able to take maximum advantage of the
+performance benefits of parallel testing, while some exceptions are still run
+in parallel.
@omega
omega added a note

did you mean "sequentially" here? Right now it says parallel twice in my mind :)

Correct. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
+
+=head3 --rules examples
+
+ # All tests are allowed to run in parallel, except those starting with "p"
+ --rules='seq=t/p*.t' --rules='par=**'
+
+ # All tests must run in sequence except those starting with "p", which should be run parallel
+ --rules='par=t/p*.t'
+
+=head3 --rules resolution
+
+=over4
+
+=item * By default, all tests are eligible to be run in parallel. Specifying any of your own rules removes this one.
+
+=item * "First match wins". The first rule that matches a test will be the one that applies.
+
+=item * Any test which does not match a rule will be run in sequence at the end of the run.
+
+=item * The existence of a rule does not imply selecting a test. You must still specify the tests to run.
+
+=item * Specifying a rule to allow tests to run in parallel does not make the run in parallel. You still need specify the number of parallel C<jobs> in your Harness object.
+
+=back
+
+=head3 --rules Glob-style pattern matching
+
+We implement our own glob-style pattern matching for --rules. Here are the
+supported patterns:
+
+ ** is any number of characters, including /, within a pathname
+ * is zero or more characters within a filename/directory name
+ ? is exactly one character within a filename/directory name
+ {foo,bar,baz} is any of foo, bar or baz.
+ \ is an escape character
+
+=head3 More advance specifications for parallel vs sequence run rules
+
+If you need more advanced management of what runs in parallel vs in sequence, see
+the associated 'rules' documentation in L<TAP::Harness> and L<TAP::Parser::Scheduler>.
+If what's possible directly through C<prove> is not sufficient, you can write your own
+harness to access these features directly.
+
=head2 @INC
prove introduces a separation between "options passed to the perl which
View
45 lib/TAP/Harness.pm
@@ -330,20 +330,37 @@ run only one test at a time.
=item * C<rules>
-A reference to a hash of rules that control which tests may be
-executed in parallel. This is an experimental feature and the
-interface may change.
-
- $harness->rules(
- { par => [
- { seq => '../ext/DB_File/t/*' },
- { seq => '../ext/IO_Compress_Zlib/t/*' },
- { seq => '../lib/CPANPLUS/*' },
- { seq => '../lib/ExtUtils/t/*' },
- '*'
- ]
- }
- );
+A reference to a hash of rules that control which tests may be executed in
+parallel. If no rules are declared, all tests are eligible for being run in
+parallel. Here some simple examples. For the full details of the data structure
+and the related glob-style pattern matching, see
+L<TAP::Parser::Scheduler/"Rules data structure">.
+
+ # Run all tests in sequence, except those starting with "p"
+ $harness->rules({
+ par => 't/p*.t'
+ });
+
+ # Run all tests in parallel, except those starting with "p"
+ $harness->rules({
+ seq => [
+ { seq => 't/p*.t' },
+ { par => '**' },
+ ],
+ });
+
+ # Run some startup tests in sequence, then some parallel tests than some
+ # teardown tests in sequence.
+ $harness->rules({
+ seq => [
+ { seq => 't/startup/*.t' },
+ { par => ['t/a/*.t','t/b/*.t','t/c/*.t'], }
+ { seq => 't/shutdown/*.t' },
+ ],
+
+ });
+
+This is an experimental feature and the interface may change.
=item * C<stdout>
View
145 lib/TAP/Parser/Scheduler.pm
@@ -30,9 +30,98 @@ $VERSION = '3.25';
=head3 C<new>
- my $sched = TAP::Parser::Scheduler->new;
+ my $sched = TAP::Parser::Scheduler->new(tests => \@tests);
+ my $sched = TAP::Parser::Scheduler->new(
+ tests => [ ['t/test_name.t','Test Description'], ... ],
+ rules => \%rules,
+ );
+
+Given 'tests' and optional 'rules' as input, returns a new
+C<TAP::Parser::Scheduler> object. Each member of C<@tests> should be either a
+a test file name, or a two element arrayref, where the first element is a test
+file name, and the second element is a test description. By default, we'll use
+the test name as the description.
+
+The optional C<rules> attribute provides direction on which tests should be run
+in parallel and which should be run sequentially. If no rule data structure is
+provided, a default data structure is used which makes every test eligible to
+be run in parallel:
+
+ { par => '**' },
+
+The rules data structure is documented more in the next section.
+
+=head2 Rules data structure
+
+The "C<rules>" data structure is the the heart of the scheduler. It allows you
+to express simple rules like "run all tests in sequence" or "run all tests in
+parallel except these five tests.". However, the rules structure also supports
+glob-style pattern matching and recursive definitions, so you can also express
+arbitarily complicated patterns.
+
+The rule must only have one top level key: either 'par' for "parallel" or 'seq'
+for "sequence".
+
+Values must be either strings with possible glob-style matching, or arrayrefs
+of strings or hashrefs which follow this pattern recursively.
+
+Every element in an arrayref directly below a 'par' key is eligible to be run
+in parallel, while vavalues directly below a 'seq' key must be run in sequence.
+
+=head3 Rules examples
+
+Here are some examples:
+
+ # All tests be run in parallel (the default rule)
+ { par => '**' },
+
+ # Run all tests in sequence, except those starting with "p"
+ { par => 't/p*.t' },
+
+ # Run all tests in parallel, except those starting with "p"
+ {
+ seq => [
+ { seq => 't/p*.t' },
+ { par => '**' },
+ ],
+ }
+
+ # Run some startup tests in sequence, then some parallel tests than some
+ # teardown tests in sequence.
+ {
+ seq => [
+ { seq => 't/startup/*.t' },
+ { par => ['t/a/*.t','t/b/*.t','t/c/*.t'], }
+ { seq => 't/shutdown/*.t' },
+ ],
+ },
+
-Returns a new C<TAP::Parser::Scheduler> object.
+=head3 Rules resolution
+
+=over4
+
+=item * By default, all tests are eligible to be run in parallel. Specifying any of your own rules removes this one.
+
+=item * "First match wins". The first rule that matches a test will be the one that applies.
+
+=item * Any test which does not match a rule will be run in sequence at the end of the run.
+
+=item * The existence of a rule does not imply selecting a test. You must still specify the tests to run.
+
+=item * Specifying a rule to allow tests to run in parallel does not make the run in parallel. You still need specify the number of parallel C<jobs> in your Harness object.
+
+=back
+
+=head3 Glob-style pattern matching for rules
+
+We implement our own glob-style pattern matching. Here are the patterns it supports:
+
+ ** is any number of characters, including /, within a pathname
+ * is zero or more characters within a filename/directory name
+ ? is exactly one character within a filename/directory name
+ {foo,bar,baz} is any of foo, bar or baz.
+ \ is an escape character
=cut
@@ -70,6 +159,9 @@ sub new {
sub _set_rules {
my ( $self, $rules, $tests ) = @_;
+
+ # Convert all incoming tests to job objects.
+ # If no test description is provided use the file name as the description.
my @tests = map { TAP::Parser::Scheduler::Job->new(@$_) }
map { 'ARRAY' eq ref $_ ? $_ : [ $_, $_ ] } @$tests;
my $schedule = $self->_rule_clause( $rules, \@tests );
@@ -185,6 +277,8 @@ sub _expand {
return @match;
}
+=head2 Instance Methods
+
=head3 C<get_all>
Get a list of all remaining tests.
@@ -207,9 +301,9 @@ sub _gather {
=head3 C<get_job>
-Return the next available job or C<undef> if none are available. Returns
-a C<TAP::Parser::Scheduler::Spinner> if the scheduler still has pending
-jobs but none are available to run right now.
+Return the next available job as L<TAP::Parser::Scheduler::Job> object or
+C<undef> if none are available. Returns a L<TAP::Parser::Scheduler::Spinner> if
+the scheduler still has pending jobs but none are available to run right now.
=cut
@@ -281,9 +375,50 @@ sub _find_next_job {
=head3 C<as_string>
Return a human readable representation of the scheduling tree.
+For example:
+
+ my @tests = (qw{
+ t/startup/foo.t
+ t/shutdown/foo.t
+
+ t/a/foo.t t/b/foo.t t/c/foo.t t/d/foo.t
+ });
+ my $sched = TAP::Parser::Scheduler->new(
+ tests => \@tests,
+ rules => {
+ seq => [
+ { seq => 't/startup/*.t' },
+ { par => ['t/a/*.t','t/b/*.t','t/c/*.t'] },
+ { seq => 't/shutdown/*.t' },
+ ],
+ },
+ );
+
+Produces:
+
+ par:
+ seq:
+ par:
+ seq:
+ par:
+ seq:
+ 't/startup/foo.t'
+ par:
+ seq:
+ 't/a/foo.t'
+ seq:
+ 't/b/foo.t'
+ seq:
+ 't/c/foo.t'
+ par:
+ seq:
+ 't/shutdown/foo.t'
+ 't/d/foo.t'
+
=cut
+
sub as_string {
my $self = shift;
return $self->_as_string( $self->{schedule} );
View
28 lib/TAP/Parser/Scheduler/Job.pm
@@ -31,10 +31,11 @@ Represents a single test 'job'.
=head3 C<new>
my $job = TAP::Parser::Scheduler::Job->new(
- $name, $desc
+ $filename, $description
);
-Returns a new C<TAP::Parser::Scheduler::Job> object.
+Given the filename and description of a test as scalars, returns a new
+L<TAP::Parser::Scheduler::Job> object.
=cut
@@ -47,9 +48,14 @@ sub new {
}, $class;
}
+=head2 Instance Methods
+
=head3 C<on_finish>
-Register a closure to be called when this job is destroyed.
+ $self->on_finish(\&method).
+
+Register a closure to be called when this job is destroyed. The callback
+will be passed the C<TAP::Parser::Scheduler::Job> object as it's only argument.
=cut
@@ -60,7 +66,10 @@ sub on_finish {
=head3 C<finish>
-Called when a job is complete to unlock it.
+ $self->finish;
+
+Called when a job is complete to unlock it. If a callback has been registered
+with C<on_finish>, it calls it. Otherwise, it does nothing.
=cut
@@ -71,6 +80,15 @@ sub finish {
}
}
+=head2 Attributes
+
+ $self->filename;
+ $self->description;
+ $self->context;
+
+These are all "getters" which return the data set for these attributes during object construction.
+
+
=head3 C<filename>
=head3 C<description>
@@ -96,6 +114,8 @@ sub as_array_ref {
=head3 C<is_spinner>
+ $self->is_spinner;
+
Returns false indicating that this is a real job rather than a
'spinner'. Spinners are returned when the scheduler still has pending
jobs but can't (because of locking) return one right now.
View
10 lib/TAP/Parser/Scheduler/Spinner.pm
@@ -34,12 +34,14 @@ return a real job.
my $job = TAP::Parser::Scheduler::Spinner->new;
-Returns a new C<TAP::Parser::Scheduler::Spinner> object.
+Ignores any arguments and returns a new C<TAP::Parser::Scheduler::Spinner> object.
=cut
sub new { bless {}, shift }
+=head2 Instance Methods
+
=head3 C<is_spinner>
Returns true indicating that is a 'spinner' job. Spinners are returned
@@ -50,4 +52,10 @@ return one right now.
sub is_spinner {1}
+=head1 SEE ALSO
+
+L<TAP::Parser::Scheduler>, L<TAP::Parser::Scheduler::Job>
+
+=cut
+
1;
Something went wrong with that request. Please try again.