Skip to content
This repository

Documentation for the 'rules' system and the 'Scheduler' #5

Merged
merged 5 commits into from about 1 year ago

3 participants

Mark Stosberg Michael G. Schwern Andreas Marienborg
Mark Stosberg

This is a follow-up to the pull request to my previous pull request, in which I created a patch to add sequence-only exceptions for parallel test runs, only to discover that there was an undocumented option to prove called '--rules' which already has this feature.
--rules is handled by TAP::Parser::Scheduler, and much of the parts of the code for --rules and the Scheduler were not documented since they were introduced in 2008.

Despite being difficult to discover by most people, the rules system has been well exercised over time, as it is used by t/harness in the Perl distribution, which uses a more advanced syntax for it.

The ability to mark some tests as a not parallel-ready will be a very welcome feature for users with large test suites. Indeed, it turns out it was key to speeding up the Perl testing process. With this feature available, it can be a fairly simple process to convert a complex test suite to parallel testing:

  1. Start with tests 100% passing as a baseline (or at least inventory those which are failing)
  2. Try a fully parallel test run (or 5 or 10)
  3. Note which tests fail when run in parallel
  4. Add these exceptions to your .proverc using the --rules syntax.

Profit!

There are a couple of discussion points going forward:

  1. In the little documentation that was pre-existing, the rule system was marked as "experimental" in 2008. Is it still experimental because it's been largely undocumented and undiscovered, or is it considered stable now before because it's 4 years old? I would inclined to continue to mark it as experimental until more people can discover it and provide feedback. I left it marked as such in the documentation.

  2. Now that I figured out how to use it via prove, I think the syntax for prove could use particular scrutiny to see if that's what we want to keep, or if we want to simply it further. Here's my concern. The primary use-case I see for having "rules" options in prove is to make some tests as not-parallel-ready. Yet, here's the cumbersome, non-memorable recipe for that:

    # All tests are allowed to run in parallel, except those starting with "p"
    --rules='seq=t/p*.t' --rules='par=**'

That's a pretty gross syntax for what could be a common usage. On one hand, people may copy/paste this syntax into their .proverc file and forget. On the other, I think we can do better. My initial inclination is to leave the "--rules" flag exposed for advanced uses, but support a simpler shorthand, for example, this could mean the same as the above:

# Tests starting with "p" are not parallel-ready, they must be run in sequence. 
--not-parallel='t/p*.t' 

There would be some details to work out there, like how the "simple" and advanced syntaxes would interact, but it seems worth considering.

Andreas Marienborg omega commented on the diff August 27, 2012
bin/prove
@@ -264,6 +265,61 @@ The C<--state> switch may be used more than once.
264 265
 
265 266
     $ prove -b --state=hot --state=all,save
266 267
 
  268
+=head2 --rules
  269
+
  270
+The C<--rules> option is used to control which tests are run sequentially and
  271
+which are run in parallel, if the C<--jobs> option is specified. The option may
  272
+be specified multiple times, and the order matters.
  273
+
  274
+The most practical use is likely to specify that some tests are not
  275
+"parallel-ready".  Since mentioning a file with --rules doens't cause it to
  276
+selected to run as a test, you can "set and forget" some rules preferences in
  277
+your .proverc file. Then you'll be able to take maximum advantage of the
  278
+performance benefits of parallel testing, while some exceptions are still run
  279
+in parallel.
2
Andreas Marienborg
omega added a note August 27, 2012

did you mean "sequentially" here? Right now it says parallel twice in my mind :)

Mark Stosberg
markstos added a note August 28, 2012

Correct. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Michael G. Schwern
Owner

I don't know anything about the rules system, but having docs is better than not having docs. Even if they're wrong, it's a good place to revise from.

Michael G. Schwern schwern merged commit 99633c2 into from April 02, 2013
Michael G. Schwern schwern closed this April 02, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Showing 5 unique commits by 1 author.

Aug 18, 2012
Mark Stosberg Document what's going with this: map { 'ARRAY' eq ref $_ ? $_ : [ $_,…
… $_ ] } @$tests;
ba38c97
Mark Stosberg Only new() is a class method. The rest are instance methods. 28943fd
Aug 20, 2012
Mark Stosberg Attempting to complete docs for 'rules' and 'Scheduler' in TAP/* modu…
…les.
a6007fb
Mark Stosberg Document the --rules option for prove 0a2034f
Mark Stosberg refine rules docs
    - Update code comments to quit giving the impression that --rules
      specifiy tests to run.

    - Provide link to further documentation if you need it.
02d8c9e
This page is out of date. Refresh to see the latest.
56  bin/prove
@@ -70,6 +70,7 @@ Options that take arguments:
70 70
  -j,  --jobs N          Run N test jobs in parallel (try 9.)
71 71
       --state=opts      Control prove's persistent state.
72 72
       --rc=rcfile       Process options from rcfile
  73
+      --rules           Rules for parallel vs sequential processing.
73 74
 
74 75
 =head1 NOTES
75 76
 
@@ -264,6 +265,61 @@ The C<--state> switch may be used more than once.
264 265
 
265 266
     $ prove -b --state=hot --state=all,save
266 267
 
  268
+=head2 --rules
  269
+
  270
+The C<--rules> option is used to control which tests are run sequentially and
  271
+which are run in parallel, if the C<--jobs> option is specified. The option may
  272
+be specified multiple times, and the order matters.
  273
+
  274
+The most practical use is likely to specify that some tests are not
  275
+"parallel-ready".  Since mentioning a file with --rules doens't cause it to
  276
+selected to run as a test, you can "set and forget" some rules preferences in
  277
+your .proverc file. Then you'll be able to take maximum advantage of the
  278
+performance benefits of parallel testing, while some exceptions are still run
  279
+in parallel.
  280
+
  281
+=head3 --rules examples
  282
+
  283
+    # All tests are allowed to run in parallel, except those starting with "p"
  284
+    --rules='seq=t/p*.t' --rules='par=**'
  285
+
  286
+    # All tests must run in sequence except those starting with "p", which should be run parallel
  287
+    --rules='par=t/p*.t'
  288
+
  289
+=head3 --rules resolution
  290
+
  291
+=over4
  292
+
  293
+=item * By default, all tests are eligible to be run in parallel. Specifying any of your own rules removes this one.
  294
+
  295
+=item * "First match wins". The first rule that matches a test will be the one that applies.
  296
+
  297
+=item * Any test which does not match a rule will be run in sequence at the end of the run.
  298
+
  299
+=item * The existence of a rule does not imply selecting a test. You must still specify the tests to run.
  300
+
  301
+=item * Specifying a rule to allow tests to run in parallel does not make the run in parallel. You still need specify the number of parallel C<jobs> in your Harness object.
  302
+
  303
+=back
  304
+
  305
+=head3 --rules Glob-style pattern matching
  306
+
  307
+We implement our own glob-style pattern matching for --rules. Here are the
  308
+supported patterns:
  309
+
  310
+    ** is any number of characters, including /, within a pathname
  311
+    * is zero or more characters within a filename/directory name
  312
+    ? is exactly one character within a filename/directory name
  313
+    {foo,bar,baz} is any of foo, bar or baz.
  314
+    \ is an escape character
  315
+
  316
+=head3 More advance specifications for parallel vs sequence run rules
  317
+
  318
+If you need more advanced management of what runs in parallel vs in sequence, see
  319
+the associated 'rules' documentation in L<TAP::Harness> and L<TAP::Parser::Scheduler>.
  320
+If what's possible directly through C<prove> is not sufficient, you can write your own
  321
+harness to access these features directly.
  322
+
267 323
 =head2 @INC
268 324
 
269 325
 prove introduces a separation between "options passed to the perl which
45  lib/TAP/Harness.pm
@@ -330,20 +330,37 @@ run only one test at a time.
330 330
 
331 331
 =item * C<rules>
332 332
 
333  
-A reference to a hash of rules that control which tests may be
334  
-executed in parallel. This is an experimental feature and the
335  
-interface may change.
336  
-
337  
-    $harness->rules(
338  
-        {   par => [
339  
-                { seq => '../ext/DB_File/t/*' },
340  
-                { seq => '../ext/IO_Compress_Zlib/t/*' },
341  
-                { seq => '../lib/CPANPLUS/*' },
342  
-                { seq => '../lib/ExtUtils/t/*' },
343  
-                '*'
344  
-            ]
345  
-        }
346  
-    );
  333
+A reference to a hash of rules that control which tests may be executed in
  334
+parallel. If no rules are declared, all tests are eligible for being run in
  335
+parallel. Here some simple examples. For the full details of the data structure
  336
+and the related glob-style pattern matching, see
  337
+L<TAP::Parser::Scheduler/"Rules data structure">.
  338
+
  339
+    # Run all tests in sequence, except those starting with "p"
  340
+    $harness->rules({
  341
+        par => 't/p*.t'
  342
+    });
  343
+
  344
+    # Run all tests in parallel, except those starting with "p"
  345
+    $harness->rules({
  346
+        seq => [
  347
+                  { seq => 't/p*.t' },
  348
+                  { par => '**'     },
  349
+               ],
  350
+    });
  351
+
  352
+    # Run some  startup tests in sequence, then some parallel tests than some
  353
+    # teardown tests in sequence.
  354
+    $harness->rules({
  355
+        seq => [
  356
+            { seq => 't/startup/*.t' },
  357
+            { par => ['t/a/*.t','t/b/*.t','t/c/*.t'], }
  358
+            { seq => 't/shutdown/*.t' },
  359
+        ],
  360
+
  361
+    });
  362
+
  363
+This is an experimental feature and the interface may change.
347 364
 
348 365
 =item * C<stdout>
349 366
 
145  lib/TAP/Parser/Scheduler.pm
@@ -30,9 +30,98 @@ $VERSION = '3.25';
30 30
 
31 31
 =head3 C<new>
32 32
 
33  
-    my $sched = TAP::Parser::Scheduler->new;
  33
+    my $sched = TAP::Parser::Scheduler->new(tests => \@tests);
  34
+    my $sched = TAP::Parser::Scheduler->new(
  35
+        tests => [ ['t/test_name.t','Test Description'], ... ],
  36
+        rules => \%rules,
  37
+    );
  38
+
  39
+Given 'tests' and optional 'rules' as input, returns a new
  40
+C<TAP::Parser::Scheduler> object.  Each member of C<@tests> should be either a
  41
+a test file name, or a two element arrayref, where the first element is a test
  42
+file name, and the second element is a test description. By default, we'll use
  43
+the test name as the description.
  44
+
  45
+The optional C<rules> attribute provides direction on which tests should be run
  46
+in parallel and which should be run sequentially. If no rule data structure is
  47
+provided, a default data structure is used which makes every test eligible to
  48
+be run in parallel:
  49
+
  50
+    { par => '**' },
  51
+
  52
+The rules data structure is documented more in the next section.
  53
+
  54
+=head2 Rules data structure
  55
+
  56
+The "C<rules>" data structure is the the heart of the scheduler. It allows you
  57
+to express simple rules like "run all tests in sequence" or "run all tests in
  58
+parallel except these five tests.". However, the rules structure also supports
  59
+glob-style pattern matching and recursive definitions, so you can also express
  60
+arbitarily complicated patterns.
  61
+
  62
+The rule must only have one top level key: either 'par' for "parallel" or 'seq'
  63
+for "sequence".
  64
+
  65
+Values must be either strings with possible glob-style matching, or arrayrefs
  66
+of strings or hashrefs which follow this pattern recursively.
  67
+
  68
+Every element in an arrayref directly below a 'par' key is eligible to be run
  69
+in parallel, while vavalues directly below a 'seq' key must be run in sequence.
  70
+
  71
+=head3 Rules examples
  72
+
  73
+Here are some examples:
  74
+
  75
+    # All tests be run in parallel (the default rule)
  76
+    { par => '**' },
  77
+
  78
+    # Run all tests in sequence, except those starting with "p"
  79
+    { par => 't/p*.t' },
  80
+
  81
+    # Run all tests in parallel, except those starting with "p"
  82
+    {
  83
+        seq => [
  84
+                  { seq => 't/p*.t' },
  85
+                  { par => '**'     },
  86
+               ],
  87
+    }
  88
+
  89
+    # Run some  startup tests in sequence, then some parallel tests than some
  90
+    # teardown tests in sequence.
  91
+    {
  92
+        seq => [
  93
+            { seq => 't/startup/*.t' },
  94
+            { par => ['t/a/*.t','t/b/*.t','t/c/*.t'], }
  95
+            { seq => 't/shutdown/*.t' },
  96
+        ],
  97
+    },
  98
+
34 99
 
35  
-Returns a new C<TAP::Parser::Scheduler> object.
  100
+=head3 Rules resolution
  101
+
  102
+=over4
  103
+
  104
+=item * By default, all tests are eligible to be run in parallel. Specifying any of your own rules removes this one.
  105
+
  106
+=item * "First match wins". The first rule that matches a test will be the one that applies.
  107
+
  108
+=item * Any test which does not match a rule will be run in sequence at the end of the run.
  109
+
  110
+=item * The existence of a rule does not imply selecting a test. You must still specify the tests to run.
  111
+
  112
+=item * Specifying a rule to allow tests to run in parallel does not make the run in parallel. You still need specify the number of parallel C<jobs> in your Harness object.
  113
+
  114
+=back
  115
+
  116
+=head3 Glob-style pattern matching for rules
  117
+
  118
+We implement our own glob-style pattern matching. Here are the patterns it supports:
  119
+
  120
+    ** is any number of characters, including /, within a pathname
  121
+    * is zero or more characters within a filename/directory name
  122
+    ? is exactly one character within a filename/directory name
  123
+    {foo,bar,baz} is any of foo, bar or baz.
  124
+    \ is an escape character
36 125
 
37 126
 =cut
38 127
 
@@ -70,6 +159,9 @@ sub new {
70 159
 
71 160
 sub _set_rules {
72 161
     my ( $self, $rules, $tests ) = @_;
  162
+
  163
+    # Convert all incoming tests to job objects. 
  164
+    # If no test description is provided use the file name as the description. 
73 165
     my @tests = map { TAP::Parser::Scheduler::Job->new(@$_) }
74 166
       map { 'ARRAY' eq ref $_ ? $_ : [ $_, $_ ] } @$tests;
75 167
     my $schedule = $self->_rule_clause( $rules, \@tests );
@@ -185,6 +277,8 @@ sub _expand {
185 277
     return @match;
186 278
 }
187 279
 
  280
+=head2 Instance Methods
  281
+
188 282
 =head3 C<get_all>
189 283
 
190 284
 Get a list of all remaining tests.
@@ -207,9 +301,9 @@ sub _gather {
207 301
 
208 302
 =head3 C<get_job>
209 303
 
210  
-Return the next available job or C<undef> if none are available. Returns
211  
-a C<TAP::Parser::Scheduler::Spinner> if the scheduler still has pending
212  
-jobs but none are available to run right now.
  304
+Return the next available job as L<TAP::Parser::Scheduler::Job> object or
  305
+C<undef> if none are available. Returns a L<TAP::Parser::Scheduler::Spinner> if
  306
+the scheduler still has pending jobs but none are available to run right now.
213 307
 
214 308
 =cut
215 309
 
@@ -281,9 +375,50 @@ sub _find_next_job {
281 375
 =head3 C<as_string>
282 376
 
283 377
 Return a human readable representation of the scheduling tree.
  378
+For example:
  379
+
  380
+    my @tests = (qw{
  381
+        t/startup/foo.t 
  382
+        t/shutdown/foo.t
  383
+    
  384
+        t/a/foo.t t/b/foo.t t/c/foo.t t/d/foo.t
  385
+    });
  386
+    my $sched = TAP::Parser::Scheduler->new(
  387
+        tests => \@tests,
  388
+        rules => {
  389
+            seq => [
  390
+                { seq => 't/startup/*.t' },
  391
+                { par => ['t/a/*.t','t/b/*.t','t/c/*.t'] },
  392
+                { seq => 't/shutdown/*.t' },
  393
+            ],
  394
+        },
  395
+    );
  396
+
  397
+Produces:
  398
+
  399
+    par:
  400
+      seq:
  401
+        par:
  402
+          seq:
  403
+            par:
  404
+              seq:
  405
+                't/startup/foo.t'
  406
+            par:
  407
+              seq:
  408
+                't/a/foo.t'
  409
+              seq:
  410
+                't/b/foo.t'
  411
+              seq:
  412
+                't/c/foo.t'
  413
+            par:
  414
+              seq:
  415
+                't/shutdown/foo.t'
  416
+        't/d/foo.t'
  417
+
284 418
 
285 419
 =cut
286 420
 
  421
+
287 422
 sub as_string {
288 423
     my $self = shift;
289 424
     return $self->_as_string( $self->{schedule} );
28  lib/TAP/Parser/Scheduler/Job.pm
@@ -31,10 +31,11 @@ Represents a single test 'job'.
31 31
 =head3 C<new>
32 32
 
33 33
     my $job = TAP::Parser::Scheduler::Job->new(
34  
-        $name, $desc 
  34
+        $filename, $description
35 35
     );
36 36
 
37  
-Returns a new C<TAP::Parser::Scheduler::Job> object.
  37
+Given the filename and description of a test as scalars, returns a new
  38
+L<TAP::Parser::Scheduler::Job> object.
38 39
 
39 40
 =cut
40 41
 
@@ -47,9 +48,14 @@ sub new {
47 48
     }, $class;
48 49
 }
49 50
 
  51
+=head2 Instance Methods
  52
+
50 53
 =head3 C<on_finish>
51 54
 
52  
-Register a closure to be called when this job is destroyed.
  55
+    $self->on_finish(\&method).
  56
+
  57
+Register a closure to be called when this job is destroyed. The callback
  58
+will be passed the C<TAP::Parser::Scheduler::Job> object as it's only argument.
53 59
 
54 60
 =cut
55 61
 
@@ -60,7 +66,10 @@ sub on_finish {
60 66
 
61 67
 =head3 C<finish>
62 68
 
63  
-Called when a job is complete to unlock it.
  69
+   $self->finish;
  70
+
  71
+Called when a job is complete to unlock it. If a callback has been registered
  72
+with C<on_finish>, it calls it. Otherwise, it does nothing. 
64 73
 
65 74
 =cut
66 75
 
@@ -71,6 +80,15 @@ sub finish {
71 80
     }
72 81
 }
73 82
 
  83
+=head2 Attributes
  84
+
  85
+  $self->filename;
  86
+  $self->description;
  87
+  $self->context;
  88
+
  89
+These are all "getters" which return the data set for these attributes during object construction.
  90
+
  91
+
74 92
 =head3 C<filename>
75 93
 
76 94
 =head3 C<description>
@@ -96,6 +114,8 @@ sub as_array_ref {
96 114
 
97 115
 =head3 C<is_spinner>
98 116
 
  117
+  $self->is_spinner;
  118
+
99 119
 Returns false indicating that this is a real job rather than a
100 120
 'spinner'. Spinners are returned when the scheduler still has pending
101 121
 jobs but can't (because of locking) return one right now.
10  lib/TAP/Parser/Scheduler/Spinner.pm
@@ -34,12 +34,14 @@ return a real job.
34 34
 
35 35
     my $job = TAP::Parser::Scheduler::Spinner->new;
36 36
 
37  
-Returns a new C<TAP::Parser::Scheduler::Spinner> object.
  37
+Ignores any arguments and returns a new C<TAP::Parser::Scheduler::Spinner> object.
38 38
 
39 39
 =cut
40 40
 
41 41
 sub new { bless {}, shift }
42 42
 
  43
+=head2 Instance Methods
  44
+
43 45
 =head3 C<is_spinner>
44 46
 
45 47
 Returns true indicating that is a 'spinner' job. Spinners are returned
@@ -50,4 +52,10 @@ return one right now.
50 52
 
51 53
 sub is_spinner {1}
52 54
 
  55
+=head1 SEE ALSO
  56
+
  57
+L<TAP::Parser::Scheduler>, L<TAP::Parser::Scheduler::Job>
  58
+
  59
+=cut
  60
+
53 61
 1;
Commit_comment_tip

Tip: You can add notes to lines in a file. Hover to the left of a line to make a note

Something went wrong with that request. Please try again.