Generators: testcase numbering #274

RagnarGrootKoerkamp · 2023-06-29T21:57:52Z

This follows from the discussion in #262 (particularly #262 (comment)).

To ensure that included testcases (from other testgroups/directories) are run in order, they must be numbered with a lower number. Kattis and/or problemtools automatically dedup testcases with the same name, so we can not 'renumber' testcases once they are included into a different directory.

This somewhat implies the need for a global numbering of testcases.
Also, note that yaml explicitly specified that directories are unordered, and hence in these cases we use alphabetical order (not the order as specified in the file).

We could number in a few ways:

Do per-testgroup numbering after all

Number everything lexicographically using in-order:

1-easy/2-test.in
1-easy/3-test.in
4-hard/5-test.in
6-last.in

Number testgroups first and then testcases

1-easy/3-test.in
1-easy/4-test.in
2-hard/5-test.in
6-last.in

Number testgroups and testcases individually
```
1-easy/1-test.in
1-easy/2-test.in
2-hard/3-test.in
4-last.in
```
As long as each group contains at least one generated case, numbering conflicts shouldn't happen.

Non-numbered testgroups/testcases can be parents of numbered ones that are just skipped in the global numbering. I don't think that should cause complications.

I think this last option is quite reasonable.

thorehusfeldt · 2023-06-28T11:59:27Z

I have no strong opinion.

For completeness, testdata_tools autonumbers testcases (never testgroups), and does so in order of appearance in the script. An underlying assumption is that testgroup names appear in lexicographic order in the script.

I guess that if I wrote a problem like this:

sample:
  '1': graph --n 2 --m 1 
  '2': graph --n 3 --m 2
secret:
  - 'paths':
   - ...
  - 'trees':
    - ...
  - 'undirected':
    - ...
  - 'directed':
    - ...

and 82 secret testcases (all specified as list items), I’d like the testcase names to be

sample/1.in
sample/2.in
secret/1-paths/01-pathgraph-n-3.in
secret/1-paths/02-pathgraph-n-10.in
secret/2-trees/03-complete-binary-tree-depth-2.in

and so on. (Because I want to think of the secret testcases as 01, 02, ….)

Trying to reengineer this rule, I’m in favour of your last rule. (In particular, separate counters for testcases and testgroups.)

I think testgroups can be individually numbered per-parent.

I admit I have no experience with more than two levels of nesting, the first of which is the never-numbered sample / secret, so it’s only child test groups of secret that could ever get a number. But it feels weird to me that 2-trees (which Kattis will display as “Group 2” in the interface) should change its name to 3-trees just because 1-paths has received a numbered child testgroup during problem development.

So to expand on your last example:

1-easy/1-test.in
1-easy/2-test.in
2-hard/1-undirected/3-test.in
2-hard/2-directed/4-test.in
2-hard/5-test.in
6-last.in

Note that evaluation order in testgroup 2-hard now depends on the number of testcases somewhere else (because test.in could have become 05-test.in and the grader would view that name as “earlier” than the two testgroups.) That’s just the way it is, I think.

RagnarGrootKoerkamp · 2023-06-28T12:46:05Z

Testcase numbering: I agree that globally unique numbers independent of testgroups make sense and are intuitive ✔️
Testgroup numbering: Right, per directory numbering feels a bit more intuitive and clean, as global ordering isn't necessary here.

Indeed the two orderings can interfere in a slightly non-intuitive way, but if that's acceptable, this does seem like a clean way to go about things. I'll try to implement it.

RagnarGrootKoerkamp · 2023-06-29T21:59:58Z

@thorehusfeldt I've pushed an implementation for global testcase numbering to this pr (numbering branch).

1st commit is implementation
2nd commit updates the in-repo numbering for test/problems/identity
3rd commit adds your test zip from earlier to test/problems/generatornumbering

Can you test as well? I had some issues earlier after renaming things and changing between named and numbered directories and such, but at least the ones I encountered are now fixed.

thorehusfeldt · 2023-06-30T05:56:00Z

Here’s a rewritten-to-use-generators problem that crashes gracelessly. (But that can’t be because of numbering.)

mieogfie.zip

➜  dpop23 git:(bapc) ✗ bt run --problem mieogfie
PROBLEM mieogfie
Generate: secret/group1         SKIPPED: testdata.yaml
Traceback (most recent call last):####################-------------------------------------------------------------------- 12/48]
  File "/usr/local/bin/bt", line 1030, in <module>
    main()
  File "/usr/local/bin/bt", line 1026, in main
    run_parsed_arguments(parser.parse_args())
  File "/usr/local/bin/bt", line 893, in run_parsed_arguments
    success &= generate.generate(problem)
  File "/Users/thore/Developer/BAPCtools/bin/generate.py", line 1532, in generate
    config.run()
  File "/Users/thore/Developer/BAPCtools/bin/generate.py", line 1319, in run
    self.root_dir.walk(p.put, generate_dir)
  File "/Users/thore/Developer/BAPCtools/bin/generate.py", line 809, in walk
    d.walk(testcase_f, dir_f, dir_last=dir_last)
  File "/Users/thore/Developer/BAPCtools/bin/generate.py", line 809, in walk
    d.walk(testcase_f, dir_f, dir_last=dir_last)
  File "/Users/thore/Developer/BAPCtools/bin/generate.py", line 805, in walk
    dir_f(self)
  File "/Users/thore/Developer/BAPCtools/bin/generate.py", line 1316, in generate_dir
    p.join()
  File "/Users/thore/Developer/BAPCtools/bin/parallel.py", line 109, in join
    raise self.first_error
  File "/Users/thore/Developer/BAPCtools/bin/parallel.py", line 64, in _worker
    self.f(task)
  File "/Users/thore/Developer/BAPCtools/bin/generate.py", line 1313, in <lambda>
    p = parallel.Parallel(lambda t: t.listed and t.generate(self.problem, self, bar))
  File "/Users/thore/Developer/BAPCtools/bin/generate.py", line 710, in generate
    meta_yaml['validator_hashes'] = testcase.validator_hashes('input_format')
  File "/Users/thore/Developer/BAPCtools/bin/run.py", line 109, in validator_hashes
    h = combine_hashes_dict(o)
  File "/Users/thore/Developer/BAPCtools/bin/util.py", line 939, in combine_hashes_dict
    hasher.update(d[key].encode())

thorehusfeldt · 2023-06-30T07:37:26Z

Here’s a minimal generator for the above problem that fails with

  File "/Users/thore/Developer/BAPCtools/bin/util.py", line 939, in combine_hashes_dict
    hasher.update(d[key].encode())
AttributeError: 'list' object has no attribute 'encode'

exactly when I remove the # (i.e., set the input validator flags).

solution: /submissions/accepted/th.py

data:
  sample: 
    data:
      "1": gen_explicit.py 40 45
  secret:
    testdata.yaml:
          #    input_validator_flags: --smallmie
    data:
      40-41  : gen_explicit.py 40 41

thorehusfeldt · 2023-06-30T07:46:13Z

A user story related to numbering is that I may have written a bunch of

foo: generate_me 34
bar: generate_me 64
baz: generate_me 76

and then later change my mind to preferring numbered testcases by introducing dashes into my generator script:

- foo: generate_me 34
- bar: generate_me 64
- baz: generate_me 76

Currently, this (correctly) triggers

 Testcase not listed in generator.yaml (delete using --clean).

but maybe there should be a more helpful? In general, if a data directory includes both foo.in and \d+-foo.in then probably something unintended happened.

But I’m not sure. Maybe an attempt to provide a more plausible error message ("Testcase not listed (but numbered variant listed)") causes more confusion.

RagnarGrootKoerkamp · 2023-06-30T07:52:05Z

Right, most of those warnings should be handled by the automatic removing of duplicate cases though. But around symlinks it's not super smooth yet i think

RagnarGrootKoerkamp · 2023-06-30T09:06:58Z

Re the not listed warning: I've improved this to check whether the case is a duplicate, and if so, mention that --force will remove the unlisted duplicate.
I'll make it so that generate -f also deletes broken symlinks

RagnarGrootKoerkamp · 2023-06-30T10:03:06Z

With this latest commit, changing from named to numbered or back and doing generate -f should give some info messages that duplicate cases and broken symlinks are deleted, and convert everything cleanly to the new state. generate again shouldn't do anything.

RagnarGrootKoerkamp · 2023-06-30T11:01:02Z

Zero-padding should work again -- there was a bug in counting the total number of testcases.

thorehusfeldt · 2023-07-02T06:51:52Z

Another suggestion for better warnings: when I add a new testcase in the middle of a numbered list I already generated, I get a confusing list of “duplicate rule” ERRORS like this:

ERROR: Found duplicate rule "data/secret/edgecases/14-star.in" at secret/edgecases/15-star and secret/edgecases/14-star

or

Unlisted duplicate of secret/edgecases/16-branch => delete with --force.

This seems overly dramatic for an innocuous change that merely renumbers existing cases and requires a lot of --clean and -f.

I’m not sure what the user experience should be, but currently there is big difference between editing generators with many numbered or many unnumbered cases.

RagnarGrootKoerkamp · 2023-07-02T07:30:35Z

Right, let's make it so that this unlisted duplicate warning checks if cases only differ by their number and in that case we can remove the duplicate silently without problem. That would still give warnings if you rename and renumber at the same time but that's much more rare.

The duplicate rule warning is triggered if two cases have exactly the same value of input: and/or in: in the yaml. I think that's fine, but I don't really understand why this triggers for adding a case. Can you give an example? (But the message could be clearer either way.)

RagnarGrootKoerkamp · 2023-07-02T08:54:37Z

Reworded duplicate rule to identical input
identical input now only triggers for listed testcases, since unlisted testcases already give errors elsewhere anyway.
Removing of duplicate unlisted rules is now done silently. (Message is only shown with -v.) This is slightly aggressive, but removes clutter that's almost always unneeded. One thing that may happen is if you manually added a .png for the unlisted case that wasn't regenerated. But such additional files already give a warning at the moment so should be OK.
Remaining issue: all testcases after an insertion/deletion are renumbered, and they all print NEW 22-testcase.in. To fix this we would probably need some dedicated logic to handle renames/renumbering, since currently there isn't really an association between the old and new name of the testcase. The new one is generated, and then in a final step the old unlisted one is cleaned up.

RagnarGrootKoerkamp mentioned this pull request Jun 27, 2023

Support generating included testcases #262

Merged

RagnarGrootKoerkamp added 3 commits June 29, 2023 23:55

[gen] Support including numbered cases by 'key' (not 'name')

aad6bab

[gen] Update numbering for identity

8a87902

[gen] Add generatorincludes problem to test

275ac51

RagnarGrootKoerkamp added 2 commits June 30, 2023 10:46

[gen] Fix validating with input_validator_flags

f3a08e2

[gen] Hint that generate -f deletes duplicate cases

1a53b82

[gen] Delete unlisted broken symlinks with generate -f

4af8be0

RagnarGrootKoerkamp force-pushed the numbering branch from 9218055 to 4af8be0 Compare June 30, 2023 10:06

RagnarGrootKoerkamp added 2 commits June 30, 2023 13:13

[gen] Fix counting number of testcases for padding

36c990c

[tests] Update testcase numbers

0e66769

RagnarGrootKoerkamp force-pushed the numbering branch from 308ae38 to 0e66769 Compare June 30, 2023 11:13

[gen] #238 Error & clean unlisted testdata.yaml

d6731f0

RagnarGrootKoerkamp merged commit d2d12f9 into master Jun 30, 2023
2 checks passed

RagnarGrootKoerkamp deleted the numbering branch June 30, 2023 14:48

RagnarGrootKoerkamp mentioned this pull request Jul 21, 2023

[meta] Relevant BAPCtools issues Kattis/problem-package-format#27

Closed

thorehusfeldt mentioned this pull request Dec 26, 2023

Make auto-numbering of bt generate consistent with DOMjudge numbering #325

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generators: testcase numbering #274

Generators: testcase numbering #274

RagnarGrootKoerkamp commented Jun 29, 2023 •

edited

Loading

thorehusfeldt commented Jun 28, 2023 •

edited

Loading

RagnarGrootKoerkamp commented Jun 28, 2023

RagnarGrootKoerkamp commented Jun 29, 2023 •

edited

Loading

thorehusfeldt commented Jun 30, 2023

thorehusfeldt commented Jun 30, 2023

thorehusfeldt commented Jun 30, 2023

RagnarGrootKoerkamp commented Jun 30, 2023

RagnarGrootKoerkamp commented Jun 30, 2023

RagnarGrootKoerkamp commented Jun 30, 2023 •

edited

Loading

RagnarGrootKoerkamp commented Jun 30, 2023

thorehusfeldt commented Jul 2, 2023

RagnarGrootKoerkamp commented Jul 2, 2023

RagnarGrootKoerkamp commented Jul 2, 2023

Generators: testcase numbering #274

Generators: testcase numbering #274

Conversation

RagnarGrootKoerkamp commented Jun 29, 2023 • edited Loading

thorehusfeldt commented Jun 28, 2023 • edited Loading

RagnarGrootKoerkamp commented Jun 28, 2023

RagnarGrootKoerkamp commented Jun 29, 2023 • edited Loading

thorehusfeldt commented Jun 30, 2023

thorehusfeldt commented Jun 30, 2023

thorehusfeldt commented Jun 30, 2023

RagnarGrootKoerkamp commented Jun 30, 2023

RagnarGrootKoerkamp commented Jun 30, 2023

RagnarGrootKoerkamp commented Jun 30, 2023 • edited Loading

RagnarGrootKoerkamp commented Jun 30, 2023

thorehusfeldt commented Jul 2, 2023

RagnarGrootKoerkamp commented Jul 2, 2023

RagnarGrootKoerkamp commented Jul 2, 2023

RagnarGrootKoerkamp commented Jun 29, 2023 •

edited

Loading

thorehusfeldt commented Jun 28, 2023 •

edited

Loading

RagnarGrootKoerkamp commented Jun 29, 2023 •

edited

Loading

RagnarGrootKoerkamp commented Jun 30, 2023 •

edited

Loading