Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generators: testcase numbering #274

Merged
merged 9 commits into from
Jun 30, 2023
Merged

Generators: testcase numbering #274

merged 9 commits into from
Jun 30, 2023

Conversation

RagnarGrootKoerkamp
Copy link
Owner

@RagnarGrootKoerkamp RagnarGrootKoerkamp commented Jun 29, 2023

This follows from the discussion in #262 (particularly #262 (comment)).

To ensure that included testcases (from other testgroups/directories) are run in order, they must be numbered with a lower number. Kattis and/or problemtools automatically dedup testcases with the same name, so we can not 'renumber' testcases once they are included into a different directory.

This somewhat implies the need for a global numbering of testcases.
Also, note that yaml explicitly specified that directories are unordered, and hence in these cases we use alphabetical order (not the order as specified in the file).

We could number in a few ways:

  • Do per-testgroup numbering after all
  • Number everything lexicographically using in-order:
    1-easy/2-test.in
    1-easy/3-test.in
    4-hard/5-test.in
    6-last.in
    
  • Number testgroups first and then testcases
    1-easy/3-test.in
    1-easy/4-test.in
    2-hard/5-test.in
    6-last.in
    
  • Number testgroups and testcases individually
    1-easy/1-test.in
    1-easy/2-test.in
    2-hard/3-test.in
    4-last.in
    
    As long as each group contains at least one generated case, numbering conflicts shouldn't happen.

Non-numbered testgroups/testcases can be parents of numbered ones that are just skipped in the global numbering. I don't think that should cause complications.

I think this last option is quite reasonable.

@thorehusfeldt
Copy link
Collaborator

thorehusfeldt commented Jun 28, 2023

I have no strong opinion.

For completeness, testdata_tools autonumbers testcases (never testgroups), and does so in order of appearance in the script. An underlying assumption is that testgroup names appear in lexicographic order in the script.

I guess that if I wrote a problem like this:

sample:
  '1': graph --n 2 --m 1 
  '2': graph --n 3 --m 2
secret:
  - 'paths':
   - ...
  - 'trees':
    - ...
  - 'undirected':
    - ...
  - 'directed':
    - ...

and 82 secret testcases (all specified as list items), I’d like the testcase names to be

sample/1.in
sample/2.in
secret/1-paths/01-pathgraph-n-3.in
secret/1-paths/02-pathgraph-n-10.in
secret/2-trees/03-complete-binary-tree-depth-2.in

and so on. (Because I want to think of the secret testcases as 01, 02, .)

Trying to reengineer this rule, I’m in favour of your last rule. (In particular, separate counters for testcases and testgroups.)

I think testgroups can be individually numbered per-parent.

I admit I have no experience with more than two levels of nesting, the first of which is the never-numbered sample / secret, so it’s only child test groups of secret that could ever get a number. But it feels weird to me that 2-trees (which Kattis will display as “Group 2” in the interface) should change its name to 3-trees just because 1-paths has received a numbered child testgroup during problem development.

So to expand on your last example:

1-easy/1-test.in
1-easy/2-test.in
2-hard/1-undirected/3-test.in
2-hard/2-directed/4-test.in
2-hard/5-test.in
6-last.in

Note that evaluation order in testgroup 2-hard now depends on the number of testcases somewhere else (because test.in could have become 05-test.in and the grader would view that name as “earlier” than the two testgroups.) That’s just the way it is, I think.

@RagnarGrootKoerkamp
Copy link
Owner Author

Testcase numbering: I agree that globally unique numbers independent of testgroups make sense and are intuitive ✔️
Testgroup numbering: Right, per directory numbering feels a bit more intuitive and clean, as global ordering isn't necessary here.

Indeed the two orderings can interfere in a slightly non-intuitive way, but if that's acceptable, this does seem like a clean way to go about things. I'll try to implement it.

@RagnarGrootKoerkamp
Copy link
Owner Author

RagnarGrootKoerkamp commented Jun 29, 2023

@thorehusfeldt I've pushed an implementation for global testcase numbering to this pr (numbering branch).

  • 1st commit is implementation
  • 2nd commit updates the in-repo numbering for test/problems/identity
  • 3rd commit adds your test zip from earlier to test/problems/generatornumbering

Can you test as well? I had some issues earlier after renaming things and changing between named and numbered directories and such, but at least the ones I encountered are now fixed.

@thorehusfeldt
Copy link
Collaborator

Here’s a rewritten-to-use-generators problem that crashes gracelessly. (But that can’t be because of numbering.)

mieogfie.zip

➜  dpop23 git:(bapc) ✗ bt run --problem mieogfie
PROBLEM mieogfie
Generate: secret/group1         SKIPPED: testdata.yaml
Traceback (most recent call last):####################-------------------------------------------------------------------- 12/48]
  File "/usr/local/bin/bt", line 1030, in <module>
    main()
  File "/usr/local/bin/bt", line 1026, in main
    run_parsed_arguments(parser.parse_args())
  File "/usr/local/bin/bt", line 893, in run_parsed_arguments
    success &= generate.generate(problem)
  File "/Users/thore/Developer/BAPCtools/bin/generate.py", line 1532, in generate
    config.run()
  File "/Users/thore/Developer/BAPCtools/bin/generate.py", line 1319, in run
    self.root_dir.walk(p.put, generate_dir)
  File "/Users/thore/Developer/BAPCtools/bin/generate.py", line 809, in walk
    d.walk(testcase_f, dir_f, dir_last=dir_last)
  File "/Users/thore/Developer/BAPCtools/bin/generate.py", line 809, in walk
    d.walk(testcase_f, dir_f, dir_last=dir_last)
  File "/Users/thore/Developer/BAPCtools/bin/generate.py", line 805, in walk
    dir_f(self)
  File "/Users/thore/Developer/BAPCtools/bin/generate.py", line 1316, in generate_dir
    p.join()
  File "/Users/thore/Developer/BAPCtools/bin/parallel.py", line 109, in join
    raise self.first_error
  File "/Users/thore/Developer/BAPCtools/bin/parallel.py", line 64, in _worker
    self.f(task)
  File "/Users/thore/Developer/BAPCtools/bin/generate.py", line 1313, in <lambda>
    p = parallel.Parallel(lambda t: t.listed and t.generate(self.problem, self, bar))
  File "/Users/thore/Developer/BAPCtools/bin/generate.py", line 710, in generate
    meta_yaml['validator_hashes'] = testcase.validator_hashes('input_format')
  File "/Users/thore/Developer/BAPCtools/bin/run.py", line 109, in validator_hashes
    h = combine_hashes_dict(o)
  File "/Users/thore/Developer/BAPCtools/bin/util.py", line 939, in combine_hashes_dict
    hasher.update(d[key].encode())

@thorehusfeldt
Copy link
Collaborator

Here’s a minimal generator for the above problem that fails with

  File "/Users/thore/Developer/BAPCtools/bin/util.py", line 939, in combine_hashes_dict
    hasher.update(d[key].encode())
AttributeError: 'list' object has no attribute 'encode'

exactly when I remove the # (i.e., set the input validator flags).

solution: /submissions/accepted/th.py

data:
  sample: 
    data:
      "1": gen_explicit.py 40 45
  secret:
    testdata.yaml:
          #    input_validator_flags: --smallmie
    data:
      40-41  : gen_explicit.py 40 41

@thorehusfeldt
Copy link
Collaborator

A user story related to numbering is that I may have written a bunch of

foo: generate_me 34
bar: generate_me 64
baz: generate_me 76

and then later change my mind to preferring numbered testcases by introducing dashes into my generator script:

- foo: generate_me 34
- bar: generate_me 64
- baz: generate_me 76

Currently, this (correctly) triggers

 Testcase not listed in generator.yaml (delete using --clean).

but maybe there should be a more helpful? In general, if a data directory includes both foo.in and \d+-foo.in then probably something unintended happened.

But I’m not sure. Maybe an attempt to provide a more plausible error message ("Testcase not listed (but numbered variant listed)") causes more confusion.

@RagnarGrootKoerkamp
Copy link
Owner Author

Right, most of those warnings should be handled by the automatic removing of duplicate cases though. But around symlinks it's not super smooth yet i think

@RagnarGrootKoerkamp
Copy link
Owner Author

Re the not listed warning: I've improved this to check whether the case is a duplicate, and if so, mention that --force will remove the unlisted duplicate.
I'll make it so that generate -f also deletes broken symlinks

@RagnarGrootKoerkamp
Copy link
Owner Author

RagnarGrootKoerkamp commented Jun 30, 2023

With this latest commit, changing from named to numbered or back and doing generate -f should give some info messages that duplicate cases and broken symlinks are deleted, and convert everything cleanly to the new state. generate again shouldn't do anything.

@RagnarGrootKoerkamp
Copy link
Owner Author

Zero-padding should work again -- there was a bug in counting the total number of testcases.

@RagnarGrootKoerkamp RagnarGrootKoerkamp merged commit d2d12f9 into master Jun 30, 2023
2 checks passed
@RagnarGrootKoerkamp RagnarGrootKoerkamp deleted the numbering branch June 30, 2023 14:48
@thorehusfeldt
Copy link
Collaborator

Another suggestion for better warnings: when I add a new testcase in the middle of a numbered list I already generated, I get a confusing list of “duplicate rule” ERRORS like this:

ERROR: Found duplicate rule "data/secret/edgecases/14-star.in" at secret/edgecases/15-star and secret/edgecases/14-star

or

Unlisted duplicate of secret/edgecases/16-branch => delete with --force.

This seems overly dramatic for an innocuous change that merely renumbers existing cases and requires a lot of --clean and -f.

I’m not sure what the user experience should be, but currently there is big difference between editing generators with many numbered or many unnumbered cases.

@RagnarGrootKoerkamp
Copy link
Owner Author

Right, let's make it so that this unlisted duplicate warning checks if cases only differ by their number and in that case we can remove the duplicate silently without problem. That would still give warnings if you rename and renumber at the same time but that's much more rare.

The duplicate rule warning is triggered if two cases have exactly the same value of input: and/or in: in the yaml. I think that's fine, but I don't really understand why this triggers for adding a case. Can you give an example? (But the message could be clearer either way.)

@RagnarGrootKoerkamp
Copy link
Owner Author

  • Reworded duplicate rule to identical input

  • identical input now only triggers for listed testcases, since unlisted testcases already give errors elsewhere anyway.

  • Removing of duplicate unlisted rules is now done silently. (Message is only shown with -v.) This is slightly aggressive, but removes clutter that's almost always unneeded. One thing that may happen is if you manually added a .png for the unlisted case that wasn't regenerated. But such additional files already give a warning at the moment so should be OK.

  • Remaining issue: all testcases after an insertion/deletion are renumbered, and they all print NEW 22-testcase.in. To fix this we would probably need some dedicated logic to handle renames/renumbering, since currently there isn't really an association between the old and new name of the testcase. The new one is generated, and then in a final step the old unlisted one is cleaned up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants