-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support generating include
d testcases
#262
Conversation
Re. ordering (#261 (comment)), this may already happen, but either way we should generate included testcases after all other testcases |
what is the intended behaviour of |
@mzuenni : as far as I can tell, there is no authoritative source for the intended semantics of In particular, and this is not important for the semantics, it creates symlinks to those testcases. In my example above, > ls -l data/secret
total 16
lrwxr-xr-x 1 thore staff 15 20 Jun 17:01 1.ans -> ../sample/1.ans
lrwxr-xr-x 1 thore staff 14 20 Jun 17:01 1.in -> ../sample/1.in
-rw-r--r-- 1 thore staff 3 20 Jun 17:01 foo.ans
-rw-r--r-- 1 thore staff 4 20 Jun 17:01 foo.in So this already works (except for the can’t-run-it-twice issue reported here.) More importantly when bar:
type: directory
include: foo means: include the testcases that are descendants of (An alternative choice would be to include below |
ok i'm starting to remember things slowly :D The PR at https://github.com/Kattis/problem-package-format/pull/2/files is the best 'groundtruth' there is at the moment. I suppose I implemented So yes, I implemented (side note: even though |
In fact, The generator script used by the Swedish crowd (https://github.com/Kodsport/testdata_tools) has a directive as well as the directive Their semantics is exactly like proposed by I would really be nice to get this working. (It’s very close already.) — A difference is that |
great! Unless one of you has a look, i'll get to it by tomorrow probably |
My guess is that in the second run, the symlinked files get picked up as unlisted manual testcases which are added to the set of known testcases? in that case the assert should be different? |
yeah probaly something like that. |
If you want something more meaty to test than the minimal example upthread, here’s a tiny “add two numbers” problem with half a dozen testcases and a complicated test group structure generated by: solution: /submissions/accepted/th.py
data:
sample:
type: directory
data:
'1': stdout.py 1 2
'2': stdout.py 1 -3
secret:
type: directory
data:
small:
type: directory
data:
positive:
type: directory
data:
'sm-all-pos': stdout.py 4 6
'sm-zero': stdout.py 4 0
include:
- 'sample/1'
general:
type: directory
data:
'sm-mixed': stdout.py 4 -6
'sm-all-neg': stdout.py -4 -6
include:
- 'sample/2'
general:
type: directory
data:
'lg': stdout.py 83413870413975664 -11
include:
- 'secret/small' I’m not very sure about the desired root of the |
I'm working on this now. I've merged #261 to avoid merge conflicts later on. Will put some findings here: When running for the first time with an empty
This will probably be fixed by either:
I think I prefer the 1st option. Ah I see; this was because Currently the yaml entries are sorted (alphabetically) before processing them. Processing by file-order sounds more intuitive so will remove the
Does this really already work? (or am I misreading?) For me there are a few more issues (such as only globbing for I agree that flattening included directories simplifies things. Also, it prevents 'double inclusion' of the samples as in your
A separate issue is how to deal with numbered testcases/directories. This may or may not already work; will play with it after the above works. |
Python |
I'll disallow including parent directories since that would create infinite recursion. |
I'm also being lazy with warnings for now; symlinks are always written, also if they would overwrite distinct files/symlinks. |
It does not. (Single files almost-work, subgroup inclusion does not work.)
Ancestral includes must be disallowed (they make no sense.) I can’t see a use-case for descendant-includes, so I’d just disallow them. The two actually-important use-cases (that I use all the time) are
Those are the two inclusion mechanisms supported by What is important to understand is how this interacts with (There is a nag about output validator flags for “different” occurrences of the “same” testcase. But Kattis does not actually implement those anyway, as I learned somewhat painfully during a recent contest. Kattis seems to run every testcase exactly once and reuse the judgement. So we shouldn’t worry about that.) |
ok thanks for clarifying :) It now almost works for me locally.
Yes, these will work for sure. I'm rewriting to resolve all includes in code after all (rather than globbing the filesystem), and I think child-inclusions will work after all.
Oh this is a good point I hadn't considered yet. That may imply I have to upgrade included testcases from simple strings to actual 'TestcaseRule' objects. I'll first get things working without this additional validation and then go from there.
Hmm. So to reword: input validation should always happen for included testcases, but output validation not necessarily, since even though we can validate the provided |
Ok basic generating should work now. I've converted the isssue into a PR. Outstanding issues:
Here's an extended config for the test problem:
|
include
can only run onceinclude
d testcases
8c4afa5
to
e31df2b
Compare
@thorehusfeldt Ok I think this is in pretty good shape now. I've added caching of validators flags so they should only rerun for includes if arguments changed. Could you test it out? After generating once rerunning it should be pretty much instant provided there are no errors. Then if you modify the flags in a One open question is whether I should by default skip running on included cases. Since I (and kattis) don't support having distinct output validator flags, running once should be OK right. |
It seems there are some parallellism issues now: we must make sure testcases have finished processing before we can include them elsewhere. I'll probably rewrite things into two passes after all:
|
Generation is already done in two steps (first all listen then all unlisted) a third step won't harm :D |
let's hope this fixed it |
Alas, I have no success It just gets stuck.
I’m on 9a0d821, and on a 2017 Mac, Python 3.10. |
Can you run with |
(to success using In better news, I have rewritten https://dpop23.kattis.com/contests/dpop-23-1/problems/dpop23.storebededag to use |
An interesting issue is the following: If, in the running example, we use numbered testcasenames in data:
- "sm-all-pos": stdout.py 4 6
- "sm-zero": stdout.py 4 0 what is then the proper value in the later include in include:
- "sample/2"
- "secret/small/positive/sm-zero" or include:
- "sample/2"
- "secret/small/positive/2-sm-zero" Currently, the latter works, and the former predictably throws File "/Users/thore/Developer/BAPCtools/bin/util.py", line 467, in read_yaml
assert path.is_file()
AssertionError I don’t pretend to have thought this through, but I think I prefer |
I think the commit above has the fixes you also did. Re. numbered testcases:
So for now the simplest solution to me sounds like forbidding using A completely different approach would be to have a global numbering instead of per-directory numbering. Then, we could include cases and reuse their existing number. Still the issue remains that testcases may have non-unique names. TL;DR: Unless you would really like to use it with numbering, let's only allow |
I agree with that observation and also like the convention very much, and would be very happy to move the specification in that direction more explicitly. Before going there I just want to make sure we aren’t talking past each other about what “unique names” means: “name” here means that the name of a testcase whose input is at |
9a131bc
to
6860160
Compare
generators.yaml: accept_score: 25 should be accept_score: "25" (The joys of actually validating against a schema. |
Oh ok, sure. (I assume the testdata.yaml spec is somewhat well established, and changing it to a number is too late now? And this was not simply an oversight?) |
ok I think it's time to merge this; some other changes depend on it, and it seems to work so far |
7857ecb
to
2714fe3
Compare
(Note: I converted @thorehusfeldt's original issue into a PR.)
Here is a simple generator, say for an
addtwonumbers
-like problem:It uses the
include
key to include the1 2
instance called1
also in the secret data.This works once:
but not twice