Generators: Allow hardcoding testcases input/answer #272

RagnarGrootKoerkamp · 2023-06-30T14:32:50Z

Currently we do

testcase: stdout.py 1 2 3

Would be nice to instead do

testcase:
  in: 1 2 3

Then, we can directly include samples and other small manual cases. This should support keys for some more extensions (corresponding to text files):

testcase:
  in: 1 2 3
  ans: 6
  hint: what is the sum of 1, 2, and 3?
  desc: sum of 3

(images probably shouldn't be made this way.)

Note that this would prevent future overriding of these keys by commands.
input (rename to gen), solution, and visualizer are used for that currently.

Result

Generating

See doc/generators.yaml for the result. In brief:

testcase:

indicates testcase.{in,ans,...} are given directly in the target directory (discouraged).

testcase: generator.py {seed}

indicates the generator to run

testcase:
  copy: manual_cases/sample/3

copies manual_cases/sample/3.* to testcase.*

testcase:
  in: "1"
  ans: "1"
  desc: Identity of 1.
  hint: What is the identity function?

Priorities:

testcase:
  # First generate things
  generate: generator.py {seed}
  # Then copy in given files (overwriting generated files)
  copy: path/to/files
  # Then overwrite with hardcoded content
  in: 1 2
  ans: "3"

Numbering:

Testgroups are numbered per directory as g1-<name>, g2-<name>.
Testcases are numbered globally within the problem.

Including

Testgroups can contain a include: key that maps to a list of strings, each the name of a testcase or testgroup. (For numbered cases, the non-numbered dictionary key as written in generators.yaml must be used.) The referred-to testcases/testgroups must have a unique name within the problem.

mpsijm · 2023-06-28T07:35:50Z

Related to (/ supersedes?) #220 🙂

RagnarGrootKoerkamp · 2023-06-30T14:37:28Z

This turned out to be quite easy to implement. @mpsijm @mzuenni @thorehusfeldt who wants to give it a try?

todo:

a bit more testing
docs
update generators.yaml
update corresponding json and cue specs
allow .interaction as well

What I implemented now:
In each testcase, at least one of input (the current command) and in (the new hardcoding) is required.
Then:

run the input command if present
copy over in: to testcase.in (overwriting the command output), and also copy over other present keys in ans, hint, desc.
run the solution if ans is not given.
validate as usual.

Most of the code in the PR is actually to get the caching and info messages and such correct.

thorehusfeldt · 2023-07-02T07:06:59Z

I like this a lot. If I understand correctly, we are at something like this:

generator :: command | {
    input: command
    in: string
    ans: string
    desc: string
    hint: string
    file_config
    directory_reserved
    ...
}

where the new fields are in, ans, desc, and hint.

(input should really be renamed command or invocation or generator-invocation (but not generator); sooner is better than later.)

I can’t see a use-case for having both (solution and ans), nor for having (input and in and ans), we could forbid these. Most other combinations do make sense for me but I haven’t thought this through.

thorehusfeldt · 2023-07-02T07:14:25Z

Should the option of supplying the command key with null still exist, as in

my_testcase:

Or should we now insist on command to be either string or nonempty dict, as in

my_testcase:
  in:
  ans:

(I prefer the latter, it’s more explicit.)

RagnarGrootKoerkamp · 2023-07-02T07:40:28Z

Let me rename input => command now. It's a bit verbose but it's very unlikely you'll need to expand testcase: command into the long form anyway. [done]
Good idea to keep things simple and require exactly one of in: and command:. [done]
So far the semantics of solution are that it's run whenever .ans is not generated by command. That is compatible with having both ans: and solution: in a testcase, but agreed it's kinda silly so probably this should at least be a warning anyway. [also an error now]
Hmm; I do like sample-1: as an empty key. Another problem I ran into with the tests is that we have some problem there where we provide data/sample/1.{in,ans,png}. After changing the yaml to
```
1:
  in: 1 2
  ans: 3
```
the changes in this PR triggered a warning for 1.png existing while it isn't generated. I reverted that so that for samples having extra files is OK again. Basically we don't have a good way currently of saying explicitly that 1.png exists and was manually created, unless we add image extensions to the list and only allow them to be empty.
In the current implementation and empty in: value simply means that the input should be empty, but this is somewhat conflicting with an empty testcase: key which means it's manually provided.

A bigger change could be to not allow any manually provided files but I think that's too much.

thorehusfeldt · 2023-07-02T07:58:55Z

A bigger change could be to not allow any manually provided files but I think that's too much.

My only uneasiness are manually provided files that are not explictly listed.

But I admit that I no longer have a consistent model for what “absence of value” means in generators.yaml, or how a manually provided file should be best specified.

thorehusfeldt · 2023-07-02T08:43:41Z

I expect the following to be used a lot:

data:
  sample:
    data:
      - '':
          in: |
            123 32
            foo
            bar baz
      - '':
          in: |
            2 45
            eki eki
            ptang

(And this is excellent! Finally a good way to specify samples.)

The indentation seems excessive, but this is proper yaml. (It looks weird because - : adds two layers of indentation.)

Nag: If the user (understandably) forgets one level of indentation, as in

data:
  sample:
    data:
      - '':
        in: |
          123 32
          foo
          bar baz
      - '':
        in: |
          2 45
          eki eki
          ptang

the bt generate emits the unhelpful

FATAL ERROR: Dictionary must contain exactly one named testcase/group: sample

thorehusfeldt · 2023-07-02T20:04:13Z

Consider being more explicit by adding the key path: go CUE object generator, next to in, ans, command, desc, etc:

foo: 
  path: manual/foo.in
bar:
  in: 34 64
baz:
  command: random {seed}
bum:
  path: null # same as path/to/bum.in, deprecated

Setting exactly one of path, in, and command makes sense.

When the value is yaml-string (instead of yaml-dictionary), it is interpreted as a shorthand for either path (*) or command using some pattern matching:

a : manual/foo.in # end in .in and contains no whitespace: interpreted as a path
b : random {seed} # a command
c : null # same as "path/to/c.in"
d :  # same as null

The introduction of path is mainly to avoid

command: manual/foo.in
command:

which look wrong to me.

I have no strong opinion about what path: should called. Other suggestions are copy-from: , cp: or manual:. (Manual, though established, strikes me as a surprisingly bad name. After all, in:-cases are exactly manual, whereas included testcases were often automatically produced, not manually. This is about copying, not about how the case was produced.)
Maybe overkill, but path: [manual/foo.in, manual/foo.ans] (allowing a list) makes sense to me, as does path: manual/foo (specifying both manual/foo.in and manual/foo.ans if they exist. But it’s implicit and therefore I don’t really like it.) I’m not sure this is a good idea.

thorehusfeldt · 2023-07-03T10:53:44Z

After some thinking (and actually writing a specification):

path-to-infile should have the same syntax as path-to-ansfile, and the above proposal breaks that. Alternatives:

tc-a:
  in: 34 643     # string specifies contents
  ans: 54        # string specifies contents
tc-b:
  in-path: manual/sample/foo.in # string specifies path
  ans-path: manual/sample/foo.ans # string specifies path

Arguably, .hint and .desc should also be first-level citizens (so we should have hint-path etc.), but then we could even elevate .png. But I don’t think any of this will ever be used.

Alternative is to be lenient in the value specification:

tc-c:
  in: 34 54 # string specifies contents of `tc-c.in`
tc-d:
  in: manual/secret/group4/foo.in # string looks like a path, so it's a path

That gives me gray hairs but is close in spirit to what generators.py does right now.

The third, possibly cleanest suggestion would be the field path, which takes a suffix-less path:

tc-e:
  in: 54 12 # string specifies contents
  path: manual/sample/foo # copy whatever `.in`, `.ans`. `.whatever` is found at that path

The order might be:

if key command is present, run it, possibly creating in, ans, etc.
if key path: <path> is present, copy <path>.in, <path>.ans etc., to testcase.in etc., possibly overwriting the existing files
if keys in, ans, desc, hint are present, copy those strings to testcase.in etc.,
if still no testcase.in has manifested, look for it it <problemname>/data/path/to/testcase.in, etc. (Deprecated, prefer form 2.) This case is triggered, e.g., if the entire dict is empty, like tc-f: null, consistent with the current specification.
still no testcase.in? Error.
If no testcase.ans has manifested, run solution on testcase.in

thorehusfeldt · 2023-07-03T11:18:51Z

I think this is what I would prefer the specification to be:

package problemformat
import "struct"

#command: !="" & (=~"^[^{}]*(\\{(name|seed(:[0-9]+)?)\\}[^{}]*)*$")
#file_config: {
	solution?:    #command        // null disallowed (specify #testcase.ans instead)
	visualizer?:  #command | null // null means: skip visualisation
	random_salt?: string
}

#testcase: null | #command  | { // null means: expect file in <problemname>/data/...
	command?: #command   // invocation of a generator
	path?: #path         // copy <problemname>/generators/path.in etc.
	in?: string          // explicitly given
	ans?: string
	desc?: string
	hint?: string
	#file_config
}

#data_dict: [string]: #testgroup | #testcase

#testgroup: {
	#file_config
	testdata_settings?: #testdata_settings
	data: #data_dict | [...{#data_dict & struct.MaxFields(1)} ] // list of singleton dicts
}

#Generators: {
	generators?: [string]: [...string]
	#testgroup
	...                  // do allow unknown_key at top level for tooling
} & { data: close({          // top level data dict has keys {sample, secret}
	sample: #testgroup
	secret: #testgroup
	})
}

(I have now a CUE package problemformat that specifies testdata.yaml files, but which doesn’t belong in the generators spec. It defines #testdata_settings.)

RagnarGrootKoerkamp · 2023-07-03T12:18:05Z

ok, many new ideas here to reply to. I really like your 'order of steps' proposal, and think that that corresponds exactly to my mental model of this. Also the CUE spec looks good to me.

Some more thoughts:

'inline' testcases

I call testcases for which in/ans files are provided directly in data/* 'inline' in the code.
These are currently specified with testcase: (with an empty/null value).

We never used these for secret data
We typically use this sample data:
```
sample:
  data:
    '1':
    '2':
```
or I believe even this works to ignore that directory completely:
```
data:
 sample:
   data:
```
Now that we can hardcode sample data directly, I don't see actually see much need for inline cases at all anymore. Either you don't use generators and just manually put all your data in data/ as you like, or you opt in to use this and then you need to be explicit about everything and basically give up all control over data/.

This existing 'middle ground' of allowing a subset of data/ to be 'inline' cases while the rest is generated also gives some mental overhead; making it fully generated is simpler and more consistent.
One issue is that the sample: '1': syntax is used a lot in existing problems, so either way the implementation should keep supporting that, probably with a deprecation warning.

'manual'/'copied' testcases

There are testcases currently given as testcase: manual/testcase.in.

I like your path: key that takes a path without extension:

testcase:
  path: manual/testcase

where the semantics are that all files generators/manual/testcase.EXT for known extensions EXT are copied to the testcase in data/. This also removes the current inconsistency that testcase: manual/testcase.in actually copies all extensions, not just the .in.

Regarding the name:

copy-from: could also work but sligtly long
cp: is slightly too linuxy/unclear
copy: would work well for me
path: isn't super informative about what's actually being done with this path, but is OK.

Then again, command: is also just saying what follows, not what is done with it. It could also be e.g. execute: or run:.

Hardcoded cases

I think we're clear on this: testcase: in: <string> copies that to testcase.in.

Maybe we can dissallow this to be empty. (But see below.)

Unlisted files

I think by default, we should warn for every file in data/ that is not explicitly generated, and we should discourage using inline files.

However, there is still one case that would be nice to have a proper solution for: manual visualizations for the samples, that are not generated using a visualizer: script.

Some options:

'inline' the entire sample directory using sample: data: null, so that anything is allowed there
'copy' all testcases on a per-testcase basis: sample: data: 1: path: manual/sample/1 (i'm lazy and compressing the syntax), where generators/manual/sample/1.{in,ans,png} exist.
'hardcode' the in: and ans: per testcase, and provide path: manual/sample/1 in addition to that to copy over 1.png.
'hardcode' the in: and ans: per testcase, and set png: to an empty value to indicate the file is provided 'inline' in data/sample/1.png.

With this last option, testcase: null means that any testcase.* may be present inline, while testcase: <ext>: null means that testcase.<ext> may be present inline.

thorehusfeldt · 2023-07-03T17:20:11Z

I like option 2 and 3 (explicit) more than 1 and 4 (weird semantics of null.)

mpsijm · 2023-07-03T18:55:11Z

Specific feedback for option 4: I think that the key should not be named png:, because the visualization is not always a PNG. Perhaps we can "reuse" the visualizer: key for this, although we then get overloaded behaviour where the value of this key is either a command to run for the visualizer, or a path to a pre-generated image.

To fix that, here's a proposal that is in some way a mix between options 3 and 4 (let's call it option 5):

We could allow keys like in, ans, hint, desc, etc. to be either a string or a path: object. For example:

- '':
    in: 2 4  # Content of a testcase, copied to `1.in`
    ans:
      path: manual/sample/1.ans  # Path to the answer, copied to `1.ans`

- '':
    in: 2 4  # Content of a testcase, copied to `2.in`
    ans: 6  # Content copied to `2.ans`
    visualizer:
      path: manual/sample/2.png  # Path to the PNG, instead of a command to run

thorehusfeldt · 2023-07-03T19:55:24Z

image (or even visualization) may be better than png. Alternatively, support png jpeg jpg, and pdf. To me, keys are “conceptually cheap” (adding more keys adds little cognitive overhead and very little extra code), whereas overloading interpretations or funky semantics are ”conceptually expensive”. So I prefer the former.

RagnarGrootKoerkamp · 2023-07-03T20:41:28Z

Oh yes, anywhere I wrote png I implicied meant that this can be any of the valid image extensions, and you have to choose the right one. That way it's consistent with in and ans directly corresponding to files.

mpsijm · 2023-07-04T06:15:07Z

@thorehusfeldt I agree with you that "conceptually cheap" specifications are to be preferred.

In practice, I do not think we need the granular control that option 5 provides (why would we ever want not all files matching a prefix to be copied?). The way how option 3 copies all manual files matching the prefix given in the global-for-the-testcase path: key except those that have their contents explicitly provided, should be enough.

I also like options 2 and 3 best, in that case, but I also think we should keep maintaining option 1 (but with an empty value instead of an explicit null) for backwards compatibility (and I wouldn't mind adding a warning there 🙂). (EDIT: I initially listed option 4 here as well, but that's not needed for backwards compatibility of course, so I'm fine with dropping it 😄)

thorehusfeldt · 2023-07-04T06:24:37Z

but with an empty value instead of an explicit null

My understanding is that foo: , foo: null, and foo: ~ are the same in yaml.

RagnarGrootKoerkamp · 2023-07-04T07:27:25Z

Ok so:

option 1 (testcase: null) is already supported by backwards compatibility for samples (data: sample: '1': null) and should keep working (with a warning).
options 2 and 3 come by default with the current proposal (the list in this comment)
Option 4 would require additional logic for an empty testcase: png: null key. But since we don't like this so much anyway, I'd prefer to just not implement this and disallow empty testcase: <ext>: null keys completely. We only keep the empty testcase: null for completely inline testcases.

Thus, you can either keep your testcase completely inline, or move all of it into generators.yaml.

…dcoded .in

…d/found

…file from sample case this is already mentioned at the top of the file

RagnarGrootKoerkamp mentioned this pull request Jun 27, 2023

Support generating included testcases #262

Merged

mpsijm mentioned this pull request Jun 28, 2023

Generators: Add testcase: count: X for multiple testcase copies #271

Closed

RagnarGrootKoerkamp mentioned this pull request Jun 28, 2023

Support hardcoding manual testcases directly in generators.yaml #220

Closed

RagnarGrootKoerkamp added a commit that referenced this pull request Jun 30, 2023

[gen] #272 Allow hardcoding testcases

53966c2

RagnarGrootKoerkamp added a commit that referenced this pull request Jun 30, 2023

[gen] #272 Allow hardcoding testcases

3e340e0

RagnarGrootKoerkamp force-pushed the hardcoding branch from 53966c2 to 3e340e0 Compare June 30, 2023 14:38

RagnarGrootKoerkamp added a commit that referenced this pull request Jun 30, 2023

[gen] #272 Allow hardcoding testcases

7c0ce31

RagnarGrootKoerkamp force-pushed the hardcoding branch from 3e340e0 to 7c0ce31 Compare June 30, 2023 14:51

RagnarGrootKoerkamp added a commit that referenced this pull request Jun 30, 2023

[gen] #272 Allow hardcoding testcases

a06af92

RagnarGrootKoerkamp force-pushed the hardcoding branch from 7c0ce31 to a06af92 Compare June 30, 2023 14:52

RagnarGrootKoerkamp added a commit that referenced this pull request Jul 1, 2023

[gen] #272 Allow hardcoding testcases

e549177

RagnarGrootKoerkamp force-pushed the hardcoding branch from 53db57d to e549177 Compare July 1, 2023 13:18

thorehusfeldt and others added 24 commits August 25, 2023 15:27

generators JSON schema: rename input to command

a3dda4a

[gen] Improved error message when forgetting one indent level for har…

ff9ca5d

…dcoded .in

[gen] Add samples 4 and 5 to generators.yaml example

ecbba48

[gen] JSON schema: add descriptions for in, ans, desc, hint

ec6623e

[gen] expand docs re multiline yaml syntax

d06bb8a

[gen] Implement 6-step approach (#272)

3d2745f

[test] Add /usr/bin/env python3 shebangs to test submissions

3c23898

[gen] update generators.yaml for new testcase: path: path/to/name

9b2be62

[gen] Require at least one of 'command', 'path', 'in'

0558fc9

[gen] {path => copy}

4f28adb

[gen] {command => generate}

6ae8493

[gen] Update {path=>copy} and {command=>generate} in yaml files

0fcfb27

JSON schema: use generate, copy

2408262

[gen] turn some errors that are handled anyway into warnings

30eafbf

[gen] Fix python error and give nice warning when no .in was generate…

577222e

…d/found

[fuzz] Update old usage of testcase.manual

a345b72

[gen] Fix bug for unlisted cases

993f244

[gen] {--add-manual => --add-unlisted}

3cde760

[gen] Rename 'manual inline' cases to 'unlisted' cases

42c8013

[gen] hardcode samples for identity

d2c6cc7

[gen] doc/generators.yaml: Fix some small textual issues

7914a71

[gen] doc/generators.yaml: Remove notice of solution generation .ans …

c661280

…file from sample case this is already mentioned at the top of the file

remove +x bits

ab81165

fix tests and doc

9f2e29e

RagnarGrootKoerkamp force-pushed the hardcoding branch from 1886d90 to 9f2e29e Compare August 25, 2023 13:27

RagnarGrootKoerkamp merged commit e588586 into master Aug 25, 2023
1 of 2 checks passed

RagnarGrootKoerkamp deleted the hardcoding branch August 25, 2023 13:43

RagnarGrootKoerkamp added a commit that referenced this pull request Aug 25, 2023

[gen] #272 Allow hardcoding testcases

4d65985

RagnarGrootKoerkamp added a commit that referenced this pull request Aug 25, 2023

[gen] Implement 6-step approach (#272)

a968d71

mpsijm mentioned this pull request Feb 27, 2024

Check whether inline testcases are git-ignored. #73

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generators: Allow hardcoding testcases input/answer #272

Generators: Allow hardcoding testcases input/answer #272

RagnarGrootKoerkamp commented Jun 30, 2023 •

edited

Loading

mpsijm commented Jun 28, 2023

RagnarGrootKoerkamp commented Jun 30, 2023 •

edited

Loading

thorehusfeldt commented Jul 2, 2023 •

edited

Loading

thorehusfeldt commented Jul 2, 2023 •

edited

Loading

RagnarGrootKoerkamp commented Jul 2, 2023 •

edited

Loading

thorehusfeldt commented Jul 2, 2023 •

edited

Loading

thorehusfeldt commented Jul 2, 2023 •

edited by mpsijm

Loading

thorehusfeldt commented Jul 2, 2023 •

edited

Loading

thorehusfeldt commented Jul 3, 2023 •

edited

Loading

thorehusfeldt commented Jul 3, 2023 •

edited

Loading

RagnarGrootKoerkamp commented Jul 3, 2023 •

edited

Loading

thorehusfeldt commented Jul 3, 2023

mpsijm commented Jul 3, 2023 •

edited

Loading

thorehusfeldt commented Jul 3, 2023

RagnarGrootKoerkamp commented Jul 3, 2023

mpsijm commented Jul 4, 2023 •

edited

Loading

thorehusfeldt commented Jul 4, 2023

RagnarGrootKoerkamp commented Jul 4, 2023

Generators: Allow hardcoding testcases input/answer #272

Generators: Allow hardcoding testcases input/answer #272

Conversation

RagnarGrootKoerkamp commented Jun 30, 2023 • edited Loading

Result

Generating

Numbering:

Including

mpsijm commented Jun 28, 2023

RagnarGrootKoerkamp commented Jun 30, 2023 • edited Loading

thorehusfeldt commented Jul 2, 2023 • edited Loading

thorehusfeldt commented Jul 2, 2023 • edited Loading

RagnarGrootKoerkamp commented Jul 2, 2023 • edited Loading

thorehusfeldt commented Jul 2, 2023 • edited Loading

thorehusfeldt commented Jul 2, 2023 • edited by mpsijm Loading

thorehusfeldt commented Jul 2, 2023 • edited Loading

thorehusfeldt commented Jul 3, 2023 • edited Loading

thorehusfeldt commented Jul 3, 2023 • edited Loading

RagnarGrootKoerkamp commented Jul 3, 2023 • edited Loading

'inline' testcases

'manual'/'copied' testcases

Hardcoded cases

Unlisted files

thorehusfeldt commented Jul 3, 2023

mpsijm commented Jul 3, 2023 • edited Loading

thorehusfeldt commented Jul 3, 2023

RagnarGrootKoerkamp commented Jul 3, 2023

mpsijm commented Jul 4, 2023 • edited Loading

thorehusfeldt commented Jul 4, 2023

RagnarGrootKoerkamp commented Jul 4, 2023

RagnarGrootKoerkamp commented Jun 30, 2023 •

edited

Loading

RagnarGrootKoerkamp commented Jun 30, 2023 •

edited

Loading

thorehusfeldt commented Jul 2, 2023 •

edited

Loading

thorehusfeldt commented Jul 2, 2023 •

edited

Loading

RagnarGrootKoerkamp commented Jul 2, 2023 •

edited

Loading

thorehusfeldt commented Jul 2, 2023 •

edited

Loading

thorehusfeldt commented Jul 2, 2023 •

edited by mpsijm

Loading

thorehusfeldt commented Jul 2, 2023 •

edited

Loading

thorehusfeldt commented Jul 3, 2023 •

edited

Loading

thorehusfeldt commented Jul 3, 2023 •

edited

Loading

RagnarGrootKoerkamp commented Jul 3, 2023 •

edited

Loading

mpsijm commented Jul 3, 2023 •

edited

Loading

mpsijm commented Jul 4, 2023 •

edited

Loading