Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generators: Allow hardcoding testcases input/answer #272

Merged
merged 30 commits into from
Aug 25, 2023

Conversation

RagnarGrootKoerkamp
Copy link
Owner

@RagnarGrootKoerkamp RagnarGrootKoerkamp commented Jun 30, 2023

Currently we do

testcase: stdout.py 1 2 3

Would be nice to instead do

testcase:
  in: 1 2 3

Then, we can directly include samples and other small manual cases. This should support keys for some more extensions (corresponding to text files):

testcase:
  in: 1 2 3
  ans: 6
  hint: what is the sum of 1, 2, and 3?
  desc: sum of 3

(images probably shouldn't be made this way.)

Note that this would prevent future overriding of these keys by commands.
input (rename to gen), solution, and visualizer are used for that currently.


Result

Generating

See doc/generators.yaml for the result. In brief:

testcase:

indicates testcase.{in,ans,...} are given directly in the target directory (discouraged).

testcase: generator.py {seed}

indicates the generator to run

testcase:
  copy: manual_cases/sample/3

copies manual_cases/sample/3.* to testcase.*

testcase:
  in: "1"
  ans: "1"
  desc: Identity of 1.
  hint: What is the identity function?

Priorities:

testcase:
  # First generate things
  generate: generator.py {seed}
  # Then copy in given files (overwriting generated files)
  copy: path/to/files
  # Then overwrite with hardcoded content
  in: 1 2
  ans: "3"

Numbering:

  • Testgroups are numbered per directory as g1-<name>, g2-<name>.
  • Testcases are numbered globally within the problem.

Including

Testgroups can contain a include: key that maps to a list of strings, each the name of a testcase or testgroup. (For numbered cases, the non-numbered dictionary key as written in generators.yaml must be used.) The referred-to testcases/testgroups must have a unique name within the problem.

@mpsijm
Copy link
Collaborator

mpsijm commented Jun 28, 2023

Related to (/ supersedes?) #220 🙂

@RagnarGrootKoerkamp
Copy link
Owner Author

RagnarGrootKoerkamp commented Jun 30, 2023

This turned out to be quite easy to implement. @mpsijm @mzuenni @thorehusfeldt who wants to give it a try?

todo:

  • a bit more testing
  • docs
  • update generators.yaml
  • update corresponding json and cue specs
  • allow .interaction as well

What I implemented now:
In each testcase, at least one of input (the current command) and in (the new hardcoding) is required.
Then:

  • run the input command if present
  • copy over in: to testcase.in (overwriting the command output), and also copy over other present keys in ans, hint, desc.
  • run the solution if ans is not given.
  • validate as usual.

Most of the code in the PR is actually to get the caching and info messages and such correct.

@thorehusfeldt
Copy link
Collaborator

thorehusfeldt commented Jul 2, 2023

I like this a lot. If I understand correctly, we are at something like this:

generator :: command | {
    input: command
    in: string
    ans: string
    desc: string
    hint: string
    file_config
    directory_reserved
    ...
}

where the new fields are in, ans, desc, and hint.

(input should really be renamed command or invocation or generator-invocation (but not generator); sooner is better than later.)

I can’t see a use-case for having both (solution and ans), nor for having (input and in and ans), we could forbid these. Most other combinations do make sense for me but I haven’t thought this through.

@thorehusfeldt
Copy link
Collaborator

thorehusfeldt commented Jul 2, 2023

Should the option of supplying the command key with null still exist, as in

my_testcase: 

Or should we now insist on command to be either string or nonempty dict, as in

my_testcase:
  in:
  ans:

(I prefer the latter, it’s more explicit.)

@RagnarGrootKoerkamp
Copy link
Owner Author

RagnarGrootKoerkamp commented Jul 2, 2023

  • Let me rename input => command now. It's a bit verbose but it's very unlikely you'll need to expand testcase: command into the long form anyway. [done]
  • Good idea to keep things simple and require exactly one of in: and command:. [done]
  • So far the semantics of solution are that it's run whenever .ans is not generated by command. That is compatible with having both ans: and solution: in a testcase, but agreed it's kinda silly so probably this should at least be a warning anyway. [also an error now]
  • Hmm; I do like sample-1: as an empty key. Another problem I ran into with the tests is that we have some problem there where we provide data/sample/1.{in,ans,png}. After changing the yaml to
    1:
      in: 1 2
      ans: 3
    the changes in this PR triggered a warning for 1.png existing while it isn't generated. I reverted that so that for samples having extra files is OK again. Basically we don't have a good way currently of saying explicitly that 1.png exists and was manually created, unless we add image extensions to the list and only allow them to be empty.
  • In the current implementation and empty in: value simply means that the input should be empty, but this is somewhat conflicting with an empty testcase: key which means it's manually provided.

A bigger change could be to not allow any manually provided files but I think that's too much.

@thorehusfeldt
Copy link
Collaborator

thorehusfeldt commented Jul 2, 2023

A bigger change could be to not allow any manually provided files but I think that's too much.

My only uneasiness are manually provided files that are not explictly listed.

But I admit that I no longer have a consistent model for what “absence of value” means in generators.yaml, or how a manually provided file should be best specified.

@thorehusfeldt
Copy link
Collaborator

thorehusfeldt commented Jul 2, 2023

I expect the following to be used a lot:

data:
  sample:
    data:
      - '':
          in: |
            123 32
            foo
            bar baz
      - '':
          in: |
            2 45
            eki eki
            ptang

(And this is excellent! Finally a good way to specify samples.)

The indentation seems excessive, but this is proper yaml. (It looks weird because - : adds two layers of indentation.)

Nag: If the user (understandably) forgets one level of indentation, as in

data:
  sample:
    data:
      - '':
        in: |
          123 32
          foo
          bar baz
      - '':
        in: |
          2 45
          eki eki
          ptang

the bt generate emits the unhelpful

FATAL ERROR: Dictionary must contain exactly one named testcase/group: sample

@thorehusfeldt
Copy link
Collaborator

thorehusfeldt commented Jul 2, 2023

Consider being more explicit by adding the key path: go CUE object generator, next to in, ans, command, desc, etc:

foo: 
  path: manual/foo.in
bar:
  in: 34 64
baz:
  command: random {seed}
bum:
  path: null # same as path/to/bum.in, deprecated

Setting exactly one of path, in, and command makes sense.

When the value is yaml-string (instead of yaml-dictionary), it is interpreted as a shorthand for either path (*) or command using some pattern matching:

a : manual/foo.in # end in .in and contains no whitespace: interpreted as a path
b : random {seed} # a command
c : null # same as "path/to/c.in"
d :  # same as null

The introduction of path is mainly to avoid

command: manual/foo.in
command: 

which look wrong to me.

  • I have no strong opinion about what path: should called. Other suggestions are copy-from: , cp: or manual:. (Manual, though established, strikes me as a surprisingly bad name. After all, in:-cases are exactly manual, whereas included testcases were often automatically produced, not manually. This is about copying, not about how the case was produced.)
  • Maybe overkill, but path: [manual/foo.in, manual/foo.ans] (allowing a list) makes sense to me, as does path: manual/foo (specifying both manual/foo.in and manual/foo.ans if they exist. But it’s implicit and therefore I don’t really like it.) I’m not sure this is a good idea.

@thorehusfeldt
Copy link
Collaborator

thorehusfeldt commented Jul 3, 2023

After some thinking (and actually writing a specification):

path-to-infile should have the same syntax as path-to-ansfile, and the above proposal breaks that. Alternatives:

tc-a:
  in: 34 643     # string specifies contents
  ans: 54        # string specifies contents
tc-b:
  in-path: manual/sample/foo.in # string specifies path
  ans-path: manual/sample/foo.ans # string specifies path

Arguably, .hint and .desc should also be first-level citizens (so we should have hint-path etc.), but then we could even elevate .png. But I don’t think any of this will ever be used.

Alternative is to be lenient in the value specification:

tc-c:
  in: 34 54 # string specifies contents of `tc-c.in`
tc-d:
  in: manual/secret/group4/foo.in # string looks like a path, so it's a path

That gives me gray hairs but is close in spirit to what generators.py does right now.

The third, possibly cleanest suggestion would be the field path, which takes a suffix-less path:

tc-e:
  in: 54 12 # string specifies contents
  path: manual/sample/foo # copy whatever `.in`, `.ans`. `.whatever` is found at that path

The order might be:

  1. if key command is present, run it, possibly creating in, ans, etc.
  2. if key path: <path> is present, copy <path>.in, <path>.ans etc., to testcase.in etc., possibly overwriting the existing files
  3. if keys in, ans, desc, hint are present, copy those strings to testcase.in etc.,
  4. if still no testcase.in has manifested, look for it it <problemname>/data/path/to/testcase.in, etc. (Deprecated, prefer form 2.) This case is triggered, e.g., if the entire dict is empty, like tc-f: null, consistent with the current specification.
  5. still no testcase.in? Error.
  6. If no testcase.ans has manifested, run solution on testcase.in

@thorehusfeldt
Copy link
Collaborator

thorehusfeldt commented Jul 3, 2023

I think this is what I would prefer the specification to be:

package problemformat
import "struct"

#command: !="" & (=~"^[^{}]*(\\{(name|seed(:[0-9]+)?)\\}[^{}]*)*$")
#file_config: {
	solution?:    #command        // null disallowed (specify #testcase.ans instead)
	visualizer?:  #command | null // null means: skip visualisation
	random_salt?: string
}

#testcase: null | #command  | { // null means: expect file in <problemname>/data/...
	command?: #command   // invocation of a generator
	path?: #path         // copy <problemname>/generators/path.in etc.
	in?: string          // explicitly given
	ans?: string
	desc?: string
	hint?: string
	#file_config
}

#data_dict: [string]: #testgroup | #testcase

#testgroup: {
	#file_config
	testdata_settings?: #testdata_settings
	data: #data_dict | [...{#data_dict & struct.MaxFields(1)} ] // list of singleton dicts
}

#Generators: {
	generators?: [string]: [...string]
	#testgroup
	...                  // do allow unknown_key at top level for tooling
} & { data: close({          // top level data dict has keys {sample, secret}
	sample: #testgroup
	secret: #testgroup
	})
}

(I have now a CUE package problemformat that specifies testdata.yaml files, but which doesn’t belong in the generators spec. It defines #testdata_settings.)

@RagnarGrootKoerkamp
Copy link
Owner Author

RagnarGrootKoerkamp commented Jul 3, 2023

ok, many new ideas here to reply to. I really like your 'order of steps' proposal, and think that that corresponds exactly to my mental model of this. Also the CUE spec looks good to me.

Some more thoughts:

'inline' testcases

I call testcases for which in/ans files are provided directly in data/* 'inline' in the code.
These are currently specified with testcase: (with an empty/null value).

  • We never used these for secret data

  • We typically use this sample data:

    sample:
      data:
        '1':
        '2':

    or I believe even this works to ignore that directory completely:

    data:
     sample:
       data:
  • Now that we can hardcode sample data directly, I don't see actually see much need for inline cases at all anymore. Either you don't use generators and just manually put all your data in data/ as you like, or you opt in to use this and then you need to be explicit about everything and basically give up all control over data/.

    This existing 'middle ground' of allowing a subset of data/ to be 'inline' cases while the rest is generated also gives some mental overhead; making it fully generated is simpler and more consistent.

  • One issue is that the sample: '1': syntax is used a lot in existing problems, so either way the implementation should keep supporting that, probably with a deprecation warning.

'manual'/'copied' testcases

There are testcases currently given as testcase: manual/testcase.in.

I like your path: key that takes a path without extension:

testcase:
  path: manual/testcase

where the semantics are that all files generators/manual/testcase.EXT for known extensions EXT are copied to the testcase in data/. This also removes the current inconsistency that testcase: manual/testcase.in actually copies all extensions, not just the .in.

Regarding the name:

  • copy-from: could also work but sligtly long
  • cp: is slightly too linuxy/unclear
  • copy: would work well for me
  • path: isn't super informative about what's actually being done with this path, but is OK.

Then again, command: is also just saying what follows, not what is done with it. It could also be e.g. execute: or run:.

Hardcoded cases

I think we're clear on this: testcase: in: <string> copies that to testcase.in.

Maybe we can dissallow this to be empty. (But see below.)

Unlisted files

I think by default, we should warn for every file in data/ that is not explicitly generated, and we should discourage using inline files.

However, there is still one case that would be nice to have a proper solution for: manual visualizations for the samples, that are not generated using a visualizer: script.

Some options:

  1. 'inline' the entire sample directory using sample: data: null, so that anything is allowed there
  2. 'copy' all testcases on a per-testcase basis: sample: data: 1: path: manual/sample/1 (i'm lazy and compressing the syntax), where generators/manual/sample/1.{in,ans,png} exist.
  3. 'hardcode' the in: and ans: per testcase, and provide path: manual/sample/1 in addition to that to copy over 1.png.
  4. 'hardcode' the in: and ans: per testcase, and set png: to an empty value to indicate the file is provided 'inline' in data/sample/1.png.

With this last option, testcase: null means that any testcase.* may be present inline, while testcase: <ext>: null means that testcase.<ext> may be present inline.

@thorehusfeldt
Copy link
Collaborator

I like option 2 and 3 (explicit) more than 1 and 4 (weird semantics of null.)

@mpsijm
Copy link
Collaborator

mpsijm commented Jul 3, 2023

Specific feedback for option 4: I think that the key should not be named png:, because the visualization is not always a PNG. Perhaps we can "reuse" the visualizer: key for this, although we then get overloaded behaviour where the value of this key is either a command to run for the visualizer, or a path to a pre-generated image.

To fix that, here's a proposal that is in some way a mix between options 3 and 4 (let's call it option 5):

We could allow keys like in, ans, hint, desc, etc. to be either a string or a path: object. For example:

- '':
    in: 2 4  # Content of a testcase, copied to `1.in`
    ans:
      path: manual/sample/1.ans  # Path to the answer, copied to `1.ans`

- '':
    in: 2 4  # Content of a testcase, copied to `2.in`
    ans: 6  # Content copied to `2.ans`
    visualizer:
      path: manual/sample/2.png  # Path to the PNG, instead of a command to run

@thorehusfeldt
Copy link
Collaborator

image (or even visualization) may be better than png. Alternatively, support png jpeg jpg, and pdf. To me, keys are “conceptually cheap” (adding more keys adds little cognitive overhead and very little extra code), whereas overloading interpretations or funky semantics are ”conceptually expensive”. So I prefer the former.

@RagnarGrootKoerkamp
Copy link
Owner Author

Oh yes, anywhere I wrote png I implicied meant that this can be any of the valid image extensions, and you have to choose the right one. That way it's consistent with in and ans directly corresponding to files.

@mpsijm
Copy link
Collaborator

mpsijm commented Jul 4, 2023

@thorehusfeldt I agree with you that "conceptually cheap" specifications are to be preferred.

In practice, I do not think we need the granular control that option 5 provides (why would we ever want not all files matching a prefix to be copied?). The way how option 3 copies all manual files matching the prefix given in the global-for-the-testcase path: key except those that have their contents explicitly provided, should be enough.

I also like options 2 and 3 best, in that case, but I also think we should keep maintaining option 1 (but with an empty value instead of an explicit null) for backwards compatibility (and I wouldn't mind adding a warning there 🙂). (EDIT: I initially listed option 4 here as well, but that's not needed for backwards compatibility of course, so I'm fine with dropping it 😄)

@thorehusfeldt
Copy link
Collaborator

but with an empty value instead of an explicit null

My understanding is that foo: , foo: null, and foo: ~ are the same in yaml.

@RagnarGrootKoerkamp
Copy link
Owner Author

Ok so:

  • option 1 (testcase: null) is already supported by backwards compatibility for samples (data: sample: '1': null) and should keep working (with a warning).
  • options 2 and 3 come by default with the current proposal (the list in this comment)
  • Option 4 would require additional logic for an empty testcase: png: null key. But since we don't like this so much anyway, I'd prefer to just not implement this and disallow empty testcase: <ext>: null keys completely. We only keep the empty testcase: null for completely inline testcases.

Thus, you can either keep your testcase completely inline, or move all of it into generators.yaml.

thorehusfeldt and others added 24 commits August 25, 2023 15:27
…file from sample case

this is already mentioned at the top of the file
@RagnarGrootKoerkamp RagnarGrootKoerkamp merged commit e588586 into master Aug 25, 2023
1 of 2 checks passed
@RagnarGrootKoerkamp RagnarGrootKoerkamp deleted the hardcoding branch August 25, 2023 13:43
RagnarGrootKoerkamp added a commit that referenced this pull request Aug 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants