Add instructions on how to extract canonical data to the Contributing Guide #230

petertseng · 2016-04-19T04:35:58Z

the TODO issues like https://github.com/exercism/todo/issues/100 point us to some examples as guidelines for what should go in a test inputs/outputs json file. It is probably useful to assemble some guide somewhere about that using what we know of our existing json files.

This is prompted by exercism/discussions#10 (comment) where I muse that it could have been good if some non-obvious corner cases had been commented with their purpose. Sure, in many situations the test name is enough, but when it isn't, I'm always for more documentation.

kotp · 2016-04-19T06:54:35Z

I had noted in this comment how(what appears to be) a comment made it to the test title. That should not happen. If we do need comments, they should be in the json file as "notes" or something similar, rather than the title.

kytrinyx · 2016-04-19T10:12:28Z

Agreed, it would be really great to have a comprehensive guide (that we could add to). Probably somewhere in the contributing guide.

We haven't quite worked out the format for comments--if you have suggestions for how to improve them, that would be great.

Maybe an optional key "notes" for any test case? That way we can put notes as comments in the test suite if they exist.

kotp · 2016-04-19T18:13:17Z

Hey! "notes" is a good idea. 😆 I was thinking that messages for the tests may be helpful. The generated tests simply fail at the moment with no clarification of what may be the problem, other than the test title. So we have two kinds of "comments" to consider.

I am not familiar enough with other testing frameworks to know how error messaging is handled, but "test-message" as a key for the test messages when deemed appropriate, and "notes" for commentary notes, such as "explanation for a magic number" may be nice.

kytrinyx · 2016-04-19T19:33:27Z

We did have messages for some tests, but I found that it was incredibly redundant. I get the benefit of working with both Go and Ruby which have very different approaches to testing. I think we can use the test data to generate messages where necessary (e.g. leap year test suite could have a message that says "case: 1998" or something like that).

I really like the idea of having notes that are meant for the end user, though, for places where it makes sense.

kotp · 2016-04-20T02:39:25Z

I agree, comments like the ones in food_chain (not generated) are meant to be informative to the learner, and I think they are in the right place. As far as messages... since I have (finally) done a generator class, I think it isn't such a big deal to get values in the messages for the tests. If the tests are actually descriptive enough (Perhaps like a test title that translates to this^#231 pattern: "standard and odd year") we can then use the description in the failure message of the test, perhaps with the value of the offending call and the inputs.

#231: A little more descriptive Leap Year test titles.

kytrinyx · 2016-04-21T13:24:39Z

Agreed - if the description is good, then that combined with inputs/outputs should help dramatically.

petertseng · 2016-07-05T03:10:53Z

Notes: In #289 there is a question on whether there should be any facility for marking tests as primarily testing performance so that tracks that auto-generate can read that and react accordingly (if they want to treat performance tests differently)

Other musings I had were: Should there be a difference between comments that need only be visible to the track maintainers/anyone who want to port the exerciser to a new language) versus those that should directly be added to auto-generated test files so that student can see them?

kytrinyx · 2016-08-01T20:56:19Z

Quick note: this might be helpful: http://x.exercism.io/v3/problems/data-todos

IanWhitney · 2016-08-03T00:11:51Z

My general strategy:

Look at the list of the tracks that currently implement the problem. I use http://x.exercism.io/problems/:problem-name
Open up a bunch of tabs, 1 per language that implements the problem, to http://exercism.io/exercises/:language-name/:problem-name
Scan the test suites to find Families and Outliers.
- Usually a problem has a Family of common set of tests that started off in one language and have been copied into a bunch of other languages, sometimes with minor modifications. If you're lucky, the problem only has one Family.
- Problems usually have one Outlier language (Looking at you Go, and sometimes Haskell) that have decided to forge their own path and write a brand new test suite from scratch.
From that survey, compile a set of tests that:
- Covers the described problem thoroughly enough (note: not thoroughly, just enough)
- Generally represents the tests that are currently in Production, because asking every track maintainer to fully rewrite their test suite is mean.
Determine test order. General rules for that:
- Put the simplest test first. Get students to a green test ASAP.
  - Custom Set, for example frequently put one of the hardest tests first.
- Introduce 1 new wrinkle per test
- Try not to have tests that pass without any code changes. If the test already passes, why is it there?
- Try to avoid tests that exist to check performance. Or clearly mark them as optional. These tests are unit tests and should report if the code works or does not work. Performant code is great, but those tests belong in an acceptance suite.

I gots lots of opinions on this, apparently. Take or leave as you see fit.

petertseng · 2016-08-10T04:09:04Z

I'm not sure I have so much more to add, but I'd like to point out a small thing that is helpful to me (more helpful and up-to-date than the individual issues in the todo repository, for sure...)

List problems:

curl http://x.exercism.io/v3/problems/data-todos | jq '.[] | map(.["slug"])'

Look at a given problem (replace accumulate with your chosen exercise):

curl http://x.exercism.io/v3/problems/data-todos | jq '.[] | map(select(.["slug"] == "accumulate"))'

Lists implementing tracks, and is aware of whether they moved their exercises under exercises/ directory, and links to them correctly.

rbasso · 2016-08-10T04:48:26Z

Problems usually have one Outlier language (Looking at you Go, and sometimes Haskell) that have decided to forge their own path and write a brand new test suite from scratch.

We are fixing that in the Haskell track, @IanWhitney. 😄 exercism/haskell#211

Try to avoid tests that exist to check performance.

We still have a few custom performance tests in xhaskell. Sorry! 😔
I would like to move them to benchmark suites in the future, but it will take some time until we solve that, because it is low priority.

I gots lots of opinions on this, apparently. Take or leave as you see fit.

Makes perfect sense to me. 👍

petertseng · 2016-09-18T00:48:58Z

As we have discussed at #375 (comment) - just as important as what does go in the canonical test data is what does not go in the canonical test data. There may be cases (such as negative lengths for the triangle exercise) that people might reasonably think should be tested, but we discuss and choose not to test (maybe to focus more on the problem at hand and/or avoid edge-itis).

It is beneficial to record the decision of omission so that people don't have to trawl through issues and/or risk duplicating past work. The record may be made either in the JSON file or in the README. Especially in the README if we wish students to be aware of it and/or offer that they can add tests for such cases themselves, which seems like a fine thing to encourage.

junedev · 2016-10-21T19:14:37Z

The contribution guide is updated. How about closing this issue?

petertseng · 2016-11-07T01:17:16Z

I have a bit of an identity crisis with this issue.

Reading my opening statement on this issue, it appears I wished to discuss a standard for how to annotate an individual test case with the rationale for having it. At some point it appears it turned into a duplicate of #139 instead. I'll gladly close #139. It's my fault for not keeping this one on my original track.

To make clear what I'm talking about, here is a question that I ask: When I look at a canonical-data.json, how can I be sure that every test case is necessary? What does each individual case test for and what mistaken solutions might it help catch? If it is not already clear from the test description, I want some way to tell.

Now, we may decide that the way to tell is through the git commit message, and take no further action. It may be hard to look through many commit messages to find the case I'm looking for, but maybe we don't have that problem yet. But this is why I ask whether it should be written directly in the JSON file.

There are discussions of various things in this issue:

notes / comments / the rationale for an individual test case
how to synthesise a set of common test cases using existing language implementations (done by contribution-guide: adding section of extracting shared test data #400, duplicate of Contributing Guide: Write about how to extract shared inputs/outputs #139)
Documenting our decisions of omission - since this is being discussed in Add a philosophy document to exercises directories? #382 and Label for policy decisions "hidden" in issues? discussions#96 , I think we can leave that discussion to those issues.

So I'll either rename this issue to reflect the first topic since there are comments about it in here already (I'll do this first), or create a new issue and close this one if it's become too unwieldy.

petertseng · 2016-11-07T01:19:05Z

It's too distracting to have to read through everything. I'll create a new issue and reference comments from this issue where appropriate.

No longer using the `www` prefix for the home page on RubyLearning.com fixes #230

kotp mentioned this issue Apr 20, 2016

A little more descriptive Leap Year test titles #231

Merged

kotp mentioned this issue Jun 28, 2016

Audit test files for assert/refute with no custom message exercism/ruby#386

Closed

petertseng mentioned this issue Jul 5, 2016

Add instructions on how to extract canonical data to the Contributing Guide #261

Closed

kytrinyx changed the title ~~Guide for what should go in a test json file~~ Add instructions on how to extract canonical data to the Contributing Guide Jul 5, 2016

petertseng mentioned this issue Aug 1, 2016

raindrops: remove redundant description from JSON file #314

Merged

kytrinyx mentioned this issue Aug 1, 2016

Write Docs on how to extract test data #315

Closed

junedev mentioned this issue Aug 18, 2016

binary-search: add json test data #349

Merged

petertseng mentioned this issue Sep 10, 2016

Test that "match entire lines" flag works properly #369

Merged

petertseng mentioned this issue Sep 18, 2016

Correct the definition of triangle inequality to not include degenera… #375

Merged

junedev mentioned this issue Oct 8, 2016

contribution-guide: adding section of extracting shared test data #400

Merged

petertseng changed the title ~~Add instructions on how to extract canonical data to the Contributing Guide~~ How do we express the intention of non-obvious test cases in our canonical-data.json files? Nov 7, 2016

petertseng closed this as completed Nov 7, 2016

petertseng changed the title ~~How do we express the intention of non-obvious test cases in our canonical-data.json files?~~ Add instructions on how to extract canonical data to the Contributing Guide Nov 7, 2016

This was referenced Nov 7, 2016

Contributing Guide: Write about how to extract shared inputs/outputs #139

Closed

How do we express the intention of non-obvious test cases in our canonical-data.json files? #442

Closed

petertseng mentioned this issue Nov 30, 2017

alphametics: Update canonical data. Add a test case with many addends #1024

Merged

emcoding pushed a commit that referenced this issue Nov 19, 2018

RubyLearning DNS has changed

80a622d

No longer using the `www` prefix for the home page on RubyLearning.com fixes #230

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add instructions on how to extract canonical data to the Contributing Guide #230

Add instructions on how to extract canonical data to the Contributing Guide #230

petertseng commented Apr 19, 2016 •

edited

Loading

kotp commented Apr 19, 2016

kytrinyx commented Apr 19, 2016

kotp commented Apr 19, 2016 •

edited

Loading

kytrinyx commented Apr 19, 2016

kotp commented Apr 20, 2016 •

edited

Loading

kytrinyx commented Apr 21, 2016

petertseng commented Jul 5, 2016 •

edited by kotp

Loading

kytrinyx commented Aug 1, 2016

IanWhitney commented Aug 3, 2016 •

edited

Loading

petertseng commented Aug 10, 2016 •

edited

Loading

rbasso commented Aug 10, 2016

petertseng commented Sep 18, 2016

junedev commented Oct 21, 2016

petertseng commented Nov 7, 2016 •

edited

Loading

petertseng commented Nov 7, 2016

Add instructions on how to extract canonical data to the Contributing Guide #230

Add instructions on how to extract canonical data to the Contributing Guide #230

Comments

petertseng commented Apr 19, 2016 • edited Loading

kotp commented Apr 19, 2016

kytrinyx commented Apr 19, 2016

kotp commented Apr 19, 2016 • edited Loading

kytrinyx commented Apr 19, 2016

kotp commented Apr 20, 2016 • edited Loading

kytrinyx commented Apr 21, 2016

petertseng commented Jul 5, 2016 • edited by kotp Loading

kytrinyx commented Aug 1, 2016

IanWhitney commented Aug 3, 2016 • edited Loading

petertseng commented Aug 10, 2016 • edited Loading

rbasso commented Aug 10, 2016

petertseng commented Sep 18, 2016

junedev commented Oct 21, 2016

petertseng commented Nov 7, 2016 • edited Loading

petertseng commented Nov 7, 2016

petertseng commented Apr 19, 2016 •

edited

Loading

kotp commented Apr 19, 2016 •

edited

Loading

kotp commented Apr 20, 2016 •

edited

Loading

petertseng commented Jul 5, 2016 •

edited by kotp

Loading

IanWhitney commented Aug 3, 2016 •

edited

Loading

petertseng commented Aug 10, 2016 •

edited

Loading

petertseng commented Nov 7, 2016 •

edited

Loading