say: Tests vulnerable to octal interpretation of chunks #1586

thriqon · 2019-09-18T20:27:59Z

The tests of say do not catch an erroneous interpretation of chunks as
octal (which might happen due to leading zeroes inside a chunk). For example, my code produced the following:

bash say.sh 1023
one thousand nineteen

This is due to 023 oct == 19 dec. Unfortunately, none of the test
cases test for this (and I'm sure quite a lot of implementations in the
database are wrong because of this).

yawpitch · 2019-09-18T21:28:51Z

This repo is locked to anything but bug fixes (see #1560), and I’m not sure this qualifies. I’m also not sure how many languages can actually be effected by this (for instance in Python casting “023” to an int will always give you 23 as to be interpreted as octal it would need to be in the 0o23 octal literal from), though obviously Bash is effected.

Perhaps this should be implemented as a track-specific test in those languages that see “023” as an octal literal?

sshine · 2019-09-19T03:18:33Z

I don't really know if it qualifies as a bug or as an improvement that does not alter the substance of the exercise, but I'd consider it passable.

@thriqon: Did you submit or review a solution that erroneously passed existing tests but fails on this one, and would you mind sharing it? We have two conflicting forces at work: these tests aim to address arbitrary kinds of mistakes in dozens of languages, and yet sadly, the format limits us to a degree where we would like the cases to be succint.

There are existing tests with inner zeroes, e.g. 1002345 -> "one million two thousand three hundred forty-five". Does this not evoke the same kind of error as the first test case in this PR? Is there a motivation to find a smaller counter-example such as 1023? Does "0023" not get interpreted as an octal by the same program?

The second test case in this PR, if somehow the inner part is interpreted as octal, would fail differently. I'm not sure the description "a number with higher digits" fully catches the intent; maybe go with the existing test naming of simply "one thousand nine", since a description of the octal corner case, I doubt, will make sense in a lot of languages.

rpottsoh · 2019-09-19T12:49:08Z

@iHiD repo problem-specifications has WIP on the brain.

thriqon · 2019-09-19T19:31:28Z

@sshine, unfortunately, I've since fixed my code and it is gone, I'd have to reintroduce the bug.

The testcase 1,002,345 contains inner zeroes, but not in the bug-triggering position. In order to make the bug visible a chunk has to be a) lower than 100 and b) greater than 7. I'd be happy to rewrite this PR changing 1,023,045, as that should cover all the old intentions AND the octal case.

petertseng · 2019-09-21T01:02:32Z

I would approve either this PR as it stands, or 1023045. In both cases, please ensure that Travis CI succeeds. Please read the results of the current run at https://travis-ci.org/exercism/problem-specifications/builds/586724026 if unsure.

It seems we are uncertain what constitutes a bugfix. Here are some possible thoughts.

"If and only if no types have changed, the PR is a bugfix". Correctly classifies leap: Add years that refute some unusual factors #1581 as a bugfix and two-bucket: test inability to reach the goal #1580 as not-a-bugfix. Classifies this PR as a bugfix.
"If and only if the README is not changed, the PR is a bugfix". Correctly classifies leap: Add years that refute some unusual factors #1581 as a bugfix. Incorrectly classifies two-bucket: test inability to reach the goal #1580 as a bugfix. Classifies this PR as a bugfix.
"If and only if we believe most submitted solutions already do this correctly, the PR is a bugfix". Relies on human guesswork. By my guesses, correctly classifies leap: Add years that refute some unusual factors #1581 as a bugfix and two-bucket: test inability to reach the goal #1580 as not-a-bugfix. By my guesses, classifies this PR as a bugfix. The only implementations affected are those that convert parts of string to an integer as an intermediate step. A reminder that the resulting integers then have to be converted back into a string.

That is the best I can do for how to determine whether it is a bugfix. Help is appreciated.

The tests of `say` do not catch an erroneous interpretation of chunks as octal, which happens easily with internal zeroes. For example, my code produced the following: ````bash bash say.sh 1023 one thousand nineteen ```` This is due to `023 oct == 19 dec`. Unfortunately, none of the test cases test for this (and I'm sure quite a lot of implementations in the database are wrong because of this). These two cases catch a) wrong interpretation with valid octal numbers b) interpretation of chunks containing digits invalid in octal, i.e. 8 and 9.

sshine

One last thing and I'll approve: Avoid changing the linebreak on the last line of the file.

It is unrelated to the intent of the PR.
It messes with an (undocumented?) invariance:

for e in exercises/*; do
  cdata="$e/canonical-data.json"
  [ -e "$cdata" -a "$(tail -c 1 "$cdata" | wc -l)" -eq 0 ] \
    && echo "$e has got a newline at eof"
done # currently, nothing is printed.

I actually thought we had a CI script that would complain here. Edit: We address this in #1443.

Also, I see that you use force pushes: If you avoid that, it provides a better overview of the process in the PR, and we can always "squash merge" the PR so it looks nice in the history afterwards. We don't force particular GitHub workflows here, though, just an FYI.

(This part resolves discussion.)

@thriqon wrote:

I'd be happy to rewrite this PR changing 1,023,045, as that should cover all the old intentions AND the octal case.

Looking at #399, it seems that there was no immediate justification for the 1002345 test case. While there probably went some thought into this unit as representing some equivalence class, the tests seem to skip some coverage; modifying this unit will modify the coverage, but it's not immediately clear to me if this is a reduction.

I'm sorry for causing confusion by suggesting that an existing unit might work when it doesn't. I would probably prefer the separate tests you originally started out with, but on @petertseng's initiative, am also happy with either.

exercises/say/canonical-data.json

sshine · 2019-09-21T17:49:53Z

(The rest is unrelated to fixing this bug in particular.)

@petertseng wrote:

It seems we are uncertain what constitutes a bugfix.

Yes, we are. For the purpose of this PR, it should be enough to assert that it is either a bugfix or an improvement that does not alter the substance of the exercise. We could clarify what we mean by bugfix, since there is a distinction between adding a unit test that fixes a common bug in solutions on a given track, and fixing a bug in canonical data (#1319 is the ~~only~~ most unambiguous example of a bugfix I can think of).

But I don't know if it's necessary, since the aim of categorizing something as a bugfix or an improvement that does not alter the substance of the exercise is to avoid the kind of tension that caused #1560.

sshine

LGTM.

Perhaps we should wait for two more approvals.

petertseng

approval conditional on commits being squashed

in about 24 hours it will have been a full week since 1023045 was mentioned. I warn that I may decide to merge at some point after that, on the premise that we've had sufficient time to see any objections.

yawpitch · 2019-09-26T12:14:17Z

I'm still not certain that this shouldn't be a track-specific test, since octal interpretation of the chunks is a language-implementation-specific issue, but I don't have a formal objection based on that.

What I do somewhat object to is modifying an existing test without also modifying description to incorporate the reasoning for the change ... without looking at the commit history it both numbers appear entirely arbitrary and a future commit could undo the "added" benefit of testing octal interpretation unknowingly.

Honestly I'd rather have a new test with a description that specifically mentioned that it would block an issue with octal interpretation of chunks in languages for which that's an issue.

sshine · 2019-09-26T13:41:57Z

I'm still not certain that this shouldn't be a track-specific test

I think that principal discussion goes into the work of @iHiD's project group work later this year.

Making the distinction every time people add test cases is one of the most common roadblocks for public, iterative improvement of this repo. We tried to mitigate it with optional tests for common cases (#1492, #1518) and other proposals and discussions (e.g. #1545, your #1553), but "octal interpretation" would be a fringe case even in that model.

What I do somewhat object to is modifying an existing test without also modifying description to incorporate the reasoning for the change ... without looking at the commit history it both numbers appear entirely arbitrary [...]

Yes; as I uncovered, #399 does not immediately justify its particular choice of 1002345, so we can only assert that it has some coverage, and modifying it to 1023045 changes that coverage arbitrarily.

Honestly I'd rather have a new test with a description

Same. But I wasn't blocking either.

that specifically mentioned that it would block an issue with octal interpretation of chunks in languages for which that's an issue.

It's a tradeoff. Some tracks will find that redundant as it's so fringey. The test descriptions in this exercise don't even mention what they're testing most of the time, anyways. Lots of loose ends wrt. having a standard for test descriptions and upholding that. If we're going to solve them in this PR, then the scope change of this PR is both unreasonable and would exceed the temporary lockdown. (Edit: Changed sentence structure.)

As in: If this test case description is simply "one million twenty-three thousand forty-five", it's not broken a standard that the canonical data for this exercise currently upholds; and looking at the history for the test case, we'd at least find this PR rather than the initial commit that contains no coverage description.

I've reverted the test case to 1023, which was in @thriqon's original PR and later confused by my indecision.

exercises/say/canonical-data.json

yawpitch · 2019-09-27T10:17:17Z

It's a tradeoff. Some tracks will find that redundant as it's so fringey. The test descriptions in this exercise don't even mention what they're testing most of the time, anyways. Lots of loose ends wrt. having a standard for test descriptions and upholding that. If we're going to solve them in this PR, then the scope change of this PR is both unreasonable and would exceed the temporary lockdown. (Edit: Changed sentence structure.)

That seems spurious. The existence of many loose ends doesn't mean there's any value in adding another loose end. Nor does not adding another loose end here implicate that we need to fix all of the other tests in this exercise in this PR. I see no reason that if we merge this it shouldn't have a sensible description all its own.

Moreover with the hold as described in #1560 a PR to update all the descriptions, including this one, should be merged, but arguably this PR should not as the existing tests are not broken. I'm confident that this is not a bug fix, but a new test that is intended to address a language-specific issue. Even before the hold it likely should have been given the "optional" tag so that track maintainers would know if they had reason to implement it.

As in: If this test case description is simply "one million twenty-three thousand forty-five", it's not broken a standard that the canonical data for this exercise currently upholds; and looking at the history for the test case, we'd at least find this PR rather than the initial commit that contains no coverage description.

You're right, this exercise currently upholds no standard whatsoever, including the minimal one implied by common meaning of "description" on that field, however recapitulating a previously existing error is just shooting yourself in the foot again.

I've had reservations about this PR from the start. There is a reason why this repo is on hold, which is why this:

Making the distinction every time people add test cases is one of the most common roadblocks for public, iterative improvement of this repo.

Runs into the wall of my reservations ... until #1560 is closed there are not supposed to be any public, iterative improvements to this repo except for clear and obvious bug fixes -- existing tests are wrong, not merely that the existing tests fail to address all possible issues -- and copy improvements. I'm all for copy improvements and I'd consider the description field to be copy ... but to bypass the intent of the lockdown while not even making a minimal effort at ensuring the new test has good copy seems unworkable to me.

yawpitch

My overall discomfort with merging this will remain, but if a decision is made to merge I'd think that at minimum either a meaningful "description" OR "comment" OR "optional" tag should be added before that merge occurs.

We should not require future committers to do a git history search to try and discover why a specific arbitrary set of digits was used. I mean if we can't even get people to read and respect a prominently pinned Issue in Github, why should we believe they'll troll through a git log?

sshine · 2019-09-27T12:19:04Z

this PR should not [be merged] as the existing tests are not broken

In that case, we'll put it on hold. I agree with your reasoning.

@thriqon can add this as a track-specific test and refer to this PR in the meantime.

petertseng · 2019-09-27T16:03:03Z

I see that I must have misunderstood, thinking #1581 was a precedent. I now see the error of my ways. My bad!

The reason I now see my misunderstanding is that I now see that (in the leap exercise) dividing by a number other than 100 or 400 is an operation independent of a language (we can do it in our heads, we can do it on paper and pencil), whereas the knowledge that a string that starts with 0 will be interpreted as octal depends on the particular specification of how strings are interpreted as numbers in a given language.

It must be that I initially did not give this enough thought and therefore failed to see this distinction.

When it comes time to consider this PR again, please do consider a description that says why the case is added, not just echo the expected output. Thank you.

#1586 (comment)

yawpitch · 2019-09-27T16:47:25Z

I see that I must have misunderstood, thinking #1581 was a precedent. I now see the error of my ways. My bad!

It's not an error, it's just a point of view on which rough consensus can be difficult to achieve; there's obviously some wiggle room on what constitutes a bug fix issue. Since #1581 was us making, essentially, a flawed statement about the problem, rather than about any given solution to that problem, it leans a little more towards tipping to "bug fix" and therefore merge-able during the hold. This issue addresses a valid and interesting edge case, but one that will effect only solutions to the problem implemented in certain languages... that leans a little away from "bug fix", even if it's something we definitely want to address once the hold gets lifted.

ErikSchierboom · 2022-01-12T13:34:46Z

@thriqon Are you still interested in working on this PR?

IsaacG · 2023-01-01T18:39:13Z

@thriqon Is this PR still relevant or should it be closed out?

kytrinyx · 2023-01-03T08:09:19Z

Given that it's been a year since the previous ping with no response, I think we are good to close it.

IsaacG · 2023-01-03T19:49:22Z

Closing out for now. Feel free to reopen if you want to work on this further!

sshine requested changes Sep 21, 2019

View reviewed changes

exercises/say/canonical-data.json Outdated Show resolved Hide resolved

Drop trailing newline

ed5bf4b

sshine approved these changes Sep 24, 2019

View reviewed changes

petertseng previously approved these changes Sep 25, 2019

View reviewed changes

Revert to @thriqon's original test of 1023.

e35ae57

petertseng reviewed Sep 26, 2019

View reviewed changes

exercises/say/canonical-data.json Outdated Show resolved Hide resolved

Revert the entire test to 1023, not just the description.

f2252fc

yawpitch requested changes Sep 27, 2019

View reviewed changes

sshine added the hold label Sep 27, 2019

sshine mentioned this pull request Oct 13, 2019

resistor-color-duo: Fix wording in description, add test case #1605

Closed

Base automatically changed from master to main January 27, 2021 15:31

ErikSchierboom removed the hold label Oct 15, 2021

IsaacG closed this Jan 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

say: Tests vulnerable to octal interpretation of chunks #1586

say: Tests vulnerable to octal interpretation of chunks #1586

thriqon commented Sep 18, 2019

yawpitch commented Sep 18, 2019

sshine commented Sep 19, 2019

rpottsoh commented Sep 19, 2019

thriqon commented Sep 19, 2019

petertseng commented Sep 21, 2019

sshine left a comment •

edited

sshine commented Sep 21, 2019 •

edited

sshine left a comment

petertseng left a comment

yawpitch commented Sep 26, 2019

sshine commented Sep 26, 2019 •

edited

yawpitch commented Sep 27, 2019 •

edited

yawpitch left a comment •

edited

sshine commented Sep 27, 2019 •

edited

petertseng commented Sep 27, 2019

yawpitch commented Sep 27, 2019

ErikSchierboom commented Jan 12, 2022

IsaacG commented Jan 1, 2023

kytrinyx commented Jan 3, 2023

IsaacG commented Jan 3, 2023

say: Tests vulnerable to octal interpretation of chunks #1586

say: Tests vulnerable to octal interpretation of chunks #1586

Conversation

thriqon commented Sep 18, 2019

yawpitch commented Sep 18, 2019

sshine commented Sep 19, 2019

rpottsoh commented Sep 19, 2019

thriqon commented Sep 19, 2019

petertseng commented Sep 21, 2019

sshine left a comment • edited

Choose a reason for hiding this comment

sshine commented Sep 21, 2019 • edited

sshine left a comment

Choose a reason for hiding this comment

petertseng left a comment

Choose a reason for hiding this comment

yawpitch commented Sep 26, 2019

sshine commented Sep 26, 2019 • edited

yawpitch commented Sep 27, 2019 • edited

yawpitch left a comment • edited

Choose a reason for hiding this comment

sshine commented Sep 27, 2019 • edited

petertseng commented Sep 27, 2019

yawpitch commented Sep 27, 2019

ErikSchierboom commented Jan 12, 2022

IsaacG commented Jan 1, 2023

kytrinyx commented Jan 3, 2023

IsaacG commented Jan 3, 2023

sshine left a comment •

edited

sshine commented Sep 21, 2019 •

edited

sshine commented Sep 26, 2019 •

edited

yawpitch commented Sep 27, 2019 •

edited

yawpitch left a comment •

edited

sshine commented Sep 27, 2019 •

edited