-
-
Notifications
You must be signed in to change notification settings - Fork 535
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
say: Tests vulnerable to octal interpretation of chunks #1586
Conversation
This repo is locked to anything but bug fixes (see #1560), and I’m not sure this qualifies. I’m also not sure how many languages can actually be effected by this (for instance in Python casting “023” to an int will always give you Perhaps this should be implemented as a track-specific test in those languages that see “023” as an octal literal? |
I don't really know if it qualifies as a bug or as an improvement that does not alter the substance of the exercise, but I'd consider it passable. @thriqon: Did you submit or review a solution that erroneously passed existing tests but fails on this one, and would you mind sharing it? We have two conflicting forces at work: these tests aim to address arbitrary kinds of mistakes in dozens of languages, and yet sadly, the format limits us to a degree where we would like the cases to be succint. There are existing tests with inner zeroes, e.g. 1002345 -> "one million two thousand three hundred forty-five". Does this not evoke the same kind of error as the first test case in this PR? Is there a motivation to find a smaller counter-example such as 1023? Does "0023" not get interpreted as an octal by the same program? The second test case in this PR, if somehow the inner part is interpreted as octal, would fail differently. I'm not sure the description "a number with higher digits" fully catches the intent; maybe go with the existing test naming of simply "one thousand nine", since a description of the octal corner case, I doubt, will make sense in a lot of languages. |
@iHiD repo |
@sshine, unfortunately, I've since fixed my code and it is gone, I'd have to reintroduce the bug. The testcase |
I would approve either this PR as it stands, or 1023045. In both cases, please ensure that Travis CI succeeds. Please read the results of the current run at https://travis-ci.org/exercism/problem-specifications/builds/586724026 if unsure. It seems we are uncertain what constitutes a bugfix. Here are some possible thoughts.
That is the best I can do for how to determine whether it is a bugfix. Help is appreciated. |
The tests of `say` do not catch an erroneous interpretation of chunks as octal, which happens easily with internal zeroes. For example, my code produced the following: ````bash bash say.sh 1023 one thousand nineteen ```` This is due to `023 oct == 19 dec`. Unfortunately, none of the test cases test for this (and I'm sure quite a lot of implementations in the database are wrong because of this). These two cases catch a) wrong interpretation with valid octal numbers b) interpretation of chunks containing digits invalid in octal, i.e. 8 and 9.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One last thing and I'll approve: Avoid changing the linebreak on the last line of the file.
- It is unrelated to the intent of the PR.
- It messes with an (undocumented?) invariance:
for e in exercises/*; do
cdata="$e/canonical-data.json"
[ -e "$cdata" -a "$(tail -c 1 "$cdata" | wc -l)" -eq 0 ] \
&& echo "$e has got a newline at eof"
done # currently, nothing is printed.
I actually thought we had a CI script that would complain here. Edit: We address this in #1443.
Also, I see that you use force pushes: If you avoid that, it provides a better overview of the process in the PR, and we can always "squash merge" the PR so it looks nice in the history afterwards. We don't force particular GitHub workflows here, though, just an FYI.
(This part resolves discussion.)
@thriqon wrote:
I'd be happy to rewrite this PR changing 1,023,045, as that should cover all the old intentions AND the octal case.
Looking at #399, it seems that there was no immediate justification for the 1002345 test case. While there probably went some thought into this unit as representing some equivalence class, the tests seem to skip some coverage; modifying this unit will modify the coverage, but it's not immediately clear to me if this is a reduction.
I'm sorry for causing confusion by suggesting that an existing unit might work when it doesn't. I would probably prefer the separate tests you originally started out with, but on @petertseng's initiative, am also happy with either.
(The rest is unrelated to fixing this bug in particular.) @petertseng wrote:
Yes, we are. For the purpose of this PR, it should be enough to assert that it is either a bugfix or an improvement that does not alter the substance of the exercise. We could clarify what we mean by bugfix, since there is a distinction between adding a unit test that fixes a common bug in solutions on a given track, and fixing a bug in canonical data (#1319 is the But I don't know if it's necessary, since the aim of categorizing something as a bugfix or an improvement that does not alter the substance of the exercise is to avoid the kind of tension that caused #1560. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Perhaps we should wait for two more approvals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
approval conditional on commits being squashed
in about 24 hours it will have been a full week since 1023045 was mentioned. I warn that I may decide to merge at some point after that, on the premise that we've had sufficient time to see any objections.
I'm still not certain that this shouldn't be a track-specific test, since octal interpretation of the chunks is a language-implementation-specific issue, but I don't have a formal objection based on that. What I do somewhat object to is modifying an existing test without also modifying description to incorporate the reasoning for the change ... without looking at the commit history it both numbers appear entirely arbitrary and a future commit could undo the "added" benefit of testing octal interpretation unknowingly. Honestly I'd rather have a new test with a description that specifically mentioned that it would block an issue with octal interpretation of chunks in languages for which that's an issue. |
I think that principal discussion goes into the work of @iHiD's project group work later this year. Making the distinction every time people add test cases is one of the most common roadblocks for public, iterative improvement of this repo. We tried to mitigate it with optional tests for common cases (#1492, #1518) and other proposals and discussions (e.g. #1545, your #1553), but "octal interpretation" would be a fringe case even in that model.
Yes; as I uncovered, #399 does not immediately justify its particular choice of 1002345, so we can only assert that it has some coverage, and modifying it to 1023045 changes that coverage arbitrarily.
Same. But I wasn't blocking either.
It's a tradeoff. Some tracks will find that redundant as it's so fringey. The test descriptions in this exercise don't even mention what they're testing most of the time, anyways. Lots of loose ends wrt. having a standard for test descriptions and upholding that. If we're going to solve them in this PR, then the scope change of this PR is both unreasonable and would exceed the temporary lockdown. (Edit: Changed sentence structure.) As in: If this test case description is simply "one million twenty-three thousand forty-five", it's not broken a standard that the canonical data for this exercise currently upholds; and looking at the history for the test case, we'd at least find this PR rather than the initial commit that contains no coverage description. I've reverted the test case to 1023, which was in @thriqon's original PR and later confused by my indecision. |
That seems spurious. The existence of many loose ends doesn't mean there's any value in adding another loose end. Nor does not adding another loose end here implicate that we need to fix all of the other tests in this exercise in this PR. I see no reason that if we merge this it shouldn't have a sensible description all its own. Moreover with the hold as described in #1560 a PR to update all the descriptions, including this one, should be merged, but arguably this PR should not as the existing tests are not broken. I'm confident that this is not a bug fix, but a new test that is intended to address a language-specific issue. Even before the hold it likely should have been given the "optional" tag so that track maintainers would know if they had reason to implement it.
You're right, this exercise currently upholds no standard whatsoever, including the minimal one implied by common meaning of "description" on that field, however recapitulating a previously existing error is just shooting yourself in the foot again. I've had reservations about this PR from the start. There is a reason why this repo is on hold, which is why this:
Runs into the wall of my reservations ... until #1560 is closed there are not supposed to be any public, iterative improvements to this repo except for clear and obvious bug fixes -- existing tests are wrong, not merely that the existing tests fail to address all possible issues -- and copy improvements. I'm all for copy improvements and I'd consider the description field to be copy ... but to bypass the intent of the lockdown while not even making a minimal effort at ensuring the new test has good copy seems unworkable to me. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My overall discomfort with merging this will remain, but if a decision is made to merge I'd think that at minimum either a meaningful "description" OR "comment" OR "optional" tag should be added before that merge occurs.
We should not require future committers to do a git history search to try and discover why a specific arbitrary set of digits was used. I mean if we can't even get people to read and respect a prominently pinned Issue in Github, why should we believe they'll troll through a git log
?
In that case, we'll put it on hold. I agree with your reasoning. @thriqon can add this as a track-specific test and refer to this PR in the meantime. |
I see that I must have misunderstood, thinking #1581 was a precedent. I now see the error of my ways. My bad! The reason I now see my misunderstanding is that I now see that (in the leap exercise) dividing by a number other than 100 or 400 is an operation independent of a language (we can do it in our heads, we can do it on paper and pencil), whereas the knowledge that a string that starts with 0 will be interpreted as octal depends on the particular specification of how strings are interpreted as numbers in a given language. It must be that I initially did not give this enough thought and therefore failed to see this distinction. When it comes time to consider this PR again, please do consider a description that says why the case is added, not just echo the expected output. Thank you. |
It's not an error, it's just a point of view on which rough consensus can be difficult to achieve; there's obviously some wiggle room on what constitutes a bug fix issue. Since #1581 was us making, essentially, a flawed statement about the problem, rather than about any given solution to that problem, it leans a little more towards tipping to "bug fix" and therefore merge-able during the hold. This issue addresses a valid and interesting edge case, but one that will effect only solutions to the problem implemented in certain languages... that leans a little away from "bug fix", even if it's something we definitely want to address once the hold gets lifted. |
@thriqon Are you still interested in working on this PR? |
@thriqon Is this PR still relevant or should it be closed out? |
Given that it's been a year since the previous ping with no response, I think we are good to close it. |
Closing out for now. Feel free to reopen if you want to work on this further! |
The tests of
say
do not catch an erroneous interpretation of chunks asoctal (which might happen due to leading zeroes inside a chunk). For example, my code produced the following:
This is due to
023 oct == 19 dec
. Unfortunately, none of the testcases test for this (and I'm sure quite a lot of implementations in the
database are wrong because of this).