Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

say: Tests vulnerable to octal interpretation of chunks #1586

Closed
wants to merge 4 commits into from
Closed

say: Tests vulnerable to octal interpretation of chunks #1586

wants to merge 4 commits into from

Conversation

thriqon
Copy link

@thriqon thriqon commented Sep 18, 2019

The tests of say do not catch an erroneous interpretation of chunks as
octal (which might happen due to leading zeroes inside a chunk). For example, my code produced the following:

bash say.sh 1023
one thousand nineteen

This is due to 023 oct == 19 dec. Unfortunately, none of the test
cases test for this (and I'm sure quite a lot of implementations in the
database are wrong because of this).

@yawpitch
Copy link
Contributor

This repo is locked to anything but bug fixes (see #1560), and I’m not sure this qualifies. I’m also not sure how many languages can actually be effected by this (for instance in Python casting “023” to an int will always give you 23 as to be interpreted as octal it would need to be in the 0o23 octal literal from), though obviously Bash is effected.

Perhaps this should be implemented as a track-specific test in those languages that see “023” as an octal literal?

@sshine
Copy link
Contributor

sshine commented Sep 19, 2019

I don't really know if it qualifies as a bug or as an improvement that does not alter the substance of the exercise, but I'd consider it passable.

@thriqon: Did you submit or review a solution that erroneously passed existing tests but fails on this one, and would you mind sharing it? We have two conflicting forces at work: these tests aim to address arbitrary kinds of mistakes in dozens of languages, and yet sadly, the format limits us to a degree where we would like the cases to be succint.

There are existing tests with inner zeroes, e.g. 1002345 -> "one million two thousand three hundred forty-five". Does this not evoke the same kind of error as the first test case in this PR? Is there a motivation to find a smaller counter-example such as 1023? Does "0023" not get interpreted as an octal by the same program?

The second test case in this PR, if somehow the inner part is interpreted as octal, would fail differently. I'm not sure the description "a number with higher digits" fully catches the intent; maybe go with the existing test naming of simply "one thousand nine", since a description of the octal corner case, I doubt, will make sense in a lot of languages.

@rpottsoh
Copy link
Member

@iHiD repo problem-specifications has WIP on the brain.

@thriqon
Copy link
Author

thriqon commented Sep 19, 2019

@sshine, unfortunately, I've since fixed my code and it is gone, I'd have to reintroduce the bug.

The testcase 1,002,345 contains inner zeroes, but not in the bug-triggering position. In order to make the bug visible a chunk has to be a) lower than 100 and b) greater than 7. I'd be happy to rewrite this PR changing 1,023,045, as that should cover all the old intentions AND the octal case.

@petertseng
Copy link
Member

I would approve either this PR as it stands, or 1023045. In both cases, please ensure that Travis CI succeeds. Please read the results of the current run at https://travis-ci.org/exercism/problem-specifications/builds/586724026 if unsure.


It seems we are uncertain what constitutes a bugfix. Here are some possible thoughts.

That is the best I can do for how to determine whether it is a bugfix. Help is appreciated.

The tests of `say` do not catch an erroneous interpretation of chunks as
octal, which happens easily with internal zeroes. For example, my code
produced the following:

````bash
bash say.sh 1023
one thousand nineteen
````

This is due to `023 oct == 19 dec`. Unfortunately, none of the test
cases test for this (and I'm sure quite a lot of implementations in the
database are wrong because of this). These two cases catch

a) wrong interpretation with valid octal numbers
b) interpretation of chunks containing digits invalid in octal, i.e. 8
and 9.
Copy link
Contributor

@sshine sshine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One last thing and I'll approve: Avoid changing the linebreak on the last line of the file.

  • It is unrelated to the intent of the PR.
  • It messes with an (undocumented?) invariance:
for e in exercises/*; do
  cdata="$e/canonical-data.json"
  [ -e "$cdata" -a "$(tail -c 1 "$cdata" | wc -l)" -eq 0 ] \
    && echo "$e has got a newline at eof"
done # currently, nothing is printed.

I actually thought we had a CI script that would complain here. Edit: We address this in #1443.

Also, I see that you use force pushes: If you avoid that, it provides a better overview of the process in the PR, and we can always "squash merge" the PR so it looks nice in the history afterwards. We don't force particular GitHub workflows here, though, just an FYI.


(This part resolves discussion.)

@thriqon wrote:

I'd be happy to rewrite this PR changing 1,023,045, as that should cover all the old intentions AND the octal case.

Looking at #399, it seems that there was no immediate justification for the 1002345 test case. While there probably went some thought into this unit as representing some equivalence class, the tests seem to skip some coverage; modifying this unit will modify the coverage, but it's not immediately clear to me if this is a reduction.

I'm sorry for causing confusion by suggesting that an existing unit might work when it doesn't. I would probably prefer the separate tests you originally started out with, but on @petertseng's initiative, am also happy with either.

exercises/say/canonical-data.json Outdated Show resolved Hide resolved
@sshine
Copy link
Contributor

sshine commented Sep 21, 2019

(The rest is unrelated to fixing this bug in particular.)

@petertseng wrote:

It seems we are uncertain what constitutes a bugfix.

Yes, we are. For the purpose of this PR, it should be enough to assert that it is either a bugfix or an improvement that does not alter the substance of the exercise. We could clarify what we mean by bugfix, since there is a distinction between adding a unit test that fixes a common bug in solutions on a given track, and fixing a bug in canonical data (#1319 is the only most unambiguous example of a bugfix I can think of).

But I don't know if it's necessary, since the aim of categorizing something as a bugfix or an improvement that does not alter the substance of the exercise is to avoid the kind of tension that caused #1560.

Copy link
Contributor

@sshine sshine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Perhaps we should wait for two more approvals.

petertseng
petertseng previously approved these changes Sep 25, 2019
Copy link
Member

@petertseng petertseng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approval conditional on commits being squashed

in about 24 hours it will have been a full week since 1023045 was mentioned. I warn that I may decide to merge at some point after that, on the premise that we've had sufficient time to see any objections.

@yawpitch
Copy link
Contributor

I'm still not certain that this shouldn't be a track-specific test, since octal interpretation of the chunks is a language-implementation-specific issue, but I don't have a formal objection based on that.

What I do somewhat object to is modifying an existing test without also modifying description to incorporate the reasoning for the change ... without looking at the commit history it both numbers appear entirely arbitrary and a future commit could undo the "added" benefit of testing octal interpretation unknowingly.

Honestly I'd rather have a new test with a description that specifically mentioned that it would block an issue with octal interpretation of chunks in languages for which that's an issue.

@sshine
Copy link
Contributor

sshine commented Sep 26, 2019

I'm still not certain that this shouldn't be a track-specific test

I think that principal discussion goes into the work of @iHiD's project group work later this year.

Making the distinction every time people add test cases is one of the most common roadblocks for public, iterative improvement of this repo. We tried to mitigate it with optional tests for common cases (#1492, #1518) and other proposals and discussions (e.g. #1545, your #1553), but "octal interpretation" would be a fringe case even in that model.

What I do somewhat object to is modifying an existing test without also modifying description to incorporate the reasoning for the change ... without looking at the commit history it both numbers appear entirely arbitrary [...]

Yes; as I uncovered, #399 does not immediately justify its particular choice of 1002345, so we can only assert that it has some coverage, and modifying it to 1023045 changes that coverage arbitrarily.

Honestly I'd rather have a new test with a description

Same. But I wasn't blocking either.

that specifically mentioned that it would block an issue with octal interpretation of chunks in languages for which that's an issue.

It's a tradeoff. Some tracks will find that redundant as it's so fringey. The test descriptions in this exercise don't even mention what they're testing most of the time, anyways. Lots of loose ends wrt. having a standard for test descriptions and upholding that. If we're going to solve them in this PR, then the scope change of this PR is both unreasonable and would exceed the temporary lockdown. (Edit: Changed sentence structure.)

As in: If this test case description is simply "one million twenty-three thousand forty-five", it's not broken a standard that the canonical data for this exercise currently upholds; and looking at the history for the test case, we'd at least find this PR rather than the initial commit that contains no coverage description.

I've reverted the test case to 1023, which was in @thriqon's original PR and later confused by my indecision.

@yawpitch
Copy link
Contributor

yawpitch commented Sep 27, 2019

It's a tradeoff. Some tracks will find that redundant as it's so fringey. The test descriptions in this exercise don't even mention what they're testing most of the time, anyways. Lots of loose ends wrt. having a standard for test descriptions and upholding that. If we're going to solve them in this PR, then the scope change of this PR is both unreasonable and would exceed the temporary lockdown. (Edit: Changed sentence structure.)

That seems spurious. The existence of many loose ends doesn't mean there's any value in adding another loose end. Nor does not adding another loose end here implicate that we need to fix all of the other tests in this exercise in this PR. I see no reason that if we merge this it shouldn't have a sensible description all its own.

Moreover with the hold as described in #1560 a PR to update all the descriptions, including this one, should be merged, but arguably this PR should not as the existing tests are not broken. I'm confident that this is not a bug fix, but a new test that is intended to address a language-specific issue. Even before the hold it likely should have been given the "optional" tag so that track maintainers would know if they had reason to implement it.

As in: If this test case description is simply "one million twenty-three thousand forty-five", it's not broken a standard that the canonical data for this exercise currently upholds; and looking at the history for the test case, we'd at least find this PR rather than the initial commit that contains no coverage description.

You're right, this exercise currently upholds no standard whatsoever, including the minimal one implied by common meaning of "description" on that field, however recapitulating a previously existing error is just shooting yourself in the foot again.

I've had reservations about this PR from the start. There is a reason why this repo is on hold, which is why this:

Making the distinction every time people add test cases is one of the most common roadblocks for public, iterative improvement of this repo.

Runs into the wall of my reservations ... until #1560 is closed there are not supposed to be any public, iterative improvements to this repo except for clear and obvious bug fixes -- existing tests are wrong, not merely that the existing tests fail to address all possible issues -- and copy improvements. I'm all for copy improvements and I'd consider the description field to be copy ... but to bypass the intent of the lockdown while not even making a minimal effort at ensuring the new test has good copy seems unworkable to me.

Copy link
Contributor

@yawpitch yawpitch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My overall discomfort with merging this will remain, but if a decision is made to merge I'd think that at minimum either a meaningful "description" OR "comment" OR "optional" tag should be added before that merge occurs.

We should not require future committers to do a git history search to try and discover why a specific arbitrary set of digits was used. I mean if we can't even get people to read and respect a prominently pinned Issue in Github, why should we believe they'll troll through a git log?

@sshine
Copy link
Contributor

sshine commented Sep 27, 2019

this PR should not [be merged] as the existing tests are not broken

In that case, we'll put it on hold. I agree with your reasoning.

@thriqon can add this as a track-specific test and refer to this PR in the meantime.

@sshine sshine added the hold label Sep 27, 2019
@petertseng
Copy link
Member

I see that I must have misunderstood, thinking #1581 was a precedent. I now see the error of my ways. My bad!

The reason I now see my misunderstanding is that I now see that (in the leap exercise) dividing by a number other than 100 or 400 is an operation independent of a language (we can do it in our heads, we can do it on paper and pencil), whereas the knowledge that a string that starts with 0 will be interpreted as octal depends on the particular specification of how strings are interpreted as numbers in a given language.

It must be that I initially did not give this enough thought and therefore failed to see this distinction.

When it comes time to consider this PR again, please do consider a description that says why the case is added, not just echo the expected output. Thank you.

@yawpitch
Copy link
Contributor

I see that I must have misunderstood, thinking #1581 was a precedent. I now see the error of my ways. My bad!

It's not an error, it's just a point of view on which rough consensus can be difficult to achieve; there's obviously some wiggle room on what constitutes a bug fix issue. Since #1581 was us making, essentially, a flawed statement about the problem, rather than about any given solution to that problem, it leans a little more towards tipping to "bug fix" and therefore merge-able during the hold. This issue addresses a valid and interesting edge case, but one that will effect only solutions to the problem implemented in certain languages... that leans a little away from "bug fix", even if it's something we definitely want to address once the hold gets lifted.

Base automatically changed from master to main January 27, 2021 15:31
@ErikSchierboom
Copy link
Member

@thriqon Are you still interested in working on this PR?

@IsaacG
Copy link
Member

IsaacG commented Jan 1, 2023

@thriqon Is this PR still relevant or should it be closed out?

@kytrinyx
Copy link
Member

kytrinyx commented Jan 3, 2023

Given that it's been a year since the previous ping with no response, I think we are good to close it.

@IsaacG
Copy link
Member

IsaacG commented Jan 3, 2023

Closing out for now. Feel free to reopen if you want to work on this further!

@IsaacG IsaacG closed this Jan 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants