Word Count - Adds multiple apostrophe test case #1628

c4llmeco4ch · 2020-01-07T23:28:14Z

I noticed when working through the word-count problem on the Python track, I was able to pass the test suite with an incorrect solution. Effectively, the test suite does not check that users ignore non-contracting apostrophes. As such, I've added a simple test checking that the word "''hey''" (the middle quotes are apostrophes) successfully comes out to "hey". I could see this being potentially out-of-scope though.

yawpitch · 2020-01-18T15:33:22Z

This is something of an oversight, however I believe merging this is currently blocked by #1560

c4llmeco4ch · 2020-01-18T18:59:28Z

Fair enough, wasn't sure if this fell under the scope of "test cases being wrong" as mentioned in that post, but I'll look to send it over to the language-specific repos and see what trickles down until the lock is lifted.

yawpitch · 2020-01-19T13:47:25Z

Yeah, I'm a little uncomfortable with a missing test being the same thing as a test being wrong ... the description.md does mention just a single intervening apostrophe, but we also don't test things like "0th" or other interpolations of digits between letters, or hyphenated words ... I mean what should the word count be for 'twouldn't've be, anyway? The tests we've written, generally, describe a minimally functional implementation, they don't describe a bug-free one, so while I don't mind the idea of tightening this one up a little further once the block has lifted I'd say it doesn't quite rise to the urgency that would be required to make an exception.

c4llmeco4ch · 2020-01-21T21:47:08Z

Sure, makes sense. Should we close this for now or leave this on hold? I'm not sure how long it is planned for the lock-down to occur. I don't want to unnecessarily clutter the pull request queue, especially since it sounds like the maintainers are incredibly busy.

Thanks for your consideration!

iHiD · 2020-01-21T23:16:37Z

Leave it open on hold. Thank you :)

ErikSchierboom · 2021-12-10T10:36:12Z

I'm not sure that the additional test case is something that we'd need to test for. How likely is it that someone writes an implementation that passes single quotes but fails with double quotes? Similarly, is it reasonable to expect double quotes in input?

ErikSchierboom · 2021-12-10T10:36:36Z

CC @exercism/reviewers

wolf99

LGTM.
If implementations are unlikely to make this mistake then they will pass implicitly anyway.
If they do make the mistake then it is good to catch it as it may indicate an overly convoluted implementation.

petertseng · 2021-12-10T12:25:49Z

For what it's worth, my current implementation wouldn't have handled this properly since I'm only removing exactly one pair of surrounding single quotes from a given word. And since no test currently has two pairs of surrounding single quotes, this test would add value. If our tests only ever have words surrounded by one pair, no incentive to handle more than one pair, and I predict (but have no hard data to back up) that the path of least resistance is to in fact just handle one pair (because no looping required).

Is there a limited scope of what inputs we would want to test for this problem, given its goals (I don't actually know its goals). You may recall that I expressed a preference in Acronym #1436 that the test cases be in a form that someone might expect to see in natural language. Although in this case of word-count, that ship has already sailed because ",\n,one,\n ,two \n 'three'" and car: carpet as java: javascript!!&@$%^& are quite out there. But I think it's worth thinking about the question of what lengths we want to ask students to go to, especially if it's about handling inputs that we might not even consider realistic. Or perhaps you might consider this text realistic, in the case of someone with a limited keyboard using two pairs of single quotes to substitute for their inability to type double quotes.

Not yet fully decided but my current stance is best described as: No real desire to add it, but won't stand in the way if the majority decision is to add it.

ErikSchierboom · 2021-12-10T12:54:27Z

Is there a limited scope of what inputs we would want to test for this problem, given its goals

We currently don't have explicit goals for exercises. We'll probably add this in the future, but that is potentially problematic as different tracks could have different goals for an exercise. What we could do is add things like you mentioned, where we restrict text to "normal" text.

ErikSchierboom · 2022-01-12T13:47:00Z

@exercism/reviewers Your thoughts on the desirability of the change in this PR?

BethanyG · 2022-01-12T19:23:36Z

Real-language wise, it feels like a bit of an edge case to me.

Or at least it feels like an edge-case when it comes to word-counting. I think it's a useful scenario to consider from the perspective of "cleaning up" language you'd then process (count, format, analyze, etc.) "downstream". So maybe this exercise has a related predecessor or extension that deals with the difficulties of sanitizing and normalizing?

I think that if we are going to add test cases like these - as well as car: carpet as java: javascript!!&@$%^&, we should have the problem description outline what is, and is not considered a word for counting purposes. To do otherwise is to send a student into a spiral of "what if" and (maybe) unproductive defense and checking.

All that being said -- if a majority decide this is indeed a test case worth adding here, I am not going to stand in their way.

kotp · 2022-01-12T22:10:22Z

For what it is worth, I do not think that even the big players always get this right. Do a word count from various tools, such as Lotus Notes, Word, etc, and you are likely to get different answers.

That said, if you are splitting words on apostrophes you are probably approaching the problem incorrectly.

I guess from a view point of mentoring, I do not mind if the test is there or not. I do not know why having more than one apostrophe in a word would cause a problem for counting words, because I do not think I would have ever considered an apostrophe to be a word boundary.

I also was expecting to see a test that covered multiple apostrophes, as opposed to "multiple single quotes", such as:

"You're going to count words, but you can't really do it by using the apostrophes."

Where there are two apostrophes in that sentence.

glennj · 2022-01-12T22:22:25Z

It's apparently colloquial only, but

shouldn't've => "You shouldn't've done that!"
y'all's

kotp · 2022-01-12T22:25:55Z

It's apparently colloquial only, but

shouldn't've => "You shouldn't've done that!"

y'all's

And both still only one word.

kotp · 2022-01-12T22:26:53Z

Also, the test does not test for this, it tests "''hey''" which are single quotes (as used) and not apostrophes (as used in "y'all"), even though they are the same character (though do not have to be).

kotp · 2022-01-12T22:29:29Z

So I would say there are two aspects here, definition of "word" in regards to works that contain one or more apostrophes. The second aspect is that I think the test does not cover what the PR title states as well.

kytrinyx · 2023-04-01T09:54:44Z

As part of reworking the word-count problem description (#2247) I've taken a look at this PR, and my feeling is that this test case doesn't really fundamentally make the exercise more interesting, and while it's potentially more correct, the definition of "correct" is vague enough that I'm not convinced that it makes enough of a difference.

The new problem definition frames the exercise in terms of subtitles for TV shows, which is always going to be fuzzy. Subtitles are not always complete sentences. They're not always correct in terms of grammar or punctuation or spelling. But in the use case described, it doesn't actually matter, because the goal is just "directionally, roughly useful", not "precisely correct".

So my conclusion here is that I'd prefer to close this without merging.

Added multiple apostrophe test case

d99e9fa

yawpitch added the hold label Jan 18, 2020

kotp changed the title ~~Added multiple apostrophe test case~~ Word Count - Adds multiple apostrophe test case Jan 18, 2020

c4llmeco4ch mentioned this pull request Jan 30, 2020

Added apostrophe test case exercism/go#1329

Closed

cmccandless mentioned this pull request Jan 30, 2020

word-count: Add multiple apostrophes test case exercism/python#2169

Merged

Base automatically changed from master to main January 27, 2021 15:31

ErikSchierboom removed the hold label Oct 15, 2021

wolf99 approved these changes Dec 10, 2021

View reviewed changes

kytrinyx closed this Apr 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Word Count - Adds multiple apostrophe test case #1628

Word Count - Adds multiple apostrophe test case #1628

c4llmeco4ch commented Jan 7, 2020 •

edited

yawpitch commented Jan 18, 2020

c4llmeco4ch commented Jan 18, 2020

yawpitch commented Jan 19, 2020

c4llmeco4ch commented Jan 21, 2020 •

edited

iHiD commented Jan 21, 2020

ErikSchierboom commented Dec 10, 2021

ErikSchierboom commented Dec 10, 2021

wolf99 left a comment

petertseng commented Dec 10, 2021

ErikSchierboom commented Dec 10, 2021

ErikSchierboom commented Jan 12, 2022

BethanyG commented Jan 12, 2022

kotp commented Jan 12, 2022

glennj commented Jan 12, 2022

kotp commented Jan 12, 2022

kotp commented Jan 12, 2022

kotp commented Jan 12, 2022

kytrinyx commented Apr 1, 2023

Word Count - Adds multiple apostrophe test case #1628

Word Count - Adds multiple apostrophe test case #1628

Conversation

c4llmeco4ch commented Jan 7, 2020 • edited

yawpitch commented Jan 18, 2020

c4llmeco4ch commented Jan 18, 2020

yawpitch commented Jan 19, 2020

c4llmeco4ch commented Jan 21, 2020 • edited

iHiD commented Jan 21, 2020

ErikSchierboom commented Dec 10, 2021

ErikSchierboom commented Dec 10, 2021

wolf99 left a comment

Choose a reason for hiding this comment

petertseng commented Dec 10, 2021

ErikSchierboom commented Dec 10, 2021

ErikSchierboom commented Jan 12, 2022

BethanyG commented Jan 12, 2022

kotp commented Jan 12, 2022

glennj commented Jan 12, 2022

kotp commented Jan 12, 2022

kotp commented Jan 12, 2022

kotp commented Jan 12, 2022

kytrinyx commented Apr 1, 2023

c4llmeco4ch commented Jan 7, 2020 •

edited

c4llmeco4ch commented Jan 21, 2020 •

edited