gherkin: Duplicate keywords in translations #17

ciaranmcnulty · 2021-02-01T15:46:03Z

Summary

In Behat/Gherkin we use the cucumber i18n data for translations. We have some tests that use this data to:

Generate a gherkin string using the different variations of the keywords
Generate a parsed AST that corresponds to that feature
Check the parser creates a similar AST when consuming the gherkin

This tends to fail when there are duplicate keywords inside a language's translation, so we skip some languages when doing this test.

Currently we are excluding the following from the tests:

ne: अनी means both And and Given
uz: Агар means both Given and When
en-old: Tha means both When and Then

Impact

Aside from failing tests easily skipped, in Behat our AST retains the 'type' of the node, which in some cases will now be wrong vs what the author intended

Maybe more importantly, in those languages there is a word that means two different things. This may damage education efforts around the G/W/T structure.

Possible Solution

fix the languages listed above to be unambiguous - this would have backwards compatibility issues so might not be acceptable
ensure via automated tests that duplicated words cannot be added to new or existing translations

The text was updated successfully, but these errors were encountered:

aslakhellesoy · 2021-02-01T20:24:27Z

I agree we should remove duplicates. We fixed a similar issue in cucumber/common@a54f32d - see https://github.com/cucumber/cucumber/blob/master/gherkin/CHANGELOG.md#1501---2020-08-12 where we removed keywords that only differed by case.

We need to bump the major version when/if we fix this, but that's no big deal.

stof · 2021-02-17T10:46:49Z

As the keyword would keep working for a step, I'm not even sure this creates a BC issue. They would still be usable for steps (as you would only remove one of the occurrences of the duplicate, not both). And to preserve BC with step type, it would be a matter of keeping the one corresponding to the step type being detected currently by cucumber implementations (but I think this is not even consistent between implementations, and may not cause issues)

ehuelsmann · 2022-05-25T16:48:36Z

@aslakhellesoy isn't this solved with the cucumber/common#1741 merge?

mpkorstanje transferred this issue from cucumber/common Nov 8, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gherkin: Duplicate keywords in translations #17

gherkin: Duplicate keywords in translations #17

ciaranmcnulty commented Feb 1, 2021

aslakhellesoy commented Feb 1, 2021

stof commented Feb 17, 2021

ehuelsmann commented May 25, 2022

gherkin: Duplicate keywords in translations #17

gherkin: Duplicate keywords in translations #17

Comments

ciaranmcnulty commented Feb 1, 2021

Summary

Impact

Possible Solution

aslakhellesoy commented Feb 1, 2021

stof commented Feb 17, 2021

ehuelsmann commented May 25, 2022