-
-
Notifications
You must be signed in to change notification settings - Fork 617
change: Do not transliterate when suggesting method names for step definitions #1633
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Previously, most of this was done within duplicated methods in the `PatternPolicy` implementations and passed into the `Pattern` as `canonicalText`. There are lots of potential interpretations of the `canonical text` of a step but in practice we were only using this as the basis of the suggested method name when generating snippets. This suggestion was processed almost entirely in the `PatternPolicy` class with one final step in the `ContextSnippetGenerator` to lowercase the first character. It makes more sense to consolidate all that logic together, and to make explicit that this property on the `Pattern` is the suggested method name.
Expose it in the service container so that users (particularly those working in other languages) can provide a more complex implementation if they wish to do so.
Drop the behat/transliterator dependency and simply suggest step method names with the full utf8 characters used in the feature file.
Just remove any characters that are not within the allowed set for method names (including digits at the start of the name), and fall back to the generic `stepDefinition1` if the text is completely invalid. Since we may now have utf8 characters in the suggestion, we also need to make the case conversions (to lowerCamelCase) multibyte-safe. Fortunately, Behat already depends on ext-mbstring.
| #[Then('/^Добавить "([^"]*)" число$/')] | ||
| public function dobavitChislo($arg1): void | ||
| public function stepDefinition1($arg1): void |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will now just suggest generic names for some non-latin languages where the entire character set is outside the a-zA-Z0-9_\x80-\xff unicode range.
We could attempt to detect whether ext-intl is present and use the native Transliterator class as suggested by @stof in #1632.
However I'd be interested to hear what Russian etc speakers would actually want to have here - is a straight machine transliteration useful, or would they likely end up changing it anyway?
My gut feel is it would make more sense to leave this as-is "out of the box". It would now be easy for us (or a.n. other) to implement a Behat extension with a hard dependency on ext-intl for more complex cases like this.
| Then 5 should have value of £10 # FeatureContext::shouldHaveValueOf£() | ||
| TODO: write pending definition | ||
| And 7 should have value of £7.2 # FeatureContext::shouldHaveValueOfPs() | ||
| And 7 should have value of £7.2 # FeatureContext::shouldHaveValueOf£() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually think the new ValueOf£ makes more sense than ValueOfPs anyway!
| #[Given('I have a package v2.5')] | ||
| public function iHaveAPackageV(): void | ||
| public function iHaveAPackageV25(): void |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be trivial to strip trailing numbers as before. However I'm not sure it's warranted (bearing in mind this number is in the hardcoded step text, it is not a placeholder).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this case is fine to me
src/Behat/Behat/Context/Snippet/Generator/ContextSnippetGenerator.php
Outdated
Show resolved
Hide resolved
| public function getCanonicalText() | ||
| public function getSuggestedMethodName(): string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The unified diff looks at a glance like I've just renamed the method - in fact getCanonicalText is still present but deprecated, and getSuggestedMethodName is a new method.
| public function __construct( | ||
| private string $suggestedMethodName, | ||
| private string $pattern, | ||
| private int $placeholderCount = 0, | ||
| ) { | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These previously had PHPDoc typehints, I would also be very surprised if anyone externally is creating a Pattern instance and certainly not with arguments that won't be able to cast to these types. I think therefore it's safe to make them strict.
src/Behat/Behat/Context/Snippet/Generator/ContextSnippetGenerator.php
Outdated
Show resolved
Hide resolved
| private function getMethodName(string $contextClass, string $canonicalText, string $pattern): string | ||
| { | ||
| $methodName = $this->deduceMethodName($canonicalText); | ||
| $methodName = SimpleStepMethodNameGenerator::cleanupMethodName($canonicalText); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we trigger a deprecation here when the cleaned value is different from the provided argument, to ask using the StepMethodNameGenerator in the pattern policy ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is we have no neat way to report that deprecation to the user at the moment, and even if we did if they are consuming a PatternPolicy from an extension that could be difficult for them to resolve in the short term.
Considering all the method actually does is lowercase the first letter and backfill empty strings, I almost considered removing it now - worst case someone will get a suggested method with an uppercase first character (or perhaps no name) and have to fix it in their Context class... It's not going to actually break anything. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@carlos-granados I have reflected more on @stof's comment I think the benefit of lowercasing the first character (in the unlikely situation someone is using a custom PatternPolicy) is limited for the cost of exposing this method and changing it again at some future date. I've reworked to just handle the case where the method name is empty, which avoids having to have this method at all.
src/Behat/Behat/Definition/ServiceContainer/DefinitionExtension.php
Outdated
Show resolved
Hide resolved
| * @deprecated provided only for BC, to allow the ContextSnippetGenerator to continue doing this cleanup for any | ||
| * steps that were generated by a third-party PatternPolicy that does not use our StepMethodNameGenerator | ||
| */ | ||
| public static function cleanupMethodName(string $name): string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If several steps end up with an empty name and we want to use the generic name we will get several functions named stepDefinition1 which is not ideal. Perhaps you could have a static int property which is initialised to 1 and appended to stepDefinition and then increased?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although they will all be suggested as stepDefinition1 initially ContextSnippetGenerator::getUniqueMethodName() will then convert these to stepDefinition2, stepDefinition3 etc before suggesting the actual snippets. This then also takes account of methods that are already defined in the context (rather than just from this execution).
It doesn't actually look like we have test / feature coverage of that - I could add that and a quick comment here to make the behaviour clearer / more reliable in future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, now I realise what you mean. Yes, some test coverage and a comment would certainly help
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@carlos-granados turns out the "numbering duplicate step definitions" behaviour is actually covered already in snippets.feature:
Behat/features/snippets.feature
Lines 104 to 114 in 2ea762d
| #[Then('/^I should have "([^"]*)"$/')] | |
| public function iShouldHave2($arg1): void | |
| { | |
| throw new PendingException(); | |
| } | |
| #[Then('/^I should get a "([^"]*)":$/')] | |
| public function iShouldGetA2($arg1, PyStringNode $string): void | |
| { | |
| throw new PendingException(); | |
| } |
I have added comments to the new code, and also renamed from StepMethodNameGenerator::generate() to StepMethodNameSuggester::suggest() to make more obvious that these are just suggestions and may not be the final result (also makes it more consistent with $pattern->getSuggestedMethodName()`
carlos-granados
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking mostly good, just added one comment
Renamed the interface and method to make it clearer that this code is only responsible for *suggesting* a method name, which may later be changed by the ContextSnippetGenerator. And added some comments to explain why suggested names do not need to be unique.
…licy The previous implementation introduced a new-but-deprecated method to attempt to handle the (unlikely) case that a user has a custom PatternPolicy suggesting step method names. In practice, the only thing that is actually *required* to produce valid code is that the name is not empty. We can easily handle that without extracting a complete internal method. Worst case, if a user's custom code is not lowercasing the first character of their method name, we may generate a method that does not follow their coding style. This feels like a reasonable trade-off for the improvement to code clarity / structure here.
95bf39a to
b40c126
Compare
carlos-granados
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good, great job @acoulton
As discussed in #1632, we should just remove our behat/transliterator dependency. We use very little of it, it is out of date, and it avoids the licensing issues discussed in Behat/Transliterator#3 and #1343.
This PR starts by refactoring the duplicated code that generates method names into a single service that implements the new
StepMethodNameGeneratorinterface. This would allow people to define their own implementations if they want something more complex (or with more dependencies) than we provide out of the box.Our default
SimpleStepMethodNameGeneratorthen does the bare minimum to produce a unicode name that is syntactically valid for PHP.This is all backwards compatible at the code level (including if people have registered a custom
PatternPolicyclass).At a functional level, it obviously changes the method names we'll suggest - including that there are some cases where we'll now just suggest a generic name where previously we would have attempted to transliterate.