Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gherkin: an alternative syntax to <...> for interpolating values into a Scenario Outline #1004

Closed
bhreinb opened this issue May 19, 2020 · 32 comments
Labels
⌛ stale Will soon be closed by stalebot unless there is activity library: gherkin

Comments

@bhreinb
Copy link

bhreinb commented May 19, 2020

Summary

Currently there is two constructs to encapsulate a DocString namely """ and ```. I'm wondering could such an option be added for interpolating values into a Scenario Outline|Template. Preferably any alternative syntax to <...> could be configurable too.

Current Behavior

The construct <...> works well in most cases however imo it introduces a usability issue when the syntax is used in conjunction with a DocString that has XML|HTML content. For example

Scenario Templates: The parser accepts formatted text
   Given some xml text
   """<xml>
   <xml><xmlHeader>Hello world!<xmlFooter></xml>
   """
   Given some html text
   """<html>
   <html><htmlHeader>Hello world!<htmlFooter></html>
   """
   Scenarios:
   | xmlHeader     | xmlFooter      | htmlHeader | htmlFooter |
   | <soap:Header> | </soap:Header> | <head>     | </head>    |

Possible Solution

The suggestion would be that we can have another variable syntax for example {{xmlHeader}} that has the same functionality as <xmlHeader>, etc. I think it reads much better when you use those formats. See below:

Given some html text
"""<html>
   <html>{{htmlHeader}}Hello world!{{htmlFooter}}</html>
"""

Note reason for {{...}} is that it's used in testing tools like postman plus template engine libraries for example Django that has implementations in Python, Java, JavaScript and C# AFAIK. Some other string interpolation syntaxes potentially to consider:

${foo} - EcmaScript, Perl, PHP, Bash, Dart, Groovy
$foo - Bash, Dart
#{foo} - Ruby, CoffeeScript
{foo} - JSX, Python 3.6

Context & Motivation

Imo it fixes a usability issue that comes about from using variable substitution with Scenario Outline|template in conjunction with a DocString when the content of the DocString is HTML|XML.

@marnen
Copy link

marnen commented May 19, 2020

I understand the idea of having a syntax that can deal better with literal <> characters, but I also tend to think (and the core team may well disagree with me on this!) that literal XML/HTML is better suited for unit tests...

@bhreinb
Copy link
Author

bhreinb commented May 19, 2020

The context here is system testing. I don't agree with the assertion that literal xml/html is better suited for unit tests. In some cases maybe but for example testing of a soap web service which requires an xml payload should be tested at a level higher than unit level.

I also seen usage of xml|html like I presented above discussed in a PR

#292 (review)

@marnen
Copy link

marnen commented May 19, 2020

@bhreinb

In some cases maybe but for example testing of a soap web service which requires an xml payload should be tested at a level higher than unit level.

Agreed, but is Cucumber the right tool for acceptance testing that? I'm not sure I think it is. In general I think Cucumber excels at modeling UI interaction in user terms, not API testing. I've used it for APIs and don't really think it's the right tool, though I could be convinced otherwise. I've had better luck with things like https://github.com/zipmark/rspec_api_documentation for APIs.

And again, I want to stress that I'm not on the core team or in any way "official". This is just my opinion as a (mostly) very happy user of Cucumber on lots of projects.

@bhreinb
Copy link
Author

bhreinb commented May 19, 2020

Well the topic has slightly diverged from what I submitted but maybe testing of SOAP web services is a lot of implementation detail for cucumber. They do say cucumber isn't an Automation API tool but it works well with other API automation tools

https://cucumber.io/docs/guides/api-automation/

In any case It really was to provide an example of xml|html within a DocString construct. The example arguably may not have been the best one to pick but happened to be the first that came to my mind 😬. Thanks for including a link to rspec btw.

@marnen
Copy link

marnen commented May 19, 2020

@bhreinb That's not a link to RSpec; it's a link to a really amazing RSpec plugin for API testing and documentation.

I'll admit that I'm having trouble coming up with a situation where HTML or XML would be appropriate in a Cucumber scenario, so if you have a better example, I'd be really curious to see it.

@mpkorstanje
Copy link
Contributor

I don't think these are examples of what most people would consider good gherkin. However being able to control which characters are interpreted as meta characters can improve readability. The two different ways to write a docstring are a good example of that. So in my opinion it is worth considering more in depth.

When using the docstring block we can either use the triple quotes """ so we do not have to escape triple back-ticks:

"""
It's very easy to make some words **bold** and other
words *italic* with Markdown. 

GitHub also supports something called code fencing, 
which allows for multiple lines without indentation:
```
if (isAwesome){
  return true
}
```
"""

Or we can use the triple back-ticks ``` so we do not have to escape quotes.

```
"""
Change will not come if we wait for some other person
or some other time. We are the ones we've been
waiting for. We are the change that we seek.
"""
~ Barack Obama
```

The nice thing here from a syntax perspective is that the Gherkin syntax determines which characters are considered meta characters in the doc string. The scenario outline however lacks these hints:

Scenario Outline: Famous quotes
  Given a 
  And a  
  Then our tile is laid out as
  ```
  """
  
  """
  ~ 
  ```  

Examples:
  | person       | quote  |
  | Barrak Obama | Change will not come if we wait for some other person\nor some other time. We are the ones we've been\n  waiting for. We are the change that we seek. |
  | 

When processing the examples for this feature file there is no information that can hint what the special characters should be. In fact, as far as Gherkin is concerned the < and > aren't even special characters. The strings <person> and <quote> are replaced as part of a post processing step. Any other bracketed words are ignored.

And this brings us to the impossibility of merely adding {{ and }} as special characters. Suppose that {{ and }} were introduced in addition to < and >. The example below would be rather ambiguous.

"""xml
<person>{{person}}</person>
"""

And when used the other way around, for example to test a moustache template it would be a breaking change:

"""mustache
{{person}} <person>
"""

So without solving the problem of hinting what symbols should bracket a replacement I don't think this can be added.

Note: I do not consider a configuration option to be a valid solution. Gherkin should be self contained. It is simply not practical to pass this configuration along to all sorts of different tools that might process it. I would also not consider using a heuristic to guess what the replacement characters could be.

@marnen
Copy link

marnen commented May 20, 2020

@mpkorstanje I mostly agree with what you've said here, but I think a configuration option might be a possibility if it could be in the Gherkin file itself, so that the Gherkin parser could deal with it transparently to whatever Cucumber implementation it's working with.

Example of possible syntax:

@{{delimiter}} # overloading tag syntax, but perhaps something else would be better
Scenario Outline: I can use braces as delimiters!
  Given I am logged in as {{email}}
  When I go to the home page
  Then I should see "Hello, {{name}}!"
  And I should see "Today's math fact: 3 < 5 > 2" # literal < and >
  
  Examples:
    | email           | name     |
    | joe@example.com | Joe User |

I don't know if this is worth doing, but it would at least address some of the possible objections...

@bhreinb
Copy link
Author

bhreinb commented May 20, 2020

Hi there,

A couple of things to address:

From a purist BDD point of view having XML|HTML in a gherkin document per say probably is verbose. But that would also be the case for JSON|YAML imho. I guess if it's at odds with the philosophy of gherkin usage then should it be allowed in the first place at all or referenced as a use case albeit not a major one 😄 per this comment #292 (review) .

I don't quite follow how the pattern {{...}} would be ambiguous but maybe I'm biased on that. In addition, I'd be very surprised to see something like mustache referenced within the context of a DocString. I would think that is far more improbable than specifying a data interchange format (XML|JSON|YAML) or markup snippet like HTML.

The suggestion for a configurable syntax came about from a discussion on slack around this:

https://cucumberbdd.slack.com/archives/CEE9L148J/p1589807524014900

I suggest making it configurable, then you don’t have to have an opinion (other than <..> being the default)

which in fairness I'm in agreement with.

@vincent-psarga
Copy link
Contributor

I find it pretty interesting to have another way to interpolate data coming from the examples table.

And even if I'm not a big fan of the HTMl in the scenarios (too low level from my point of view), the reality I that lots of people use Cucumber and other BDD related tool this way. I think it's more interesting to let people write "bad" Gherkin and help them move toward good practices than blocking them at the beginning.

For the tag annotation on the other side, I don't think it would be a good way a marking the scenarios as interpolating from moustaches instead of <>.

Maybe a magic comment as this is done with most linting tool ?

// gherkin: use-mustaches
Scenario Outline: I can use braces as delimiters!
  Given I am logged in as {{email}}
  When I go to the home page
  Then I should see "Hello, {{name}}!"
  And I should see "Today's math fact: 3 < 5 > 2" # literal < and >
  
  Examples:
    | email           | name     |
    | joe@example.com | Joe User |

But again this adds data which may distract from the feature content.
That should be ok for developers (we're kinda used to skip over comments), but when using the feature as a way to discuss with non-technical people, it could be a slight issue.

@mpkorstanje
Copy link
Contributor

Maybe a magic comment as this is done with most linting tool ?

We do this for the encoding in Cucumber-JVM. Thought it is not part of Gherkin.

@vincent-psarga
Copy link
Contributor

could be nicer if that's set at the feature level and not the scenario level ? like setting the language (which is part of Gherkin this time)

@bhreinb
Copy link
Author

bhreinb commented May 20, 2020

Command line argument perhaps?

@mpkorstanje
Copy link
Contributor

Command line argument perhaps?

Gherkin should be self contained. It is simply not practical to pass this configuration along to all sorts of different tools that might process it.

@marnen
Copy link

marnen commented May 20, 2020

@mpkorstanje Seems like it would be practical, but there’s another issue at work: command line (or config file) is the wrong granularity, because this should be able to vary per scenario, hence my suggestion of tags (or magic comments, as @vincent-psarga suggested).

@bhreinb
Copy link
Author

bhreinb commented May 21, 2020

Having the granularity at scenario level does seem like overkill imho...I'd be surprised if an engineer had two syntaxes constructs like so <...> or {...} within the same feature file...in any case if this was such an issue then the tests that have the alternative syntax essentially could be maintained in another feature file.

@mpkorstanje
Copy link
Contributor

mpkorstanje commented May 22, 2020

Okay so we'd have something like:

# language: fr
# encoding: ISO-8859-1
# interpolation: double-curly-brackets
Fonctionnalité: Concombres fractionnaires

  Plan du Scénario: dans la ventre
    Étant donné j'ai {{nombre}} concombres fractionnaires
  Exemples:
    | nombre |
    | 5,5    |

This altogether looks less hideous then what I expected.

Would anybody like to bikeshed over the naming options?

# interpolation: angle-brackets
# interpolation: double-curly-brackets

I've picked the terms used by wikipedia but I can imagine there are other names that work.

@marnen
Copy link

marnen commented May 22, 2020

Sure, if we’re bikeshedding, I’d kind of rather that we not have prenamed interpolation strategies. Rather, accept any characters satisfying \S and \W with a directive something like $[<placeholders>] (picking a hideous example for sake of argument).

Also, I really don’t want comments used for Gherkin processing directives. Hot comments are evil. Better to have a separate syntax if we need them.

@mpkorstanje
Copy link
Contributor

mpkorstanje commented May 23, 2020

I am not a fan of hot comments either. It looks just fine without in the example below. Though currently hot comments are used to indicate the language and encoding. A consistent appearance makes the syntax easier. I think that is more important then avoiding hot comments. Especially since that ship has sailed.

language: fr
encoding: ISO-8859-1
interpolation: double-curly-brackets

Fonctionnalité: Concombres fractionnaires

  Plan du Scénario: dans la ventre
    Étant donné j'ai {{nombre}} concombres fractionnaires
  Exemples:
    | nombre |
    | 5,5    |

I’d kind of rather that we not have prenamed interpolation strategies.

I can see that working. With a pattern (\S\W)+(?:\w\s)+(\S\W)+. Where the capture groups define the boundaries.

# language: fr
# encoding: ISO-8859-1
# interpolation: {{placeholder}}

Fonctionnalité: Concombres fractionnaires

  Plan du Scénario: dans la ventre
    Étant donné j'ai {{nombre}} concombres fractionnaires
  Exemples:
    | nombre |
    | 5,5    |

edit: The pattern needs work. Right \W matches nearly everything in Unicode. May as well match everything. So (.+)placeholder(.+) might be straight up better.

@marnen
Copy link

marnen commented May 23, 2020

OK, I wasn't aware that hot comments were already in use, since I've never actually needed to use them for anything. Yuck. :)

Also, I think you mean (?=\S)\W or something like that instead of \S\W. (Why yes, I do play regexcrossword.com. How did you guess? :D )

@marnen
Copy link

marnen commented May 23, 2020

Oh, duh. [^\s\w].

@mpkorstanje
Copy link
Contributor

mpkorstanje commented May 24, 2020

While it might be a fun exercise to get this right, I think I'd rather avoid potentially shooting ourselves in the foot by allowing everything and having to go back on that at some point in the future.

I think that limiting the set of valid characters to #${}<>()[] should allow users to construct an obvious replacement pattern in most content types. Can anyone come up with an example of a content type where this is not the case?

@marnen
Copy link

marnen commented May 24, 2020

@mpkorstanje:

I think I'd rather avoid potentially shooting ourselves in the foot by allowing everything and having to go back on that at some point in the future.

What risk are you envisioning there? Personally, I don’t want to be too arbitrary...

Can anyone come up with an example of a content type where this is not the case?

Any content type at all. I basically would like the user to have free choice of the most sensible delimiters for his or her scenario. (Also, I’d add '"‘’`“”„«» to that minimal list, as well as some bracketing characters in Asian scripts...)

@bhreinb
Copy link
Author

bhreinb commented Jun 24, 2020

So possibly using hot comments may solve the above feature request? I wasn't aware hot comments were used to indicate language or encoding interesting.

@stale
Copy link

stale bot commented Oct 11, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs.

@stale stale bot added the ⌛ stale Will soon be closed by stalebot unless there is activity label Oct 11, 2020
@bhreinb
Copy link
Author

bhreinb commented Oct 13, 2020

Hi There,

I'm happy to submit a PR for the above in the next few weeks (whenever I get some free time to do this) via the hot comment approach suggested above assuming it would get merged?

@stale stale bot removed the ⌛ stale Will soon be closed by stalebot unless there is activity label Oct 13, 2020
@mpkorstanje
Copy link
Contributor

I currently do not have the bandwidth to attend to this. I doubt any of the other maintainers has at the moment.

@bhreinb
Copy link
Author

bhreinb commented Oct 16, 2020

To accept a PR? I don't have the time to send one now anyway in any case but in a few weeks I assume somebody could accept it then?

@aslakhellesoy
Copy link
Contributor

Gherkin is currently maintained for the following languages:

  • Go
  • JavaScript/TypeScript
  • Ruby
  • Java

In order for us to accept a pull request, it would have to update all the maintained implementations. Are you up for that task?

@bhreinb
Copy link
Author

bhreinb commented Oct 16, 2020

I could do two of the items listed above Java & JavaScript/TypeScript. Unfortunately I haven't had any exposure to the other languages referenced above though 😞.

@aslakhellesoy
Copy link
Contributor

I guess you could submit a PR for those languages and wait for someone else to complete it with the remaining ones. The simpler the changes are, the quicker it will be for someone else to port over.

@stale
Copy link

stale bot commented Dec 15, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs.

@stale stale bot added the ⌛ stale Will soon be closed by stalebot unless there is activity label Dec 15, 2020
@stale
Copy link

stale bot commented Dec 25, 2020

This issue has been automatically closed because of inactivity. You can support the Cucumber core team on opencollective.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
⌛ stale Will soon be closed by stalebot unless there is activity library: gherkin
Projects
None yet
Development

No branches or pull requests

5 participants