Initialise parser in arbitrary state #109

aslakhellesoy opened this Issue Oct 12, 2015 · 15 comments


None yet
6 participants

aslakhellesoy commented Oct 12, 2015

Cucumber will need to get a step's keyword in order to generate a stepdef snippet with the right keyword.
Pickles don't contain the step keyword because they can originate from non-gherkin documents (such as Markdown, which doesn't have step keywords).

Pickles contain the path of the source file.

It's possible to parse the source file with Gherkin, iterate over scenarios and find the step for a certain line.

But that's very cumbersome.

It would be better to ask the parser to return a Step AST node directly. This would require initialising the parser in the right state, so that parsing a step doesn't cause a parse error.

Something like this (pseudoish code):

var parser = new Parser();
parser.state = Parser.ScenarioState;
var step = parser.parse('    Given I have 4 cukes');
var keyword = step.keyword;



aslakhellesoy commented Oct 12, 2015

Or am I trying too hard to decouple Gherkin from pickles here? Is it more pragmatic to put the keyword back on PickleStep? Will we regret that later when people feed Cucumber from all sorts of source formats?


charlierudolph commented Oct 12, 2015

Since this makes things so much more complex I would vote for one of following simpler solutions.

  1. Snippets are no longer defined with a keyword. I like this as it reinforces that cucumber doesn't differentiate between "Given", "When", "Then" step definitions and also removes the need to support i18n in snippets/step definitions See cucumber/cucumber-js#248
  2. Just add the keyword to the pickled steps. It will be undefined when coming from other sources, but the snippet generator is going to have to deal with this anyway.

aslakhellesoy commented Oct 12, 2015

I'm not at all in favour of 1, since it removes valuable semantic context from a step definition. Context that is valuable to a human.

I'm fine with 2 I think. @gasparnagy what are your thoughts about this? You've also mentioned a use case for initialising the parser in a different state. What's that use case again?


brasmusson commented Oct 12, 2015

You cannot assume that it is the keyword of the step that should be used in the snippet. For "And", "But" and "*" Cucumber-Ruby uses keyword of the preceding step with a "Given", "When" or "Then".

The general problem of making lookup from locations in pickles to Gherkin/Markdown source need to be solved sooner or later - for reporting if not sooner. Currently Cucumber-Ruby solves it by using classes for both the AST and the test cases(/pickles) and having references from the test case/step classes to the AST classes. In potential use cases like sending pickles files off into a computing grid and getting back results json files, that approach will not do of course.

A somewhat related lookup problem is to run the right test cases for command line arguments like "path/name.feature:23". Line 23 may not be the line of a pickle, if it is the line of an outline step the all test cases for that Scenario Outline is executed. This particular lookup is not implemented very efficiently in Cucumber-Ruby today (see cucumber/cucumber-ruby#901).


aslakhellesoy commented Oct 13, 2015

@brasmusson could we solve this problem by putting this logic in the compiler? When the compiler sees a Given followed by a Then, it will create pickle steps with two Given keywords?


brasmusson commented Oct 13, 2015

Parsing just the step line might not be the solution to finding the keyword for the snippet, however the feature can be useful in other cases. Cucumber-Ruby is (ab)using the Gherkin3 parser to parse nested steps (feature), by setting up the parser in specific parser state, not exactly the kind of dependencies you really want (as I wrote in the PR description).


brasmusson commented Oct 13, 2015

@aslakhellesoy Yes, using the compiler to calculate the proper snippet keyword would work. The the attribute of the pickle should rather be named snippet_keyword than keyword since it would be aimed at the snippets and not for instance the pretty formatter output. The intentions, as I understand it, is then to be able to produce both the new raw result json file and the snippets, without using the Gherkin source. The Gherkin source would only be needed to produce the other (more final) reports.


koterpillar commented Nov 9, 2015

Hello there,

I'm maintaining a Python Gherkin library and this is also something I would like to see implemented. Currently to parse a step by itself I'm recreating a complete feature and then extracting the steps, so an interface to do this will be nice:


enkessler commented Mar 15, 2016

Yep. I'm having to do the same 'wrap the thing that I want to parse up in a fake feature file and extract it afterward' trick in my library as well. It was easier than trying to monkey patch this gem back when I first started using it.

Being able parse fractions of a feature file would be a nice feature to have right out of the box.


enkessler commented Apr 30, 2016

I feel like this should be as easy as tweaking the grammar to say 'here are some more valid starting rules'. Is this not the case?


enkessler commented Oct 22, 2016

@aslakhellesoy , @brasmusson I have again found myself in need of this kind of ability to parse arbitrary portions of Gherkin. Is it something that is still desired on your end?


brasmusson commented Oct 23, 2016

@enkessler I can see some places where the ability to parse arbitrary portions of Gherkin, would remove the need for the "'wrap the thing that I want to parse up in a fake feature file and extract it afterward' trick" (Cucumber-Ruby's StepsParser and DataTableParser). It would not help in the case of finding the correct keyword for a snippet though, to be sure to get that right the whole feature file is needed.

All in all, I view this a rather low priority feature, compared to other work item within the Cucumber open-source project like:

  • Using cucumber-expressions in the Cucumber implementations.
  • Upgrade Cucumber-JVM to use the new Gherkin parser and compiler.
  • Killing the legacy parsers of Cucumber-Ruby
  • Implementing command line formatters (Pretty and HTML).

jenisys commented Oct 26, 2016

I would proceed with @charlierudolph suggestion 2 (adding the keyword as optional attribute to pickles). If this optional information is lacking for the snippet generator (meaning keyword is missing), it should use some like * I have 4 cukes (to indicate a generic-step; if it is not already implemented that way).

If your first solution is still desirable, I would suggest a minor "simplification" (and hiding how the parser gets into the correct state). Add another parser entry point (or a specific StepParser/StepsParser), like:

var parser = new Parser();
var step = parser.parse_step('Given I have 4 cukes');  //< State machine init hidden from the user.
// BETTER: parser.parse_steps(text)  
// ==> Overcomes keyword ambiguities mentioned by @brasmusson (And, But, ...)

aslakhellesoy commented Oct 26, 2016

Yeah I agree the keyword should be in the pickle steps. Happy to take a pull request! Even for just a single language or two.


brasmusson commented Oct 27, 2016

If the new optional attribute is intended for snippets, it should be named snippet_keyword (or something), or both keyword (the actual keyword) and snippet_keyword should be added as optional attributes.

I have implemented snippet generation with the correct keywords from pickles in cucumber/cucumber-jvm#1035. It is a two step process where the runner (which only knows about pickles), generated snippets with **KEYWORD** as the keyword and stores them in the result object, and then the front end replaces **KEYWORD** with the correct keyword - after having reparsed the feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment