Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concise step definitions #132

Closed
wants to merge 8 commits into from

Conversation

atykhonov
Copy link
Contributor

While I was writing integration tests (using ecukes package) I was thinking how it could be possible to get rid regexps from step definitions.

For example there is such step definion:

(When "^I translate \"\\(.+\\)\" from \"\\(.+\\)\" to \"\\(.+\\)\"$"

My wish was to make it:

  • more readable;
  • avoid repetitive regexp-pattern such as \"\\(.+\\)\";
  • reduce amount of attention to keep definitions to be correct;

This feature request is a try to implement these desired things. It is rather about proof of concept than about final version.

This feature request contains the functionality by means of which it is possible to define steps in the following way:

(When "^I translate TEXT from SOURCE-LANGUAGE to TARGET-LANGUAGE$"

The package treats such definition as usual: ^I translate \"\\(.+\\)\" from \"\\(.+\\)\" to \"\\(.+\\)\"$

The possibility to define a step definition in the usual way still remain:

(When "^I translate \"\\(.+\\)\" from \"\\(.+\\)\" to \"\\(.+\\)\"$"

There are special keywords are defined to be replaced by appropriate regexps. For example CONTENTS will be replaced by the \\(?: \"\\(.*\\)\"\\|:\\) regexp, POSITION (POS or NUMBER or NUM) by \\([0-9]+\\) and MODE (VARIABLE or VALUE) by the \\(.+\\) pattern. Capitalized words which contains in itself any mentioned keyword and separated by "-" will be also replaced by the appropriate regexp. For example LINE-NUM or POINT-POSITION will be replaced by \\([0-9]+\\). The idea about capitalized words (separated by "-") was to make a correct decision (while replacing a keyword) by means of special endings (such as NUM, POSITION etc).

In case of undefined keyword is used in step definition they replaces by default \"\\(.+\\)\" regexp pattern which matches any string. As for example, mentioned SOURCE-LANGUAGE replaces by \"\\(.+\\)\".

What do you think?

@rejeep
Copy link
Contributor

rejeep commented Jan 30, 2014

Hi,

I agree that writing those regexes by hand is a pain. I always let Ecukes do the job for me, which makes it almost painless.

Another idea is to allow the first argument to be a list. I'm thinking the way Flycheck works (see http://flycheck.readthedocs.org/en/latest/manual/extending.html#defining-new-syntax-checkers). It provides a DSL for defining checkers, with a few special keywords and support for rx. For example this:

(When "^I place the cursor between \"\\(.+\\)\" and \"\\(.+\\)\"$"
  ...)

Could be written as:

(When (line-start "I place the cursor between " (one-or-more char) " and " (one-or-more char) line-end)
  ...)

What do you think?

@atykhonov
Copy link
Contributor Author

<< I always let Ecukes do the job for me, which makes it almost painless.

I treat this as good advice. Will try to follow this way in the future. But right now, as I imagine, this will reduce flexibility. But will see how it will be going.

In the meantime, I was thinking about another idea... To be able to write (besides step definitions) variable (keyword?) definitions so it will be possible just to put a keyword-pattern definition just before step definitions and be able to do not write patterns. In this way, there could be described (by unit test writer) some common keywords and they could be used within the tests. And step definitions could looks like I described before. Keyword def may looks like the following: (def-key SOURCE-LANGUAGE "\"\\(.+\\)\"") or (def-key SOURCE-LANGUAGE TEXT), where TEXT def-key is predefined somewhere in the core... Or (def-key API-KEY NUMBER)

But! My ideas comparing to idea with rx approach seems look like the hacks :-) I definitely like that approach! It is very powerfull. I don't khow how Flycheck works (needs more info and investigation) but I like example provided by you. I like it because it is much more readable, I believe it would be easier to support, to change, to extend...

May this to be a part of ecukes? How?

@rejeep
Copy link
Contributor

rejeep commented Feb 3, 2014

I don't think your idea is bad either. I like it more when there's an API to add your own definitions like you suggested. Let's say for example that your plugin works a lot with ip-addresses. I would be very nice to avoid that duplication.

(When "^When I do something with IP-ADDRESS$"
  (lambda (ip-address)
    ...
    ))

The downside with rx syntax is that it can be quite long. And you can actually start using them without any change in Ecukes, by just calling the rx macro, like this:

(When (rx line-start "I place the cursor between " (one-or-more char) " and " (one-or-more char) line-end)
  ...)

@atykhonov
Copy link
Contributor Author

Great! Will try rx macro then (if there will be no other alternatives)!
Yes, I agree that it can be quite long. And plain text such as "^When I do something with IP-ADDRESS$" looks much more better as for me.

What if I'll implement such possibility to define such keywords and use them within step definitions? Can such thing become as a part of ecukes?

@rejeep
Copy link
Contributor

rejeep commented Feb 3, 2014

Can such thing become as a part of ecukes?

Definitely, I'm just discussing options here.

A few comments:

I'm not sure about what we should call this feature. You say key. Template is another alternative. What would be a good name..?

How about making the templates a little more verbose, such as {IP-ADDRESS} or $ip-address. I'm not sure if we need this though. What do you think?

I don't think that Ecukes should have any pre defined templates. I just think that Ecukes should provide the functionality. Then packages such as Espuds can define these. And you can also define your own domain specific.

We have to figure out a good name for the function that defines a template. For example: ecukes-define-template.

@atykhonov
Copy link
Contributor Author

Well, I'm here with more thoughts and comments...

What would be a good name..? Well, English is not my native language thus it is hard for me to choose correct name. But this is of course not the only reason. I'm continue thinking about feature name also and trying to wrap its essence by a notional terms :-) Key is just an abbreviation of keyword. It became key in a macros name such as def-key (similar to defun, abbreviation of define function). Why keyword? I already don't remember why I've chose such name... Seems that was thoughtless decision. Thus I don't see any reason to stay with such name. Template? For me step definition is more similar to a template than... pattern? What is difference between "pattern" and "template"? :-) Well, "pattern" is more close to me than "template". May be because I quite often used to use "template" term when was talking about html templates and "pattern" when was talking about regular expressions :-) Thus I prefer "pattern". Therefore what about ecukes-define-pattern?

About more verbose templates, such as {IP-ADDRESS} or $ip-address. I've chose CAPITALIZED NAMES because they are very similar to the ARGUMENTS in the elisp doc-strings. When describing a step definition it may looks like the following:

(When "^I translate TEXT$"
  (lambda (text) ...)

And you see that TEXT and argument text somehow interchanges theirs meanings. That is why I like such way. However may be mentioned similarity is a wrong way? Because capitalized names relate to the doc-strings but right now we are not talking about a doc-string but about a step definition... And such interchanging may be wrong, and may be even confusing?

About {IP-ADDRESS} or $ip-address. Despite the fact (is that really the fact?) that such definitions seems (as for me) are a little bit aliens in the elisp world I like $ip-address (and ${IP-ADDRESS}) because it is widely used to use in the programming world (bash, php, perl). So they might be quite recognizable by others (and might be quite habitual for me) as such constituents which are going to be replaced by something else. But I dislike them because they includes the "$" which is the special sign for the regular expressions. So in case of $ip-address seems it may be a little bit confusing. I like ${IP-ADDRESS} despite the fact that "$" is used in it. And I have nothing against {IP-ADDRESS}. But won't it be confused for those who use it in python for string formatting? :-)

Well, in general, I becoming feel mixed feelings about this feature :-) My thoughts are not fluent about it, I still feel that I'm still not able think about it fluently and coherently. The mixing of regular expression and custom definitions in it still confuses me.

In the meantime I'm continue thinking about rx approach. Well, I agree that rx definitions could be long and they are not so readable as plain text. In the same time rx approach is very-very powerful. I mean that may be there is no such need in such power. And, as another idea, what if wrap rx by the functions which will allow to decrease the level of rx abstraction and subtract its power? What if make it more concrete? (Well, after I've written a text below I noticed that it is even not necessary to use rx. I'm leaving this text without appropriate changes, just let me express myself with rx).

For example,

(Then (elx "I should be in buffer" (ex 'buffer-name)) ...)

elx is the ecukes function which wraps rx and means kind of: ecukes line eXpression (expression which occupy only single line). It wraps in itself expression by line-start and line-end. ex is a ecukes function which also wraps rx and takes argument. For example symbol buffer-name which ex recognizes and makes a call to a function like ecukes-rx-buffer-name which returns rx expression? ex could also takes many others symbols such as region, point, line, file, word, variable etc etc etc. (it may be also possible to implement it in such way that such functions as ecukes-rx-buffer-name will be able to make validation and report an error in case of, for example, there is no buffer with such name as was specified by a user in a concrete step. Not sure that that could be useful and easy to implement, but just in case)

or

(When (elx "I translate a text" (ex 'text) "from" (ex 'language) "to" (ex 'language)) ...)

(lets imagine that I, as a user of ecukes, am able to define (in mine *-steps.el file) ecukes-rx-language function so ex will be able to make a call).

Other example:

(Then (emx "I insert" (ex 'contents)) ...)

or (with the same meaning)

(Then (emx "I insert" (ex 'py-string)) ...)

emx is the ecukes function which also wraps rx and means kind of: ecukes multiline eXpression which may include in itself py-string. emx in itself also wraps a parameter ("I insert") with a line-start and a regular expression which matches py-string.

Well, this probably looks a little bit complicated but at least, I feel, the general picture is more consistent than in case of template way.

What do you think?

@rejeep
Copy link
Contributor

rejeep commented Feb 4, 2014

Key is just an abbreviation of keyword

Ok, keyword is not bad at all actually.

About more verbose templates, such as {IP-ADDRESS} or $ip-address. I've chose CAPITALIZED NAMES because they are very similar to the ARGUMENTS in the elisp doc-strings.

I agree, makes more sense. It's prettier syntax and we really don't need to be that verbose.

In the meantime I'm continue thinking about rx approach...

Personally, I don't think that adding the elx feature would help very much. If you want you can always just use rx directly to create a regex string.

The mixing of regular expression and custom definitions in it still confuses me.

Yes, I'm thinking the same. Do you know if Cucumber has any similar feature? It might be worth checking that out and perhaps in such case use their approach.

@atykhonov
Copy link
Contributor Author

I agree, makes more sense. It's prettier syntax and we really don't need to be that verbose.

I agree, but I afraid about parsing issues, when user will use accidentally upper cased text in the step definition... However may be this is wrong fear.

Yes, I'm thinking the same. Do you know if Cucumber has any similar feature? It might be worth checking that out and perhaps in such case use their approach.

I'm sorry, I didn't even know that there is such library as Cucumber (shame on me). I've read ecukes documentation and see the reference but didn't take a look...

Currently I'm not lucky while looking for similar feature. But I'm not surprised because cucumber and its step definitions looks quite good as for me: regexps are not required there to be escaped.

But, what else I wanted to say by means of rx approach and original keyword approach that I see data types here. I can be very wrong here but let me please express myself. I believe, let me say that, that cucumber and ecukes are quite different tools because cucumber is used to used for very wide range of tasks, scenarios, business logic, user activities etc etc etc. And probably mainly for different data types. So as I see there is no reason to define these data types as they are will be very different for different projects: every project uses its own entities. In the same time Ecukes, as for me, is a tool which is very close to a Emacs and its entities (such as point, word, region etc) which are very pronounced as for me. Also, I would like to say that, probably cucumber is a tool which intended to be used with entities with higher level of abstraction. For example there can be such step definition: /^When user select a book$/ You see that this step definition about user and book. In the same time as I imagine ecukes will be mostly about strings, texts, points, regions, words, buffers etc. I'm not sure but this is just as I imagine.

Thus the original idea was to use these entities as widely as possible within ecukes. The reason is: they are pretty pronounced and I believe that such approach may help users to be allowed express theirs steps by means of short and precise terms which are very close to the real world (to the world of emacs).

@@ -41,6 +41,13 @@
"^\\s-*|.+|"
"Regexp matching table.")

(defconst ecukes-concise-keywords-alist
'(("\\(?: \"\\(.*\\)\"\\|:\\)" . (" CONTENTS"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have a space before CONTENTS

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I didn't want this piece of code to be a part of ecukes. I left it to be fixed later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now keywords are removed so this issue with CONTENTS seems is not actual any more. But in case it will be required to add it I have no idea how it could be fixed except:

make from step definitions with such complex regular expression two step definitions. So one of them accept "lorem ipsum strings" and another one py-strings.

@rejeep
Copy link
Contributor

rejeep commented Feb 4, 2014

Yes I agree, but we would still have to allow the user to define domain specific keywords, such as ip address.

I will look at your implementation in more detail when I have the time and try it out on my projects. I do have a few quick comments if you want to start work on it directly:

  • The function dolist is part of cl which I will eventually will try to get rid of. Please use -each instead.
  • What other keywords can we think of? Line is one that came to my mind.
  • Maybe MODE and VARIABLE should be more strict. Now it can include spaces for example. We might also want to add MACRO and FUNCTION.
  • Why set case-fold-search to nil?
  • Create function that will be available in support/env.el to create your own keywords, for example ecukes-define-keyword.

(matches))
(dolist (keyword concise-keywords)
(while (setq matches (s-match (format "[[:upper:]0-9-]+-%s" keyword) body))
(setq body (s-replace (nth 0 matches) regexp body)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not (car matches)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

car still not usual for me. But thanks! Fixed!

@atykhonov
Copy link
Contributor Author

I just committed new version. It is more better than previous one.

  • It has no predefined keywords. Original list of predefined keywords was weird. So I decided just to remove it for now and we can agree and add (now or later) more better predefined keywords if overall we need to add them;
  • Previous code was parsing ANY capitalized word. Removed. Keyword must to be defined first then it can be parsed;
  • I added new function ecukes-define-keyword. So it is possible to add keyword in such a way:
(ecukes-define-keyword "TEXT" "\"\\(.+\\)\"")
  • case-fold-search removed. Now it is unneeded;

And I'm sorry, seems I skipped somehow yours last list of comments and didn't make appropriate fixes; And I put ecukes-define-keyword to the ecukes-parse.el. That is wrong place due to yours comments. Also didn't change dolist and didn't fixed regexp for CONTENTS so it still with whitespace :-( Will be working with these things tomorrow. Also the question about MODE and VARIABLE still actual for me.

< What other keywords can we think of? Line is one that came to my mind.

Sorry that I was boring with my last long comments... I somehow described there some ideas about keywords. But well, there was just examples. Other keywords could be:

  • POINT
  • LINE
  • FILE
  • WORD
  • VARIABLE
  • VALUE
  • TEXT
  • BUFFER
  • KEY-BINDING
  • IP-ADDRESS
  • MODE
  • CONTENTS
  • MESSAGE
  • MACRO
  • FUNCTION
  • COMMAND

Here is another question: would be LINE good enough? May be LINE-NUMBER? The same question is about POINT, and may be FILENAME (or FILE-NAME), BUFFER-NAME?

@atykhonov
Copy link
Contributor Author

And... I finally broke unit tests... Appropriate fix already committed.

(setq case-fold-search nil)

My bad. This is required because without it s-match and s-replace become case insensitive.

@atykhonov
Copy link
Contributor Author

And lastly (for today)... Again, please, any comments, suggestions, code reviews etc are much appreciated!

@atykhonov
Copy link
Contributor Author

  • dolist' replaced by--each'
  • ecukes-concise-keywords-alist' andecukes-define-keyword' moved to `ecukes-load.el'

@dickmao
Copy link
Collaborator

dickmao commented Feb 2, 2021

Statute of limitations has run its course.

@dickmao dickmao closed this Feb 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants