Introducing Cukes #431

steveklabnik · 2013-12-03T12:10:54Z

Please note this is not merge-able yet, this is mostly for discussion purposes.

Hey all! One of the first things I've been tasked with is improving the API specification. Over the weekend I asked around my network, did some research, and spiked out an idea, and I wanted to run it all past you.

Requirements

As always, the first step is to gather requirements. Here's what I gather we need for the API spec:

A way to define what each aspect of the API does.
An automated way to turn those definitions into executable tests.
A way to run those tests regularly to ensure conformance.

All three of these are pretty straightforward, but might as well spell them out.

These specifications must be easy to understand by a large variety of people.

This one is complicated. There are two kinds of concerns here: {in,out}side of Balanced, and across software language groups.

The first is the in/out part. Basically, while we have a very clear idea of what everything does, people who are new to Balanced won't. Just today I got an email from someone, asking if Balanced was appropriate for them, when their marketplace was basically identical to Rent My Bike. I'm happy to answer those kinds of questions, but we need something that will be easy for people who don't live and breathe payments to grok.

Secondly, software language groups. If we write our specs in Python, Pythonistas will be able to understand it, but Haskellers might not. Well, they probably will, but you know what I mean. Everyone wants their preferred language, but our product is an API, and it's language agnostic. Therefore, we need something that's not incredibly closely tied to a particular software language.

The specification has to facilitate collaboration on what the specification actually does.

This is related to #4, but is a bit different. It's not enough that the spec is legible to everyone, but that it's easy to bikeshed. ;)

What we have

Right now, the API spec looks largely like this: https://github.com/balanced/balanced-api/blob/revision1/scenarios/cards/tokenize_without_secret.yml

It's a bunch of YAML files, with a totally custom, bash/Python runner.

This situation is good in that it's language agnostic, but bad in that it's 100% custom, so it's not natively understood by anyone, and while it communicates to some degree of what the specs do, it's almost 100% code. It took me some time to be able to actually understand what's going on, which will always happen, of course, but it's still hard to turn into actual words, which impedes #4 and #5.

Solution

I've spiked out a temporary solution based on a tool called "Cucumber." Before I tell you more, let's compare:

Current: https://github.com/balanced/balanced-api/blob/revision1/scenarios/cards/tokenize_without_secret.yml

Cucumber: https://github.com/balanced/balanced-api/blob/cukes/features/cards/tokenize.feature

The 'current' has two features, and the cucumber only has one, so ignore past line 18 on 'current.'

Cucumber, if you haven't heard of it, is a tool for collaboration between stakeholders. The idea is that you write high level descriptions of what you want your software to do in psudo-English ("Gherkin"), and then that gets transformed into code to actually run and test those descriptions.

While I've implemented the scenarios in Ruby, you can write the 'code' aspect of the tests in a variety of languages. You can also write the features in a variety of languages, too. For example: https://github.com/cucumber/cucumber/blob/master/examples/i18n/el/features/addition.feature

As for how this gets turned into code, I have two versions:

JSON Schema: https://github.com/balanced/balanced-api/blob/cukes/features/step_definitions/card_steps.rb

Assertions: https://github.com/balanced/balanced-api/blob/b91d4264eaace33373606191ef4d32ca1372ac47/features/step_definitions/card_steps.rb

Since we already assert against JSON Schema, I think sticking with that makes sense, as it'll be easier to port.

I actually think the "200 code" assertion may be too low level.

Pros

Basically, this is the exact use case for Cucumber. Because the specifications are written in English, we help make it easier for people to understand, as well as making it easier from people to collaborate across language boundaries. I have some issue with English being the de facto human language that technologists communicate in, but it is what it is, at least for now. This solves #4 and #5, our biggest problems, nicely.

Furthermore, collaboration can be done on the high-level language, without worrying about the details. For example, Matin had some concern over my wording of the last part of that scenario, because 'that's not exactly what that does.' Excellent, we have some sort of communication error here. Cucumber's high-level descriptions bring that kind of thing out, and people can submit pull requests with just the Gherkin, and we can add the implementation ourselves.

I wonder if "elevator pitches for your code" is a good way of describing Cucumber. Maybe I've been in San Francisco too long. ;)

Cons

Cucumber is... not a tool without some controversy. It was originally described as an 'acceptance testing' tool, and so people got into 'test mode' and started writing tests. Cucumber is very bad when used in this way. Eventually, even some helpers were removed [1] to discourage writing tests at a low level. Basically, if you're not writing very high-level descriptions, you are Cuking It Wrong. ;)

Cucumber can be slow. There's the whole Gherkin-code 'compilation' step, so it can take longer than other frameworks. This is mitigatable, but we're already planning on making HTTP requests against our API for our tests, so they'll be slow no matter what the framework is.

As I mentioned above, some people really, really dislike cukes. I think that's mostly due to having terrible Cucumber suites, people have developed some scars. This can be overcome, but there may be some initial 'eww.' There may be with our specs as of right now. ;)

Other options

Basically, if this is a bad idea, we should just move to some sort of very basic unit testing library in a not-to-unusual language. I'd basically use Python and Requests to do this, since it's what we're most familiar with overall. I think that the downsides of being language specific is pretty big, though.

Conclusion

I'm sure I've forgotten a few things, but I'd love to hear feedback, good and bad. I think this is the right move, but I don't always have all the right answers... I would especially love feedback from @mattwynne and @aslakhellesoy .

1: http://aslakhellesoy.com/post/11055981222/the-training-wheels-came-off

Scenarios

CRUD tests

Here's all the cases we need to cover to merge. From https://docs.balancedpayments.com/1.1/api/ :

API KEYS

Create a Key
Get a Key
List API Keys
Delete API Key

BANK ACCOUNT VERIFICATIONS

Create a Bank Account Verification
Get a Bank Account Verification
Confirm a Bank Account Verification

BANK ACCOUNTS

CALLBACKS

Create a Callback
Get a Callback
List all Callbacks
Delete a Callback

CARDS

CREDITS

Create a Credit
Get a Credit
List All Credits

CUSTOMERS

DEBITS

EVENTS

Get an Event
List all Events

REFUNDS

Create a Refund
Get a Refund
List All Refunds
Update a Refund

REVERSALS

Create a Reversal
Get a Reversal
List All Reversals
Update a Reversal

non-crud tests

That's exhaustive. Let's do this.

mattwynne · 2013-12-03T12:33:58Z

I like it!

You'll have to see how it plays out to stick to this kind of high-level language in all of your scenarios. Some people might want to see more detail, yet not want to read the Ruby code underneath in the step definition. You can either address this with hand-crafted user docs (which could go in the Gherkin), or you could consider having one scenario per feature that uses more detail and has concrete examples of the JSON request / responses.

I'm interested to see how this develops.

tef · 2013-12-03T12:37:06Z

I think as it stands, your requirements are manufactured to want Cucumber, so let's look at them:

A way to define what each aspect of the API does.
An automated way to turn those definitions into executable tests.
A way to run those tests regularly to ensure conformance.

The second step is the one you'll have to justify, and I'm not convinced.

. Basically, while we have a very clear idea of what everything does, people who are new to Balanced won't. Just today I got an email from someone, asking if Balanced was appropriate for them, when their marketplace was basically identical to Rent My Bike

Writing these tests in pseudocode won't make people read them. (Unit) tests themselves do not give big picture views of your code, and should not serve as documentation, or example code.

If you have a comprehensive test suite, would you expect people to browse through them one-by-one to see if their use case works?

Secondly, software language groups.

There is an xkcd cartoon here, but instead of using a language some people know, you're going to use a language no-one knows. Although some tests become 'readable', the maintenance of tests becomes labourious. Frequently for each line of english you have to write a custom regular expression to parse it. Your test runner won't be language agnostic either.

Therefore, we need something that's not incredibly closely tied to a particular software language.

Except the implementation will be, and you will need to rewrite the ad-hoc regex rules in each language if you want to avoid this. This is a huge burden for tests and maintenance.

Basically, this is the exact use case for Cucumber

The use case of cucumber (afaik) was so that clients signing off on a project could agree to pages and pages of legalese that was executable, to enable the customer to both understand and specifiy what they want.

This is not your use case. Your use case as presented is "We need better documentation and examples to showcase features and make it easier to get started with the API". Cucumber solves this if you think reading tests is a good substitute for documentation and worked, comprehensive examples in other languages :-)

My experience with cucumber is that it made writing tests slow, and running tests slow, and debugging tests harder.

BRMatt · 2013-12-03T12:38:32Z

Sorry if I've misunderstood the problem, but have you considered using a tool like api-blueprint to document the API, then use tools that test the API against the defined documentation?

It adds some semantics on top of Markdown for documenting requests/responses/schemas etc. and it's pretty readable, even when browsing via Github.

Admittedly it probably won't capture intricate scenarios as easily as a cuke scenario/bash script, but thought it might be worth bringing up.

tef · 2013-12-03T12:39:20Z

(nb I'm not against having documentation and examples which can be tested to check they work, but I don't think that is the same as having tests that serve as documentation and examples. c.f python's doctests and the problems with using them)

steveklabnik · 2013-12-03T12:41:08Z

@BRMatt , I'm not opposed, but one of the tricky bits here is that as Balanced moves even further towards hypermedia, we want to move away from documenting exact URLs and HTTP methods, and focus more on workflows. I don't have the most experience with api-blueprint, but generally, these kinds of tools tend to focus on 'RESTish' style documentation rather than this kind, so that's something that's in here too.

steveklabnik · 2013-12-03T12:44:57Z

@tef, I think your 'use case for cukes' is the same thing, just more cynical. ;)

I actually didn't even come up with cukes until the last minute, so maybe they are described as a fit for cukes, but it happened in the other direction.

I don't think new people will go read a bunch of cukes, but it would be nice to show as a response to this kind of question.

mattwynne · 2013-12-03T12:51:50Z

@tef makes an important point that the demands of testing and documentation can come into conflict. The RSpec project have tried this, and found that the comprehensive test suite made for pretty boring user documentation. Users don't want to see every single edge case.

My advice in this situation is usually to push the edge cases down into unit tests. If you think there's still benefit in having that complete specification readable by everyone, consider writing two sets of cukes - one that provides a user manual, and another that provides your complete specification.

Kosmas · 2013-12-03T13:02:48Z

@tef regarding the issue of speed with the tests, I think that is something that will change soon but @mattwynne will be able to confirm this.

I also believe that the cukes would help to have a quick understanding of how things are supposed to work, but in order to fully understand you would have to go further deep into the code.

MattyO · 2013-12-03T13:27:29Z

The python equivalent to cucumber is called lettuce. It a little immature compared to cucumber. Its usefulness may be found in the use cases of interoperability with any current testing tools and libraries you might have or want to use in the future written in python.

Mainly wanted to let you know of its existence.

BRMatt · 2013-12-03T14:02:27Z

@BRMatt , I'm not opposed, but one of the tricky bits here is that as Balanced moves even further towards hypermedia, we want to move away from documenting exact URLs and HTTP methods, and focus more on workflows. I don't have the most experience with api-blueprint, but generally, these kinds of tools tend to focus on 'RESTish' style documentation rather than this kind, so that's something that's in here too.

@steveklabnik Fair enough, those are reasonable points. I believe the maintainers of api-blueprint are trying to work in support for hypermedia relations, but I'm not sure how that fits in with your plans.

mjallday · 2013-12-03T15:45:11Z

@steveklabnik do you see Balanced then implementing the underlying scenarios for the Gherkin in each client library to ensure they conform to a basic functionality (the spec)?

As I'm sitting here figuring out how we can implement the rev1.1 clients I really like the idea of a basic set of functional tests across all client libraries so we don't end up with the issues we had last round such as under-utliized client libraries having issues opened because they are missing esoteric functionality and we don't realise until someone actually tries to use it.

mahmoudimus · 2013-12-04T06:45:09Z

Here are my thoughts:

I think Gherkin is the way to go here, definitely +1 on it.

First and foremost, there are many Gherkin/Ragel parsers in many different languages. The official client libraries that are supported are: Ruby, Python, PHP, Java, C# and Node.js -- all of which have Gherkin/Ragel runners. So this is largely aiding #4 as well and helps with the training wheels.

We attempted to something very similar to this in our documentation: https://github.com/balanced/balanced-docs/tree/master/clients -- maybe instead of having scenarios like this, we organize tests implemented in different languages and use those to feed into our API? Ben might have more thoughts on this topic.

This is definitely a step in the right direction and however, I'd like to compare it to something like http://apiary.io or RAML.

steveklabnik · 2013-12-04T08:14:11Z

@mjallday maybe; I'm not sure how directly applicable they are, as the clients would need testing that they consume things correctly, and these are tests that stuff is produced correctly.

That said, we should integration test both together, at some point.

steveklabnik · 2013-12-04T09:39:35Z

I just realized that I wrote a blog post a few years back that's directly relevant to this conversation http://blog.steveklabnik.com/posts/2011-12-20-write-better-cukes-with-the-rel-attribute

JonRowe · 2013-12-05T05:15:04Z

What'd be great is a way to visualise the gherkin documentation online, with each step able to sho the raw response / differing language implementations...

steveklabnik · 2013-12-05T05:16:02Z

@JonRowe , that's what http://relishapp.com is for :)

JonRowe · 2013-12-05T05:25:11Z

It allows you to visualise the documentation, but it's not very non-developer friendly, I found @chrismdp's rack-usermanual idea very intriguing in this respect. It also doesn't currently offer a means to inspect what it actually does / provide sample responses.

steveklabnik · 2013-12-05T05:46:17Z

Oh that is super cool!

aslakhellesoy · 2013-12-05T09:54:11Z

@steveklabnik we'd be happy to set you up with a free https://cucumber.pro/ account once we go live. Cucumber Pro will replace Relish.

Regarding the idea to document the API with Cucumber. If I was a developer using the API, I would like to see scenarios like this:

Feature: Payment

  Scenario: tokenize
    Given some precondition here
    When I http POST to /cards with JSON body:
      """
      {
        "number": "4111111111111111",
        "expiration_month": "12",
        "expiration_year": 2016
      }
      """
    Then the response code should be 201 with:
      """
      {
        ... whatever the expected reply is...
      }
      """

This is more verbose than what you suggested, but I think this level of detail is needed for the audience you're targeting (developers using the API)

steveklabnik · 2013-12-05T10:02:27Z

@aslakhellesoy thanks! Yeah, that makes sense; with something like this, I'm never 100% sure what the right level of abstraction is

aslakhellesoy · 2013-12-05T10:45:28Z

To find the right level of abstraction, ask some stakeholders (in your case, users of the API):

Does this make sense?
Would you be able to do this yourself with this info?
Do you need more info?
Is anything puzzling you?

Looking at your feature I think most developers unfamiliar with your API would answer like this:

Yes, I think so
No. I have no idea what HTTP request to issue
Yes, can you give me an example?
I don't understand how the secret key is relevant. Can you explain? (that's what the scenario description is for)

Feature: feature name
  Scenario: scenario name
    Some descriptive text here is useful

    Given ...

mahmoudimus · 2013-12-05T16:46:18Z

@aslakhellesoy that looks really good! 👍

andrewsardone · 2013-12-05T18:00:00Z

We've been using a similar setup at @nutshellcrm. I think it has been working pretty well, though I admit it's tough to find the right level of abstraction.

chrismdp · 2013-12-06T12:07:37Z

@steveklabnik thanks. Just realised the Rack::Usermanual demo links were broken: should be fixed now.

steveklabnik · 2013-12-09T17:26:32Z

@andrewsardone would you be interested in some help and / or collaboration to turn this into a gem, maybe?

steveklabnik · 2013-12-09T18:00:31Z

@tef, does this style do anything to mitigate your concerns, or you still think it's a terrible idea?

mahmoudimus · 2013-12-09T18:09:14Z

👍

steveklabnik · 2013-12-09T18:30:16Z

@aslakhellesoy @mattwynne , one more question: We have scenarios that operate like this: https://github.com/balanced/balanced-api/blob/revision1/scenarios/cards/tokenize_without_secret.yml#L20-L23

This (kinda terrible because of the runner) test basically says "run the 'tokenize' scenario, and then make a request to the URI that's located at cards.href in its response."

This kind of thing is going to come up a lot as we move more and more towards hypermedia; "execute this thing, if it passes, execute this other thing, based on what comes back." These feel like two different tests to me, but GivenScenario was depreciated, so I'm not sure how to accomplish this kind of task.

Thoughts?

mamund · 2013-12-09T18:46:18Z

I've thought about this, too. Another possible testing challenge is that you're building a client that could possibly travel several paths to a successful outcome. So it's not just if this then that but also something like start at A and IF you eventually get to Z that's a pass ELSE that's a fail

A likely situation for hypermedia, but not something I am aware that current testing tools can handle.

matin · 2013-12-09T20:13:59Z

@steveklabnik Could you also add a scenario that debits/charges the card that you just tokenized?

One of the most painful parts of our current scenarios is referencing the result of a previous scenario. This becomes even worse if you want to use some sort of a fixture to avoid doing repetitive tasks like tokenizing a card or creating a customer for different scenarios.

Should be linked to #481

#486

#487

steveklabnik · 2014-01-23T19:50:39Z

I am currently re-running the tests over every commit to make sure that bisect works. Assuming that's successful, it's time to finally merge this sucker.

matthewfl · 2014-01-23T19:52:44Z

@steveklabnik I just noticed that there are some scenarios that do not appear to be implemented, might we double check that we have everything

https://github.com/balanced/balanced-api/blob/revision1/scenarios/debits/create_new_card.yml

Accidentally had two successes, now we have one and a fail.

steveklabnik · 2014-01-24T01:49:53Z

Okay! I have re-gone through and double checked. We were missing one or two. Let's get that CI building.

steveklabnik · 2014-01-24T02:02:37Z

🎊

Introducing Cukes

mjallday · 2014-01-24T02:03:26Z

oh snap! 💃

julianpistorius · 2014-02-05T00:25:47Z

@steveklabnik have you looked at 'behave' instead of Cucumber? It's Python and allegedly better than Lettuce. http://pythonhosted.org/behave/index.html

http://pythonhosted.org/behave/comparison.html

steveklabnik · 2014-02-05T00:48:50Z

@julianpistorius I have not, no.

matthewfl and others added 12 commits January 23, 2014 19:30

fixing bank account rest scenarios

2770646

fixed 404 error

7d774b5

canceling an order

f802d28

the end is nearing

4b13e43

one more to go

5a1a65f

I did it, its done, no more failing

bf659a8

some issues with test credit card numbers

df27867

properly tag luhn test failure.

79fd4df

Should be linked to #481

Tagging CVV is unsupported failure.

8027255

#486

Properly tagging cvv does not match

a6fa221

#487

cvv card number scenarios working

79702b8

it is all working now

f08a98d

steveklabnik added 4 commits January 24, 2014 01:16

Add feature for debiting a card directly.

ce58fea

Fix up specs related to reversals.

3a87dc0

Accidentally had two successes, now we have one and a fail.

New feature: update a customer

82540eb

Two last bank account features.

2b9b019

steveklabnik added a commit that referenced this pull request Jan 24, 2014

Merge pull request #431 from balanced/cukes

7c22c05

Introducing Cukes

steveklabnik merged commit 7c22c05 into revision1 Jan 24, 2014

steveklabnik mentioned this pull request Jan 29, 2014

Hypermedia apiaryio/api-blueprint#13

Open

steveklabnik deleted the cukes branch February 5, 2014 00:48

mjallday mentioned this pull request Feb 27, 2014

Setup cucumber integration tests balanced/balanced-dashboard#1009

Closed

bf4 mentioned this pull request Oct 22, 2015

apis stuff bf4/bf4.github.com#9

Open

bf4 mentioned this pull request Dec 11, 2015

Add best practices for building Ruby JSON APIs thoughtbot/guides#365

Closed

Introducing Cukes #431

Introducing Cukes #431

Conversation

steveklabnik commented Dec 3, 2013

Requirements

What we have

Solution

Pros

Cons

Other options

Conclusion

Scenarios

CRUD tests

non-crud tests

mattwynne commented Dec 3, 2013

tef commented Dec 3, 2013

BRMatt commented Dec 3, 2013

tef commented Dec 3, 2013

steveklabnik commented Dec 3, 2013

steveklabnik commented Dec 3, 2013

mattwynne commented Dec 3, 2013

Kosmas commented Dec 3, 2013

MattyO commented Dec 3, 2013

BRMatt commented Dec 3, 2013

mjallday commented Dec 3, 2013

mahmoudimus commented Dec 4, 2013

steveklabnik commented Dec 4, 2013

steveklabnik commented Dec 4, 2013

JonRowe commented Dec 5, 2013

steveklabnik commented Dec 5, 2013

JonRowe commented Dec 5, 2013

steveklabnik commented Dec 5, 2013

aslakhellesoy commented Dec 5, 2013

steveklabnik commented Dec 5, 2013

aslakhellesoy commented Dec 5, 2013

mahmoudimus commented Dec 5, 2013

andrewsardone commented Dec 5, 2013

chrismdp commented Dec 6, 2013

steveklabnik commented Dec 9, 2013

steveklabnik commented Dec 9, 2013

mahmoudimus commented Dec 9, 2013

steveklabnik commented Dec 9, 2013

mamund commented Dec 9, 2013

matin commented Dec 9, 2013

steveklabnik commented Jan 23, 2014

matthewfl commented Jan 23, 2014

steveklabnik commented Jan 24, 2014

steveklabnik commented Jan 24, 2014

mjallday commented Jan 24, 2014

julianpistorius commented Feb 5, 2014

steveklabnik commented Feb 5, 2014