-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Service testing strategy #927
Comments
By the way, I wanted to share my experimenting: https://github.com/paulmelnikow/shields-tests This includes: |
I worked up a third POC:
The vendor spec looks like this: 'use strict';
const Joi = require('joi');
const ServiceTester = require('../service-tester');
const t = new ServiceTester('CRAN', '/cran');
module.exports = t;
t.create('version')
.get('/v/devtools.json')
.expectJSONTypes(Joi.object().keys({
name: Joi.equal('cran'),
value: Joi.string().regex(/^v\d+\.\d+\.\d+$/)
}));
t.create('specified license')
.get('/l/devtools.json')
.expectJSON({ name: 'license', value: 'GPL (>= 2)' });
t.create('unknown package')
.get('/l/some-bogus-package.json')
.expectJSON({ name: 'cran', value: 'not found' });
t.create('unknown info')
.get('/z/devtools.json')
.expectStatus(404)
.expectJSON({ name: 'badge', value: 'not found' });
t.create('malformed response')
.get('/v/foobar.json')
.intercept(nock => nock('http://crandb.r-pkg.org')
.get('/foobar')
.reply(200))
.expectJSON({ name: 'cran', value: 'invalid' });
t.create('connection error')
.get('/v/foobar.json')
.intercept(nock => nock('http://crandb.r-pkg.org')
.get('/foobar')
.replyWithError({ code: 'ECONNRESET' }))
.expectJSON({ name: 'cran', value: 'inaccessible' });
t.create('unspecified license')
.get('/l/foobar.json')
// JSON without License.
.intercept(nock => nock('http://crandb.r-pkg.org')
.get('/foobar')
.reply(200, {}))
.expectJSON({ name: 'license', value: 'unknown' }); This provides close to 100% code coverage for the CRAN/METACRAN badges. Overall, I'm happy with it. I find the Joi syntax too verbose. This is the syntax from my sketch, which I prefer: t.create('version')
.get('/v/devtools.json')
.expectJSON({ name: 'cran', value: v => v.should.match(/^v\d+\.\d+\.\d+$/) }); The implementation of that was more complicated than I expected so I'm going to defer it. This is good enough to start writing tests. We can always improve it later. |
Somewhat tangentially, I always thought it would be welcome to have a (non-test) script that would simply record a known-good output of all .json badges into a JSON file, along with all network outputs from using try.html. Then we could also have a (test) script that would be able to check that all the .json badges remain the same, ensuring that PRs don't cause regressions. |
That's a nice idea. It would help prevent regression in the code, and could run fast, on every commit. It's a nice complement to tests which hit the vendors' servers, designed to catch issues like #939 which are with the services. A problem we'd need to solve would be how to maintain the battery of recorded outputs when a dev wants to make additions or changes. When a dev adds new functionality, they need to record a few new requests, but keep the rest intact. When changing code to accommodate API changes upstream, they need to replace a few requests, but keep the rest intact. |
Re: #1286 (comment) ^^ @chris48s I do think it's important that we hit the real services during PRs, and also nightly, so we can detect breaking changes in the upstream APIs. The current setup is not perfect. I don't like that the build breaks on master when there are problems upstream. And we still don't have a good solution for our GitHub service tests. (#979) |
Unsurprisingly, it seems like you've already thought about this quite thoroughly :) Having read this thread, I see that whereas you would usually want to mock out interactions with an external service the nature of this project means you do want some tests in your suite that will start failing if an external service goes away or makes a backwards-incompatible API change. I guess the situation I am thinking of is not so much api breaks (which are important to detect), but if you think about this test for example: https://github.com/badges/shields/blob/master/service-tests/npm.js#L34-L36 This test could start failing because Express changes its licence, rather than because there is a regression in the code under test or NPM has changed its API. Obviously in writing tests we can try to make good choices about the packages/repos we test against, but it is always an issue.. I think given where the project/test suite is now, prioritising the integration tests is most important but adding what you've described as 'recorded tests' has value too. Do you have any services where you also have 'recorded tests' in place? |
We don't have any recorded tests right now. Think this issue is a good record, though not sure there's anything actionable right now. Let's open a new issue when there is! |
I'd like to suggest a goal for the project: automated tests of the vendor badges.
As Thaddée mentioned in #411:
I agree that tests like these would be valuable for end users. They would detect API problems when they occur, rather than hoping/waiting someone will eventually show up and report them.
They'd be relatively simple to write. They would help badge developers , and push more of the testing responsibility from the reviewer to the contributor.
In PR reviews, I spend a significant amount of time manually testing various combinations of inputs. After requesting changes I really should start all over again. Tests would make this faster and much more reliable.
I'm skilled enough to spot many common problems in this type of code, however unlike Thaddée I'm not an expert in this codebase. I don't have the eye to recognize all the problematic patterns that he would. I can't tell the code is obviously right, or obviously wrong code, just by looking at it.
Given how easy it is to write vendor badges, and what's sure to be endless growth of available repositories and tools, PR review will remain a bottleneck. With automatic tests, testing responsibility can be scaled out to a wide net of contributors. Writing tests for existing vendors is another easy way for contributors to put love into the project, even those who are content with the existing feature set.
Some downsides:
There are a few important attributes of tests like these:
We could call these integration tests.
There’s another, more ambitious kind of test I'd also like to see: tests which record vendor responses and play them back later. They would inject mock vendor responses to handle cases like malformed requests, which can’t be tested any other way. Fundamentally, they could test that the server code is doing exactly what it is supposed to, regardless of whether a package still exists or a vendor server is temporarily down.
Important attributes:
We could run the whole suite of tests on every PR. That ensures non-breakage if other features due to a change, speeds up reviews, and lets us merge with confidence.
Running the whole suite is especially useful for refactoring: developing other, potentially significant changes to the implementation which should not affect most of the server’s behavior.
The tool I've been experimenting with is Nock Back. Working tests are reliable and it's pretty good overall. However it doesn't have the most active maintenance, I've run into some bugs (nock/nock#870), and I've found failing tests to be tricky to debug (nock/nock#781, nock/nock#869).
We could call these recorded tests.
Checking code coverage is important not just for improving the code quality, but also the quality of the service. In my experience writing just a few of these, I found behaviors which should be improved, code paths that do not make any callback, and other errors in error-handling code.
However, recording responses is a really big bite to take, especially with some uncertainty about the current state of the tooling. So I'd suggest we leave recording for another day.
For now, the goals should be the following:
This would include:
a. Automatically inferring which vendors to run
b. Reporting code coverage
Thoughts? Concerns?
And importantly, would anyone else like to join in the fun? 😀
The text was updated successfully, but these errors were encountered: