Skip to content

replace old ocsp and crl apiTests with new integration tests#120

Open
heskew wants to merge 5 commits intomainfrom
cert-verification-integration-tests
Open

replace old ocsp and crl apiTests with new integration tests#120
heskew wants to merge 5 commits intomainfrom
cert-verification-integration-tests

Conversation

@heskew
Copy link
Member

@heskew heskew commented Jan 26, 2026

Resolves #118 and #119

@heskew heskew marked this pull request as ready for review January 26, 2026 16:43
@heskew heskew requested a review from a team as a code owner January 26, 2026 16:43
Copy link
Contributor

@cb1kenobi cb1kenobi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@Ethan-Arrowood Ethan-Arrowood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking great. Love to see some extensive use of the new integration testing framework.

I'm most concerned with two things:

  1. configurable ports (can introduce major flakiness)
  2. Assertions seem to be okay with 200, 400, and other status codes. Unless it really is random or undetermined, these scenarios feel like specific cases we should be asserting for. What would cause a request to sometimes 200 and other times 400 when it comes to certificate verification? Seems suspicious to me, but maybe there is something to all of that I'm not aware of or considering.

Finally, is there any additional documentation for the new utils (or tests) that needs to be added to the integration testing docs content?

[
'ocsp',
'-port',
String(port),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the ability to specify a port should be optional. This introduces the potential for flakiness if the specified port is already in use.

If possible, can this command self-allocate an available port number and then this method resolves that port number in the response object? Something like Node's http.createServer(0) ?

If not, could we look into some sort of retry mechanism at least?

From my own experience, I know there is no mechanism to "reserve" a port number; at least not with some race condition. Packages like this: https://www.npmjs.com/package/get-port are susceptible to race-conditions since it just spawns a server, reads the port number, then returns the number and you hope you can use that value before something else does.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I don't want to get too complex in here. I've updated what I can in here and might see what I can do about the crl and ocsp endpoints and cert generation. Have some ideas but that might be fine for a follow up.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eh...you've spent so much time and effort on the initial integration test setup I don't want to leave something like this for later. I'll get something pushed up in a bit to take care of the remaining potential port conflicts.... brb

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! I think preventing flake in our integration tests is crucial.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, no worries. See how you feel about this. Leaving port allocation to a lower level would probably be possible, but more complex - especially across major platforms. I'm still not sure how this will run as it is across platforms and environments with the openssl dependency. I'll do further testing on my win and linux machines and follow up with additional prs if needed.

Comment on lines +7 to +9
import fs from 'node:fs';
import path from 'node:path';
import { execSync } from 'node:child_process';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Mixing import patterns is funny to me. I'd expect either all method imports (import { writeFile } from 'node:fs';) or all module imports (import child_process from 'node:child_process';). We likely should get a lint rule going for this. No big deal if you don't change it here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guess it depends on if you want to use the default export vs named. To me it just depends on what seems to look better in code and this felt like a fine balance. Don't think it's a big deal if tidy and 100% down for additional useful guardrails.

* @param certsPath - Path to directory containing test.crl
* @returns Promise resolving to CrlServerContext
*/
export async function startCrlServer(port: number, certsPath: string): Promise<CrlServerContext> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar as above, I don't like specifying ports as it can be racey or flakey. I'll include the appropriate code change below since this one is at least using Node.js' createServer API.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed to auto assigned ports where more easily possible but the ocsp and clr endpoints need a know port at cert creation time so there's a little more involved. Added some comments and hopefully made it less likely that there would be conflicts - and if so, we should get a clear error in the logs if an an endpoint has a port conflict.

Comment on lines 132 to 137
err.code === 'ECONNRESET' ||
err.code === 'EPROTO' ||
err.code === 'ECONNREFUSED' ||
err.message?.includes('socket hang up') ||
err.message?.includes('certificate') ||
err.message?.includes('closed'),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These feel like 3 different scenarios we should be testing for. Is it random how this situation would error? Does it change based on the node version?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tidied up a bunch of this to have clear narrow assertions. Anything else should be treated as an unexpected failure somewhere (code or test). Thanks for calling that out. I would have wanted to do this later anyway, but why not get a clean start. 🫠

},
(res) => {
res.resume();
res.on('end', () => resolve(res.statusCode!));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't aware the statusCode could be undefined or null.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be undefined apparently...


// First request - populates cache
const status1 = await makeRequest();
ok(status1 === 200 || status1 === 404, 'First request should succeed');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is both 200 or 404 okay here?

Comment on lines 172 to 163
if (ocspResponder) {
await stopOcspResponder(ocspResponder);
ocspResponder = null;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test stopping the ocsp responder that the other tests in this suite rely on is not ideal. First of all, I'm not sure if the Node.js test runner absolutely executes all tests within a suite in order. Furthermore, this would be very confusing if anyone ever added another test to this suite. In order to reduce potential issues, I'd prefer you do the following:

  • Explicitly disable concurrency for this suite, and leave a comment as for why.
  • Switch the suite level test function to async and then await each test() call so they absolutely execute in the specified order.

So it'd look something like this:

// Concurrency is disabled by default in the Node.js test runner, and we are explicitly disabling it here
// since the last test in this suite must always run last in order to not pre-emptively disable the OCSP server.
suite('...', { concurrency: false }, async () => {
  await test('...', async () => {});
  await test('...', async () => {});
  await test('...', async () => {});
});

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, there's a comment now clarifying what will happen so that should be totally okay and not get us into trouble later. ;)

Yeah... explicit and clear order might be nice but it feels like it could be fragile still in a similar way. I might just add a completely separate suite file to isolate the behavior for that scenario...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left the suite intact but made the setup and teardown more explicit and resilient (and even better commented). See how this feels.

Comment on lines 229 to 231
// but no OCSP check occurs. Could get 401 if cert doesn't have a user.
ok(
res.statusCode === 200 || res.statusCode === 404 || res.statusCode === 401,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar suggestion as above; why are we okay with all of these scenarios? Isn't there some specific steps in order to get a 200 vs a 404 or otherwise?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some residual ideas from the older tests probably but they should be explicit

@heskew heskew marked this pull request as draft January 28, 2026 22:38
@heskew
Copy link
Member Author

heskew commented Jan 28, 2026

@Ethan-Arrowood thanks for all the feedback! I'll get back to this soon - head's down on something else. RE response codes I've cared more about 'access denied' vs 'okay-ish'. I have no problem refining these tests though to catch any potential false passes.

Copy link
Member

@kriszyp kriszyp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't looked at the details and trust you are sorting that out, but I certainly approve at a high-level.

@heskew heskew force-pushed the cert-verification-integration-tests branch 3 times, most recently from c8a8dac to c52b676 Compare February 2, 2026 23:18
@heskew heskew force-pushed the cert-verification-integration-tests branch 3 times, most recently from 8dd73a5 to f23bf1a Compare February 3, 2026 18:54
@heskew heskew force-pushed the cert-verification-integration-tests branch from f23bf1a to 5f3dc41 Compare February 3, 2026 18:59
@Ethan-Arrowood
Copy link
Member

Just took a read through your replies. Sounds like you're on a much better track now. I'll wait until you mark this as ready for review before going through the code again.

@heskew heskew force-pushed the cert-verification-integration-tests branch from 3efea32 to 9ce6ab2 Compare February 4, 2026 00:07
@heskew heskew marked this pull request as ready for review February 4, 2026 11:50
@heskew
Copy link
Member Author

heskew commented Feb 4, 2026

@Ethan-Arrowood - please give this another look whenever you get a chance

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Enable OCSP and CRL certificate verification tests in CI

4 participants