Introducing AVA snapshots v3 #2685

ninevra · 2021-02-14T10:40:47Z

Implements a new snapshot file format containing all information necessary to generate the snapshot report file.

Allows the use of test.skip(), test.only(), --match, line number selection, and t.snapshot.skip() during --update-snapshots runs.

Closes #2634 by removing the "Could not update snapshots" case entirely.
Closes #2635 by allowing existing test & assertion selection mechanisms to be used with --update-snapshots.
Hopefully will assist with #1768 and #2099.

Current proposed snapshot file format:

Readable prefix: AVA Snapshot v3\n
Snapshot version (UInt16LE): 3
sha256 checksum of the following

gzip compressed data:

CBOR encoded data:

{
  blocks: [
    {
      title: "A test title",
      snapshots: [
        {
          label: "A snapshot label",
          buffer: /* Serialized concordance descriptor */
        },
        ...
      ]
    },
    ...
  ]
}

Notes:

The label field is absent unless the user supplied a label. This allows changing the default snapshot label if desired in the future.
The buffer field may be absent, if the snapshot has only ever been skipped. Such snapshots are currently rendered as <No Data>.
Currently, some data still exists only in the report file, namely the filenames of the test file and the snapshot file. This information can be derived from the name of the snapshot file, but theoretically that could change in the future.

Pros of the proposed format:

The snapshot report can now be regenerated from the snapshot file.
- Enables the use of test selection mechanisms during --update-snapshots
- Allows snapshot reports to be kept always in order, rather than the previous approach of appending changes at the end.
- Prerequisite for any feature that edits snapshots without overwriting them entirely.
AVA now maintains less byte-manipulation code
The snapshot format is easier (though not trivial, because of the gzipping and the header) for third-parties to read
Adding new properties either to snapshots, to blocks, or to the file overall should be easy and remain readable by older AVAs (though older AVAs may or may not preserve such unrecognized data when updating the .snap file)

Cons of the proposed format:

AVA can't totally guarantee that the snapshot file format is deterministic; some leeway is given to the CBOR encoder. .snap files might show spurious changes e.g. between minor versions of the CBOR encoder.
AVA's attack surface is changed. Attackers remain able to submit a malicious .snap file in a pull request, in hope of exploting potential bugs in AVA to exfiltrate repository secrets from CI runs or gain control of developers' machines. The total amount of code exposed to such an attack is significantly increased, and unless we carefully filter the result of CBOR decoding (not currently done), such an attacker can produce unexpected javascript types and values. On the other hand, the binary-parsing code now has rather more eyes on it.

Other changes:

Frontend:

Rather than append new tests to the old report, adding new tests without --update-snaphots now maintains declaration-order.
When the snapshot report is changed, tests not touched in that run (e.g. removed tests, tests which no longer use snapshots) are moved to the end of the report, maintaining their previous order among themselves.

Edge cases:

If --update-snapshots is passed and the only use of snapshots is in discarded t.try() attempts, then the snapshot file is removed. There's no snapshot data to record in this case, so I don't see any reason to keep the file.
If t.snapshot.skip(); t.snapshot(); is added at the end of a test, RangeError is no longer thrown; instead a "blank space" is recorded for the skipped snapshot. This is currently rendered in the snapshot report as <No Data>.
snapshot-manager now assumes Array.prototype.sort is stable. This is the true in all supported Node versions afaik, and is required by the standard as of ecmascript 2019, but notably is false in node v10. If Array.prototype.sort is not stable, the snapshot report order may not be deterministic.

Backend:

The snapshot file is eagerly loaded in the Runner constructor, rather than lazily loaded.
- Runner always emits a dependency on its .snap file, even in --update-snapshots mode and even if no snapshot assertions are performed.
- In some very narrow edge cases where the snapshot file has a valid version and checksum but unreadable data, this could result in throwing an unhelpful error. AVA@3 only threw such errors if snapshots were actually used.
Introduces a dependency on cbor.
Removes a dependency on md5-hex.

TODO:

This index was used to establish declaration order of beforeEach and afterEach hooks. Since these hooks no longer support snapshots, associatedTaskIndex is unnecessary.

This index was used to establish declaration order of snapshots with ids. Since snapshots with ids are no longer supported, this index is unnecessary.

This format contains all the necessary information for regenerating snapshot reports (test titles, snapshot labels, snapshot ordering) but isn't designed for robustness of any sort, only ease of implementation. Once a proof of concept is ready, the format should be revisited.

This requires deleting all the .snap files, as well as modifying contains-only.js and then putting it back after.

Still prohibit removing snapshots when any block was skipped.

This enables unit tests to continue using Runners without snapshot support, while allowing Runner to use the snapshot manager interface as though it was pre-loaded.

Including test.only(), --match, and line number selection.

Better conveys the purpose of the test suite. Clarifies the PR diff, since the old snapshot-updates is being entirely removed and replaced.

ninevra · 2021-02-16T00:01:56Z

Looks like the testing approach I've been using here and in snapshot-removal is very prone to causing timeouts in CI. Tests are taking upwards of half a minute to run in CI vs 3-4s locally. Further, since each suite contains a large number of concurrent tests in a single file, they're all kicked off at once, increasing the chance that no test finishes within AVA's global timeout period.

Most of these tests are creating a temporary directory, copying a fixture, running AVA once to set up some initial state, running AVA again to modify that state, checking that the change matches the expectation, and then cleaning up the temporary directory. (The windows runs are somehow timing out within the first fixture execution, which is very strange.) I'll have to rethink this & hopefully eliminate some of those steps and/or reorganize the tests to run in a timeout-friendly pattern.

novemberborn · 2021-03-01T19:39:33Z

md5 checksum of the following

Shall we update this to SHA-256? Just to deal with projects that rule out insecure hashes throughout the codebase.

novemberborn · 2021-03-01T19:45:39Z

Adding new properties either to snapshots, to blocks, or to the file overall should be easy and remain readable by older AVAs (though older AVAs may or may not preserve such unrecognized data when updating the .snap file)

Our only concern should be being able to read snapshots written by previous minor versions (within the same major release).

A test using a new API won't pass when used with an older version either.

novemberborn · 2021-03-01T19:46:27Z

AVA can't totally guarantee that the snapshot file format is deterministic; some leeway is given to the CBOR encoder. .snap files might show spurious changes e.g. between minor versions of the CBOR encoder.

The CBOR spec itself makes a big deal about determinism, so hopefully this will be rare.

novemberborn · 2021-03-01T19:51:31Z

The snapshot file is eagerly loaded in the Runner constructor, rather than lazily loaded.

Why does it need to be loaded eagerly? Could we load it synchronously when needed?

novemberborn · 2021-03-01T19:52:06Z

The snapshot file is eagerly loaded in the Runner constructor, rather than lazily loaded.

Why does it need to be loaded eagerly? Could we load it synchronously when needed?

Although, of course, if there is no snapshot file this is a cheap operation, and if there is, it's likely needed. So this is fine, never mind 😄

novemberborn

When the snapshot report is changed, tests not touched in that run (e.g. removed tests, tests which no longer use snapshots) are moved to the end of the report, maintaining their previous order among themselves.

What happens when tests are skipped? Are those sections moved to the end as well? We should have access to the declaration order.

docs/04-snapshot-testing.md

novemberborn

I'm going to need some more time to review this. Impressive work @ninevra.

ninevra · 2021-03-02T07:33:46Z

When the snapshot report is changed, tests not touched in that run (e.g. removed tests, tests which no longer use snapshots) are moved to the end of the report, maintaining their previous order among themselves.

What happens when tests are skipped? Are those sections moved to the end as well? We should have access to the declaration order.

Tests that are skipped should remain ordered properly. (edit: this is demonstrated by the snapshot for snapshot-workflow/selection.js, "With --update snapshots and test.skip(), other tests' snapshots are updated")

Currently, ordering info is collected for tests that are skipped (or "not selected" by other means) or that use t.snapshot() or t.snapshot.skip() (if used only in a discarded t.try(), I think things get a little inconsistent; I need to look at that case in more detail). Ordering info is not currently collected for tests that are run but don't use snapshots, nor for .todo() tests, nor for hooks.

It's possible to collect ordering info for all tests that are declared, whether they use snapshots or not; I'd add a Manager.prototype.touch(title, taskIndex) method and call it every time a test is declared. It's of course not possible to collect ordering info for tests that have been removed entirely.

The snapshot file is eagerly loaded in the Runner constructor, rather than lazily loaded.

Why does it need to be loaded eagerly? Could we load it synchronously when needed?

Although, of course, if there is no snapshot file this is a cheap operation, and if there is, it's likely needed. So this is fine, never mind smile

Manager now has a substantially larger interface to Runner (compare and save are joined by skipSnapshot and skipBlock), some of which needs to be called before the first snapshot comparison. I initially tried to keep lazyloading, which is certainly possible, but it requires basically shimming the entire Manager interface and replaying calls to skipSnapshot and skipBlock when the real Manager is finally loaded. The duplicated interface was really annoying to iterate on and debug, so I scrapped it.

Plus, in theory eager loading should make the performance characteristics of snapshot-based tests a little more consistent; before, the first t.snapshot() to run would perform sync file io and then parse a potentially-large file, while later assertions would be fast.

AVA can't totally guarantee that the snapshot file format is deterministic; some leeway is given to the CBOR encoder. .snap files might show spurious changes e.g. between minor versions of the CBOR encoder.

The CBOR spec itself makes a big deal about determinism, so hopefully this will be rare.

I expect it should be rare, yes. I'm invoking cbor.encode with {omitUndefinedProperties: true, canonical: true}, which should help. Unfortunately the cbor package doesn't yet appear to support "Deterministic CBOR", but if i recall correctly the main difference between that and "Canonical CBOR" is the choice of sort order for map keys, so it shouldn't really make a difference.

md5 checksum of the following

Shall we update this to SHA-256? Just to deal with projects that rule out insecure hashes throughout the codebase.

Sounds good to me. I'll hopefully have time to make changes in a day or two.

Adding new properties either to snapshots, to blocks, or to the file overall should be easy and remain readable by older AVAs (though older AVAs may or may not preserve such unrecognized data when updating the .snap file)

Our only concern should be being able to read snapshots written by previous minor versions (within the same major release).

A test using a new API won't pass when used with an older version either.

Makes sense.

Should we consider including a minor and/or patch version in the .snap file? Could add it to the readable prefix without issue, keep the existing version field as-is so that AVA <= 3 keep reporting the right error, and add a field version: {major: 3, minor: 0, patch: 0} (or version: [3, 0, 0] or version: "3.0.0") to the top level of the data structure. Then AVA@4.0 could issue a warning (probably not a hard error??) if it encountered a minor version > 0.

The ideal here would be to hook into AVA's global maximum concurrency, so that the total number of tests running across all test files was always at the processor count.

You could do that with shared workers actually.

A shared worker could only control parallelism across the test files where it was loaded, though, right? That is, if it was loaded in snapshot-removal and snapshot-workflow, it could limit their total concurrency to os.cpus().length, but it wouldn't know if AVA was running tests from other suites at the same time? (Shared workers are really cool, but I'm not familiar with the interface yet, sorry.)

Keep docs accurate for AVA 3. Co-authored-by: Mark Wubben <mark@novemberborn.net>

novemberborn · 2021-03-02T09:33:29Z

It's possible to collect ordering info for all tests that are declared, whether they use snapshots or not; I'd add a Manager.prototype.touch(title, taskIndex) method and call it every time a test is declared.

What scenario would this cover? Adding a new snapshot assertion to an existing test which is run selectively? Though that would still order correctly right?

Should we consider including a minor and/or patch version in the .snap file? Could add it to the readable prefix without issue, keep the existing version field as-is so that AVA <= 3 keep reporting the right error, and add a field version: {major: 3, minor: 0, patch: 0} (or version: [3, 0, 0] or version: "3.0.0") to the top level of the data structure. Then AVA@4.0 could issue a warning (probably not a hard error??) if it encountered a minor version > 0.

I don't think the warnings are worthwhile. Either it's a hard error, and then a single major version suffices, or handling in newer versions deals with missing fields. Which is needed anyway to ensure CI passes when a newer (minor / patch) AVA version is installed and snapshots haven't been updated.

A shared worker could only control parallelism across the test files where it was loaded, though, right? That is, if it was loaded in snapshot-removal and snapshot-workflow, it could limit their total concurrency to os.cpus().length, but it wouldn't know if AVA was running tests from other suites at the same time? (Shared workers are really cool, but I'm not familiar with the interface yet, sorry.)

You're right, yes. Controlling overall execution could be an interesting concept, but possibly something that AVA should provide a framework for but not provide out of the box.

The blocksByTitle format is an implementation detail of Manager, whereas the CBOR data structure is a characteristic of the snapshot file protocol. This change reduces codepath duplication between generateReport() and encodeSnapshots(), and avoids passing blockIndices around everywhere. If decodeSnapshots() and encodeSnapshots() are later exposed as a library, this may be closer to the desired interface.

ninevra · 2021-03-05T04:22:11Z

Switched to sha256 hashes and implemented the touch() method (the latter can be reverted easily if you think it's a detriment).

What scenario would this cover? Adding a new snapshot assertion to an existing test which is run selectively? Though that would still order correctly right?

It covers some narrow edge cases. For instance, if a test had its snapshot assertions removed, it's snapshot data would be retained but, since it wasn't skipped and no longer used snapshots, so the Manager would never learn its order. This woudn't matter until a new snapshot was added to the test file, at which point the report would get regenerated and the first test shunted to the end.

Collecting ordering info for every test also just seems a little easier to reason about.

novemberborn · 2021-03-07T19:09:04Z

@ninevra I've pushed some small tweaks, let me know what you think. In the snapshot format I renamed buffer to data, but I think everything else is the same.

Your tests seem very thorough. There's a lot of it though, so it's hard to review! Any tips? Or else let's just ship it 😄

Hopefully will make the snapshot-worker snapshots easier to read.

ninevra · 2021-03-12T06:27:23Z

@novemberborn all the tweaks look great to me! Especially the formatEntry() code, it looks a lot nicer now.

Your tests seem very thorough. There's a lot of it though, so it's hard to review! Any tips? Or else let's just ship it 😄

I mean, I don't think we should ship code you're not confident in, and I certainly don't intend to pressure you to do that. I am sorry that the diff ended up so massive.

What can I do to make the code and/or tests easier to review/maintain?

Ideas:

I've pushed a change to the snapshot-workflow tests that snapshots a before/after diff of the report, rather than just the final report; hopefully that'll make it easier to see what the tests are doing. Feel free to revert it if it's just making things worse.
I could consolidate all the snapshot-workflow tests in one file to make them less tedious to look through.
I could rewrite some of the test fixtures to use descriptive messages and titles instead of just 'foo' et al placeholders.

ninevra · 2021-03-12T06:34:32Z

I have a crude-but-working implementation of the shared worker concurrency limiter we discussed; it's on ninevra/ava/snapshot-regenerate-shared-worker-semaphore. I think landing that here would require updating avajs/test to expose plugin.js, though. Without that the plugin can't require('@ava/test/plugin').

Would you be interested in a PR to add semaphores to @ava/cooperate, or a separate plugin? The interface I wrote could use some more work, but eventually it'd be nice to have.

novemberborn · 2021-03-13T15:44:32Z

Would you be interested in a PR to add semaphores to @ava/cooperate, or a separate plugin? The interface I wrote could use some more work, but eventually it'd be nice to have.

That'd be perfect for @ava/cooperate!

I think landing that here would require updating avajs/test to expose plugin.js, though. Without that the plugin can't require('@ava/test/plugin').

That'd be fine, except that shared workers are still experimental and subject to breaking changes even in patch releases of AVA. I'm not sure we should be relying on that in our own test suite shipping AVA would break its tests.

novemberborn · 2021-03-13T16:06:44Z

I've pushed a change to the snapshot-workflow tests that snapshots a before/after diff of the report, rather than just the final report; hopefully that'll make it easier to see what the tests are doing.

It's very meta, but also very helpful. Thanks!

I think I got side tracked by all the fixtures. Once I skipped those it got a lot easier. Great work.

novemberborn · 2021-03-13T16:22:02Z

🎊

ninevra added 28 commits February 13, 2021 13:09

Remove associatedTaskIndex

a8b2a4f

This index was used to establish declaration order of beforeEach and afterEach hooks. Since these hooks no longer support snapshots, associatedTaskIndex is unnecessary.

Remove snapIndex

207b33d

This index was used to establish declaration order of snapshots with ids. Since snapshots with ids are no longer supported, this index is unnecessary.

Update test-tap/assert.js snapshot

0fbae43

Update test-tap/try-snapshot.js snapshot

2b40b78

Update test-tap/integration/snapshots.js snapshots

b537c46

Update test/assertions snapshot

30b02e0

Update test/snapshot-order/randomness.js snapshot

3721489

Update test/snapshot-updates snapshots

3a85c64

This requires deleting all the .snap files, as well as modifying contains-only.js and then putting it back after.

Write reports using blocksByTitle

2084726

Collect ordering info from Runner all at once

7403619

Test regenerating snapshot reports

a69ec49

Allow t.snapshot.skip() with --update-snapshots

6f80dc4

Update existing tests to allow snapshot.skip with -u

b7b6fb3

Allow test.skip() with --update-snapshots

49204dc

Still prohibit removing snapshots when any block was skipped.

Update tests to allow test.skip() with -u

69e2f62

Generalize Runner lazy-loading snapshot manager

66fcf02

This enables unit tests to continue using Runners without snapshot support, while allowing Runner to use the snapshot manager interface as though it was pre-loaded.

Allow -u with .only, --match, line number select

6e83431

Update tests to allow -u with test selection

b668f9d

Including test.only(), --match, and line number selection.

Set up snapshot workflow testing framework

7074973

More snapshot workflow tests

b542aca

Rename snapshot-updates suite to snapshot-workflow

6b8ef30

Better conveys the purpose of the test suite. Clarifies the PR diff, since the old snapshot-updates is being entirely removed and replaced.

Fix CI-dependent test failure

94bdaef

Add more snapshot workflow tests

a9222e6

Find fixture's dependencies relative to AVA_PATH

ca071cf

Add remaining todo snapshot workflow tests

90c3c01

Increase test timeout

15e226c

Further increase test timeout

a777048

Duplicate fixtures rather than use temp dirs

cf43700

novemberborn reviewed Mar 1, 2021

View reviewed changes

docs/04-snapshot-testing.md Outdated Show resolved Hide resolved

novemberborn reviewed Mar 1, 2021

View reviewed changes

Update docs/04-snapshot-testing.md

872506b

Keep docs accurate for AVA 3. Co-authored-by: Mark Wubben <mark@novemberborn.net>

ninevra added 6 commits March 4, 2021 20:11

Use sha256 instead of md5

819e4e6

Update test snapshots

0ed7a86

Explicitly remove taskIndex when encoding blocks

46b17f2

Separate taskIndices from blocks

afc8b01

Touch all non-todo tests at declaration

bb641f5

novemberborn added 2 commits March 7, 2021 20:03

Tweak snapshot manager code

22b4b98

Merge branch 'main' into snapshot-regenerate

5efbe6a

Snapshot diffs of changed reports

cff4c6a

Hopefully will make the snapshot-worker snapshots easier to read.

Upgrade concordance and rebuild snapshot reports

30f9f66

novemberborn changed the title ~~(Re)generate snapshot reports from .snap files~~ Introducing AVA snapshots v3 Mar 13, 2021

novemberborn merged commit 14ec281 into avajs:main Mar 13, 2021

ehmicky mentioned this pull request Sep 15, 2021

Some test snapshots cannot be updated netlify/build#3615

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introducing AVA snapshots v3 #2685

Introducing AVA snapshots v3 #2685

ninevra commented Feb 14, 2021 •

edited by novemberborn

ninevra commented Feb 16, 2021

novemberborn commented Mar 1, 2021

novemberborn commented Mar 1, 2021

novemberborn commented Mar 1, 2021

novemberborn commented Mar 1, 2021

novemberborn commented Mar 1, 2021

novemberborn left a comment

novemberborn left a comment

ninevra commented Mar 2, 2021 •

edited

novemberborn commented Mar 2, 2021

ninevra commented Mar 5, 2021

novemberborn commented Mar 7, 2021

ninevra commented Mar 12, 2021

ninevra commented Mar 12, 2021

novemberborn commented Mar 13, 2021

novemberborn commented Mar 13, 2021

novemberborn commented Mar 13, 2021

Introducing AVA snapshots v3 #2685

Introducing AVA snapshots v3 #2685

Conversation

ninevra commented Feb 14, 2021 • edited by novemberborn

Current proposed snapshot file format:

Pros of the proposed format:

Cons of the proposed format:

Other changes:

Frontend:

Edge cases:

Backend:

TODO:

ninevra commented Feb 16, 2021

novemberborn commented Mar 1, 2021

novemberborn commented Mar 1, 2021

novemberborn commented Mar 1, 2021

novemberborn commented Mar 1, 2021

novemberborn commented Mar 1, 2021

novemberborn left a comment

Choose a reason for hiding this comment

novemberborn left a comment

Choose a reason for hiding this comment

ninevra commented Mar 2, 2021 • edited

novemberborn commented Mar 2, 2021

ninevra commented Mar 5, 2021

novemberborn commented Mar 7, 2021

ninevra commented Mar 12, 2021

ninevra commented Mar 12, 2021

novemberborn commented Mar 13, 2021

novemberborn commented Mar 13, 2021

novemberborn commented Mar 13, 2021

ninevra commented Feb 14, 2021 •

edited by novemberborn

ninevra commented Mar 2, 2021 •

edited