Skip to content

refactor(importer): reintroduce #2644 minus performance regression#2683

Merged
andrewpollock merged 7 commits intogoogle:masterfrom
andrewpollock:importer_refactor_perf_fix
Oct 21, 2024
Merged

refactor(importer): reintroduce #2644 minus performance regression#2683
andrewpollock merged 7 commits intogoogle:masterfrom
andrewpollock:importer_refactor_perf_fix

Conversation

@andrewpollock
Copy link
Copy Markdown
Contributor

This reverts commit 165b2fe.

Fix performance regression introduced:

  • don't create a new storage client and Data Store client for each
    thread processing a GCS blob.

This reverts commit 165b2fe.

Fix performance regression introduced:

- don't create a new storage client and Data Store client for each
  thread processing a GCS blob.
Don't create a new one for each thread processing each GCS blob.
andrewpollock and others added 2 commits October 17, 2024 09:53
…et() failure

In the end, the problem was:
- `assert_has_calls(calls, any_order=False)` being the source of the problem

This assertion doesn't mind if there's *extra* calls above and beyond
what is being checked for, but without `any_order=True`, it *will* fail
if the results are out of order.

The *ordering* was the problem, with CVE-2022-0128 always being in the
set of calls, but not always interfering with the order of the other calls
being actively checked for. The ordering of the calls is unavoidably
non-deterministic (and fine) because they're being made from separate
threads.
andrewpollock and others added 2 commits October 21, 2024 00:58
The testing Datastore client is being reused in this code path, and
these are already set for it.

Address reviewer feedback on variable naming.
@andrewpollock andrewpollock merged commit cf73474 into google:master Oct 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants