Attempt to improve upload performance #3501

webbnh · 2023-07-24T22:07:08Z

The recent "Ops Review" session revealed that upload requests time out if the file approaches 7Gb in size. Profiling the execution indicates that the bulk of the time is spent extracting the metadata.log file from the results tarball. This is due to a "feature" of tar archives: the same file may appear more than once, and so an extraction has to search the entire archive to ensure that the member being extracted is the most up-to-date (i.e., last) one, and this is punishing as the archive size grows. (Also, the Python Standard library tarfile package is not as performant as the standalone tar(1) utility, and this costs additional time.)

This PR changes the Server to spawn a subprocess to run the tar(1) utility, which allows us to specify the --occurrence option which allows us to abort the search after the first instance of the member is found. Since Pbench owns the construction of the archive, we know that each member occurs only once, so there is no need to search for other instances of it. Also, since the metadata.log file occurs near the beginning of the archive, by aborting the search when we find the first instance of the member, we are able to reduce the overall search time from multiple minutes to a fraction of a second.

This PR consists of a series of commits. Two of them are the changes to the cache manager module to add the subprocess execution and the changes to the corresponding unit tests. Two of them are passes over the cache manager module and the corresponding tests to "pick lint". The others are iteration on the respective changes to the code and the tests.

lib/pbench/server/cache_manager.py

webbnh · 2023-07-31T20:13:52Z

This is now ready for review.

dbutenhof

Mostly great, but unless I'm missing something, a few tests seem a bit weak?

lib/pbench/test/unit/server/test_cache_manager.py

lib/pbench/server/cache_manager.py

lib/pbench/test/unit/server/test_cache_manager.py

dbutenhof

test-cache-manager.py needs to be blackened

lib/pbench/test/unit/server/test_cache_manager.py

dbutenhof

test_cache_manager.py needs to be blackened... but I'll grant a symbolic victory lap ...

ndokos

Still looking at the test code, but the rest LGTM.

dbutenhof · 2023-08-01T11:16:23Z

lib/pbench/test/unit/server/test_cache_manager.py

@@ -1042,6 +1040,7 @@ def mock_shutil_which(
        with monkeypatch.context() as m:
            m.setattr(shutil, "which", mock_shutil_which)
            m.setattr(subprocess, "Popen", MockPopen)
+            m.setattr(Inventory, "close", MockBufferedReader.close)


I suppose that's fine that you're mocking Inventory.close to a different close you also don't expect to be called, and you cover all of the Inventory class below, but it looks odd and the self-documentation here (plus the assert if it is called) could be better. Ah, well.

It should be fine to mock Inventory.close() with a different close(), given that they are all supposed to be providing the same io.IOBase.close() interface! 😆 (And, yeah, since I expect none of them to be called, using separate implementations seems silly and redundant. 😉)

Nevertheless, I don't disagree that, on the surface, it looks funny. 🙂

As for the documentation...well, Nick is still looking at the test code, so there is still a chance to fix that. 🙃 What in particular would you like to see?

Again, the clearest documentation would be a mocked Inventory.close that doesn't claim to be a stream close. I recognize that's "wasted" code, but it's not much. Second best would be a comment that we're just being lazy here because it doesn't really matter as we don't expect it to be called. (And if you're at all like me, once you start to write that and think about how it'll look, you'll probably just write the "proper" mock instead.)

once you start to write that and think about how it'll look, you'll probably just write the "proper" mock instead.

One of the problems that I'm trying to address is to minimize the amount of code in the unit tests which is not executed per the coverage report. So, adding an additional mock which causes the test to fail if it is called actually moves away from that goal.

the clearest documentation would be a mocked Inventory.close that doesn't claim to be a stream close.

I find this to be curious, because, AIUI, we want Inventory to act like a stream, right? (I mean, it provides all the same methods with basically the same semantics.) So, it makes sense to me to use a stream-close mock for Inventory.close() (in this weird instance, anyway)....

Second best would be a comment that we're just being lazy here because it doesn't really matter as we don't expect it to be called.

I'll do that, if you think it's worth it (although, I don't think I'll term it as "lazy", because that's not really what it's accomplishing for me... 🙂).

You should not wait for me to check the test code before you merge. I don't know how long it will take me 😁

I find this to be curious, because, AIUI, we want Inventory to act like a stream, right? (I mean, it provides all the same methods with basically the same semantics.) So, it makes sense to me to use a stream-close mock for Inventory.close() (in this weird instance, anyway)....

Inventory composes a stream and a "stream provider" so they can be managed together. That composition is critical, and eliding that to mock it as if it were a stream just seems really obscure and awkward to me. But what you've got will work, even if it might confuse a future reader who even notices it. I approved, so I'll leave the rest to you.

webbnh added the Server label Jul 24, 2023

webbnh added this to the v0.73 milestone Jul 24, 2023

webbnh self-assigned this Jul 24, 2023

webbnh force-pushed the extract_first_metadata branch from 09b0581 to f733bdd Compare July 24, 2023 23:24

dbutenhof requested changes Jul 25, 2023

View reviewed changes

lib/pbench/server/cache_manager.py Outdated Show resolved Hide resolved

lib/pbench/server/cache_manager.py Outdated Show resolved Hide resolved

lib/pbench/server/cache_manager.py Outdated Show resolved Hide resolved

lib/pbench/server/cache_manager.py Show resolved Hide resolved

webbnh force-pushed the extract_first_metadata branch from f733bdd to de1e124 Compare July 26, 2023 18:56

webbnh mentioned this pull request Jul 26, 2023

Fix possible use-before-assign bug #3505

Merged

webbnh force-pushed the extract_first_metadata branch 2 times, most recently from 3c9b2fb to 806af82 Compare July 26, 2023 21:16

dbutenhof previously approved these changes Jul 26, 2023

View reviewed changes

lib/pbench/server/cache_manager.py Outdated Show resolved Hide resolved

lib/pbench/server/cache_manager.py Show resolved Hide resolved

webbnh dismissed dbutenhof’s stale review via cccec63 July 28, 2023 22:27

webbnh force-pushed the extract_first_metadata branch 3 times, most recently from 85502f5 to 828d815 Compare July 29, 2023 01:12

webbnh added 3 commits July 28, 2023 22:25

Pick lint

77fa134

Use subprocess tar command to extract files from tarball

982e992

Add cleanup for the Popen object in Inventory.close()

c6ea0d3

webbnh force-pushed the extract_first_metadata branch 9 times, most recently from 940f84c to 9ba86ed Compare July 31, 2023 18:52

webbnh added 4 commits July 31, 2023 15:06

Pick lint from cache manager unit tests

7f8f75f

Unit test update

4d8f81a

Polish unit test changes

149af09

Don't wait forever for the tar command

b22cee5

webbnh force-pushed the extract_first_metadata branch from 9ba86ed to b22cee5 Compare July 31, 2023 19:13

webbnh marked this pull request as ready for review July 31, 2023 20:13

webbnh requested review from dbutenhof and ndokos July 31, 2023 20:13

dbutenhof requested changes Jul 31, 2023

View reviewed changes

Review feedback

6507be1

webbnh force-pushed the extract_first_metadata branch from a8d06fc to 6507be1 Compare July 31, 2023 22:17

dbutenhof reviewed Jul 31, 2023

View reviewed changes

lib/pbench/test/unit/server/test_cache_manager.py Show resolved Hide resolved

dbutenhof previously approved these changes Jul 31, 2023

View reviewed changes

No good deed....

2e5aa94

webbnh dismissed dbutenhof’s stale review via 2e5aa94 August 1, 2023 00:21

ndokos approved these changes Aug 1, 2023

View reviewed changes

dbutenhof approved these changes Aug 1, 2023

View reviewed changes

webbnh mentioned this pull request Aug 1, 2023

Add pytest-timeout and refactor requirements.txt files #3509

Merged

webbnh merged commit ca36407 into distributed-system-analysis:main Aug 2, 2023
4 checks passed

webbnh deleted the extract_first_metadata branch August 2, 2023 16:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Attempt to improve upload performance #3501

Attempt to improve upload performance #3501

webbnh commented Jul 24, 2023 •

edited

webbnh commented Jul 31, 2023

dbutenhof left a comment

dbutenhof left a comment

dbutenhof left a comment

ndokos left a comment

dbutenhof Aug 1, 2023

webbnh Aug 1, 2023

dbutenhof Aug 1, 2023

webbnh Aug 1, 2023

ndokos Aug 1, 2023

dbutenhof Aug 1, 2023

Attempt to improve upload performance #3501

Attempt to improve upload performance #3501

Conversation

webbnh commented Jul 24, 2023 • edited

webbnh commented Jul 31, 2023

dbutenhof left a comment

Choose a reason for hiding this comment

dbutenhof left a comment

Choose a reason for hiding this comment

dbutenhof left a comment

Choose a reason for hiding this comment

ndokos left a comment

Choose a reason for hiding this comment

dbutenhof Aug 1, 2023

Choose a reason for hiding this comment

webbnh Aug 1, 2023

Choose a reason for hiding this comment

dbutenhof Aug 1, 2023

Choose a reason for hiding this comment

webbnh Aug 1, 2023

Choose a reason for hiding this comment

ndokos Aug 1, 2023

Choose a reason for hiding this comment

dbutenhof Aug 1, 2023

Choose a reason for hiding this comment

webbnh commented Jul 24, 2023 •

edited