Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Harmonize extract with API logic #3473

Merged
merged 6 commits into from Jul 6, 2023

Conversation

dbutenhof
Copy link
Member

PBENCH-1194

Rework Tarball inventory access to avoid the nascent cache map so that we can rely on this inside the Pbench Server process where tarballs haven't been unpacked.

This also fixes some incompatibilities in the Quisby APIs which relied on the older extract semantics.

In order to avoid stale streams on the inventory API I encapsulated the extractfile stream into an object that can behave as a stream but will close both the stream and tarball when done: this is deferred until Flask tears down the Response object.

Finally, I've added functional tests validating the contents, inventory, visualize, and compare APIs so we'll catch any further infrastructure "drift" that might affect them.

NOTE: I renamed the test_put.py functional test file to test_datasets.py. I've put this off many times to avoid over-complicating reviews, so this time I just did it in a separate commit. All the real changes are in the first commit, and (just) the rename is in the second commit.

@dbutenhof dbutenhof added Server Code Infrastructure API Of and relating to application programming interfaces to services and functions labels Jun 29, 2023
@dbutenhof dbutenhof self-assigned this Jun 29, 2023
Copy link
Member

@webbnh webbnh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still need to review the tests, but here's the first installment.

lib/pbench/server/cache_manager.py Outdated Show resolved Hide resolved
lib/pbench/server/cache_manager.py Outdated Show resolved Hide resolved
lib/pbench/server/cache_manager.py Outdated Show resolved Hide resolved
lib/pbench/server/cache_manager.py Outdated Show resolved Hide resolved
lib/pbench/server/cache_manager.py Outdated Show resolved Hide resolved
lib/pbench/server/cache_manager.py Show resolved Hide resolved
lib/pbench/server/cache_manager.py Outdated Show resolved Hide resolved
lib/pbench/server/cache_manager.py Outdated Show resolved Hide resolved
lib/pbench/server/cache_manager.py Outdated Show resolved Hide resolved
lib/pbench/server/cache_manager.py Outdated Show resolved Hide resolved
lib/pbench/server/cache_manager.py Outdated Show resolved Hide resolved
@dbutenhof dbutenhof changed the title Extract Harmonize extract with API logic Jun 30, 2023
Copy link
Member

@webbnh webbnh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that there are a couple of tests with broken mocking. Other than that, it's mostly nits and small stuff (and one suggestion).

lib/pbench/test/functional/server/test_put.py Outdated Show resolved Hide resolved
lib/pbench/test/functional/server/test_put.py Outdated Show resolved Hide resolved
lib/pbench/test/functional/server/test_put.py Outdated Show resolved Hide resolved
lib/pbench/test/functional/server/test_put.py Show resolved Hide resolved
lib/pbench/test/functional/server/test_put.py Outdated Show resolved Hide resolved
lib/pbench/test/unit/server/test_datasets_compare.py Outdated Show resolved Hide resolved
lib/pbench/test/unit/server/test_datasets_compare.py Outdated Show resolved Hide resolved
lib/pbench/test/unit/server/test_datasets_compare.py Outdated Show resolved Hide resolved
lib/pbench/test/unit/server/test_datasets_compare.py Outdated Show resolved Hide resolved
webbnh
webbnh previously approved these changes Jul 3, 2023
Copy link
Member

@webbnh webbnh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. There's a nit or two, and I've got some coding suggestions (one of which might not actually be an improvement... 😊), and there are some unit test gaps/anomalies, but nothing that I feel like block the merge. So, see what you think.

lib/pbench/server/cache_manager.py Outdated Show resolved Hide resolved
lib/pbench/test/unit/server/test_cache_manager.py Outdated Show resolved Hide resolved
lib/pbench/test/unit/server/test_cache_manager.py Outdated Show resolved Hide resolved
lib/pbench/test/unit/server/test_datasets_inventory.py Outdated Show resolved Hide resolved
lib/pbench/server/cache_manager.py Outdated Show resolved Hide resolved
lib/pbench/server/api/resources/datasets_inventory.py Outdated Show resolved Hide resolved
webbnh
webbnh previously approved these changes Jul 3, 2023
Copy link
Member

@webbnh webbnh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good enough...although there are two open items from my last pass and two more from this one. 🙂

lib/pbench/test/unit/server/test_cache_manager.py Outdated Show resolved Hide resolved
PBENCH-1194

Rework Tarball inventory access to avoid the nascent cache map so that we can
rely on this inside the Pbench Server process where tarballs haven't been
unpacked.

This also fixes some incompatibilities in the Quisby APIs which relied on the
older extract semantics.

In order to avoid stale streams on the `inventory` API I encapsulated the
`extractfile` stream into an object that can behave as a stream but will close
both the stream and tarball when done: this is deferred until Flask tears down
the Response object.

Finally, I've added functional tests validating the `contents`, `inventory`,
`visualize`, and `compare` APIs so we'll catch any further infrastructure
"drift" that might affect them.
This is in a separate commit; because I keep putting this off in favor of
minimizing review complication, but I really want to do it because the test
grew beyond "put" way back once I realized we couldn't set real dependencies
between test files...
Copy link
Member

@webbnh webbnh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good enough , as long as you're at peace with the open items above.

Copy link
Member Author

@dbutenhof dbutenhof left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought I'd posted both of these responses long ago: I wonder how that happened? In any case, I don't consider either of these to be "open issues" regardless of any minor alternative preferences you might entertain. 😆

lib/pbench/test/unit/server/test_cache_manager.py Outdated Show resolved Hide resolved
dbutenhof added a commit to dbutenhof/pbench that referenced this pull request Jul 5, 2023
PBENCH-1204

This stems from a request from the UI to help optimize a partial refresh after
using the relay upload dialog, by providing identification of the new dataset.
Although a client using the traditional `PUT /upload` already knows the name
and MD5, the returned information may be helpful.

This is a DRAFT partly because I added a validation of the new information to
the functional test, which I've renamed in distributed-system-analysis#3473 ... I'll do that merge here
after it's gone in.

It's also DRAFT because while I like the idea of including URIs (and in
particular this addresses a certain request regarding accessibility of the
tarball), I'm not really sure which URIs to include or in what form. I'm
certain that one or two people might possibly have opinions on this subject!
@dbutenhof dbutenhof merged commit 61be9df into distributed-system-analysis:main Jul 6, 2023
4 checks passed
@dbutenhof dbutenhof deleted the extract branch July 6, 2023 11:23
dbutenhof added a commit to dbutenhof/pbench that referenced this pull request Jul 6, 2023
PBENCH-1204

This stems from a request from the UI to help optimize a partial refresh after
using the relay upload dialog, by providing identification of the new dataset.
Although a client using the traditional `PUT /upload` already knows the name
and MD5, the returned information may be helpful.

This is a DRAFT partly because I added a validation of the new information to
the functional test, which I've renamed in distributed-system-analysis#3473 ... I'll do that merge here
after it's gone in.

It's also DRAFT because while I like the idea of including URIs (and in
particular this addresses a certain request regarding accessibility of the
tarball), I'm not really sure which URIs to include or in what form. I'm
certain that one or two people might possibly have opinions on this subject!
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Of and relating to application programming interfaces to services and functions Code Infrastructure Server
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants