-
Notifications
You must be signed in to change notification settings - Fork 393
Fix premature read on stream requests in the sys.take_raft_snapshot
method
#771
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
* Support for system CA bundle: Basically fixing what requests should support by default * Fixing the Dummy backend * Getting more consistent about the slots returned by the job backup() * Refactoring ZFS functions into utils (more practical if we ever switch to a C binding library) * Attempting to fix slow streaming in hvac (see hvac/hvac#771) * Adding gzip checksum validation (it ensures we don't have any scripting error as well)
085b858
to
9dbfa98
Compare
3be579c
to
512f3b5
Compare
Codecov Report
@@ Coverage Diff @@
## main #771 +/- ##
==========================================
- Coverage 84.78% 84.77% -0.02%
==========================================
Files 65 65
Lines 3083 3087 +4
==========================================
+ Hits 2614 2617 +3
- Misses 469 470 +1
|
Hi @dekimsey , The Since your PR was opened before this change, I have:
Re: the PR, as I've taken a more active role in maintaining the project and become more aware of its inner workings, I've found a lot of issues with the Adapter system that I think are going to require large structural changes. This change makes some sense, but in a way it breaks the (already tenuous) contract of the The change could be made "upstream" in the It would still do the same thing in that Anyway I am happy to discuss this change and other Adapter stuff more if you are still interested, I know it's been a while. |
Hi @briantist, it has been awhile , but that's okay! I appreciate you updating the PR's target for me. I'll try to look at this tomorrow and see if I understand where to put the change. At first glance, I'm not sure. It's not clear to me how I might obtain a RawAdapter object from this point in the code. But this is just from vague memory and I'm on mobile right now. I'll see if I can dig into this better tomorrow. Thank you! PS I did identify I think this issue was caused by #537, when adapters were changed. |
This is actually one of the things that can make the adapter system we have now a little confusing, and at times, restrictive. At this point in the code, you cannot choose which adapter is in use. The library is designed that a method such as this is supposedly agnostic to the adapter. But many methods then assume things about the response, based on what the project's default Adapter is. But perhaps more to the point for this PR is that you don't have to worry about that in this case, my recommendation to move the code from If you made this change in Not sure if I've helped with this explanation or made it more confusing 😅 , happy to hash it out further. The raft snapshot endpoint is an interesting one for two reasons:
No worries, thanks very much for submitting this. Stream support was not something I had my mind at all when thinking about the Adapter system recently so it's quite helpful. |
It's here, it's just under a different link on the parent page. Makes it easy to miss.
I don't quite follow this, from my read RawAdapter passes kwargs through to requests and doesn't manipulate the response object at all. From my read, it already works as is. |
Actually, you are completely right. I am starting to think that any fix we could make right now in the Adapter system would not be great long term (due to the other issues with the system that need to be changed). Because the need to return a stream is so rare (this is the only call I'm aware of that returns raw binary data, if you know of more, that would be good to know), I'm starting to warm up to the idea of having this one method not use the What do you think about that idea? The "tricky" (annoying) part is having to pass in all the fields from the current adapter into the constructor of raw_adapter = hvac.adapters.RawAdapter(base_url=self._adapter.base_uri, token=self._adapter.token, ..., ..., ..., ...)
return raw_adapter.get(
url=api_path,
stream=True,
) Although we could do that directly in the method like that, it would be prone to accidental breakage if parameters are changed in any adapters in the chain, so I'd prefer we add a classmethod to the
class Adapter(metaclass=ABCMeta):
def __init__(
self,
base_uri=DEFAULT_URL,
token=None,
cert=None,
verify=True,
timeout=30,
proxies=None,
allow_redirects=True,
session=None,
namespace=None,
ignore_exceptions=False,
strict_http=False,
request_header=True,
):
# real implementation here
pass
@classmethod
def from_adapter(
cls,
adapter,
):
return cls(
base_uri=adapter.base_uri,
token=adapter.token,
cert=adapter.cert,
verify=adapter.verify,
timeout=adapter.timeout,
proxies=adapter.proxies,
allow_redirects=adapter.allow_redirects,
session=adapter.session,
namespace=adapter.namespace,
ignore_exceptions=adapter.ignore_exceptions,
strict_http=adapter.strict_http,
request_header=adapter.request_header,
) And then this would make the actual update for raft snapshots look more like this: raw_adapter = hvac.adapters.RawAdapter.from_adapter(self._adapter)
return raw_adapter.get(
url=api_path,
stream=True,
) Eager to hear your thoughts! If that sounds good, let me know how you'd like to proceed, I could put up a separate PR with the classmethod so that you're working off of that, or it can be implemented here as well, either by you or I could push up some commits, or let me know what works. |
Hey @dekimsey wondering if you've had a chance to look at the above. I think this is a viable fix for the issue and that we could get it out in the next release. I'm happy to implement it myself, but I appreciate your report and patience, and if you're still interested in contributing to We're actively looking for new contributors but I also understand you posted this a while ago and may not be interested or able to do that now. Either way I thank you! |
When a caller (such as `sys.take_raft_snapshot`) performs a stream request, the act of attempting to parse the response as JSON causes the entire response body to be read and the underlying connection to be closed. This renders the streaming response moot. This change addresses that issue by examining if the caller requested a stream response, and if so returning the response as is. Without the change, it is impossible to read raft snapshots that are larger than memory as the entire response is read into memory to attempt to read as JSON.
512f3b5
to
52de1b1
Compare
Co-authored-by: Daniel Kimsey <dkimsey@trustwave.com>
sys.take_raft_snapshot
method
@dekimsey I've gone ahead and implemented the changes I proposed, but pushed to this PR, hope that's ok. I would appreciate if you have a few minutes to check it out. I'd be especially grateful if you're able to actually test the raft snapshot method itself, since we don't have any raft tests. 😢 @kaypeter87 do you have a sense of how difficult it might be to get raft set up in a Vault dev server? Is it even possible with a single node? Wondering if we might be able to get minimal testing going or if that's going to be infeasible. |
@briantist that shouldn't be too difficult. Every feature of Vault is available in "dev" mode. The -dev flag just short-circuits a lot of setup to insecure defaults. All data is stored (encrypted) in-memory. I haven't personally tried enabling dev mode with raft, but it should be possible from what the documentation states. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the only potential use case for these streaming results? Should this change be more broad to cover any potential larger than system memory use cases?
@adammike right now, it is the only known use case. The original PR sought to solve it more broadly, but due to the way Adapters are defined, and the leaky abstraction we have, there was not actually a good place to make such a broad change (this cannot work with the JSONAdapter generally). Some of the issues we've discussed in the past about Adapters, and some of the changes in #898 should lead us toward a future where that might make sense, but I believe that for now this is best solved for this one method. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LG2M
commit 1f88016 Author: Daniel Kimsey <90741+dekimsey@users.noreply.github.com> Date: Mon Aug 14 15:22:10 2023 -0500 Fix premature read on stream requests in the `sys.take_raft_snapshot` method (hvac#771) * Fix premature read on stream requests When a caller (such as `sys.take_raft_snapshot`) performs a stream request, the act of attempting to parse the response as JSON causes the entire response body to be read and the underlying connection to be closed. This renders the streaming response moot. This change addresses that issue by examining if the caller requested a stream response, and if so returning the response as is. Without the change, it is impossible to read raft snapshots that are larger than memory as the entire response is read into memory to attempt to read as JSON. * add the Adapter.from_adapter class method * add test for adapter class method * Modify raft snapshot to always use RawAdapter Co-authored-by: Daniel Kimsey <dkimsey@trustwave.com> --------- Co-authored-by: Brian Scholer <1260690+briantist@users.noreply.github.com> commit c398774 Author: ceesios <cees@virtu-on.nl> Date: Mon Aug 14 22:17:12 2023 +0200 ldap auth method - add missing `configure` params by vault api names (hvac#975) * add missing params by vault api names * move new parameters to the end * remove duplicate keys * add ldap tests for new params * fix black formatting * fix integrtion test with raw string * remove userfilter from integration test * Revert "remove userfilter from integration test" This reverts commit 296e9f2. * fix userFilter failure on Vault < 1.9 * fix capitalization * fix conditional for other tests * fix typo in user_dn doc * add generate_parameter_deprecation_message utility function * fixup * stop suppressing deprecation errors * add aliased_parameter decorator and tests * fix asterisks in docstring * Revert "fix asterisks in docstring" This reverts commit 1a599ec. * fix docstring asterisks without side effects * add testcases to fill out coverage of alias decorator * fix lint * update LDAP configure to use alias wrapper for replaced parameter names * update test references to use canonical names * add client_tls, connection_timeout, max_page_size, dereference_aliases and order remove_nones * Revert "add client_tls, connection_timeout, max_page_size, dereference_aliases and order remove_nones" This reverts commit 55d28f8. * add client_tls, connection_timeout, max_page_size, dereference_aliases and order remove_nones * add new params to unit tests --------- Co-authored-by: Brian Scholer <1260690+briantist@users.noreply.github.com>
When a caller (such as
sys.take_raft_snapshot
) performs a streamrequest, the act of attempting to parse the response as JSON causes the
entire response body to be read and the underlying connection to be
closed. This renders the streaming response moot.
This change addresses that issue by examining if the caller requested a
stream response, and if so returning the response as is.
Without the change, it is impossible to read raft snapshots that are
larger than memory as the entire response is read into memory to attempt
to read as JSON.