-
Notifications
You must be signed in to change notification settings - Fork 361
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Potential race condition using storage::Client::WriteObject #3184
Comments
Hrm, now I also ran into a double free. Do we have some concurrency bugs in the library? stacktrace:
If this should go into a separate issue, please let me know :-) |
Thanks again for filing detailed bug reports. Can you tell us what version of libcurl is used by NixOS? Or better yet, is:
the right way to start a shell with all the NixOS packages? Should we use something other than |
The GCS support isn't yet merged.
However, I can provide a `default.nix` in some hours, which should catapult you into an environment containing the right version of nix, and gdb :-)
|
I was just planning to compile google-cloud-cpp from source with the nix dependencies and then try to repro in that environment. |
@coryan I pushed a repro at https://github.com/flokli/nix-gcs-repro, with build instructions. From inside the |
Ah, and from a |
@coryan did you manage to reproduce using the repo provided, or by building locally? Any way I can help with debugging? |
@flokli I have not been able to wrong on reproducing this in your environment. I am trying to write a test with larger/longer uploads and more opportunities for any race condition to prop up. One thing you could do to help is go through the release notes for curl (and nghttp2) and see if anything that looks like your problem has been found? I ask because we use curl in a fairly vanilla way: we create a |
For what it is worth, I have been running the code in this branch: for hours without any crashes (the program I linked runs 2 * NCORES threads all uploading objects between 1GiB and 2GiB) . It uses the same version of libcurl (7.65.3), I suspect the problems are either in nghttp2, or I need more threads. Can you tell me what nghttp2 you use? Sorry I have not had the time to learn how to work with NixOS just yet. |
All good! Thanks for taking a look into this, very much appreciated! That specific repro runs with That specific flavour of nghttp2 (without hpack/jansson, asiolib/boost, getassets/libxml2, jemalloc), you can obtain the build log via |
@coryan so you're not able to reproduce in the test suite, but are able to in the provided repro? I should add the bucket is in |
Can you confirm that I did the right things to get your repro setup > docker run --rm -it nixos/nix /bin/sh
# nix-channel --update
# nix-env -i git
# git clone https://github.com/flokli/nix-gcs-repro.git
# cd nix-gcs-repro/
# nix-shell .
# nix copy --to gs://coryan-test-nix /nix/store/*
querying source.drv on gs://coryan-test-nixSegmentation fault (core dumped) Do those look reasonable? |
yes, this looks right :-) |
Hey, nothing obvious comes to mind when looking at the stack traces or valgrind. The client library has a relatively simple strategy for locking: a download (or upload) creates a I did notice that your application creates two https://github.com/NixOS/nix/pull/3021/files#diff-b1e603a1ea8f9c5a0d38f7caf7079cdaR46 |
ping @andir |
Any updates here? Is this still an issue? Thanks. |
No sorry. AFAICT this is something in the particular version of libcurl used by that build, I was unable to repro in any other platform. I am changing the priority for now. |
Sorry for not getting back recently - the underlying Nix PR using I propose closing this for now. If this gets picked up, and is still an issue, we can always open a new issue, but there's few reasons to keep an issue open which might already fixed. |
Answering these questions before submitting your bug report will help us give
you a quicker answer. Thank you!
If one or more of these questions are not applicable, feel free to remove them.
Does this issue affect the
google-cloud-cpp
project?If the problem is with the service exposed by the
google-cloud-cpp
APIsinstead of the client libraries you may consider opening a support request
instead. The
google-cloud-cpp
developers cannot help you troubleshootproblems with the service itself.
What component of
google-cloud-cpp
is this related to?Remove the ones that do not apply.
What version of
google-cloud-cpp
are you using?v0.14.0
Please include the output from
git rev-parse HEAD
if you are compiling fromsource, or the version number from the applicable
*/version.h
file.What compiler and version are you using?
Please include the output of
g++ -v
or the equivalent command-line flag.What operating system and version are you using?
NixOS, NixOS/nixpkgs@5e225b7
If you are using a Linux distribution include the name and version of the
distribution too.
What were you trying to do?
If possible, produce a recipe for reproducing the problem.
NixOS/nix#3021
nix copy --to gs://mybucket /nix/store/...
What did you expect to see?
The passed nix outputs uploaded successfully.
What was the behavior you expected from the library?
To upload the outputs
What did you see instead?
What was the behavior you actually observed?
Anything else you would like us to know?
Include here information about your environment that is not captured above, or
any other information you think might be relevant.
This seems to be a snowflake.
I couldn't reproduce it reliably, and it succeeded after running a second time. Stacktrace attached.
The text was updated successfully, but these errors were encountered: