New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Continue localstack test harness development #111
Conversation
Heh. I just submitted a few updates which conflict with this--it separates out the localstack tests from the other tests, so your changes will need to target the new file. |
@@ -59,15 +59,15 @@ using ::tensorstore::internal_http::HttpResponse; | |||
using ::tensorstore::internal_kvstore_s3::S3Credentials; | |||
using ::tensorstore::internal_kvstore_s3::S3RequestBuilder; | |||
|
|||
ABSL_FLAG(std::string, localstack_binary, "", "Path to the localstack"); | |||
ABSL_FLAG(std::string, localstack_binary, "", "Path to the localstack binary"); | |||
ABSL_FLAG(std::string, localstack_endpoint, "http://localhost:4566", "Localstack endpoint"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a command line flag in localstack to set the port, or does it require a config file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like its configurable via environment variables
- https://docs.localstack.cloud/references/configuration/
GATEWAY_LISTEN
LOCALSTACK_HOST
I guess its possible to try PickUnusedPortOrDie
as discussed here: #91 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, if it's unset (and the env var is unset), that would be nice.
As mentioned in #91 (comment), this test tensorstore/tensorstore/kvstore/test_util.cc Lines 442 to 450 in 78b93c9
fails as follows:
I think this is due to Its not clear to me whether
I'll maybe take a look at how nginx handles this, but this will probably require a localstack change to align it's behaviour with s3/gcs. Do you know of any compelling arguments here? |
Setting tensorstore/tensorstore/kvstore/test_util.cc Lines 442 to 450 in 78b93c9
to produce a
|
@@ -173,12 +173,16 @@ tensorstore_cc_test( | |||
name = "s3_key_value_store_test", | |||
size = "small", | |||
srcs = ["s3_key_value_store_test.cc"], | |||
# args = ["--localstack_binary=$(location @pypa_localstack//:localstack)"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is the idea, but I don't know enough bazel to reference the localstack
script install in the python environment's bin directory. Perhaps I need to declare py_binary
that depends on @pypa_localstack
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To do that you'd need a workspace for localstack with a py_binary, similar to what we have for storagebench:
Then reference it like here, similar to what the gcs_grpc_testbench_test:
args = ["--testbench_binary=$(location @com_google_storagetestbench//:rest_server)"], |
I'd call the directory com_github_localstack or perhaps cloud_localstack.
@@ -59,15 +59,15 @@ using ::tensorstore::internal_http::HttpResponse; | |||
using ::tensorstore::internal_kvstore_s3::S3Credentials; | |||
using ::tensorstore::internal_kvstore_s3::S3RequestBuilder; | |||
|
|||
ABSL_FLAG(std::string, localstack_binary, "", "Path to the localstack"); | |||
ABSL_FLAG(std::string, localstack_binary, "", "Path to the localstack binary"); | |||
ABSL_FLAG(std::string, localstack_endpoint, "http://localhost:4566", "Localstack endpoint"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like its configurable via environment variables
- https://docs.localstack.cloud/references/configuration/
GATEWAY_LISTEN
LOCALSTACK_HOST
I guess its possible to try PickUnusedPortOrDie
as discussed here: #91 (comment)
:-) We should probably coordinate more -- apologies I was out on PTO mid-August. Thanks for merging #91 and adding the docs. I can devote a day or so a week to working on tensorstore so the velocity is not as high as it could be. I'd imagine the context switches aren't ideal for either of us. If it's desirable to develop the remaining S3 functionality more speedily, I'd be fine with stepping back and letting others take the helm. Alternatively, I can continue at the current pace, aiming to get the localstack testharness into good shape. Then, the remaining work should be easier for others to deal with piecemeal in smaller PRs. Also, In terms of communication, I think it would be valuable if I converted the TODO in #91 into an umbrella issue detailing the remaining S3 work. |
@@ -33,14 +33,17 @@ TEST(ValidateTest, ClassifyBucketName) { | |||
EXPECT_EQ(ClassifyBucketName("sthree-configurator.bucket"), | |||
BucketNameType::kInvalid); | |||
|
|||
// Standard bucket names are < 63 characters | |||
// Standard bucket names are < 63 characters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Split this test change into a separate pull request, please.
There are a few cleanups that I was going to do, mostly related to using Context for AWS auth, and perhaps a very minimal mock based test (similar to the gcs driver). But since it's submitted, the s3 kvstore will also be included in our continuous build and will see updates that affect it in that way, which might include updates which affect all of our drivers. But other than that, I don't have immediate plans to add features and am happy let you handle those improvements. Hopefully we won't step on each other's toes too much. |
w.r.t the range issue, I had to fix a few of those in the gcs testbench, it seems very likely that range requests are not well tested in localstack as well. If AWS respond correctly, it seems like a pull request for localstack is worthwhile. You could see what happens running the localstack test against minio instead of localstack. As for what's the right thing to do, according to this: https://www.rfc-editor.org/rfc/rfc9110.html#name-range-requests
|
Just a note about localstack_test vs other possibilities. With the addition of Then the following will work to access an actual s3 bucket:
With that, I see the following sequence for the basic test (some entries elided), where it appears to fail to write the initial value:
... so I may have mixed up some of the endpoint and host stuff while getting this to be async. Stay tuned for a followup cl or 2. |
Thanks, I'll need to take the time to read this carefully.
I think the examples above are fine, but for the Briefly, it looks like AWS behaviourly treats 1-0 as a request for the entire content. Actually, skimming the RFC above, it looks like this might technically be invalid:
Maybe a clamp is needed here for the inclusive_min=1, exclusive_max=1 case? tensorstore/tensorstore/internal/http/http_request.cc Lines 37 to 39 in c01efdc
That concerns me as from memory it was working in July, I'll have to set it up with my AWS bucket to test, but I need to switch to other tasks for the rest of the week. |
How does GCS respond to this? |
For localstack, the most important thing is that it mimics AWS responses, and AWS responds like this:
Which is the entire value. |
The localstack devs have indicated on the issue they agree that it should mimic AWS and return the entire response. |
I have a pending update for http-style backends to avoid sending invalid Range: headers. |
I had a closer look at the rfc and it looks like a server may choose to ignore an invalid range request:
|
The AWS range issue fix was merged:
I guess it will be included in the 2.3 release, which does not seem to have many remaining issues: I'll likely recreate this PR once 2.3 is released. |
FWIW I've run the localstack test case locally with the Looks like localstack 2.3 is targeting a 26th September release: |
You should be able to use even the older version now since I've updated the range request logic across the board to avoid the bad requests. |
Based in part on #111 When provided with --localstack_binary, localstack_test will start localstack in host mode (via package localstack[runtime]). When provided with --localstack_endpoint, localstack_test will connect to a running localstack instance. In order to integrate it into the CI, we need to install localstack[runtime], however there is presently a pypa conflict between apache-beam and localstack. PiperOrigin-RevId: 574269409 Change-Id: I511f6a7c28b9d14d9836030988f6d9bcb68a0549
With my latest commit it's possible to do this (on a mac with python 3.10) (in two terminals):
It's almost possible to do this, however there are two outstanding issues.
|
I remember trying a |
Follows on from these comments in #91