Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-5397: [FlightRPC] Add TLS certificates for testing Flight #4510

Closed
wants to merge 3 commits into from
Closed

ARROW-5397: [FlightRPC] Add TLS certificates for testing Flight #4510

wants to merge 3 commits into from

Conversation

ghost
Copy link

@ghost ghost commented Jun 10, 2019

@codecov-io
Copy link

codecov-io commented Jun 10, 2019

Codecov Report

Merging #4510 into master will decrease coverage by 13.21%.
The diff coverage is n/a.

Impacted file tree graph

@@             Coverage Diff             @@
##           master    #4510       +/-   ##
===========================================
- Coverage   88.55%   75.34%   -13.22%     
===========================================
  Files         796       56      -740     
  Lines      103239     3192   -100047     
  Branches     1253        0     -1253     
===========================================
- Hits        91425     2405    -89020     
+ Misses      11569      787    -10782     
+ Partials      245        0      -245
Impacted Files Coverage Δ
python/pyarrow/ipc.pxi
cpp/src/arrow/csv/chunker-test.cc
cpp/src/parquet/column_page.h
cpp/src/parquet/bloom_filter-test.cc
cpp/src/arrow/array/builder_decimal.cc
cpp/src/plasma/client.cc
cpp/src/arrow/io/test-common.h
cpp/src/arrow/util/int-util-test.cc
cpp/src/arrow/python/io.cc
python/pyarrow/hdfs.py
... and 830 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 25b4a46...5eff724. Read the comment docs.

@ghost
Copy link
Author

ghost commented Jun 10, 2019

The submodule is now up-to-date, let's make sure the tests run in CI...

@ghost
Copy link
Author

ghost commented Jun 10, 2019

C++ with conda-forge appears to run the tests, but not Python with conda-forge (even though it sources travis_env_common.sh); I'll take a look at that when I can.

Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This looks good on the principle, just a couple comments.

cpp/src/arrow/flight/test-util.cc Outdated Show resolved Hide resolved
cpp/src/arrow/flight/test-util.cc Outdated Show resolved Hide resolved
cpp/src/arrow/flight/flight-test.cc Outdated Show resolved Hide resolved
cpp/src/arrow/flight/flight-test.cc Outdated Show resolved Hide resolved
python/pyarrow/tests/test_flight.py Outdated Show resolved Hide resolved
) as server_location:
options = flight.FlightCallOptions(timeout=0.5)
client = flight.FlightClient.connect(server_location)
with pytest.raises(pa.ArrowIOError, match="Deadline Exceeded"):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a point in testing per-call timeout here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added an explanatory comment, but the purpose of the timeout is to make sure gRPC doesn't block there for a long time trying to reconnect over and over. (Not sure why it doesn't fail-fast in these cases.)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, looks like we set gRPC's wait_for_ready option, which is why it doesn't fail-fast. But unsetting that causes C++ tests to fail due to a race condition between starting the server subprocess and creating the client. I'll convert the tests to not use the subprocess (should make them run faster too), then drop the option and the timeout here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not using a subprocess is nice, but we'll still need to allow retries, otherwise the race condition will still be there AFAIK.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to ping the server (with a short timeout) until a request succeeds, rather than rely on experimental gRPC options that we don't have much control over (again, in my opinion, blocking for some unbounded amount of time is a rather pathological way to handle connection failure...it also leads to more cases where Ctrl-C doesn't work in Python)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually as implemented, creating a server immediately starts it, so there's no need for even that...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It starts the thread, but is the listening socket already created before?

Copy link
Author

@ghost ghost Jun 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - FlightServerBase::Init calls BuildAndStart, so all Serve actually does is set up signals and block until the server shuts down. (maybe we shouldn't rely on that..., but as-is, it's a little annoying to tell what the underlying error in a call is except by comparing strings)

python/pyarrow/tests/test_flight.py Outdated Show resolved Hide resolved
@ghost
Copy link
Author

ghost commented Jun 13, 2019

Looks like one of the C++ builds flaked when downloading deps, but otherwise tests run in CI now.

@pitrou
Copy link
Member

pitrou commented Jun 13, 2019

I restarted the failed build.

@ghost
Copy link
Author

ghost commented Jun 13, 2019

Thanks, looks like tests pass!

@pitrou
Copy link
Member

pitrou commented Jun 13, 2019

Gonna do a final review.

Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just two nits.

@@ -620,5 +659,25 @@ TEST_F(TestAuthHandler, CheckPeerIdentity) {
ASSERT_EQ(result->body->ToString(), "user");
}

TEST_F(TestTls, DoAction) {
if (!client_) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this actually occur?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this is left over from when I made the TLS tests optional, I've removed it.

cpp/src/arrow/flight/test-util.cc Outdated Show resolved Hide resolved
@wesm
Copy link
Member

wesm commented Jun 13, 2019

Did the Thrift download fail (I just opened https://issues.apache.org/jira/browse/ARROW-5576)?

@pitrou
Copy link
Member

pitrou commented Jun 13, 2019

@wesm Yeah, that was it.

@wesm
Copy link
Member

wesm commented Jun 13, 2019

OK, I looked into this and we're using the Apache dist system inappropriately, I commented at https://issues.apache.org/jira/browse/ARROW-5576?focusedCommentId=16863112&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16863112

@pitrou pitrou closed this in b7e8ed7 Jun 13, 2019
@pitrou
Copy link
Member

pitrou commented Jun 13, 2019

Thanks @lihalite !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants