-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added XCTestExpectation to wait for all attemted fallback hosts (test-092) #1361
Conversation
@maratal I'll review the code in a sec, but I was wondering how you confirmed that this fixes the issue? |
931b43b
to
c921f6b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, I think that whilst these changes do make the tests pass, they change the test's expectations about the order of events, and they do so in a way that I think doesn't match the expected behaviour of the library according to the features spec.
I added some logging on top of your changes here (sorry, I accidentally pushed to your PR branch, didn't mean to, and have reverted that!).
Usually, running the test causes the following sequence of events:
Test Suite 'Selected tests' started at 2022-04-20 16:26:31.301
Test Suite 'Ably-iOS-Tests.xctest' started at 2022-04-20 16:26:31.303
Test Suite 'RealtimeClientConnectionTests' started at 2022-04-20 16:26:31.304
Test Case '-[Ably_iOS_Tests.RealtimeClientConnectionTests test__092__Connection__Host_Fallback__should_retry_hosts_in_random_order_after_checkin_if_an_internet_connection_is_available]' started.
2022-04-20 16:26:31.319350-0300 xctest[31330:3399770] LAWRENCE: got wss://realtime.ably.io:443/?key=xxxx:xxxx&echo=true&agent=ably-cocoa/1.2.10 0OS/15.4.0&v=1.2&format=msgpack
2022-04-20 16:26:31.530950-0300 xctest[31330:3399982] [boringssl] boringssl_metrics_log_metric_block_invoke(153) Failed to log metrics
2022-04-20 16:26:31.833247-0300 xctest[31330:3399982] LAWRENCE: got wss://d.ably-realtime.com:443/?key=xxxx:xxxx&echo=true&agent=ably-cocoa/1.2.10 0OS/15.4.0&v=1.2&format=msgpack
2022-04-20 16:26:31.946389-0300 xctest[31330:3399984] LAWRENCE: got wss://e.ably-realtime.com:443/?key=xxxx:xxxx&echo=true&agent=ably-cocoa/1.2.10 0OS/15.4.0&v=1.2&format=msgpack
2022-04-20 16:26:32.057573-0300 xctest[31330:3399981] LAWRENCE: got wss://a.ably-realtime.com:443/?key=xxxx:xxxx&echo=true&agent=ably-cocoa/1.2.10 0OS/15.4.0&v=1.2&format=msgpack
2022-04-20 16:26:32.161498-0300 xctest[31330:3399981] LAWRENCE: got wss://c.ably-realtime.com:443/?key=xxxx:xxxx&echo=true&agent=ably-cocoa/1.2.10 0OS/15.4.0&v=1.2&format=msgpack
2022-04-20 16:26:32.268037-0300 xctest[31330:3399982] LAWRENCE: got wss://b.ably-realtime.com:443/?key=xxxx:xxxx&echo=true&agent=ably-cocoa/1.2.10 0OS/15.4.0&v=1.2&format=msgpack
2022-04-20 16:26:32.268988-0300 xctest[31330:3399770] LAWRENCE: fulfilling expectation
2022-04-20 16:26:32.378366-0300 xctest[31330:3399770] LAWRENCE: got disconnected
Test Case '-[Ably_iOS_Tests.RealtimeClientConnectionTests test__092__Connection__Host_Fallback__should_retry_hosts_in_random_order_after_checkin_if_an_internet_connection_is_available]' passed (1.081 seconds).
Test Suite 'RealtimeClientConnectionTests' passed at 2022-04-20 16:26:32.387.
Executed 1 test, with 0 failures (0 unexpected) in 1.081 (1.083) seconds
Test Suite 'Ably-iOS-Tests.xctest' passed at 2022-04-20 16:26:32.388.
Executed 1 test, with 0 failures (0 unexpected) in 1.081 (1.084) seconds
Test Suite 'Selected tests' passed at 2022-04-20 16:26:32.389.
Executed 1 test, with 0 failures (0 unexpected) in 1.081 (1.088) seconds
Program ended with exit code: 0
But, if I run the test (can just run that test in isolation locally) enough times, I eventually get:
Test Suite 'Selected tests' started at 2022-04-20 16:20:17.991
Test Suite 'Ably-iOS-Tests.xctest' started at 2022-04-20 16:20:17.993
Test Suite 'RealtimeClientConnectionTests' started at 2022-04-20 16:20:17.994
Test Case '-[Ably_iOS_Tests.RealtimeClientConnectionTests test__092__Connection__Host_Fallback__should_retry_hosts_in_random_order_after_checkin_if_an_internet_connection_is_available]' started.
2022-04-20 16:20:18.012794-0300 xctest[31036:3393941] LAWRENCE: got wss://realtime.ably.io:443/?key=xxxx:xxxx&echo=true&agent=ably-cocoa/1.2.10 0OS/15.4.0&v=1.2&format=msgpack
2022-04-20 16:20:18.267933-0300 xctest[31036:3393965] [boringssl] boringssl_metrics_log_metric_block_invoke(153) Failed to log metrics
2022-04-20 16:20:18.706577-0300 xctest[31036:3393965] LAWRENCE: got wss://d.ably-realtime.com:443/?key=xxxx:xxxx&echo=true&agent=ably-cocoa/1.2.10 0OS/15.4.0&v=1.2&format=msgpack
2022-04-20 16:20:18.811340-0300 xctest[31036:3393965] LAWRENCE: got wss://e.ably-realtime.com:443/?key=xxxx:xxxx&echo=true&agent=ably-cocoa/1.2.10 0OS/15.4.0&v=1.2&format=msgpack
2022-04-20 16:20:18.921586-0300 xctest[31036:3393967] LAWRENCE: got wss://a.ably-realtime.com:443/?key=xxxx:xxxx&echo=true&agent=ably-cocoa/1.2.10 0OS/15.4.0&v=1.2&format=msgpack
2022-04-20 16:20:19.032072-0300 xctest[31036:3393966] LAWRENCE: got wss://c.ably-realtime.com:443/?key=xxxx:xxxx&echo=true&agent=ably-cocoa/1.2.10 0OS/15.4.0&v=1.2&format=msgpack
2022-04-20 16:20:19.138814-0300 xctest[31036:3393941] LAWRENCE: got disconnected
2022-04-20 16:20:34.219855-0300 xctest[31036:3393968] LAWRENCE: got wss://realtime.ably.io:443/?key=xxxx:xxxx&echo=true&agent=ably-cocoa/1.2.10 0OS/15.4.0&v=1.2&format=msgpack
2022-04-20 16:20:34.333597-0300 xctest[31036:3393967] LAWRENCE: got wss://b.ably-realtime.com:443/?key=xxxx:xxxx&echo=true&agent=ably-cocoa/1.2.10 0OS/15.4.0&v=1.2&format=msgpack
2022-04-20 16:20:34.335247-0300 xctest[31036:3393941] LAWRENCE: fulfilling expectation
Test Case '-[Ably_iOS_Tests.RealtimeClientConnectionTests test__092__Connection__Host_Fallback__should_retry_hosts_in_random_order_after_checkin_if_an_internet_connection_is_available]' passed (16.346 seconds).
Test Suite 'RealtimeClientConnectionTests' passed at 2022-04-20 16:20:34.341.
Executed 1 test, with 0 failures (0 unexpected) in 16.346 (16.348) seconds
Test Suite 'Ably-iOS-Tests.xctest' passed at 2022-04-20 16:20:34.343.
Executed 1 test, with 0 failures (0 unexpected) in 16.346 (16.350) seconds
Test Suite 'Selected tests' passed at 2022-04-20 16:20:34.344.
Executed 1 test, with 0 failures (0 unexpected) in 16.346 (16.353) seconds
Program ended with exit code: 0
That is, the .disconnected
event is received before the b.ably-realtime.com
host is tried. But RSC15a says that the connection should only fail (which I believe to mean, emit a .disconnected
event and set an errorReason
on the connection) when all fallback hosts have been tried.
It seems like this might need deeper investigation?
Well, that's interesting. What happened there, it's library has a very short timeout set in the beginning of this test, so it fails earlier than it can test all the hosts. Increasing said timeout solves this problem, but I need to think how to fulfill expectations properly. Quick fix that I've tried looks ugly. @lawrence-forooghian |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @maratal, I took a look at the new commit f4cf782. But if, as I observed sometimes happening in my previous comment,
the
.disconnected
event is received before theb.ably-realtime.com
host is tried.
then wouldn't we expect the test to still fail?
See the results of testing in a loop here #1399 @lawrence-forooghian @lukasz-szyszkowski |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @maratal, did you have any thoughts about this question?
Also, I see that in #1399 you've run the tests 28 times. Given that in the original data used to generate the list in #1330, this particular test only failed in 1.2% of uploads, we'd expect to need to run the tests about 84 times in order to expect to see even a single failure. So I don't think that the data in #1399 yet gives me confidence that this test is fixed, especially bearing my above question in mind.
I've created #1419, which will run the tests in 3 parallel workflows for a day. So that should gather quite a lot more data. Let's see how that turns out and take it from there?
Sorry I didn't respond to this. Probably because I would never able to reproduce a single failure locally, after f4cf782 fix this test always passes. But I agree that we should have more data on it. I, tbh, didn't understand well those proportions, now I see the math, thanks! @lawrence-forooghian |
Closes #1334