Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added XCTestExpectation to wait for all attemted fallback hosts (test-092) #1361

Merged
merged 3 commits into from
May 23, 2022

Conversation

maratal
Copy link
Collaborator

@maratal maratal commented Apr 17, 2022

Closes #1334

@maratal maratal changed the title Added XCTestExpectation to wait for all attemted fallback hosts Added XCTestExpectation to wait for all attemted fallback hosts (test-092) Apr 17, 2022
@lawrence-forooghian
Copy link
Collaborator

@maratal I'll review the code in a sec, but I was wondering how you confirmed that this fixes the issue?

Copy link
Collaborator

@lawrence-forooghian lawrence-forooghian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I think that whilst these changes do make the tests pass, they change the test's expectations about the order of events, and they do so in a way that I think doesn't match the expected behaviour of the library according to the features spec.

I added some logging on top of your changes here (sorry, I accidentally pushed to your PR branch, didn't mean to, and have reverted that!).

Usually, running the test causes the following sequence of events:

Test Suite 'Selected tests' started at 2022-04-20 16:26:31.301
Test Suite 'Ably-iOS-Tests.xctest' started at 2022-04-20 16:26:31.303
Test Suite 'RealtimeClientConnectionTests' started at 2022-04-20 16:26:31.304
Test Case '-[Ably_iOS_Tests.RealtimeClientConnectionTests test__092__Connection__Host_Fallback__should_retry_hosts_in_random_order_after_checkin_if_an_internet_connection_is_available]' started.
2022-04-20 16:26:31.319350-0300 xctest[31330:3399770] LAWRENCE: got wss://realtime.ably.io:443/?key=xxxx:xxxx&echo=true&agent=ably-cocoa/1.2.10                   0OS/15.4.0&v=1.2&format=msgpack
2022-04-20 16:26:31.530950-0300 xctest[31330:3399982] [boringssl] boringssl_metrics_log_metric_block_invoke(153) Failed to log metrics
2022-04-20 16:26:31.833247-0300 xctest[31330:3399982] LAWRENCE: got wss://d.ably-realtime.com:443/?key=xxxx:xxxx&echo=true&agent=ably-cocoa/1.2.10                   0OS/15.4.0&v=1.2&format=msgpack
2022-04-20 16:26:31.946389-0300 xctest[31330:3399984] LAWRENCE: got wss://e.ably-realtime.com:443/?key=xxxx:xxxx&echo=true&agent=ably-cocoa/1.2.10                   0OS/15.4.0&v=1.2&format=msgpack
2022-04-20 16:26:32.057573-0300 xctest[31330:3399981] LAWRENCE: got wss://a.ably-realtime.com:443/?key=xxxx:xxxx&echo=true&agent=ably-cocoa/1.2.10                   0OS/15.4.0&v=1.2&format=msgpack
2022-04-20 16:26:32.161498-0300 xctest[31330:3399981] LAWRENCE: got wss://c.ably-realtime.com:443/?key=xxxx:xxxx&echo=true&agent=ably-cocoa/1.2.10                   0OS/15.4.0&v=1.2&format=msgpack
2022-04-20 16:26:32.268037-0300 xctest[31330:3399982] LAWRENCE: got wss://b.ably-realtime.com:443/?key=xxxx:xxxx&echo=true&agent=ably-cocoa/1.2.10                   0OS/15.4.0&v=1.2&format=msgpack
2022-04-20 16:26:32.268988-0300 xctest[31330:3399770] LAWRENCE: fulfilling expectation
2022-04-20 16:26:32.378366-0300 xctest[31330:3399770] LAWRENCE: got disconnected
Test Case '-[Ably_iOS_Tests.RealtimeClientConnectionTests test__092__Connection__Host_Fallback__should_retry_hosts_in_random_order_after_checkin_if_an_internet_connection_is_available]' passed (1.081 seconds).
Test Suite 'RealtimeClientConnectionTests' passed at 2022-04-20 16:26:32.387.
	 Executed 1 test, with 0 failures (0 unexpected) in 1.081 (1.083) seconds
Test Suite 'Ably-iOS-Tests.xctest' passed at 2022-04-20 16:26:32.388.
	 Executed 1 test, with 0 failures (0 unexpected) in 1.081 (1.084) seconds
Test Suite 'Selected tests' passed at 2022-04-20 16:26:32.389.
	 Executed 1 test, with 0 failures (0 unexpected) in 1.081 (1.088) seconds
Program ended with exit code: 0

But, if I run the test (can just run that test in isolation locally) enough times, I eventually get:

Test Suite 'Selected tests' started at 2022-04-20 16:20:17.991
Test Suite 'Ably-iOS-Tests.xctest' started at 2022-04-20 16:20:17.993
Test Suite 'RealtimeClientConnectionTests' started at 2022-04-20 16:20:17.994
Test Case '-[Ably_iOS_Tests.RealtimeClientConnectionTests test__092__Connection__Host_Fallback__should_retry_hosts_in_random_order_after_checkin_if_an_internet_connection_is_available]' started.
2022-04-20 16:20:18.012794-0300 xctest[31036:3393941] LAWRENCE: got wss://realtime.ably.io:443/?key=xxxx:xxxx&echo=true&agent=ably-cocoa/1.2.10                   0OS/15.4.0&v=1.2&format=msgpack
2022-04-20 16:20:18.267933-0300 xctest[31036:3393965] [boringssl] boringssl_metrics_log_metric_block_invoke(153) Failed to log metrics
2022-04-20 16:20:18.706577-0300 xctest[31036:3393965] LAWRENCE: got wss://d.ably-realtime.com:443/?key=xxxx:xxxx&echo=true&agent=ably-cocoa/1.2.10                   0OS/15.4.0&v=1.2&format=msgpack
2022-04-20 16:20:18.811340-0300 xctest[31036:3393965] LAWRENCE: got wss://e.ably-realtime.com:443/?key=xxxx:xxxx&echo=true&agent=ably-cocoa/1.2.10                   0OS/15.4.0&v=1.2&format=msgpack
2022-04-20 16:20:18.921586-0300 xctest[31036:3393967] LAWRENCE: got wss://a.ably-realtime.com:443/?key=xxxx:xxxx&echo=true&agent=ably-cocoa/1.2.10                   0OS/15.4.0&v=1.2&format=msgpack
2022-04-20 16:20:19.032072-0300 xctest[31036:3393966] LAWRENCE: got wss://c.ably-realtime.com:443/?key=xxxx:xxxx&echo=true&agent=ably-cocoa/1.2.10                   0OS/15.4.0&v=1.2&format=msgpack
2022-04-20 16:20:19.138814-0300 xctest[31036:3393941] LAWRENCE: got disconnected
2022-04-20 16:20:34.219855-0300 xctest[31036:3393968] LAWRENCE: got wss://realtime.ably.io:443/?key=xxxx:xxxx&echo=true&agent=ably-cocoa/1.2.10                   0OS/15.4.0&v=1.2&format=msgpack
2022-04-20 16:20:34.333597-0300 xctest[31036:3393967] LAWRENCE: got wss://b.ably-realtime.com:443/?key=xxxx:xxxx&echo=true&agent=ably-cocoa/1.2.10                   0OS/15.4.0&v=1.2&format=msgpack
2022-04-20 16:20:34.335247-0300 xctest[31036:3393941] LAWRENCE: fulfilling expectation
Test Case '-[Ably_iOS_Tests.RealtimeClientConnectionTests test__092__Connection__Host_Fallback__should_retry_hosts_in_random_order_after_checkin_if_an_internet_connection_is_available]' passed (16.346 seconds).
Test Suite 'RealtimeClientConnectionTests' passed at 2022-04-20 16:20:34.341.
	 Executed 1 test, with 0 failures (0 unexpected) in 16.346 (16.348) seconds
Test Suite 'Ably-iOS-Tests.xctest' passed at 2022-04-20 16:20:34.343.
	 Executed 1 test, with 0 failures (0 unexpected) in 16.346 (16.350) seconds
Test Suite 'Selected tests' passed at 2022-04-20 16:20:34.344.
	 Executed 1 test, with 0 failures (0 unexpected) in 16.346 (16.353) seconds
Program ended with exit code: 0

That is, the .disconnected event is received before the b.ably-realtime.com host is tried. But RSC15a says that the connection should only fail (which I believe to mean, emit a .disconnected event and set an errorReason on the connection) when all fallback hosts have been tried.

It seems like this might need deeper investigation?

@maratal
Copy link
Collaborator Author

maratal commented Apr 20, 2022

It seems like this might need deeper investigation?

Well, that's interesting. What happened there, it's library has a very short timeout set in the beginning of this test, so it fails earlier than it can test all the hosts. Increasing said timeout solves this problem, but I need to think how to fulfill expectations properly. Quick fix that I've tried looks ugly. @lawrence-forooghian

Copy link
Collaborator

@lawrence-forooghian lawrence-forooghian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @maratal, I took a look at the new commit f4cf782. But if, as I observed sometimes happening in my previous comment,

the .disconnected event is received before the b.ably-realtime.com host is tried.

then wouldn't we expect the test to still fail?

@QuintinWillison QuintinWillison removed their request for review May 4, 2022 08:51
@maratal
Copy link
Collaborator Author

maratal commented May 12, 2022

See the results of testing in a loop here #1399 @lawrence-forooghian @lukasz-szyszkowski

Copy link
Collaborator

@lawrence-forooghian lawrence-forooghian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @maratal, did you have any thoughts about this question?

Also, I see that in #1399 you've run the tests 28 times. Given that in the original data used to generate the list in #1330, this particular test only failed in 1.2% of uploads, we'd expect to need to run the tests about 84 times in order to expect to see even a single failure. So I don't think that the data in #1399 yet gives me confidence that this test is fixed, especially bearing my above question in mind.

I've created #1419, which will run the tests in 3 parallel workflows for a day. So that should gather quite a lot more data. Let's see how that turns out and take it from there?

@maratal
Copy link
Collaborator Author

maratal commented May 16, 2022

Hey @maratal, did you have any thoughts about this question?

Sorry I didn't respond to this. Probably because I would never able to reproduce a single failure locally, after f4cf782 fix this test always passes. But I agree that we should have more data on it. I, tbh, didn't understand well those proportions, now I see the math, thanks! @lawrence-forooghian

@lawrence-forooghian lawrence-forooghian merged commit a387c46 into main May 23, 2022
@lawrence-forooghian lawrence-forooghian deleted the fix/1334-test-092-Host-Fallback branch May 23, 2022 17:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
3 participants