Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include additional error information in emulator log #34

Merged
merged 7 commits into from
Oct 30, 2023

Conversation

aarsilv
Copy link
Contributor

@aarsilv aarsilv commented Oct 27, 2023

Ticket: FF-1133 - Harden waiting for RAC request in Android SDK Test

Motivation: still getting weird failures (see: https://github.com/Eppo-exp/sdk-test-data/actions/runs/6661079552)

I want more information to make it in the uploaded log, and I noticed we only allow 2 seconds for the RAC request to finish and then silently continue on our way. This change increases the timeout to be more generous and throws an error if it does time out, so we're not left scratching our heads.

I added a bit of extra logging, too, which revealed that sometimes the action cannot reach the Google Cloud Function. So I added a little loop to wait up to 15 seconds for DNS to resolve before proceeding.

image

I also saw the above check pass but android emulator fail, so I switched to HTTPS protocol thinking that may play nicer with DNS.

Comment on lines +158 to +160
if(!lock.await(10000, TimeUnit.MILLISECONDS)) {
throw new RuntimeException("Request for RAC did not complete within timeout");
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the core change. If CountDownLatch.await() hits it's timeout, it returns false but then continues execution. This can be confusing to the developer, so now we'll throw an error.

First testing this on my bad home wifi, I hit the error 😅 . So, I've extended the timeout to a generous 10 seconds. However, since it's using this await, most of the time we won't actually wait that long.

@@ -179,8 +179,6 @@ public void testAssignments() {
public void testErrorGracefulModeOn() {
try {
initClient(TEST_HOST, false, true, true);
System.out.println("Sleeping for a bit to wait for cache population to complete");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't need these sleeps since we are no formally waiting for the request to complete.

System.out.println("Sleeping for a bit to wait for cache population to complete");
Thread.sleep(1000);
Thread.sleep(2000);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do still need this sleep though, for asynchronous cache repopulation, but I'm doubling it given how unimpressed I am with GitHub action performance.

e.printStackTrace();
}
Log.e(TAG, "Http request failure", e);
Log.e(TAG, "Http request failure: "+e.getMessage()+" "+Arrays.toString(e.getStackTrace()), e);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More helpful failure log.

Comment on lines +46 to +47
private static final String TEST_HOST = "https://us-central1-eppo-prod-312905.cloudfunctions.net/serveGithubRacTestFile";
private static final String INVALID_HOST = "https://thisisabaddomainforthistest.com";
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made these HTTPS

Comment on lines +38 to +46
this.client = buildOkHttpClient();
}

private static OkHttpClient buildOkHttpClient() {
OkHttpClient.Builder builder = new OkHttpClient().newBuilder()
.connectTimeout(10, TimeUnit.SECONDS)
.readTimeout(10, TimeUnit.SECONDS);

return builder.build();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made this more obvious it was happening in constructor and set it up for dropping in IpV4-only DNS, which may be needed (but hopefully not 🤞)

@aarsilv aarsilv marked this pull request as ready for review October 27, 2023 23:57
@aarsilv aarsilv merged commit 8957ca5 into main Oct 30, 2023
1 check passed
@aarsilv aarsilv deleted the aaron/include-http-failure-message branch October 30, 2023 23:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants