Skip to content
This repository has been archived by the owner on May 30, 2024. It is now read-only.

Lack of programmatic feedback on connection issues on service startup #184

Closed
mightyguava opened this issue Jan 17, 2020 · 6 comments
Closed
Assignees

Comments

@mightyguava
Copy link
Contributor

mightyguava commented Jan 17, 2020

Is your feature request related to a problem? Please describe.
If LaunchDarkly (or the relay) is unreachable on service startup, there isn't any feedback mechanism for the app to do anything about it (for example if we want the app to crash and page loudly).

There is a startWaitMillis() option on the builder that will attempt to block until the connection is established, but if it times out or fails, the startup process will just proceed silently. Additionally, this option causes the constructor to LDClient to block, which plays poorly with dependency injection frameworks like Guice.

Logs are generated by the SDK, but logs aren't useful to the application itself.

Describe the solution you'd like

Add a method to LDClient, like

public void awaitInitialized(Duration timeout) throws Exception {
  ...
}

This method blocks until the initial streaming connection to the backend is established. If there are connection errors, or bad HTTP status codes returned, it will throw immediately with the underlying exception (potentially after configurable retries). If the initial connection is successful, the method returns. If the initial connection takes longer than timeout, then it will throw TimeoutException.

Additionally, add a callback mechanism to LDClient to propagate subsequent connection issues, like

public void onConnectionError(... some callback interface...)

This would notify the applicable during steady state that the connection to LaunchDarkly is broken, and the application can choose what it wants to do. The argument could potentially be a com.launchdarkly.eventsource.ConnectionErrorHandler, but maybe the app shouldn't be able to decide the Action for the event source.

Describe alternatives you've considered
That's all I got.

Additional context
This is in context to a connection issue we saw that took a while to pinpoint: #183. Our services take the practice of crashing if LaunchDarkly isn't ready after a certain timeout after startup, and it'd be really useful if that crash could include why LaunchDarkly wasn't ready.

@gwhelanLD gwhelanLD self-assigned this Jan 25, 2020
@gwhelanLD
Copy link
Contributor

Hi @mightyguava,

Thanks for the suggestions. For background, we have similar APIs to what you've described in our mobile SDKs, where we consider the chance of network failures much higher. We will definitely consider this request to bring similar features to our server SDKs, but we would like to try to synchronize bringing new APIs across our range of supported platforms (with exception to our different models of clients platforms and server platforms) to avoid too much disparity in our platform support. It may take us some time to fully evaluate this API change when considering the impact across all our SDKs.

Thanks,
@gwhelanLD

LaunchDarklyCI pushed a commit that referenced this issue Jan 30, 2020
fix generation of CA certs for TLS tests
@eli-darkly
Copy link
Contributor

Late follow-up: we haven't forgotten about this feature request, it just didn't make the cut for the 5.0.0 release that we've been putting together (in beta some time this week). However, other design changes we made for 5.0.0 should make it a bit easier to implement such a thing in the future.

@eli-darkly
Copy link
Contributor

@mightyguava - Actually, we did manage to get this feature into the second beta, 5.0.0-rc2, in case you'd like to check that out. The relevant functionality is all accessed through client.getDataSourceStatusProvider(). Here are some examples of the intended usage:

  1. Get a snapshot of the current connection status
  DataSourceStatusProvider.Status status = client.getDataSourceStatusProvider().getStatus();
  if (status.getState() == DataSourceStatusProvider.State.VALID) {
    System.out.println("we're currently connected")
  } else {
    System.out.println("not currently connected - last error was: " + status.getLastError());
  }
  1. Get callbacks when the connection status changes
  client.getDataSourceStatusProvider().addStatusListener(newStatus -> {
    if (status.getState() == DataSourceStatusProvider.State.VALID) {
      System.out.println("the connection has just become valid");
    } else {
      System.out.println("there's been a connection failure: " + status.getLastError());
    }
  });

The object returned by getLastError() includes further details about whether it was a network error, an HTTP error response from the LaunchDarkly services, etc.

This doesn't currently include a "wait until initialized" method per se, as you had suggested, but that would be fairly simple to build on top of this API and we might go ahead and add it as a built-in method.

@mightyguava
Copy link
Contributor Author

Great! This is going to be really useful for us. A “wait until initialized” method would be awesome to have too as that’s how we’ll probably use it.

@eli-darkly
Copy link
Contributor

Here's a proposed new method for DataSourceStatusProvider:

  /**
   * A synchronous method for waiting for a desired connection state.
   * <p>
   * If the current state is already {@code desiredState} when this method is called, it immediately returns.
   * Otherwise, it blocks until 1. the state has become {@code desiredState}, 2. the state has become
   * {@link State#OFF} (since that is a permanent condition), 3. the specified timeout elapses, or 4.
   * the current thread is deliberately interrupted with {@link Thread#interrupt()}.
   * <p>
   * A scenario in which this might be useful is if you want to create the {@code LDClient} without waiting
   * for it to initialize, and then wait for initialization at a later time or on a different thread:
   * <pre><code>
   *     // create the client but do not wait
   *     LDConfig config = new LDConfig.Builder().startWait(Duration.ZERO).build();
   *     client = new LDClient(sdkKey, config);
   *     
   *     // later, possibly on another thread:
   *     boolean inited = client.getDataSourceStatusProvider().waitFor(
   *         DataSourceStatusProvider.State.VALID, Duration.ofSeconds(10));
   *     if (!inited) {
   *         // do whatever is appropriate if initialization has timed out
   *     }       
   * </code></pre>
   * 
   * @param desiredState the desired connection state (normally this would be {@link State#VALID}) 
   * @param timeout the maximum amount of time to wait-- or {@link Duration#ZERO} to block indefinitely
   *   (although it will still return if the thread is explicitly interrupted) 
   * @return true if the connection is now in the desired state; false if it timed out, or if the state
   *   changed to {@link State#OFF} and that was not the desired state
   * @throws InterruptedException if {@link Thread#interrupt()} was called on this thread while blocked
   */
  public boolean waitFor(State desiredState, Duration timeout) throws InterruptedException;

This is slightly different from your original suggestion, in terms of the timeout behavior: if it times out, it doesn't throw an exception, it just returns false. This is in keeping with the SDK's general strategy of avoiding exceptions unless something truly unexpected happens. Java usage is somewhat inconsistent about this, but for instance the APIs in java.util.concurrent only throw a TimeoutException if there's some reason that they can't just use their return value for that purpose (like, BlockingQueue.poll() returns null on timeout because null is not a valid queue item, whereas Future.get() throws an exception on timeout because null could be a valid result of a Future).

@eli-darkly
Copy link
Contributor

OK, the 5.0.0 release is out now. It includes the new waitFor feature described above. For more information, see DataSourceStatusProvider.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants