Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal for high level client APIs for best practices #99

Closed
allenxwang opened this issue Jun 12, 2014 · 104 comments
Closed

Proposal for high level client APIs for best practices #99

allenxwang opened this issue Jun 12, 2014 · 104 comments

Comments

@allenxwang
Copy link

Note: the latest design is here

Goals

In the past two years, Ribbon has been used as an inter-processing communication client with load balancing capabilities. Some best practices have been observed when using Ribbon together with other open source libraries. We feel it is best to offer such best practices in a form of a set of APIs in a new Ribbon module.

Here is that we want to achieve:

  • Use annotation/template to drive the process of a request creation as opposed to hand-craft the request
  • Use Hystrix for fault tolerance, fallback and call collapsing and make Ribbon client properties visible to Hystrix to close the gap of the two layers
  • Use cache, (e.g. EVCache) for low latency data access before reaching out to the actual resource
  • Built on top of reactive and non-blocking networking layer RxNetty
  • Protocol agnostic

The ultimate goal is that the clients created with this set of API should be more consistent, agile, reliable, efficient, and resilient.

Design proposal

The center pieces of the new API are AsyncRequest and RibbonRequest

public interface AsyncRequest<T> {
    public T execute();

    public Future<T> queue();

    public Observable<T> observe();

    public Observable<T> toObservable();
}

public interface RibbonRequest<T> extends AsyncRequest<T> {
    public RibbonRequest<HystrixResponse<T>> withHystrixInfo();
}

RequestTemplate is used to represent all the information, including cache providers, Hystrix fallback and any protocol specific information (e.g. URI for HTTP) required to create RibbonRequest. Information supplied to create the RequestTemplate may include variables which should be substituted with real values at the time of request creation.

/**
 * @param <I> request input entity type
 * @param <O> response entity type
 * @param <R> response meta data, e.g. HttpClientResponse
 */
public interface RequestTemplate<I, O, R> {

    RequestBuilder<O> requestBuilder();

    RequestTemplate<I, O, R> withFallbackProvider(FallbackProvider<O> fallbackProvider);

    RequestTemplate<I, O, R> withFallbackDeterminator(FallbackDeterminator<R> fallbackDeterminator);

    RequestTemplate<I, O, R> addCacheProvider(CacheProvider<O> cacheProvider, String cacheKeyTemplate);

    RequestTemplate<I, O, R> withHystrixCommandPropertiesDefaults(HystrixCommandProperties.Setter setter);

    RequestTemplate<I, O, R> withHystrixThreadPoolPropertiesDefaults(HystrixThreadPoolProperties.Setter setter);

    RequestTemplate<I, O, R> withHystrixCollapserPropertiesDefaults(HystrixCollapserProperties.Setter setter);

    public abstract class RequestBuilder<O> {
        public abstract RequestBuilder<O> withValue(String key, Object value);

        public abstract RibbonRequest<O> build();
    }
}

FallbackDeterminator is used to determine whether a response with protocol specific meta data should cause throw an error during Hystrix command execution and cause a fallback:

public interface FallbackDeterminator<T> {
    public boolean shouldTriggerFallback(T responseMetaData);
}

For example:

Ribbon.from(httpClient)
        .newRequestTemplate()
        .withFallbackDeterminator(new FallbackDeterminator<HttpClientResponse<ByteBuf>>() {
            @Override
            public boolean shouldTriggerFallback(
                    HttpClientResponse<ByteBuf> response) {
                return response.getStatus().code() >= 500;
            }
        })   

Once FallbackDeterminator determines that a protocol specific response should trigger Hystrix fallback, HystrixFallbackProvider is used to get the actual fallback:

public interface FallbackProvider<T> extends Func1<HystrixCommand<T>, Observable<T>> {
}

CacheProvider is the interface to provide access to cache of the resource

public interface CacheProvider<T> {
    Observable<T> get(String key);
}

Ribbon is the factory that creates

  • clients whose responsibility is to create protocol specific template
  • service object from the interface definition of a service contract defined by annotations
public final class Ribbon {

    private Ribbon() {
    }

    public static <I, O> RibbonHttpClient<I, O> from(HttpClient<I, O> transportClient) {
        // ...
    }

    public static <I, O, T> T create(Class<T> contract, HttpClient<I, O> transportClient) {
        // ...
    }
}

/**
 * @param <I> Request input entity type
 * @param <O> Response entity type
 * @param <T> Type of RequestTemplate
 */
public interface RibbonClient<I, O, T extends RequestTemplate<I, O, ?>> {
    public T newRequestTemplate();
}

The ribbon-transport module in Ribbon will provide clients used by Ribbon for different protocol with load balancing capabilities. You can also use any client created directly RxNetty if load balancing is not required.

The RibbonTransport in ribbon-transport module is the factory to create load balancing clients:

public final class RibbonTransport {

    private RibbonTransport() {
    }

    public static RxClient<ByteBuf, ByteBuf> newTcpClient(ILoadBalancer loadBalancer, IClientConfig config) {
        // ...
    }

    public static <I, O> RxClient<I, O> newTcpClient(ILoadBalancer loadBalancer, PipelineConfigurator<O, I> pipelineConfigurator, 
            IClientConfig config) {
        // ...
    }

    public static <I, O> RxClient<I, O> newTcpClient(PipelineConfigurator<O, I> pipelineConfigurator, 
            IClientConfig config) {
        // ...
    }

    // ...

    public static HttpClient<ByteBuf, ByteBuf> newHttpClient(IClientConfig config) {
        // ...
    }

    public static <I, O> HttpClient<I, O> newHttpClient(PipelineConfigurator<HttpClientResponse<O>, HttpClientRequest<I>> pipelineConfigurator, 
            ILoadBalancer loadBalancer, IClientConfig config) {
       // ...
    }

    public static <I, O> HttpClient<I, O> newHttpClient(PipelineConfigurator<HttpClientResponse<O>, HttpClientRequest<I>> pipelineConfigurator, 
            IClientConfig config) {
        // ...
    }
}
@benjchristensen
Copy link

This is great @allenxwang, thank you for kicking off this discussion with a well thought-out proposal.

@benjchristensen
Copy link

Regarding the AsyncRequest name, it seems it should be something without Synchronous or Asynchronous in the name since both execution modes are supported.

What is the reason for not having just RibbonRequest?

@benjchristensen
Copy link

The FallbackDeterminator seems to solve a valid need, but the semantics seem more narrow than desirable.

For example, right now with it just returning a boolean all Ribbon can now do is throw a standard Exception of some kind to trigger failure.

Shouldn't we either let the function throw an Exception or return one of its own making?

In other words, if we instead just had a callback such as onNetworkResponse or mapNetworkResponse then a user can do anything they want, including failing.

A signature could be:

public void onNetworkResponse(Func1<T response, R returnResponse> f);

Then it could be used like this:

onNetworkResponse(HttpClientResponse<ByteBuf> response -> {
     if(response.getStatus().code() >= 500) {
          throw MyCustomException("message here");
     } else {
          return response; // pass-thru (but could decorate or modify if wanted)
     }
});

I don't know the input and return types well enough to get this quite right, but what do you think of this kind of approach that is more generic and allows both success and error handling with increased flexibility?

@benjchristensen
Copy link

The signature of HystrixFallbackProvider doesn't seem to be a provider, but a function. It's not a factory providing a fallback function, it is just the function isn't it? And I believe that's correct, all it needs to be is a function of this signature:

Func1< HystrixExecutableInfo<T>, Observable<T>>

So I think it should just be called HystrixFallback or FallbackHandler or something like that, but not "provider".

Note also how I change it from HystrixCommand to HystrixExecutableInfo. We need something that both HystrixCommand and HystrixObservableCommand implement (though we could probably get away with just HystrixObservableCommand as everything here should be non-blocking). The other benefit to HystrixExecutableInfo is that it has none of the execution methods on it, just the metadata. The only thing I don't like about HystrixExecutableInfo is its name.

So I suggest either HystrixObservableCommand or HystrixExecutableInfo, but not HystrixCommand as that is blocking.

@benjchristensen
Copy link

For caching there are 2 types of cache keys we need to support:

  1. Hystrix getCacheKey() for in-memory caching (request-scoped, de-duping)
  2. EVCache/Memcached/Cassandra/whatever caching

The first is nothing but a simple function that takes the input args and returns the cache key.
The second is I think what the CacheProvider is solving for, correct?

@benjchristensen
Copy link

Regarding withHystrixInfo:

public RibbonRequest<HystrixResponse<T>> withHystrixInfo();

It seems this is overly precise when there could be other types of metadata we want to make available, including the HTTP response codes, headers, etc (even when type T is returned).

How about:

public RibbonRequest<RibbonResponse<T>> withMetadata();

public interface RibbonResonse<T>  {
  public HystrixExecutableInfo getHystrixInfo();
  public HttpHeaders getHttpHeaders(); // or whatever we want here
  ... etc ...
  public T getResponse();
}

@benjchristensen
Copy link

On the Ribbon factory, what is the difference between from and create that has a "contract" class?

https://github.com/allenxwang/ribbon/blob/2.x-commonclient/ribbon-client-extensions/src/main/java/com/netflix/ribbonclientextensions/Ribbon.java#L14

@benjchristensen
Copy link

The RibbonClient piece looks like it will support TCP, HTTP, etc. Does that mean the FallbackDeterminator functionality (and other similar things) will get the type via the transport chosen such that it could get an HttpResponse or TcpResponse object?

I suggest we consider having two different object models, one for request/response like HTTP (including SSE), and another for bi-directional such as TCP and WebSockets. This is similar to the decisions we've made in RxNetty since request/response and bi-directional streaming are very different.

How do you think that will affect this design?

@NiteshKant
Copy link

Thanks @benjchristensen for comments!

Regarding the AsyncRequest name, it seems it should be something without Synchronous or Asynchronous in the name since both execution modes are supported.

I agree, the AsyncRequest name does not represent what it does. The intent of having a base interface for RibbonRequest was that HystrixResponse interface also had the same methods execute(), queue(), observe() and toObservable() but the name is not correct. Also, we can just keep it package private and not expose it outside as the intent is just to share the methods.

Shouldn't we either let the function throw an Exception or return one of its own making?

We discussed about this point, but decided against it because the use case seems to be of transforming a response and it fits in naturally more at the transport level instead of the Ribbon API level. I was intending to move the karyon interceptor abstraction into rx-netty and make it available for both client and server. Finagle calls this abstraction as filters.
At the hystrix level, we would want to handle entities as opposed to transport responses as the transport is already done at the time of making these decisions.

The signature of HystrixFallbackProvider doesn't seem to be a provider, but a function.

Agree with this comment entirely.

For caching there are 2 types of cache keys we need to support

The reason why this does not fit in as two CacheProviders is because the key can be different?

It seems this is overly precise when there could be other types of metadata we want to make available, including the HTTP response codes, headers, etc (even when type T is returned).

Agree with the name being too precise.

On the Ribbon factory, what is the difference between from and create that has a "contract" class?

They should both be named from()

The RibbonClient piece looks like it will support TCP, HTTP, etc. Does that mean the FallbackDeterminator functionality (and other similar things) will get the type via the transport chosen such that it could get an HttpResponse or TcpResponse object?

The purpose of RibbonTransport is to marry ribbon's load balancing feature with RxNetty's transport.
RibbonClient as it stands today is catering to a request-response protocol and it tries to reflect that by having the only method newRequestTemplate(). Currently, there is only HTTP but the other protocols would be Thrift, ProtoBuf, custom binary request-response protocol, etc.
I am not very sure how will hystrix fallback, cache, etc. will fit into a full-duplex protocol like websockets, my current understanding being that those will not make sense. eg; what will you cache or fallback for a websocket/tcp connection. What would be handy is the hystrix traffic shaping functionalities which we would have to work out when we get to the full duplex protocol level.

However, from this point I am starting to think that we can completely eliminate the RibbonClient abstraction. Instead we may want to only have

public final class Ribbon {

    private Ribbon() {
    }

    public static <I, O> HttpRequestTemplate<I, O> newHttpRequestTemplate(HttpClient<I, O> transportClient) {
        // ...
    }

    public static <I, O, T> T from(Class<T> contract, HttpClient<I, O> transportClient) {
        // ...
    }
}

@NiteshKant
Copy link

@benjchristensen are the withHystrix*PropertiesDefault() methods sufficient?
Should programmatically defining these properties in the template be the only way to specify hystrix properties or there should also be a convention, eg:

<request-template-identifier>.HystrixCommand.execution.isolation.thread.timeoutInMilliseconds

@khawes
Copy link

khawes commented Jun 12, 2014

Building on Ben's comments regarding:
Regarding

public RibbonRequest<HystrixResponse<T>> withHystrixInfo();

becoming

public RibbonRequest<RibbonResponse<T>> withMetadata();

I Agree and believe it would be best not to name Hystrix by name at this API level. Just as we don't expose Netty vs. another transport, lets not expose Hystrix vs. another (as yet undefined) resiliency mechanism.

@khawes
Copy link

khawes commented Jun 12, 2014

Please don't loose the ability to create templates based on templates that is in the original RibbonResource prototype that I wrote. it is extremely useful when you get down to creating actual thin clients on top of this.

@allenxwang
Copy link
Author

Thanks to @NiteshKant for great design suggestions and @benjchristensen for the comments!

Regarding the AsyncRequest name, it seems it should be something without Synchronous or Asynchronous in the name since both execution modes are supported.

I agree, the AsyncRequest name does not represent what it does. The intent of having a base interface for RibbonRequest was that HystrixResponse interface also had the same methods execute(), queue(), observe() and toObservable() but the name is not correct. Also, we can just keep it package private and not expose it outside as the intent is just to share the methods.

What about something like RxRequest?

Shouldn't we either let the function throw an Exception or return one of its own making?

We discussed about this point, but decided against it because the use case seems to be of transforming a response and it fits in naturally more at the transport level instead of the Ribbon API level. I was intending to move the karyon interceptor abstraction into rx-netty and make it available for both client and server. Finagle calls this abstraction as filters.

We choose to have it return a simple Boolean under the assumption that it is simple enough for majority of use cases.

I see the point that having a response transforming API in Ribbon can reduce the burden of client creator for creating his own transport client. But if the interceptor in rx-netty can make it simple then it might be a better option.

It seems this is overly precise when there could be other types of metadata we want to make available, including the HTTP response codes, headers, etc (even when type T is returned).

We thought about this. The reason it only returns Hystrix metadata is that at the point this information is available, Hystrix execution is already done and it is hardly useful to check the protocol specific meta data, which is the layer below Hystrix. Also it would be cumbersome from design point of view (especially generics) to include protocol specific data here.

The first is nothing but a simple function that takes the input args and returns the cache key.
The second is I think what the CacheProvider is solving for, correct?

That is my intention. Let me know if Hystrix cache should be reflected in the design in a different way.

However, from this point I am starting to think that we can completely eliminate the RibbonClient abstraction. Instead we may want to only have

Seems like a good idea.

@NiteshKant
Copy link

Please don't loose the ability to create templates based on templates that is in the original RibbonResource prototype that I wrote.

@khawes can you explain the usecase more?

@NiteshKant
Copy link

What about something like RxRequest?

@allenxwang yeah that could work too but I would vote for making the interface package private as it is just for defining common methods and making it public introduces confusion as to why is there a RibbonRequest and RxRequest? What is the difference between them?

@khawes
Copy link

khawes commented Jun 12, 2014

Here's some sample code for being able to define a template using another template. Note the ResourceTemplateSettings differs from the ResourceTemplate class in that it has no generic return classes so that it can be reused in many templates:

   private final RESTClient restClient;

    private static final ResourceTemplateSettings defaults = new ResourceTemplateSettings()
            .withClientVersion(DMSClient.class)
            .withRESTClientName(NIWSClientName)
            .withErrorClass(RESTResult.class);


    private static final ResourceTemplate<RendezvousEntity> POST_RendezvousStart =
            new ResourceTemplate<RendezvousEntity>(RendezvousEntity.class,
                                                   RestClient.Verb.POST,
                                                  "/rendezvous",
                                                   defaults)
                    .withSuccessHttpStatus(HttpStatus.SC_CREATED)
                    .withErrorHttpStatus(HttpStatus.SC_CONFLICT,
                                         HttpStatus.SC_UNAUTHORIZED);

and later it's called like this:

    public Resource.Builder<RendezvousEntity> rendezvousStart(String esn) {
        RendezvousEntity entity = new RendezvousEntity();
        entity.esn = esn;
        entity.state = DeviceEntity.State.Init;
        return POST_RendezvousStart.resourceBuilder().withEntity(entity);
    }

@benjchristensen
Copy link

What does "success status" and "error status" end up causing to occur?

@khawes
Copy link

khawes commented Jun 12, 2014

Success will cause the return of the RendezvousEntity.class in this
example, error is the RESTResult.class (again in this example) and other
HTTP codes would result in fallback handling. We've change how we specify
when to fallback. This is not an example of that but an example of the use
for nesting templates.

-- Keith


Don’t listen to me, I play with Christmas lights
http://www.christmasoutloud.com/ all year long.

On Thu, Jun 12, 2014 at 10:52 AM, Ben Christensen notifications@github.com
wrote:

What does "success status" and "error status" end up causing to occur?


Reply to this email directly or view it on GitHub
#99 (comment).

@benjchristensen
Copy link

How does this differ from just having a function that lets you do whatever is needed?

onNetworkResponse(HttpClientResponse<ByteBuf> response -> {
     if(response.getStatus().code() >= 500) {
          throw MyCustomException("message here");
     } else {
          return response; // pass-thru (but could decorate or modify if wanted)
     }
});

@khawes
Copy link

khawes commented Jun 12, 2014

It does not. We evolved into that function. Fon't focus on the details of the example but on the use case for defining one template using another template, which contains a set of defaults for the overall client that would be tedious and error prone to copt into every resource template in your client.

@benjchristensen
Copy link

This discussion is about the details – we are designing the API :-)

It does not. We evolved into that function.

Cool, that's good to know ... as we want to have something as generic as possible without making it useless.

contains a set of defaults for the overall client

Makes sense to make reuse easy, basically "partial application" (http://en.wikipedia.org/wiki/Partial_application and http://www.ibm.com/developerworks/library/j-jn9/).

@NiteshKant
Copy link

Makes sense to make reuse easy, basically "partial application"

Does it translate to having a method copy() on RequestTemplate which creates a new copy and then since the builder is additive, you start building from the copy.

@benjchristensen
Copy link

Does it translate to having a method copy()

Probably something like that as I doubt we can afford the object allocation overhead of making the builder immutable. If we could do it as an immutable builder (like Rx sequences) then you can take any Template at any point and just embed inside something else without worrying about mutation.

@khawes
Copy link

khawes commented Jun 12, 2014

I made a copy of the builder in the containing object and passed the get methods down to the default if there were no existing values (e.g. null). For headers they were merged on build. I don't recall my reasoning as it's been a few years now since I wrote it.

allenxwang pushed a commit to allenxwang/ribbon that referenced this issue Jun 12, 2014
@allenxwang
Copy link
Author

Changes after the discussion reflected in commit: allenxwang@8b74aff

Summary of changes:

  • FallbackProvider renamed to FallbackHandler
  • HystrixCommand changed to HystrixObservableCommand in FallbackHandler
  • addCacheProvider(CacheProvider<O> provider, String keyTemplate) changed to withCache(String keyTemplate, List<CacheProvider<O>> providers) and added javadoc to call out its relationship to Hystrix internal cache
  • RibbonClient removed
  • public RibbonRequest<HystrixResponse<T>> withHystrixInfo(); changed to public RibbonRequest<RibbonResponse<T>> withMetadata();
  • AsyncRequest renamed to RxRequest and made package private
  • added RequestTemplate.copy() API
  • Ribbon APIs updated to add newHttpRequestTemplate() and rename create(Class<T>, ...) to from(Class<T>, ...)
  • Example updated to show creating request from the same template

@NiteshKant
Copy link

@allenxwang can you update the design proposal in this issue with the latest code?

withCache(String keyTemplate, List<CacheProvider<O>> providers) can this be broken down into two methods:
withCacheKey(String keyTemplate)
and
withCacheProvider(CacheProvider<O> provider)

Passing a list would prohibit use of lambdas/closures.

@NiteshKant
Copy link

Probably something like that as I doubt we can afford the object allocation overhead of making the builder immutable. If we could do it as an immutable builder (like Rx sequences) then you can take any Template at any point and just embed inside something else without worrying about mutation.

Yeah it would be good to have the template thread-safe (by immutability) but the cost would be large as it means every mutation copies the template. I think we can just say that the template is not threadsafe from the point of view of mutations to the template, however each template.requestBuilder() call creates a new RequestBuilder and this method specifically is safe to be called concurrently.
This model will aid people to create a shared RequestTemplate instance which can be used across threads to create new RequestBuilder and hence RibbonRequest instances.

@NiteshKant
Copy link

@khawes If I understand correctly, you are fine with the approach of providing a copy() method on the template to support the usecase you were mentioning.

@khawes
Copy link

khawes commented Jun 13, 2014

Yes

@khawes https://github.com/khawes If I understand correctly, you are
fine with the approach of providing a copy() method on the template to
support the usecase you were mentioning.

@allenxwang
Copy link
Author

@khawes You will have access to HystrixExecutionInfo to see if the empty result is from fallback. The network response (Http headers) are not exposed after Hystrix execution is done since it is hardly useful once the final result (after fallbacks) is available.

@khawes
Copy link

khawes commented Jun 17, 2014

Let me add context to the question that I though was obvious but apparently is not.

From a user of my client library, who eventually calls get() to get the object he/she is expecting from the service being called.
What do they see for outcome #3 (non fall back error e.g. 404)?
How do they differentiate a success from a non-fallback failure? e.g. the login situation where a bad password does not result in a customer object.

@benjchristensen
Copy link

A HystrixBadRequestException (or whatever we extend that with) is thrown to the user with whatever message/cause you gave it.

@khawes
Copy link

khawes commented Jun 17, 2014

that's what I was missing, and is also missing from your example above.

Now back to our regularly scheduled debate: withFallback & withFallbackDeterminator vs withResponseTransformer:

I believe that the most common use cases can be handled by withFallback & withFallbackDeterminator and that by not having a single method doing both, that the road to adoption will be easier and less error prone. Remember we are asking service owners to write these clients and they are not well versed in RX. Making them learn RX puts an overturned big rig on our road to adoption.
We can even have an HTTP FallbackDeterminatorBuilder that we can use:

   class FallbackDeterminatorBuilder() {
       .withSuccessHttpStatus(int... httpStatusCodes)
       .withErrorHttpStatus(int... httpStatusCodes);
// or
       .withNonFallbackHttpStatus(int... httpStatusCodes);
// take your pick
   }

which makes it even easier for adoption.

@allenxwang
Copy link
Author

Given comments from @benjchristensen, the choices to implement null/empty (in case of 404) are (in the ResponseTransformer):

  • throws HystrixBadRequestException with your customized cause (you can deserialize the content from response body). This will cause Observer.onError() invoked or RuntimeException thrown from execute()
  • give back empty Observable if user does not expect visible error in this case

@NiteshKant
Copy link

I believe that the most common use cases can be handled by withFallback & withFallbackDeterminator and that by not having a single method doing both, that the road to adoption will be easier and less error prone. Remember we are asking service owners to write these clients and they are not well versed in RX. Making them learn RX puts an overturned big rig on our road to adoption.

I do not agree that providing this helper method will make the path to adoption of ribbon easier or less error prone. I do not think it's a mainstream usecase either.
The provided ResponseTransformer (we are going to change it to the original proposal of onNetworkResponse(Func1<HttpClientResponse<T>, HttpClientResponse<T>> func) instead of the ResponseTransformer interface) has nothing to do with Rx, it takes an HttpClientResponse and returns HttpClientResponse with optional modification.

The approach of onNetworkResponse is generic enough to handle all these usecases. On the other hand, providing these methods:

 class FallbackDeterminatorBuilder() {
       .withSuccessHttpStatus(int... httpStatusCodes)
       .withErrorHttpStatus(int... httpStatusCodes);
// or
       .withNonFallbackHttpStatus(int... httpStatusCodes);
// take your pick
   }

addresses specific use cases that we now think are reasonable. Anything out of these cases will require a different function. eg: I can not do a check based on a range of error codes (> 200, 400-500, etc.)

@khawes
Copy link

khawes commented Jun 17, 2014

@NiteshKant one can absolutely do range of error codes with the FallbackDeterminator approach. I showed one (and only one) possible implementation of it.

If I'm looking at using RibbonClient for the first time How do I know that fallback logic goes in some object that needs to implement onNetworkResponse(...)? It's not intuitive., therefore I may not correctly implement fallback logic.

RibbonClient should be simple and easy to use for most use cased yet be able to handle the more advanced cases by dropping down a layer. If you conflate these two layers you end up with a high barrier to entry, and teams writing the high level layer over and over, and we loose best practices again.

@khawes
Copy link

khawes commented Jun 17, 2014

Fundamentally, as a client writer I must always specify when to fallback. However as a client writer I cannot always specify how to fallback.
Here are two examples:

  • A/B: can specify when and how to fallback because fallback uses the test metadata for falling back.
  • DMS: can specify when to fall back but cannot specify how, because It is called by apps that have cookies, c-tickets and/or MSL data on which to fallback.

By combining the two disparate actions into one method it makes it very difficult to separate the two concerns. Those that cannot provide code for fallback will end up not providing for fallback, and users will not only have to provide fallback but also define when. We have this situation today because Hystrix already has this conflated view of falling back, and because of it we wound up with at least 3 separate implementations per DMS end point, and each and every one of them got it wrong on several occasions, and in addition others opted to not implement Hystrix at all.

We need to keep determining when to fallback separate form the act of falling back; so that we can have users of the client easily specify fallback actions, while leaving when to fallback to the client writer.

@benjchristensen
Copy link

Remember we are asking service owners to write these clients and they are not well versed in RX. Making them learn RX puts an overturned big rig on our road to adoption.

The function being discussed has little to do with Rx so please don't conflate them. The function is nothing more than a callback to process from HttpClientResponse<T> to either Throwable or T.

@NiteshKant
Copy link

By combining the two disparate actions into one method it makes it very difficult to separate the two concerns.

The two concerns of when to provide fallback and the fallback itself are expressed as:

  • onNetworkResponse()
  • withFallbackProvider() (@allenxwang I think we are changing the name to withFallback() ?)

@benjchristensen
Copy link

How does the onNetworkResponse have anything to do with defining "how" to fallback? It is only about processing a network response and deciding if you want to throw an exception.

The fallback, if you can implement one, is passed into the withFallback method. If you can't provide one, then a consumer will either receive an exception and must handle it, or must provide you a function that can be injected into the withFallback method.

The two use cases you are referring to are handled in two different places.

Here is the code again to show the two places:

Ribbon.from(httpClient)
        .newRequestTemplate()
        .onNetworkResponse((HttpClientResponse<ByteBuf> response) -> {
             // use case 1: when to fallback
            // handle use case to determine a failure based on network or data 
            return ...;
         })
         .withFallback((HystrixExecutableInfo info) -> {
            // use case 2: how to fallback
         })

@khawes
Copy link

khawes commented Jun 17, 2014

How does the onNetworkResponse have anything to do with defining "how" to fallback? It is only about processing a network response and deciding if you want to throw an exception.

The two use cases you are referring to are handled in two different places.

So that's not clear at all based on the name of the method onNetworkResponse and these comments, which lead me down the path that the "right palace" to replace the fallback response object is in the onNetworkResponse method:

it takes an HttpClientResponse and returns HttpClientResponse with optional modification.

and

The function is nothing more than a callback to process from HttpClientResponse to either Throwable or T.

and

All three scenarios are handled by the ResponseTransformer

  1. ResponseTransformer returns the same response
  2. ResponseTransformer throws an exception
  3. ResponseTransformer transforms the response. For example, return a new HttpClientResponse where the the content (Observable) represent empty result

All of these say I need to to return T (my result object) so it looks like the fallback response object needs to be returned at this point. There is no contract between onNetworkResponse and FallbackHandler. FallbackDeterminator had an implicit contract with FallbackHandler. via the naming convention which tells you the relationship, and that the determinator shouldn't be where you are supposed to be messing with T.

The result of determining fallback should be either:

  • Don't
  • Do & Why.

not one of:

  • Don't
  • Do & Why
  • Don't - yet use this alternate value for the result instead of the real value. Which is what I'm reading doResponseHandler is supposed to enable.

If you walk down this path you will find people handling the fallback internal to doResponseHandler and your metrics will not be counting fallbacks and the circuit breakers won't fire.

If you need to have something to mess with then by all means create a way to do so, but it shouldn't be related to fallback or its confusing.

@benjchristensen
Copy link

If I understand correctly your arguments are:

  1. You don't find the name onNetworkResponse clear and are concerned it will result in developer error.
  2. You don't think onNetworkResponse should be able to both throw and transform data.

For 2, even if we split it into two functions, I don't see how that prevents misuse. For example, we could code it like this"

Ribbon.from(httpClient)
        .newRequestTemplate()
        .throwIfBad((HttpClientResponse<ByteBuf> response) -> {
             // throw an exception if bad
             if(x) throw new RuntimeException(); // this will go to fallback 
         })
        .transformIfNeeded((HttpClientResponse<ByteBuf> response) -> {
             return response or transformed response;
         })
         .withFallback((HystrixExecutableInfo info) -> {
            // use case 2: how to fallback
         })

The transformIfNeeded function can still be misused as you suggest. And what if they throw exceptions from there, since they can achieve everything about throwIfBad from transformIfNeeded.

So, we can split them up, but I don't like doing so. The onNetworkResponse(Func<NetworkResponse, T>) approach is the simplest, most flexible way of doing it - let the user process the response and do anything they need or want to do to it:

a) do nothing, it's fine
b) throw a RuntimeException, it's bad, go to fallback
c) throw a HystrixBadRequestException, it's bad, but the users fault so don't go to fallback
d) manipulate the data (such as mapping keys), return to user

If we split these it is purely arbitrary, artificially limiting, and less efficient (causing the response to be processed twice).

Semantically throwIfBad takes (b) and (c) and transformIfNeeded takes (a) and (d) and then withFallback if (b) happens, but in practice I can do all of it within the transformIfNeeded method.

Considering all of that, let's determine whether (d) is necessary. If it's not then this debate is moot as we don't need the ability to transform and return type T. If it is needed I don't see any reason to split it into two functions, just let it stay simple as if it is a map function from T -> R.

The premise of (d) has been that some data comes off the wire and there is a reason to allow modifying it somehow.

Thinking about this further though, I'm not sure that will work very well due to serialization and static typing. We can't mess with static types all that well. It could work with dictionaries (mapping keys for example), but that is a runtime decision by the user, not something that can be coded into this by the developer providing the client.

So perhaps onResponse is an Action1<HttpClientResponse> that doesn't have a return type and can only throw?

Ribbon.from(httpClient)
        .newRequestTemplate()
        .onResponse((HttpClientResponse<ByteBuf> response) -> {
             // throw an exception if bad
             if(x) throw new RuntimeException(); // this will go to fallback 
         })
         .withFallback((HystrixExecutableInfo info) -> {
            // use case 2: how to fallback
         })

and if that's the case, would a name like validateResponse be preferable?

Ribbon.from(httpClient)
        .newRequestTemplate()
        .validateResponse((HttpClientResponse<ByteBuf> response) -> {
             // throw an exception if bad
             if(x) throw new RuntimeException(); // this will go to fallback 
         })
         .withFallback((HystrixExecutableInfo info) -> {
            // use case 2: how to fallback
         })

However, the thing I don't like about validateResponse is it implies it's only for validation, but any side-effecting work can be done there. This is why I prefer things like map(T -> R) as it does not limit what can be done by its name.

Should transforming of the response be permitted or not? It looks like it may be problematic.

@mikeycohen
Copy link
Contributor

I just spent the last 45 mins catching up on this thread. I think at this point it may warrant an in person discussion on these points. I think we may run the risk of adding complexity to this design to offer too much flexibility. Rather than making this design fit more and more complex use cases, maybe we should question the validity of the use case. Should we ask ourselves if it's easier in terms of the design and reduction of complexity to have the server side absorb some of this complexity? For Keith's use case, I wonder if there is a flag that can be passed to the server to determine success vs failure. In general, I favor less specific methods that allow ease of understanding and use.
A lot of this task should be narrowing the scope of what is possible for a client to do, and make that simple.

@mikeycohen
Copy link
Contributor

Here are two examples:

A/B: can specify when and how to fallback because fallback uses the test metadata for falling back.
DMS: can specify when to fall back but cannot specify how, because It is called by apps that have cookies, c-tickets and/or MSL data on which to fallback.

For these examples, these are great cases for flipping the logic;

AB can have the test metadata passed in; for DMS requests, this data can be passed in to the client as well. This simplifies the logic within the client, and makes the caller need to explicitly pass in the appropriate data for the call.

@khawes
Copy link

khawes commented Jun 18, 2014

@mikeycohen

For Keith's use case, I wonder if there is a flag that can be passed to the server to determine success vs failure.

Which use case are we talking about?

There is a face to face on Friday @11:30

@khawes
Copy link

khawes commented Jun 18, 2014

@mikeycohen Flipping is a great way to explain what I'm trying to get across. Extending your example to it's logical conclusion we get:

  • In the A/B case there can be a singleton implementation of the fall back handler and A/B can just set it at initialization time, freeing it's users from the burden.
  • DMS uses will still have the burden of creating a fall back handler with the appropriate data for every call.

@allenxwang
Copy link
Author

A change to the design regarding API for get metadata (e.g. HystrixExecutableInfo):

public interface RibbonRequest<T> {

    public T execute();

    public Future<T> queue();

    public Observable<T> observe();

    public Observable<T> toObservable();

    public RequestWithMetaData<T> withMetadata();
}

public abstract class RibbonResponse<T> {
    public abstract T content();

    public abstract HystrixExecutableInfo<T> getHystrixInfo();   
}

public interface RequestWithMetaData<T> {
    Observable<RibbonResponse<Observable<T>>> observe();

    Observable<RibbonResponse<Observable<T>>> toObservable();

    Future<RibbonResponse<T>> queue();

    RibbonResponse<T> execute();
}

With this change, one is possible to do

        request.withMetadata().observe()
            .flatMap(new Func1<RibbonResponse<Observable<ByteBuf>>, Observable<String>>() {
                @Override
                public Observable<String> call(RibbonResponse<Observable<ByteBuf>> t1) {
                    if (t1.getHystrixInfo().isResponseFromFallback()) {
                        return Observable.empty();
                    } 
                    return t1.content().map(new Func1<ByteBuf, String>(){
                        @Override
                        public String call(ByteBuf t1) {
                            return t1.toString();
                        }

                    });
                }
            });

@NiteshKant
Copy link

Should transforming of the response be permitted or not? It looks like it may be problematic.

I think response transformation belong to the transport layer. Once it comes to the ribbon API layer, there should not be any transport response change done (atleast not evidently from the API). So in this case, instead of Func1 we should use Action1

@allenxwang
Copy link
Author

Latest design is in gist:

https://gist.github.com/allenxwang/9f290dc863705bb4903f

Changes:

  • Enabling Hystrix cache via a separate API and cache key
  • Having a different cache key template for each provider
  • Renamed ResponseTransformer to ResponseValidator which extends Action1 and withResponseValidator(ResponseValidator). I feel onResponse sounds like reactive, but in the case of template building, we want user to actively set this piece.

@NiteshKant
Copy link

Renamed ResponseTransformer to ResponseValidator which extends Action1 and withResponseValidator(ResponseValidator). I feel onResponse sounds like reactive, but in the case of template building, we want user to actively set this piece.

ResponseValidator does not do justice to the intention. Its not validating, its asserting whether response is a success or not.
I would like to stick to onResponse(Action1) API rather than a first class interface for ResponseValidator.

@khawes
Copy link

khawes commented Jun 20, 2014

@NiteshKant the name ResponseValidator is closer to the intention of asserting success, failure, or error than the name onResponse is. We're not trying to assert success or failure but success, authoritative error, and failure. It's the 'authoritative error' that is important. because you don't wan to fall back on these, and yet it is not successful in the sense that the expected result was returned. Lets try to find a name that mans decide if the call was successful and if not if we should fallback.

As for making it a first class interface... I believe that it needs to be. We want to be encouraging the best practice of specifying when fallback is necessary and The builder pattern is great for calling this out.


All that aside I think we are diverging from our original intention of having clients built on top of this that are so thin we can almost "test" them with our eyeballs. e.g. code so simple it's hard to introduce non-obvious errors.

Are we adding functionality that is useful for a few edge cases? I think we are. Take a step back an look at NIWSRestClient. It was very flexible and required a lot of boilerplate be created by client libraries for the common use cases. So much so that we have rest-client-utils that wraps it to allow all that boilerplate to be pushed down a layer. The result is straight forward to use and difficult to get wrong. It is also more restrictive, but that's how it gets it's simplicity. If you need the flexibility you can always drop down a layer and have at it. That's a long winded way of saying we need move the edge cases down a layer?


I like to think about the common client from the client library users perspective. And think it should look like this:

SubscribedObservable<DmsDevice> deviceCall = dmsClient.getCustomerDevice(esn, 
               customerid, 
               new myFallbackhandler(cTicket));

if fallback must be provided by the user and:

SubscribedObservable<DmsDevice> deviceCall = dmsClient.getCustomerDevice(esn, customerid);

if not.

Don't dump all over the word SubscribedObservable It's only trying to communicate that the remote call behind the Observable is already in progress at this point (e.g. the remote call has started) so that later the user can call:
DmsDeivce device = device.get(); The point is that the object I get back from my call to a client library is an observable that is already in progress so I can call several client libraries and then start doing the .get()s.

Oh and don't make me type deviceCall.toBlockingObservable.get() just give me deviceCall..get() and do that extra ensuring the object is ready for consumption bit behind the scenes. The problem with boilerplate (no matter how small) is that right after you copy and past it is that it's out of date.

In my head the implementation of ` dmsClient.getCustomerDevice(esn, customerId, myFallbackhandler(cTicket));' looks something like this:

SubscribedObservable<DmsDevice> deviceCall(String esn, Long customerId, FallbackHandler fallback) {
        GetCustomerDeviceRequestTemplate.getResource("esn", esn, "custId", customerId)
        .withResponseValidator(deviceCallValidator)
        .withFallback(fallbackHandler)
        .build() // the act of building fires off the remote call (subscribes)
         }
}

Don't make me type .build().subscribe() because at some point I will forget, or it will get deleted by a stray key stoke. Since Observable is the result of both, and If I leave out .subscribe() It will be an error I won't see unless I test that my call is executing in parallel upon return. which will also get neglected.

deviceCallValidator is a singleton dedicated to this endpoint, and since this is a REST call it will look at the http response codes. (As will all REST calls so we can and should provide a builder for these)

GetCustomerDeviceRequestTemplate defines the rest endpoint and getResource is the fully formed resource that is for this one call, because it now contains data for this one call to the resource.

Finally, if as a user if I feel I need to get at the builder before it's built I can go talk to the client library owner to have a method exposing the builder, or if the owner sees benefit for all their users it gets built into the client and everybody gets the enhanced functionality.

End of line.

@allenxwang
Copy link
Author

The design is updated here

  • Added ResourceGroup and HttpResourceGroup
  • Added ClientOptions to provide simple interface for common client configurations
  • Changed the generic type for RequestTemplate to be the entity type instead of the raw type from HttpClient
  • ResponseValidator and FallbackHandler no longer extends from Rx functions
  • Other minor renaming

The reason why ClientOptions is not offered as a setter for ResourceGroup is that the underlying client should be immutable (given that it is shared for all templates) once the group is created, so it does not make sense to have mutable client options.

@khawes
Copy link

khawes commented Jun 23, 2014

withConnIdleEvictTimeMilliSeconds was hard for me to read/parse. Maybe use:
withIdleEvictTimeMilliseconds or withConnectionIdleEvictTimeMilliseconds
the latter seems to cause me to infer this is related to making a connection rather than a dealing with a connection pool, so I prefer the former. Maybe go so far as withConnectionPoolIdleEvictTimeMilliseconds

PS Milliseconds is one word so no need to camel case the S in seconds.

@khawes
Copy link

khawes commented Jun 23, 2014

in HttpRequestTemplate and we change withUri to withUriTemplate since the URI needs to look like this: "/v1/dms/Device/{Esn}/{CustomerId}" and not "/v1/dms/Device/TIVO-123456756/1234567845" the Former is a template and the latter is a URI.

@allenxwang
Copy link
Author

Minor naming changes in ClientOptions to reflect comments from @khawes and @benjchristensen

https://gist.github.com/allenxwang/9f290dc863705bb4903f

@allenxwang
Copy link
Author

After discussion with @mikeycohen, the RequestTemplate.addCacheProvider() is reverted back to RequestTemplate.withCacheProvider() to make sure that only one cache provider can be specified. User can implement the multiple cache providers scenario with composition.

@allenxwang
Copy link
Author

Close this as the design is reflected in new ribbon module in 2.0-RC1. New issues will be created for further improvements.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants