Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

merge master #1

Conversation

mosesn
Copy link

@mosesn mosesn commented May 10, 2017

No description provided.

Daniel Schobel and others added 30 commits February 24, 2017 04:49
Problem / Solution

Fixup the reflection in one test and rm another for more reliable
test runs.

RB_ID=908703
In ConnectionManager, the connection starts off with "isKeepAlive = false".
Because the connection is initially idle, ConnectionManager returns "true"
for shouldClose(), which is called by Retries before a request is made (if the
client is constructed without FactoryToService) to get the status of the connection.
We don't want a connection to start off in a closed state because service acquisition
will be retried until exhausted.

Solution

Initialize "isKeepAlive" to "true".

RB_ID=908622
…havior

Problem

Some protocols have Java clients which live in
com.twitter.finagle.$protocol.java packages. SBT globs both java and scala
sources together during compilation, leading to packaging name conflicts when
importing any java.* packages that only show up with SBT. This breaks builds.

Solution

Add a false circular dependency on Java sources in BUILD files of those
protocols.

RB_ID=908577
Problem

HSCAN and SCAN commands take an optional argument for pattern matching.
This argument is called MATCH and not PATTERN.

Solution

Replace PATTERN with MATCH in SCAN and HSCAN.

Result

Redis clients can execute SCAN and HSCAN with an optional pattern match
argument.

Signed-off-by: Ryan O'Neill <ryano@twitter.com>

RB_ID=908817
Problem

Accidentally left logging turnt up.

Solution

Turn down the volume to a more manageable amount.

RB_ID=908857
Problem

TravisCI barfs on ALPN.  Let's make sure to disable it specially when we don't have
an OpenSSL we can use.

Solution

Check if we have OpenSSL before actually sending it out.

RB_ID=908850
Summary:
Problem / Solution

ConcurrentLoadBalanacerFactory is only used in one place – finagle-memcached.
It doesn't seem generally useful since it is specific to pipelined protocols that
have stateless sessions, of which we only have one (finagle-memcached). This patch
moves it into the finagle-memcached directory and fixes up source to reflect
the change.

Test Plan: ./pants test <relevant projects>

Reviewers: O22826 source:/src/scala/com/twitter/mediascience/!, banderson, O22963 source:/src/scala/com/twitter/taxi/!, vkostyukov, O1553 source:/media-analysis-service/!, mnakamura, O1805 source:/notifications-delivery/!, jillianc, O5308 source:/user-image-service/!, koliver, O1567 source:/mediaservices/!, yao, O5146 source:/thumbingbird/!, ryano, O1802 source:/mynahbird/!, dschobel, O948 source:/iesource/!, jdonham, O523 source:/bouncer/!, O5276 source:/udaan/!, O536 source:/cache/!, O8689 source:/ads/review/common/!, cflanagan

Reviewed By: jillianc, koliver, dschobel

Subscribers: ee-phabricator-cron, #rb_traffic, #rb_notifications-infra-team, #rb_sss, #rb_media-platform, #rb_csl, #rb_media-science-eng, #rb_growth-bangalore-eng, #rb_te-eng, #rb_cache-team, #rb_core-data-metrics, #rb_abuse-eng

JIRA Issues: CSL-4067

TBR:

TBR=true

Differential Revision: https://phabricator.twitter.biz/D29215
Problem

ConcurrentLoadbalancerFactory was moved to finagle-memcached, but
the tests depend on StringClient/StringServer, which are in the finagle-core
tests. We are missing this dependency for sbt.

Solution

Add a test dependency on finagle-core to finagle-memcached.

RB_ID=908995
TBR=true
Problem

Netty SSL barfs in TravisCI if you include the static BoringSSL jar.

Solution

In the long term, it would be great if it didn't barf.  In the short term, move
the dep to finagle-http tests, and make sure it doesn't touch SSL in CI in
finagle-http.

RB_ID=909011
TBR=true
Problem / Solution

OnReady is not used anymore at the creation site of load balancers
(i.e. the TrafficDistributor), therefore, is no longer needed.

RB_ID=908863
Problem

The `ServiceIfaces` generated by Scrooge for Scala's Service
per-endpoint are a `c.t.f.Service[ThriftMethod.Args, ThriftMethod.Result]`.
Unfortunately, `Result` here is is not a very pleasant interface to
work with as they are a `ThriftResponse[ThriftMethod.Success]` and
this means you have to do some odd pattern matching to use them.

For example, using an example IDL:

    exception AnException {
      1: i32 errorCode
    }
    exception AnotherException {
      1: i32 errorCode
    }
    service SomeService {
      i32 TheMethod(
        1: i32 input
      ) throws (
        1: AnException ex1,
        2: AnotherException ex2
      )
    }

This is roughly what pattern matching against a
`SomeService.TheMethod.Result` looks like:

    result match {
      case Return(SomeService.TheMethod.Result(Some(v), _, _))) =>
        // Successful
      case Return(SomeService.TheMethod.Result(_, Some(e), _))) =>
        // AnException
      case Return(SomeService.TheMethod.Result(_, _, Some(e)))) =>
        // AnotherException
      case Throw(_) =>
        // an exception not from Thrift
    }

Solution

The `ServiceIface` is better modelled as a `Service` from `ThriftMethod.Args`
to `ThriftMethod.SuccessType`. This is far more idiomatic as the result
of the `Service` is a `Future[SuccessType]` and thus it already is a `Try`.

This turns that pattern match into roughly:

    result match {
      case Return(success) =>
      case Throw(AnException(_)) =>
      case Throw(AnotherException(_)) =>
      case Throw(_) =>
    }

Result

While this is a breaking API change, it should be fairly superficial.
Translating from a `Result` to a `SuccessType` is fairly
straightforward and an example can be seen in Scrooge's
`ThriftServiceIface.resultFilter`.

RB_ID=908846
TBR=true
This reverts commit 0f7373b831748eb8f37e2d53c113a3a959b24db2

RB_ID=909404
TBR=true
NO-QUEUE=true
…henticator

Problem/Solution

log.debug will swallow a stack trace if the exception is placed after a string.
This orders them so the trace will get written to the appropriate handler and
debugging will be possible.

RB_ID=909304
Problem

The `loadbalancer` package co-located implementations with
traits and helpers. This made it difficult to read and understand
and most importantly expand.

Solution

This patch cleans up the directory in several ways:

1. Each algorithm gets its own sub package.
2. Algorithms were renamed to more clearly reflect their selection algorithm and load metric.
3. The test package was restructured in the same way.
3. Other small cleanups.

RB_ID=909245
TBR=true
Problem

The ServiceIfaces generated by Scrooge for Scala's Service
per-endpoint are a c.t.f.Service[ThriftMethod.Args, ThriftMethod.Result].
Unfortunately, Result here is is not a very pleasant interface to
work with as they are a ThriftResponse[ThriftMethod.Success] and
this means you have to do some odd pattern matching to use them.

For example, using an example IDL:

exception AnException {
  1: i32 errorCode
}
exception AnotherException {
  1: i32 errorCode
}
service SomeService {
  i32 TheMethod(
    1: i32 input
  ) throws (
    1: AnException ex1,
    2: AnotherException ex2
  )
}

This is roughly what pattern matching against a
SomeService.TheMethod.Result looks like:

result match {
  case Return(SomeService.TheMethod.Result(Some(v), _, _))) =>
    // Successful
  case Return(SomeService.TheMethod.Result(_, Some(e), _))) =>
    // AnException
  case Return(SomeService.TheMethod.Result(_, _, Some(e)))) =>
    // AnotherException
  case Throw(_) =>
    // an exception not from Thrift
}

Solution

The ServiceIface is better modelled as a Service from ThriftMethod.Args
to ThriftMethod.SuccessType. This is far more idiomatic as the result
of the Service is a Future[SuccessType] and thus it already is a Try.

This turns that pattern match into roughly:

result match {
  case Return(success) =>
  case Throw(AnException(_)) =>
  case Throw(AnotherException(_)) =>
  case Throw(_) =>
}

Result

While this is a breaking API change, it should be fairly superficial.
Translating from a Result to a SuccessType is fairly
straightforward and an example can be seen in Scrooge's
ThriftServiceIface.resultFilter.

RB_ID=909434
TBR=true
NO-QUEUE=true
Problem / Solution

The "aperture" gauge was exported twice under the "loadbalancer"
scope with different semantics. In order to avoid future name
conflicts, the gauges used to indicate which lb algorithm a client
is using was moved under "algorithm".

RB_ID=909309
…esponse

Problem

The Finagle HTTP model includes a number of protected[finagle]
methods exposing the underlying Netty 3 types that back the model,
keeping us coupled to Netty 3. These can also be used from Java
and make it difficult encapsulate the Netty 3 implementation
details.

Solution

Remove most of these or make them legitimate protected so that
they can easily be removed later as we transition away from a
HTTP model which is a proxy to a Netty type.

RB_ID=908409
TBR=true
Problem / Solution

Failure accrual should be based on recent success rate rather than the past 5
consecutive failures. To facilitate a switchover, this adds a toggle. To
ensure that tests pass, several changes needed to be made to various tests.

RB_ID=907355
Problem / Solution

The Netty4 toggle was mistakenly turned up in the local configuration
location. Turn it down to 0.

RB_ID=909513
Problem

The `LoadBand` trait implemented a Node in order to keep track
of the total number of pending requests. This made it less reusable and,
in fact, it need not know anything about a Node type.

Solution

Create a new entry point in `Balancer` that allows traits to have access
to the underlying `ServiceFactory`. This allows `LoadBand` to do the
neccessary bookkeeping to track the total number of pending requests.

RB_ID=909474
Problem

We want to be able to force the jdk ssl engine for testing.

Solution

Add a param to Netty4ClientEngineFactory/Netty4ServerEngineFactory,
forceJdk, to set the SSL provider to JDK if true.

RB_ID=909339
Problem

In order to have a useful `MethodBuilder` for Thrift/ThriftMux, which
have different request and response types per-endpoint, the APIs
needed some refactoring.

Solution

`MethodBuilder` now works in terms of `Filter.TypeAgnostic` so
that the filters will be usable by Thrift and ThriftMux clients.

Result

This clears the path for a future commit that will allow Thrift and
ThriftMux clients to utilize `MethodBuilder`.

RB_ID=909497
Problem/Solution

New versions of the commons libraries have been published so the dependencies
should be updated accordingly.

RB_ID=908725
…igurations

Problem

There is no direct analogy of `expHttpProxy`/`expSocksProxy` configurations
in Netty 4 transports. However, there is a better alternative available via
the `.withTransport.httpProxyTo` API that allows reusing naming/load balancing
over the proxy servers.

Solution

Given that `expHttpProxy`/`expSocksProxy` were experimental and intended to
only temporary solve the issue (of using a single HTTP proxy server in a client),
we're removing it now to unblock Netty 4 migration.

RB_ID=909739
Problem

The HTTP protocols (1.1 and 2) treat responses with 500 status codes
as successful for the purposes of metrics and failure accrual. This is
surprising to users.

Solution

Change the behavior to classify these as unsuccessful. Users can opt
out of this via the
`com.twitter.finagle.http.serverErrorsAsFailuresV2` toggle.

RB_ID=909315
Problem

HTTP clients should be able to use the MethodBuilder API.

Solution

Introduce `Http.Client.methodBuilder` as the entry point.

Result

Users get an API with retries, timeouts, and logical success metrics.
As an example, this produces a `Service` that has timeouts and retries
on a 418 status code.

    import com.twitter.conversions.time._
    import com.twitter.finagle.Http
    import com.twitter.finagle.service.{ReqRep, ResponseClass}
    import com.twitter.util.Return

    val client: Http.Client = ???
    client.methodBuilder("inet!example.com:80")
     .withTimeoutPerRequest(50.milliseconds)
     .withTimeoutTotal(100.milliseconds)
     .withRetryForClassifier {
       case ReqRep(_, Return(rep)) if rep.statusCode == 418 => ResponseClass.RetryableFailure
     }
     .newService("an_endpoint_name")

RB_ID=909793
Problem / Solution

The logic for skipping tests in `AbstractEndToEndTest` was too easy to
get wrong. Move it into `test` so it can't be missed.

RB_ID=909966
Problem

Netty's leak detection model distinguishes between composite byte
buffers with unpooled offheap components and unpooled off heap byte
buffers. It considers unreleased instances of the former to be leaks
but not so with the latter. This is a problem for finagle protocols
that use netty's decoders (such as http) because despite our pipelines
being set up to copy all data on heap, leaks can be declared after
passing the onheap unpooled buffers through a netty decoder.

Solution

Introduce a composite bytebuffer which is aware of its components
and only allows GC-releasable resource. We also introduce a
LeakDetectingAllocator so that we can produce instances of our new
buffer type.

Result

We bend netty's leak detection to the model we actually want, where
unreleased pooled or direct buffers are leaks and anything unpooled
and onheap is not.

RB_ID=908588
…tion

Problem

Aperture used a relatively complex Ring implementation to pick two
distinct nodes from its collection of endpoints. The added complexity
was neccessary when load balancers supported weights, since each endpoint
could support a range of serving units. However, this is no longer the
case. We want to simplify this, especially with the advent of deterministic
aperture which will use a Ring topology to determine ordering.

Solution

Extract the stock pick from the P2C implementation and share it between P2C
and Aperture.

RB_ID=909718
Problem

Some HTTP proxy servers (including Squid) sending their error responses (handshake errors)
within fancy-formated HTML pages. This makes Finagle throwing N4's `TooLongFrameException`
back to the user whenever the proxy handshake is failed since no payload is expected with
a handshake response at this point.

Solution

Accumulate up to 10kb of handshake responses to make sure we're not leaking N4 exceptions
into a users space and report something meaningful instead.

RB_ID=909972
nepthar and others added 27 commits April 28, 2017 19:11
Problem/Solution

DeadlineFilter uses a generic Failure as the exception given when a deadline is
exceeded, leading to it being hidden in stats. By providing a named exception
instead, DeadlineExceededException, it can be counted.

RB_ID=915634
Problem / Solution

c.t.f.thrift.transport.netty3.ThriftServerBufferedCodec and
c.t.f.thrift.transport.netty3.ThriftServerBufferedCodecFactory
are deprecated. Remove them.

RB_ID=915656
Problem / Solution

We need a default, singleton `Netty4HashedWheelTimer` instance that we can share
across the JVM process running Finagle clients/servers.

Implementation Notes

Please, note all the stats machinery is basically stolen from Netty 3
timer implementation, but it's now embedded into the timer itself.

I was on a fence trying to decide if we need to
make it abstract but given that Netty 3 timer is going away, I don't
think it's worth the effort.

Instead of exposing configuration knobs as constructor params, I made
them global flags. I think this approach makes more sense given the timer
is shared across the process (flags are great for configuring singletons).

RB_ID=915557
…tocolFactory}

Problem / Solution

c.t.f.ThriftMux.withClientId and
c.t.f.ThriftMux.withProtocolFactory are deprecated. Remove them.

RB_ID=915655
TBR=true
Problem

The semicolon at the end of the line is unnecessary.

Solution

Removed semicolons.

Signed-off-by: Yufan Gong <yufang@twitter.com>

RB_ID=915723
Problem

When using finagle-netty4 from the develop branch, peer certificate is not
available in the context after successful TLS handshake.
twitter#600

Solution

- changing ChannelTransport.peerCertificate from val to def -
SslServerConnectHandler delays channelRegistered event until
  handshake success. That means ServerBridge's initChannel will be delayed.

Result

Service instantiation delayed till TLS handshake finished

Signed-off-by: Yufan Gong <yufang@twitter.com>

RB_ID=915263
…luster} methods

Problem / Solution

com.twitter.finagle.builder.ClientBuilder.{group, cluster} are
deprecated. Remove them.

RB_ID=915098
TBR=true
Problem / Solution

Finagle protocols depending on Netty 4 (literally every single protocol)
must be using `Netty4HashedWheelTimer` as their timer. Let's override the
`Timer` stack param for them explicitly.

RB_ID=915703
Problem / Solution

There were some Netty 3 imports in finagle-redis that turned out to be
a dead code. Remove them.

RB_ID=916015
TBR=true
Problem

Now that all created Finagle Memcached clients are pipelined,
there's no need to record this in the registry.

Solution

Remove it.

RB_ID=916005
…ream IDs

Problem

When receiving HTTP/2 frames with stream IDs that aren't recognized, finagle
only logs debug messages. According to the spec, an RST may be sent when an
incoming message is for a stream that has been closed and if enough time has
passed to assume that the peer is misbehaving (opposed to receiving in-flight
messages on a stream that was recently closed). If an incoming message is
received for an idle stream, a GOAWAY/PROTOCOL_ERROR should be sent.
Additionally, the spec states that an endpoint must not send a RST in
response to an RST to avoid a loop.

Solution

If no child exists for a stream, send RST(badId, error=STREAM_CLOSED) if the
stream has been closed. Send GOAWAY(lastId, error=PROTOCOL_ERROR) if the
stream is idle (doesn't exist yet)

RB_ID=915507
Signed-off-by: Yufan Gong <yufang@twitter.com>

RB_ID=916210
Signed-off-by: Yufan Gong <yufang@twitter.com>

RB_ID=916212
Problem / Solution

Clarify that the resources directory needs to be packaged with
the application.

RB_ID=916196
Problem / Solution

com.twitter.finagle.builder.ClientBuilder.tracerFactory is
deprecated, as is com.twitter.finagle.tracing.Tracer.Factory.
Remove them.

RB_ID=915481
TBR=true
NO_USER_HOOK=1
Problem / Solution

Netty 4.1.10-Final has several features and fixes that we'd like to use, so
this upgrades netty. Additionally, Netty typically recommends a particular
tcnative version, so we bump that to 2.0.1 as well. Additionally, the new
version of netty removes DnsNameResolverBuilder#nameServerAddresses, so
that functionality has been re-created.

RB_ID=916056
Problem

The ByteReader represents an intrinsically mutable representation
of raw data with an implicit notion of owner that codec
implementations can consume. However, it doesn't contain a notion of
resource management making it difficult to represent the semantics
of data sources like the Netty 4 ByteBuf which needs to manage a
reference count.

Solution

Make ByteReader extend AutoClosable to give ByteReader a hook for
managing underlying resources.

RB_ID=916086
Problem / Solution

In a previous review, there were some formatting/visibility issues that slipped
by. This fixes them.

RB_ID=916319
Problem / Solution

TrafficDistributor is an implementation detail of LoadBalancerFactory,
it should be co-located with the loadbalancer code.

RB_ID=916346
Problem

MultiCategorizingExceptionStatsHandler, as configured, ignored flags on
exceptions that were not Failures, leading to missing stats.

Solution

Use FailureFlags.flagsOf( to generate flag name sets for arbitrary throwables.

RB_ID=916301
Problem

The Node type is typically defined by a load metric. In which case,
only the `load` and `pending` fields are needed. However, different
balancers may need to attach additional data to the Node type.  For
example, Aperture needs a `token` to sort the nodes. This was added
to the `NodeT` trait but only used by Aperture forcing the load
metrics to know the internal details of Aperture.

Looking forward, we will want to refine the `NodeT` type to implement
expiration but don't necessarily want these fields directly on `NodeT`.

Solution

Load metrics now can refine the `NodeT` type, but the final binding
is left to the implementation. That is, each implementation has to
implement the `Node` type, `newNode` and `failingNode`.

RB_ID=915772
Problem

Netty http/2 support already does a graceful shutdown, but we don't configure
how long the shutdown should be.

Solution

Since we want to be able to late-bind the shutdown timeout, we should pass a
preparatory closable which we can ensure closes before anything else.

RB_ID=915234
…r methods

Problem / Solution

c.t.f.exception.Reporter.{clientReporter, sourceReporter} are
deprecated. Remove them.

RB_ID=916403
Problem / Solution

BindingFactory is a naming implementation detail and as such should
be in Finagle's naming package.

RB_ID=916350
TBR=true
Problem

Historically, Finagle's default timer was Netty 3 `HashedWheelTimer` and was
defined in `finagle-core`. This makes it difficult to decouple `finagle-core`
from Netty 3. We need a default timer implementation w/o Netty dependency.

Solution

Given that all Finagle protocols are running Netty 4 with Netty 4 `HashedWheelTimer`
by default, we can safely service-load the timer and fallback to just `JavaTimer`
when no timers are available on the classpath.

RB_ID=915924
@monkey-mas monkey-mas merged commit 2dfc051 into monkey-mas:fix-body-and-header-in-204-and-304 May 12, 2017
@monkey-mas
Copy link
Owner

Thanks!!! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet