Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clients.rst – Fix StatsReceiver src link #321

Closed
wants to merge 49 commits into from
Closed

Clients.rst – Fix StatsReceiver src link #321

wants to merge 49 commits into from

Conversation

orrsella
Copy link
Contributor

@orrsella orrsella commented Nov 7, 2014

Had a missing .scala file extension.

Brian Degenhardt and others added 30 commits October 29, 2014 15:31
Fixes to some of the more trivial compiler warnings in finagle and
util.

RB_ID=473425
Problem

finagle's implementation of SSL does not use Server Name Indication
(SNI).

Solution

finagle must use SSLEngine, not SSLSockets, because SSLSockets have
mostly blocking operations.  In order to use SNI with SSLEngine,
we must supply a peerHost and peerPort to SSLContext.  This means
threading a host and port through to SSL connection establishment
time.  The only reference that we have to the downstream is the
SocketAddress, so we use that by default--however, if the client
specifies a hostname to verify against, we override the host with
that one.

Result

SNI is automatically turned on for all clients that enable TLS.
When host verification is also turned on, we use the same host for
host verification and SNI.

RB_ID=481449
Signed-off-by: Travis Brown <tbrown@twitter.com>

RB_ID=483177
 Problem

 finagle-kestrelx's java src is failing checkstyle

 Solution

 fix the style issues

RB_ID=484231
tl;dr: 0.5.0 has a bug, 0.5.0-1 doesn't.  See THRIFT-223 for the explanation of the problem.

Exceptions:
scrooge-generator (0.8, only used by eventbus-thrift and some leaves)
scrooge-benchmark (0.8)
ostrichagg (0.8, I don't know why, but I believe it's a leaf)

Note this *downgrades* scrooge-core from 0.8 to 0.5.

libthrift 0.5.0 and 0.5.0-1 were published by twitter, not Apache.
They're publicly available on maven.twttr.com.

Note the 0.5.0 pom.xmls (thrift and org.apache.thrift) are empty.  The
0.5.0-1 pom.xml has dependencies on slf4j-api and commons-lang.

RB_ID=474051
We need to backport changes from finagle-http to finagle-httpx. This
change comes from 81e920d.

From RB=431323:

> Problem
>
> How and where Finagle initializes a its Tracing infrastructure is
> unlcear and difficult to reason about. Each Codec is charged with
> decoding trace info, pushing the Tracer, tracing send/recv events, and
> service name annotations. This caused issues where traces were
> recorded before a tracer was pushed to the stack or before the proper
> TraceId was decoded out of the protocol, thus creating broken traces.
>
> Solution
>
> Charge the codecs with simply initializing the trace id by decoding it
> from the protocol. Let finagle handle tracing service name and send/recv
> events.
>
> Result
>
> Traces work the right way by default

RB_ID=484289
* Adds missing DefaultServer override for newTraceInitializer.
* In the future, we should switch to Stack (see RB).

RB_ID=484875
…eterize resolver discovery

 Problem

 Finagle-memcached and finagle-memcachedx each define a resolver,
 'TwitterCacheResolver', for the twcache scheme. This leads to an
 unrecoverable runtime failure for packages that pick up both libraries
 due to the conflicting resolvers registering for the same scheme.

 Solution

 Special-case the 'twcache' scheme to whitelist it for finagle-core's
 Resolver duplicate checking.

 The reason for this hack is that no other solution is deployable
 in the multi-mono-repo world without one-shotting Science in a
 branch (as confirmed in extensive discussion with jw and moses).
 The plan for an incremental rollout for finagle-memcachedx precludes
 a 2.10 style upgrade so that leads us to the proposed bandaid. The
 good news is that this bandaid has clearly defined lifespan, the
 duration of the netty4 migration.

 Trying to write tests to verify the new filtering logic was, as far
 as I could determine, impossible because the Resolver hardcodes a
 ServiceLoader call to discover resolvers defined in resource files,
 all of which happens at static init time of a private member. As a
 result this RB also introduces a generalized version of the Resolver
 object which is paramterized by a Seq[Resolver] producing thunk
 which allows us to assert signifcant properties of Resolver such
 as: defining multiple resolvers for the same schemes throws and
 that twcache is in fact a special case. That change isn't essential
 and if gets too much pushback I'm okay to drop it as the important
 result here is to unblock the finagle-memcachedx rollout.

 Result

 1) Netty4 migration is unblocked and we can proceed with porting
 services to finagle-memcachedx incrementally without threat of
 runtime TwitterCacheResolver scheme collision related failures.
 2) Implicit assumption that the TwitterCacheResolver defined in
 finagle-memcached and finagle-memcachedx will not diverge.
 3) More testable Resolver

RB_ID=484357
 syncing

 commit a403a0a7d6689a60905a6b39ff860673905bc815
 Author: Evan Jones <ejones@twitter.com>
 Date:   Thu Oct 9 22:28:48 2014 +0000

     finagle-memcached: KetamaClient: Add test for concelled requests

     Adds unit test for commit 4c29ba95b3

     RB_ID=473329

 to finagle-memcachedx

RB_ID=485881
TBR=true
In order to get the request context change[1] shipped, we need to
introduce a transitional API so that we can introduce it to science
without having an intermediate broken state.

I have a separate branch in science that makes changes according to
this, and they seem to be all that's required.

[1] https://reviewboard.twitter.biz/r/469081/

RB_ID=480925
Problem

This defines a stat per host/port endpoint in the emwa load balancer,
which has GC characteristics of per-host stats.  IE: large cluster
rollouts to mesos will GC and repromote a ton of stats.

Solution

Deleting this stat is acceptable since it's mostly for load balancer
debugging.

RB_ID=483375
…statsReceiver work in action

RB_ID=487717
TBR=true
NO-QUEUE=true
 define sbt builds for finagle-memcachedx, finagle-kestrelx and finagle-httpx

RB_ID=485655
Problem

We're no longer cross-building against 2.9.2, but that's what's in the Travis
configuration. The SBT build is also missing the ScalaCheck dependency.

Solution

Update the version, add the dependency.

Result

Tests run as expected on Travis.

RB_ID=484635
Problem

util-logging exported a dependency on scalatest. This causes broken
builds when used in Science, which is still on Scalatest 1.x.

Solution

Factor this out into its own package, and use selectively.

before% ./pants goal dependencies twitter-server/src/main/scala | grep scalatest
3rdparty:scalatest
org.scalatest:scalatest_2.10:2.2.2
before%

after% ./pants goal dependencies twitter-server/src/main/scala | grep scalatest
after%

RB_ID=482011
…uest headers

Problem:

Finagle encodes Dtab.local into request headers. However, it's possible, for instance if a request object is accidentally reused, to add many Dtabs to a request.

Solution:

Ensure that a request is stripped of Dtab headers before appending Dtab.local to Http requests.

RB_ID=478527
A few (largely cosmetic) aspects of the file ServerSet2.scala are
reworked:

- The hierarchy of implementations of the ServerSet2 trait is
  flattened. The reasoning for this is that it simplifies the interface
  one uses to construct ServerSet2 instances. Rather than a mismatched
  variety of constructors and factory functions, you simply pass a
  `Var[Zk]` to the ServerSet2 constructor. This has the downside of
  foundering the separation of concerns of Var and Zk from the
  ServerSet2 trait, which arguably goes against the desire to separate
  the service discovery library from ZooKeeper. However, I think that an
  easier-to-reason-about ServerSet2 class outweighs the benefits of
  wholly separating each of these concepts. Especially given plans for
  follow-on passes over this code to factor out the ZK bits.

- com.twitter.util.Memoize is used to remove the need for a `var` to
  cache ServerSet2 instances

- More documentation is added, specifically the help message of the
  `com.twitter.newZk` flag and a number of public APIs' Scaladocs

RB_ID=481107
Problem

Working with P2C was becoming unwieldy. It does not separate concerns
very well.

Solution

Split up the responsibilities of balancing, updating, node
maintenance, and state maintenance.

RB_ID=471979
Problem

A lack of documentation on `RetryPolicy.tries` leaves the
exercise to readers.

Solution

Improve the documentation and fix some minor broken windows.

RB_ID=488821
Problem

finagle-serversets currently emits four histograms for each path that
it's asked to look up. For services that resolve large numbers of
serversets (namely TFE and the Wily Nameserver), this explodes into
thousands of metrics. There are no known usages of these stats for
monitoring purposes, so it's a huge waste and memory pressure burden.

Solution

Remove the per-path histograms in favor of four aggregated read- and
parse-latency histograms for entries and vectors.

Result

The number of stats emitted under the "zk2" scope is vastly
reduced. The per-path "size" and "limbo" stats are maintained, as are
those emitted by the InetResolver that is embedded within the
Zk2Resolver.

RB_ID=484443
…tten test in finagle-kestrel Problem: finagle-kestrel test in MultiReaderTest fails because the empirically determined order of messages is not always true.

  Solution:
  Keep track of the messages in a Set instead of assuming any specific
  order.

  Result:
  Test passes.

RB_ID=488487
….rollup statsReceiver work in action"

RB_ID=490665
Problem

We need a way to inject the ewma load metric per client.

Solution

Create a WeightedLoadBalancerFactory that instantiates a
P2CBalancerPeakEwma. This now allows us to inject it via
LoadBalancerFactory.Param (and ClientBuilder).

RB_ID=490021
mosesn and others added 19 commits October 29, 2014 15:31
Problem

Services that are configured with TimeoutAndWriteExceptionsOnly
expect to retry on request timeouts, but don't.

Solution

Change RetryPolicy behavior to retry on request timeouts properly.

RB_ID=491721
RB_ID=493423
TBR=true
NO-QUEUE=true
RB_ID=493469
TBR=true
NO-QUEUE=true
RB_ID=493887
TBR=true
NO-QUEUE=true
Problem

Stacks are inherently stateless and thus shared between
finagle clients and servers. The way we inferred the
parameters that a module used to configure itself
attached state to a Stack and violated this guarantee.

Solution

Introduce a `parameters` field in Stack.Head which servers
as documentation and can be dynamically introspected to
determine the fields and values that a module is interested
in. To make it easy to do the right thing, abstract definitions
for `Stackables` now take type parameters that map to the
parameters a module is interested in.

Result

We can now collect interests per-module:

val stk = ...
val params = ...

val parameters = for {
 n <- stk.tails
 interest <- n.head.parameters
} yield params(interest)

RB_ID=484747
Problem

PartialFunction has an isDefinedAt which must be called before
proper invocation.  However, PartialFunction#isDefinedAt and
PartialFunction#apply often entail many of the same costs (ie object
allocation due to unapply for literal PartialFunctions).

Solution

PartialFunction has a helper PartialFunction#applyOrElse so that
we only have to call it once.  However, applyOrElse takes a function,
so we must make singleton all-purpose functions to avoid extra
allocations.

Result

PartialFunction evaluation is cheaper.

RB_ID=492081
TwitterRPC is a library for constructing RPC clients and servers. Its
APIs facilitate high-level interprocess communication based on
objectives which are used to ensure timely service and resiliency.

By providing a narrow surface area of configuration, TwitterRPC aims
to remove much of the configuration burden from service owners and to
simplify application code. It does so via service level objectives
[1], which allow applications to define their expectations in
high-level terms without having to fret about specific
technology-dependent configuration parameters.

See [2] for more information.

[1] http://en.wikipedia.org/wiki/Service_level_objective
[2] http://go/twitterrpc

RB_ID=454939
…e size()

Problem
ConcurrentRingBuffer lacks useful features like the ability to to peek() at the head element and find out the size().

Solution
- Add size() as an approximate number of elements, which is correct in absence of modifications (quiescently consistent) and is approximately correct at other times,
- Add tryPeek() that works just like tryGet() except has no side effects

RB_ID=491185
Problem

The `{Client,Server}Builder.stack` functions that aren't deprecated
don't have scaladocs. Also, the old `ServerBuilder.stack` that takes a
`Params => Server` isn't @deprecated, which I'm assuming was an
oversight.

Solution

Add appropriate scaladocs for `{Client,Server}Builder.stack` and
@deprecate `stack(Params => Server)`.

RB_ID=490235
Problem

FailFastFactory is a bit light on stats, as it only tracks for
pathological cases (i.e. marked_dead, unhealthy_{for_ms,num_retries}).

Solution

Add a marked_available stat to track the dead->available state change.

RB_ID=488429
 Problem

 Finagle-stats tests are failing non-deterministically because of
 weak-reference semantics in the Metric's registry.

 Solution

 Keep strong reference to metrics in test code.

 Result

 Tests deterministically succeed.

RB_ID=495295
Problem

There was 1) inaccuracy in package doc for mux, 2) code duplicate, and 3) inconsistency in stat format

Solution

1) Correct the doc (finagle/finagle-mux/src/main/scala/com/twitter/finagle/mux/package.scala)
2) Remove the duplicate by calling a proper function
3) Make it consistent with the other stats

Result

discountWin is now logged as "21923064.bytes" rather than "21923064" (no other change in behavior.)
now the doc is consistent with the code, and the code looks a little better.

RB_ID=476213
Problem

We do not record stats on FailureAccrual removals (and revivals)

Solution

Modify failure accrual to take a statsReceiver and use that to log
stats.  This highlights the problem that FailureAccrual is essentially a
factory Transformer that can do whatever it wants, so I additionally
deprecated that part of the API because I feel that it's too
open-ended (and certainly impossible to add stats to).  Instead, users
are encouraged to just supply failure accrual params and let us take
care of the transforming.

RB_ID=487093
Problem

The junit test runner for scala needs classes to have
the `@RunWith(classOf[JUnitRunner])` annotation attached
to classes in order to run them.

Solution

Add the annotation.

RB_ID=495921
Problem:

Weighted unions of names cannot be expressed in NameTrees.

Solution:

Change Union to contain (Double, NameTree[T]) pairs, fix parser and code
which walks NameTrees.

Result:

Weighted unions can be expressed in NameTrees.

RB_ID=494141
Problem

Content-Length headers are not removed by Netty's compressors. This
causes content which is streamed, chunked, compressed, and also has a
Content-Length header to respond with the wrong value.

Solution

Always remove content length headers when streaming replies.

Discussion

This is a bad violation of modularity. Arguably the issue lies in
Netty here, and we should follow up.

RB_ID=496263
Had a missing `.scala` file extension.
@mosesn
Copy link
Contributor

mosesn commented Dec 29, 2014

Hmm, I think I might be able to handle this myself actually, since github exposes a patch file for each commit. I'll try using https://github.com/orrsella/finagle/commit/e2c3521a567ba9bc2c27dc4a5d5a265c2f350614.patch and report back.

@mosesn
Copy link
Contributor

mosesn commented Feb 6, 2015

This made it in too!

@mosesn
Copy link
Contributor

mosesn commented Mar 3, 2015

Finally made it in! bed163e

@mosesn mosesn closed this Mar 3, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

9 participants