Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance regression with 'overlay' driver and privileged containers #1404

Closed
jeanbza opened this issue Jul 20, 2017 · 24 comments
Closed

Performance regression with 'overlay' driver and privileged containers #1404

jeanbza opened this issue Jul 20, 2017 · 24 comments

Comments

@jeanbza
Copy link

jeanbza commented Jul 20, 2017

Bug Report

  • Concourse version: 3.3.1
  • Deployment type (BOSH/Docker/binary): binary
  • Infrastructure/IaaS: AWS
  • Browser (if applicable): chrome
  • Did this used to work? Yes

We are noticing that various tasks are hanging for 1-10m before executing. (the ui doesn't show the 'loading' icon spinning, just hangs)

Our setup:

  • 2 workers with 16gb RAM and 4 cores, 1 web with 8gb and 4 cores (IIRC)
  • Postgres db lives on the web VM
  • Workers and web never get above 10% CPU
  • Three pipelines with < 5 jobs, 1 pipeline with ~20 jobs, some of which have 3000+ builds (most are low hundreds though)

Things we have tried:

This was also seen by @krishicks in the Slack concourse#general channel, I believe.

@krishicks
Copy link
Contributor

I have seen this issue repeatedly.

@pms1969
Copy link

pms1969 commented Jul 21, 2017

I've seen this as well. Just been on slack, and directed here.

Concourse Configuration: binary deployment to AWS clusters. 2 ATC's, 2 Linux Worker, and 2 winows Workers. internal LB for workers to talk to ATC's, External LB for hitting the UI. CodeDeploy over regular aws ami's.

Slack discussion begins here: https://concourseci.slack.com/archives/C07RY25QF/p1500654086732765

As per the slack discussion, I tend to see this leading up to problems with the linux workers. Number of volumes grows until the workers just start giving up, and then I end up having to rebuild the workers.

Happy to try and get any other information that may be required when it's available.

@jeanbza
Copy link
Author

jeanbza commented Jul 27, 2017

Dropping to v3.1.1 and recreating our db did nothing. Dropping to 2.x, I bet it's that file system change thing.

@jeanbza
Copy link
Author

jeanbza commented Jul 28, 2017

Dropping to 2.7.7 fixed our issues.

@markstgodard
Copy link

markstgodard commented Aug 7, 2017

I was using the following and DID NOT see this issue:

  • Concourse 3.0.1 (standalone binaries)
  • Google Compute Engine VMs (Ubuntu 14.04)
  • baggage claim driver: btrfs

I recently upgraded to Concourse 3.3.4, and I DO see this issue.

I upgraded to Concourse 3.1.1 in between 3.0.1 and 3.3.4 and I don't believe I saw the issue, but I could be mistaken.

Note: we are using the default baggage claim driver: overlay fs in 3.3.4

It seems like certain Jobs are affected by it (more so than others).

I have one job that builds a docker image (see below)

All the tasks run quick, with no lag... but when it gets to the docker-image-resource "put" task for "app-image" it takes 6+ minutes for it to start spinning...

During that 6+ minutes, it visually just looks like it has not started.

I also see the there is a container listed for that task when I do a fly containers
However I cannot intercept into it... (get a ssh bad handshake... so I assume its because it has NOT yet spun up the container ? )

- name: build-image-backend
  serial_groups: [build-docker]
  plan:
  - aggregate:
    - get: ci-runner-image
    - get: ci-scripts
    - get: ci-secrets
    - get: backend-version
      params: {bump: patch}
    - get: backend-dev
      passed: [unit-tests-backend]
      trigger: true
  - aggregate:
    - get: base-backend-image
      params: {save: true}
      passed: [build-base-backend]
    - task: prep-docker-image
      image: ci-runner-image
      file: ci-scripts/tasks/backend/prep-docker-backend.yml
  - put: backend-image
    params:
      load_base: base-backend-image
      build: output
      tag: backend-version/number
      tag_as_latest: true
    get_params:
      skip_download: true
  - put: backend-version
    params:
      file: backend-version/number

Before:
image

After:
image

During this 6+ minutes, I did a "watch" on fly volumes for the container handle that is associated with that put task. I see a few container entries for fly containers for that handle... then after about 6 minutes, I do see a new volume entry show up in the watch fly volumes | grep the-container-handle ... so seems like its related to fs or volumes?

It would be great to get this fixed.. also if there is something I could look into... logs or anything to help track down, would be appreciated.

We got alot of gains with the new caches feature, but lost them with some CI pipelines w/ this noticeable lag :)

Cheers

@vito
Copy link
Member

vito commented Aug 14, 2017

With the switch to overlay by default in 3.1, the tradeoff is that privileged tasks/resources (i.e. the Docker Image resource) have a performance penalty.

Unfortunately with container tech as it is today, you either get instability (btrfs) or slowness (overlay). We chose the latter.

Here are a few paths forward:

  • Improve the UI feedback, so you at least know what it's doing and can know it's not stuck (so more than just a spinner). We've been generally in need of finer-grained progress indicators on the build page. (/cc @jma @Lindsayauchin)
  • Look for optimizations in the runtime (/cc @topherbullock).
  • Pray to the kernel gods that shiftfs gets merged and we can kill this nasty performance overhead.
  • Add the ability to limit the inputs to a put, if it's the case that it's taking so long to transfer data that isn't actually needed by the task/put. See I should be able to control which artifacts a put receives #1202

@krishicks
Copy link
Contributor

I'd prefer the instability over the slowness.

This slowness means we end up waiting 3-5 minutes regularly for tasks to start.

When you add to that the fact that volumes in 3.0.1+ often get reaped sooner than they should, and we have to download 7GB files again from the Internet, the time it takes jobs to run becomes absurd.

@vito
Copy link
Member

vito commented Aug 15, 2017 via email

@krishicks
Copy link
Contributor

I'll switch our driver back to btrfs for now.

I've added #1475. I was told previously to make a reproducible test case, but I haven't prioritized doing that.

@jeanbza
Copy link
Author

jeanbza commented Aug 23, 2017

Also switched ours to btrfs. Hope that shiftfs gets in!

@vito vito changed the title Tasks hanging Performance regression with 'overlay' driver and privileged containers Oct 2, 2017
@vito vito removed the web-ui label Oct 2, 2017
@topherbullock topherbullock added this to Backlog in Runtime Nov 3, 2017
@topherbullock topherbullock moved this from Backlog to Icebox in Runtime Nov 3, 2017
@timrchavez
Copy link
Contributor

Did the folk that switched back to btrfs notice a performance improvement? Have you noticed increased instability? cc: @krishicks, @jadekler

FWIW the slowness and the lack of feedback drive dev-folk here bonkers.

@krishicks
Copy link
Contributor

krishicks commented Dec 28, 2017 via email

@jeanbza
Copy link
Author

jeanbza commented Dec 31, 2017

@timrchavez We definitely saw performance improvements switching back. I'm a couple months removed from managing concourse now so I'm unfortunately unable to remember specifics :(

@vito
Copy link
Member

vito commented Jan 29, 2018

For anyone not following along in #1966, we found and fixed the source of a lot of the btrfs instability that led us to switching to overlay in the first place. We can now more easily recommend that people switch back to it with the next release of Concourse (3.9), and we'll consider changing the default in the future. Until we either change the default or find a way to improve the overlay performance (not likely), I'll leave this issue open.

@timrchavez
Copy link
Contributor

timrchavez commented Feb 13, 2018

I can now confirm that we see a dramatic performance improvement by switching back to btrfs. No more long pauses before a dind task starts running. Looking forward to the v3.9.0 release to get those stability fixes.

@vito vito moved this from Icebox to Backlog in Runtime Mar 19, 2018
xtremerui pushed a commit that referenced this issue Mar 22, 2018
#1404

Submodule src/github.com/concourse/baggageclaim a6dbd0b..71a864b:
  > Change the default driver to btrfs

Signed-off-by: Shash Reddy <sreddy@pivotal.io>
@shashwathi shashwathi moved this from Backlog to In Flight in Runtime Mar 22, 2018
@shashwathi shashwathi moved this from In Flight to Done in Runtime Mar 22, 2018
@topherbullock topherbullock moved this from Done to Accepted in Runtime Mar 28, 2018
@vito vito added the accepted label Mar 29, 2018
@vito vito closed this as completed Mar 29, 2018
Runtime automation moved this from Accepted to Done Mar 29, 2018
@vito vito modified the milestone: v3.10.0 Mar 29, 2018
@ezraroi
Copy link

ezraroi commented Apr 3, 2018

So after upgrading to 3.10.0 i removed the CONCOURSE_BAGGAGECLAIM_DRIVER='btrfs' env from our workers and looks like the issue returned now... we see 10+ minutes spinning build steps

@xtremerui
Copy link
Contributor

@ezraroi refer to https://github.com/concourse/baggageclaim/blob/68422aa9cea1e86cf925e4aacbf696bfd81a0545/baggageclaimcmd/driver_linux.go#L43

is it possible for some reason your worker falls into the kernelSupportsOverlay and picks up overlay driver again? By default if it detects your file system supports btrfs it will use that.

@adarshaj
Copy link

I still see this issue in 3.14.1 even after switching to btrfs driver.. the job just keeps waiting and sometimes has 'waiting for docker to come up..' for more than 15 minutes..

@krishicks
Copy link
Contributor

I feel the "waiting for docker to come up" error is unrelated. I have, however, also seen that on 3.14.1. The job was waiting for 15 hours.

@adarshaj
Copy link

Ah okay, may be it was just coincidence in my case then? I ran with 3.14.1 with overlay a few times and hadn't seen, after switching to btrfs I saw it in every run, so probably did the mistake of inferring correlation as causation..

Any work around for it?

vito added a commit that referenced this issue Aug 15, 2018
Submodule src/github.com/beevik/etree 90dafc1e..4cd0dd97 (rewind):
  < add attribute sort support.
  < Release v1.0.1
  < Update path documentation.
  < Minor code reordering.
  < add support for absolute path queries.
  < Update travis config.
  < fix bug in GetRelativePath.
  < Modify GetPath and GetRelativePath.
  < Added a GetPath() and GetRelativePath() to get the paths of an element.
  < Update travis config
  < Added filterText type
  < Added [text()] syntax to retrieve all elements with non empty text
  < path: add text filters
  < Fix broken Markdown headings
  < Add Permissive read setting.
  < Fix unit test.
Submodule src/github.com/concourse/tsa 49a729b..e1df238:
  > fix race/panic in tsa suite
Submodule src/github.com/gorilla/handlers 7e0847f9..3a5767ca (rewind):
  < added ability to register custom log formatter (#131)
  < Fix typo in cors.go (#127)
  < [bugfix] Handle CORS pre-flight request in middleware (#112)
  < Revert "Add Vary header when allowedOrigins is * (#114)" (#122)
  < Add Vary header when allowedOrigins is * (#114)
  < distinguish between explicit and implicit star (#118)
  < [bugfix] Don't return the origin header when configured to * (#116)
  < Travis go18 (#106)
  < use http.StatusOK as initial value for responseLogger.status (#103)
  < README.md: Add sourcegraph badge
  < Merge pull request #97 from nwidger/master
Submodule src/github.com/gorilla/mux e48e440e..9fa818a4 (rewind):
  < Add test for multiple calls to Name(). Fixes #394
  < Clarify behaviour of Name method if called multiple times.
  < Update LICENSE & AUTHORS files. (#386)
  < Initialize user map (#371)
  < [deps] Add go.mod for versioned Go (#376)
  < [docs] Improve docstrings for middleware, skipclean (#375)
  < [docs] Doc fix for testing variables in path (#374)
  < Add CORSMethodMiddleware (#366)
  < Fix linter issues (docs) (#370)
  < [build] Update Go versions; add 1.10.x (#364)
  < Fix table-driven example documentation (#363)
  < Make Use() variadic (#355)
  < Modify http status code to variable in README (#350)
  < Modify 403 status code to const variable (#349)
  < Create authentication middleware example. (#340)
  < [docs] Clarify SetURLVars (#335)
  < [docs] Document route.Get* methods consistently (#338)
  < [docs] README.md: Improve "walking routes" example. (#337) (#323)
  < README.md: add miss "time" (#336)
  < [docs] Fix doc.go (#333)
  < [docs] Add testing example (#331)
  < [docs] Fix Middleware docs typos (#332)
  < Update doc.go: r.AddMiddleware(...) -> r.Use(...)
  < Make shutdown docs compilable (#330)
  < [feat] Add middleware support as discussed in #293 (#294)
  < [docs] Add graceful shutdown example (#329)
  < refactor routeRegexp, particularily newRouteRegexp. (#328)
  < Public test API to set URL params (#322)
  < [docs] Add example usage for Route.HeadersRegexp (#320)
  < [docs] Note StrictSlash re-direct behaviour #308 (#321)
  < Create ISSUE_TEMPLATE.md (#318)
  < [bugfix] Fix method subrouter handler matching (#300) (#317)
  < [docs] fix outdated UseEncodedPath method docs (#314)
  < MatchErr is set to ErrNotFound if NotFoundHandler is used (#311)
  < [docs] Document router.Match (#313)
  < [build] Allow tip failures (#312)
  < .travis.yml: Remove versions < go1.5 from build matrix
  < use req.URL.EscapedPath() instead of getPath(req) (#306)
  < GetQueryTemplates and GetQueryRegexp extraction (#304)
  < Added 1.9 build step (#303)
  < Fix WriteHeader in TestA301ResponseWriter. (#301)
  < [docs] Document evaluation order for routes (#297)
  < [docs] README.md: add missing `.` (#292)
  < [docs] Fix missing space in docstring (#289)
  < Fix #271:  Return 405 instead of 404 when request method doesn't match the route
  < Prefer scheme on child route when building URLs.
  < Use scheme from parent router when building URLs.
  < Fix typo
  < Add test and fix for escaped query values.
  < Update docs.
  < Add tests for support for queries in URL reversing.
  < Add support for queries in URL reversing.
  < Update Walking Routes Section
  < Fix invalid example code
  < Removing half of conflict marker (#268)
  < Update README with example for Router.Walk
  < Update ancestors parameter for WalkFunc for matcher subrouters
  < Update Walk to match all subrouters
  < Support building URLs with non-http schemes. (#260)
  < Updated README
  < Added method Route.GetMethods
  < Added method Route.GetPathRegexp
  < fixed typo (#250)
  < Fixing Regexp in the benchmark test (#234)
  < updating logic in route matcher, cleaner and saner (#235)
  < Merge pull request #232 from DavidJFelix/patch-1
  < Add Go 1.8 to .travis.yml
  < [bugfix] fail fast if regex is incorrectly specified using capturing groups. (#218)
  < [docs] Add route listing example to README
  < Merge pull request #199 from wirehead/minor-doc-tweek
  < Merge pull request #215 from ShaneSaww/fix_for_subroutes_with_pathPrefix
  < Merge pull request #196 from olt/doc-non-capture-groups
  < Add useEncodedPath option to router and routes (#190)
  < Simplify extractVars, fixes edge cases. (#185)
  < make the getPath method safer, fixing panics within App Engine (#189)
  < Add mechanism to route based on the escaped path (#184)
  < .travis.yml: add go1.7
  < [docs] Add logo to README. (#180)
  < [docs] Add static file example to README; doc.go. (#179)
  < Clean up some naming in mux_test.go
  < [bugfix] Fix error handling in Router.Walk (#177)
  < [docs] README typo (#175)
Submodule src/github.com/jonboulle/clockwork e7c6d408..bcac9884 (rewind):
  < README: Fix "Faking time" Golang playground anchor (#16)
  < travis: bump go version (#15)
  < Add support for fake tickers (#8)
Submodule src/github.com/russellhaering/goxmldsig 7acd5e4a..eaac44c6 (rewind):
  < Treat the xml namespace as already declared during exclusive c14n
  < Avoid mutating the original tree when performing transforms
  < Correctly build a surrounding NSContext to locate SignedInfo
  < In NSFindIterateCtx pass the surrounding context of found elements instead of their own context
  < Improve the efficiency of traversing Signature searching for SignedInfo
  < Improve namespace handling when locating CanonicalizationMethod
  < Improve namespace handling in locating SignedInfo
  < Add etreeutils support for iterating and searching of direct children
  < Actually expand travis test matrix
  < Expand go runtime test matrix
  < Merge pull request #33 from apilloud/chain
  < Merge pull request #31 from skyportsystems/master
  < Merge pull request #35 from danikarik/master
  < Merge pull request #34 from otto-md/master
  < Merge pull request #30 from skyportsystems/master
  < Merge pull request #27 from gravitational/rjones/signature
  < Merge pull request #26 from aidansteele/patch-1
Submodule src/google.golang.org/genproto 383e8b2c..411e09b9 (rewind):
  < Add response field to HttpRule (#87)
  < re-enable 1.6
  < update from googleapis (#88)
  < update from googleapis (#85)
  < update from googleapis (#84)
  < update from googleapis (#83)
  < Revert "update from googleapis (#80)" (#81)
  < update from googleapis (#80)
  < update from googleapis (#79)
  < regen: use api-common-protos (#78)
  < update from googleapis (#76)
  < regenerate (#75)
  < update protos using new go protoc plugin (#73)
  < regen speech pb.gos (#72)
  < update from googleapis (#71)
  < update from googleapis (#69)
  < Update bigtable from googleapis (#70)
  < add cloud tasks protos (#67)
  < update from googleapis (#65)
  < update from googleapis (#63)
  < update from googleapis (#62)
  < update from googleapis (#61)
  < update cloudbuild (#60)
  < update from googleapis (#59)
  < update from googleapis (#58)
  < update generated files from googleapis for googleapis/spanner/* (#57)
  < update from googleapis (#56)
  < update from googleapis (#55)
  < update from googleapis (#54)
  < update generated file for googleapis/spanner/* (#53)
  < update from googleapis (#52)
  < add codeowners (#50)
  < update from googleapis (#49)
  < update from googleapis (#48)
  < update from googleapis (#47)
  < update from googleapis (#45)
  < update generated files (#43)
  < update googleapis (#42)
  < regenerate protos (#41)
  < firestore: add generated client (#40)
  < regenerate from updated googleapis (#39)
  < update from googleapis (#38)
  < update from googleapis and protobuf (#37)
  < regenerated from updated googleapis (#36)
  < regenerate speech client (#35)
  < all: regenerate from googleapis (#32)
  < regenerate with proper protobuf path (#31)
  < all: regenerate from latest googleapis (#29)
  < make travis go get cloud.google.com/go/... (#28)
  < release videointelligence (#26)
  < all: regenerate from googleapis (#25)
Submodule src/google.golang.org/grpc 07ef407d9..0e8b58d22 (rewind):
  < channelz: unexport unnecessary API on grpc entities (#2257)
  < channelz: use atomic instead of mutex (#2218)
  < internal: remove TestingUseHandlerImpl (#2253)
  < update proto generated code (#2254)
  < Revert "internal: remove transportMonitor, replace with callbacks" (#2252)
  < internal: remove transportMonitor, replace with callbacks (#2219)
  < Change version to 1.15.0-dev (#2247)
  < interop: implement special_status_message interop test (#2241)
  < internal/grpcsync: introduce package for synchronization (#2244)
  < remove 1.6 support for channelz (#2242)
  < transport: eliminate StreamError; use status errors instead (#2239)
  < transport: replace ClientTransport with *http2Client for internal usage (#2238)
  < disable go1.6 travis tests (#2237)
  < go generate: update proto files (#2236)
  < ClientConn: add Target() returning target string (#2233)
  < client: define dialOptions as interfaces instead of functions (#2230)
  < interop: loosen restrictions on creds per test in interop client (#2231)
  < Convert io.ErrUnexpectedEOF to a codes.Internal-marked status in toRPCerr. (#2228)
  < internal/transport: remove unnecessary ServerTransport method (#2224)
  < internal/transport_test.go: prevent leaking context (#2227)
  < internal/syscall: add package description (#2226)
  < transport.go: minor typo fix (#2225)
  < resolver: document that SetDefaultScheme should be called at init time (#2217)
  < addrconn: remove unused wait() method (#2220)
  < dns resolver: exponential retry when getting empty address list (#2201)
  < internal/transport: remove some unused fields from structs (#2213)
  < internal: move DialOptions to a new file (#2193)
  < Benchmark: fix build tags (#2099)
  < transport: move to internal to make room for new, public transport API (#2212)
  < balancer: add rpc method to PickOptions (#2204)
  < transport: double-check deadline when processing server cancelation (#2211)
  < createTransport: timeout under waitForHandshake case should not have transport transferred to ready stage (#2208)
  < deprecate stream, move documentation to client|server stream (#2198)
  < Set and respect HTTP/2 SETTINGS_MAX_HEADER_LIST_SIZE (#2084)
  < travis: skip race testing on 386 as it is not supported (#2207)
  < internal: changes to travis to make it do less work (#2200)
  < stream: in withRetry, block until Status is valid and check on io.EOF (#2199)
  < grpclb: s/fmt.Errorf/errors.New/ (#2196)
  < Fix flaky test: TestClientStreamingError (#2192)
  < Add documentation for loopy. (#2169)
  < Fix test: wait on server to signal successful accept. (#2183)
  < Allow interop client to use call creds on any secure channel (#2185)
  < client: Implement gRFC A6: configurable client-side retry support (#2111)
  < documentation: clarify SendMsg documentation (#2171)
  < credentials: cleanup version-specific files (#2178)
  < Restrict channelz service test to x86 architecture (#2179)
  < client, server: update dial/server buffer options to support a "disable" setting (#2147)
  < credentials: add more appengine build tags (#2177)
  < Revert stickiness (#2175)
  < minor fix: remove redundant channelz files (#2176)
  < channelz: stage 4 - add security and socket option info with appengine build tags (#2149)
  < Update flow control test to have multiple concurrent streams. (#2170)
  < balancer/grpclb: update to latest lb proto (#2172)
  < resolver/dns: error if target ends with a colon instead of assuming the default port (#2150)
  < grpclb: remove old grpclb generated code  (#2143)
  < testing: run test in simulated appengine environment (#2145)
  < interop: set dns as default scheme in interop client (#2165)
  < Change version to 1.14.0-dev (#2163)
  < Don't log grpclb server ending connection as error (#2162)
  < channelz: move APIs to internal except channelz service (#2157)
  < transport: notify controlbuf that transport is gracefully closing to ensure proper cleanup (#2158)
  < Register incoming stream with loopy as soon as it gets created. (#2144)
  < Import grpclb package in the interop client (#2155)
  < fix: do not percent encode character tilde (#2139)
  < grpclb: backoff for RPC call if init handshake was unsucessful (#2077)
  < status: handle invalid utf-8 characters (#2109) (#2134)
  < Don't do extra work for keepalive when it's disabled. (#2148)
  < internal: move backoff to internal (#2141)
  < Fix flaky tests in transport. (#2120)
  < internal: Change Lock to RLock since no mutation is performed (#2142)
  < grpclb: remove redundent testing struct (#2126)
  < Normalize gRPC LB
  < Fix test: Account for the fact that Dial can return successfully before Accept. (#2123)
  < Add some debug info (#2136)
  < Documentation: create doc describing grpc-go's log levels and their usages (#2033)
  < internal: Update proto generated code (#2133)
  < resolver_conn_wrapper.go: fix minor typo (#2135)
  < internal: move leakcheck to internal/ (#2129)
  < Revert "status: handle invalid utf-8 characters" (#2127)
  < status: handle invalid utf-8 characters (#2109)
  < Revert " channelz: stage 4 - add security and socket option info" (#2124)
  < grpclb: minor fixes on comments and tests (#2122)
  < channelz: stage 4 - add security and socket option info (#2098)
  < Split grpclb out of top level grpc package (#2107)
  < Reduce error logs in transport. (#2117)
  < DNS resolver: Throw an error for non-default DNS authority. (#2067)
  < grpclb: sync messages.proto and update client load reporting (#2101)
  < alts: copy handshake address in Clone() (#2119)
  < codes: fix: marshal/unmarshal a Code to JSON fails (#2116)
  < Account for user configured small io write buffer. (#2092)
  < clarify CloseSend vs CloseAndRecv; better formatting (#2071)
  < internal/grpcrand: New package for concurrency-safe randoms (#2106)
  < Clarify newCCResolverWrapper documentation. (#2100)
  < Revert "channelz: stage 4 - add security and socket option info" (#2096)
  < channelz: stage 4 - add security and socket option info (#1965)
  < stickiness: limit the max count of stickiness keys (#2021)
  < Benchmarks that runs server and client and separate processes. (#1952)
  < Synchronize WriteStatus with WriteHeader on server. (#2074)
  < internal: update proto generated code (#2093)
  < health: generate health proto from grpc-proto (#2081)
  < internal: remove redundant channelz service go generate (#2085)
  < Revert "Strip port from server name in grpclb (#2066)" (#2083)
  < channelz: generate proto from grpc-proto repo (#2082)
  < internal: move version to a separate file (#2080)
  < internal: fix travis failure on alts proto (#2079)
  < test: make end2end test use split grpc / proto imports (#2069)
  < credentials/alts: make go:generate rebuild alts protos (#2056)
  < channelz: split channelz grpc and pb (#2068)
  < Strip port from server name in grpclb (#2066)
  < benchmark: listen on all addresses in benchmark servers (#2073)
  < regenerate *.pb.go files due to proto-gen-go update (#2070)
  < transport: respect http2 setting SETTINGS_HEADER_TABLE_SIZE (#2045)
  < Add AuthInfoFromContext utility API (#2062)
  < Fix possible data loss; Only let reader goroutine handle connection errors. (#1993)
  < split encode into three functions (#2058)
  < small documentation addition to NewStream (#2060)
  < Documentation: Add initial documentation on concurrency (#2034)
  < status: Introduce FromContextError convenience function (#2057)
  < Change version to 1.13.0-dev (#2054)
  < client: introduce WithDisableServiceConfig DialOption (#2010)
  < fix flaky test caused by race in channelz test (#2051)
  < Fix typo (#2050)
  < Ignore metadata that gRPC explicitly sets. (#2026)
  < internal: better test names (#2043)
  < Revert "Less mem (#1987)" (#2049)
  < client: fix interceptors after recent cleanup (#2046)
  < internal: vet.sh quits when it sees macosx (#2048)
  < channelz: update proto to canonical version and rename directory (#2044)
  < interop: Fix unimplemented method test (#2040)
  < health: set health proto canonical path (#2038)
  < Fix "deprecated" function godoc comments to match standard formatting (#2027)
  < proto: update generated code (#2039)
  < Rename proto import. (#2036)
  < Fix typos. (#2035)
  < credentials/alts: Refer to ALTS gRPC types by a different package (#2028)
  < http2Client: send reset stream when closing the stream on protocol error (#2030)
  < Stage 3: Channelz server implementation (#1919)
  < Less mem (#1987)
  < server: export ServerTransportStreamFromContext for unary interceptors to control headers/trailers (#2019)
  < dns resolver: create rand seed at init time (#2007)
  < vet: disallow importing "unsafe" (#2024)
  < stickiness: avoid using unsafe (#2023)
  < Fix typos (#2020)
  < travis: skip vet install for 386 (#2018)
  < stickiness: add stickiness support (#1969)
  < Stage 2: Channelz metric collection (#1909)
  < credentials/alts: Add ServiceOption for server-side ALTS creation (#2009)
  < documentation: add instructions for running tests locally (#2006)
  < go vet: fix composite literal uses unkeyed fields (#2005)
  < documentation: add OAuth2 doc and example (#2003)
  < reflection: regenerate pb.go file after typo fix (#2002)
  < Remove unnecessary type conversions (unconvert) (#1995)
  < Fix typos (#1994)
  < Merge pull request #1996 from knweiss/gosimple
  < documentation: mention DialContext is non-blocking by default (#1970)
  < documentation: mention Register functions should be call at init time (#1975)
  < cleanup: extend dial context for TestFailFastRPCErrorOnBadCertificates to 10 seconds (#1984)
  < Fix Test: race between t.Write() and t.closeStream()  (#1989)
  < Small test readability fixes (#1985)
  < documentation: mention peer will only be populated after RPC completes (#1982)
  < Channelz: more stable tesing (#1983)
  < grpclb: fix issues caused by caching SubConns (#1977)
  < createTransport: check for SHUTDOWN before assigning TransientFailure to ac.state  (#1979)
  < resolver/dns: Typo in lookupHost failure warning (#1981)
  < Channelz: Entity Registration and Deletion (#1811)
  < clientconn: add support for unix network in DialContext. (#1883)
  < documentation: Mark compresser and decompresser as deprecated (#1971)
  < grpclb: cache SubConns for 10 seconds after it is removed from the backendlist (#1957)
  < internal: clean up deprecated Invoke() usage (#1966)
  < Mark old balancer and naming APIs as deprecated (#1951)
  < Export changes to OSS. (#1962)
  < metadata: Add Get, Set, and Append methods to metadata.MD (#1940)
  < server: add grpc.Method function for extracting method from context (#1961)
  < resolver/manual: fix minor typo (#1960)
  < status: remove redundant import (#1947)
  < client: Fix race when using both client-side default CallOptions and per-call CallOptions (#1948)
  < Change version to 1.12.0-dev (#1946)
  < resolver: keep full unparsed target string if scheme in parsed target is not registered (#1943)
  < status: rename Status to GRPCStatus to avoid name conflicts (#1944)
  < status: Allow external packages to produce status-compatible errors (#1927)
  < Merge pull request #1941 from jtattermusch/routeguide_reimplement_distance
  < service reflection can lookup enum, enum val, oneof, and field symbols (#1910)
  < Documentation: Fix broken link in rpc-errors.md (#1935)
  < Correct Go 1.6 support policy (#1934)
  < Add documentation and example of adding details to errors (#1915)
  < Allow storing alternate transport.ServerStream implementations in context (#1904)
  < Fix Test: Update the deadline since small deadlines are prone to flakes on Travis. (#1932)
  < gzip: Add ability to set compression level (#1891)
  < credentials/alts: Remove the enable_untrusted_alts flag (#1931)
  < metadata: Fix bug where AppendToOutgoingContext could modify another context's metadata (#1930)
  < fix minor typos and remove grpc.Codec related code in TestInterceptorCanAccessCallOptions (#1929)
  < credentials/alts: Update ALTS "New" APIs (#1921)
  < client: export types implementing CallOptions for access by interceptors (#1902)
  < travis: add Go 1.10 and run vet there instead of 1.9 (#1913)
  < stream: split per-attempt data from clientStream (#1900)
  < stats: add BeginTime to stats.End (#1907)
  < Reset ping strike counter right before sending out data. (#1905)
  < resolver: always fall back to default resolver when target does not follow URI scheme (#1889)
  < server: Convert all non-status errors to codes.Unknown (#1881)
  < credentials/alts: change ALTS protos to match the golden version (#1908)
  < credentials/alts: fix infinite recursion bug [in custom error type] (#1906)
  < Fix test race: Atomically access minConnecTimout in testing environment. (#1897)
  < interop: Add use_alts flag to client and server binaries (#1896)
  < ALTS: Simplify "New" APIs (#1895)
  < Fix flaky test: TestCloseConnectionWhenServerPrefaceNotReceived (#1870)
  < examples: Replace context.Background with context.WithTimeout (#1877)
  < alts: Change ALTS proto package name (#1886)
  < Add ALTS code (#1865)
  < Expunge error codes that shouldn't be returned from library (#1875)
  < Small spelling fixes (unknow -> unknown) (#1868)
  < clientconn: fix a typo in GetMethodConfig documentation (#1867)
  < Change version to 1.11.0-dev (#1863)
  < benchmarks: add flag to benchmain to use bufconn instead of network (#1837)
  < addrConn: Report underlying connection error in RPC error (#1855)
  < Fix data race in TestServerGoAwayPendingRPC (#1862)
  < addrConn: keep retrying even on non-temporary errors (#1856)
  < transport: fix race causing flow control discrepancy when sending messages over server limit (#1859)
  < interop test: Expect io.EOF from stream.Send() (#1858)
  < metadata: provide AppendToOutgoingContext interface (#1794)
  < Add status.Convert convenience function (#1848)
  < streams: Stop cleaning up after orphaned streams (#1854)
  < transport: support stats.Handler in serverHandlerTransport (#1840)
  < Fix connection drain error message (#1844)
  < Implement unary functionality using streams (#1835)
  < Revert "Add WithResolverUserOptions for custom resolver build options" (#1839)
  < Stream: do not cancel ctx created with service config timeout (#1838)
  < Fix lint error and typo (#1843)
  < stats: Fix bug causing trailers-only responses to be reported as headers (#1817)
  < transport: remove unnecessary rstReceived (#1834)
  < transport: remove redundant check of stream state in Write (#1833)
  < client: send RST_STREAM on client-side errors to prevent server from blocking (#1823)
  < Use keyed fields for struct initializers (#1829)
  < encoding: Introduce new method for registering and choosing codecs (#1813)
  < compare atomic and mutex performance in case of contention. (#1788)
  < transport: Fix a data race when headers are received while the stream is being closed (#1814)
  < Write should fail when the stream was done but context wasn't cancelled. (#1792)
  < Explain target format in DialContext's documentation (#1785)
  < gzip: add Name const to avoid typos in usage (#1804)
  < remove .please-update (#1800)
  < Documentation: update broken wire.html link in metadata package. (#1791)
  < Document that all errors from RPCs are status errors (#1782)
  < update const order (#1770)
  < Don't set reconnect parameters when the server has already responded. (#1779)
  < credentials: return Unavailable instead of Internal for per-RPC creds errors (#1776)
  < Avoid copying headers/trailers in unary RPCs unless requested by CallOptions (#1775)
  < Update version to 1.10.0-dev (#1777)
  < compare atomic and mutex performance for incrementing/storing one variable (#1757)
  < Fix flakey test. (#1771)
  < grpclb: Remove duplicate init() (#1764)
  < server: fix bug preventing Serve from exiting when Listener is closed (#1765)
  < Fix TestGracefulStop flakiness (#1767)
  < server: fix race between GracefulStop and new incoming connections (#1745)
  < Notify parent ClientConn to re-resolve in grpclb (#1699)
  < Add dial option to set balancer (#1697)
  < Fix test: Data race while resetting global var. (#1748)
  < status: add Code convenience function (#1754)
  < vet: run golint on _string files (#1749)
  < examples: fix concurrent map accesses in route_guide server (#1752)
  < grpc: fix deprecation comments to conform to standard (#1691)
  < Adjust keepalive paramenters in the test such that scheduling delays don't cause false failures too often. (#1730)
  < fix typo (#1746)
  < fix stats flaky test (#1740)
  < relocate check for shutdown in ac.tearDown() (#1723)
  < fix flaky TestPickfirstOneAddressRemoval (#1731)
  < bufconn: allow readers to receive data after writers close (#1739)
  < After sending second goaway close conn if idle. (#1736)
  < Make sure all goroutines have ended before restoring global vars. (#1732)
  < client: fix race between server response and stream context cancellation (#1729)
  < In gracefull stop close server transport only after flushing status of the last stream. (#1734)
  < Deflake tests that rely on Stop() then Dial() not reconnecting (#1728)
  < Switch balancer to grpclb when at least one address is grpclb address (#1692)
  < Merge pull request #1724 from grpc/jtattermusch-patch-1
  < codes: Add UnmarshalJSON support to Code type (#1720)
  < naming: Fix build constraints for go1.6 and go1.7 (#1718)
  < remove stringer and go generate (#1715)
  < Add WithResolverUserOptions for custom resolver build options (#1711)
  < Fix grpc basics link in route_guide example (#1713)
  < Optimize codes.String() method using a switch instead of a slice of indexes (#1712)
  < Disable ccBalancerWrapper when it is closed (#1698)
  < Refactor roundrobin to support custom picker (#1707)
  < Change parseTimeout to not handle non-second durations (#1706)
  < make load balancing policy name string case-insensitive (#1708)
  < protoCodec: avoid buffer allocations if proto.Marshaler/Unmarshaler (#1689)
  < Add comments to ClientConn/SubConn interfaces to indicate new methods may be added (#1680)
  < client: backoff before reconnecting if an HTTP2 server preface was not received (#1648)
  < use the request context with net/http handler (#1696)
  < transport: fix race sending RPC status that could lead to a panic (#1687)
  < Fix misleading default resolver scheme comments (#1703)
  < Eliminate data race in ccBalancerWrapper (#1688)
  < Re-resolve target when one connection becomes TransientFailure (#1679)
  < New grpclb implementation (#1558)
  < Fix panics on balancer and resolver updates (#1684)
  < Change version to 1.9.0-dev (#1682)
  < set context timeout when Timeout value >= 0 (#1678)
  < switch balancer based on service config info (#1670)
  < Add proper support for 'identity' encoding type (#1664)
  < update code_string.go for new stringer changes (#1674)
  < addrConn: set ac.state to TransientFailure upon non-temporary errors (#1657)
  < Eliminate race on ac.acbw (#1666)
  < Corrected documentation on Server.Serve (#1668)
  < Update picker doc when returned SubConn is not ready (#1659)
  < travis: fix GOARCH=386 and add misspell check (#1658)
  < Add context benchmarks (#1610)
  < Add protoc command to example/readme (#1653)
  < Implement transparent retries for gRFC A6 (#1597)
  < server: add EXPERIMENTAL tag to grpc.ConnectTimeout (#1652)
  < *: replace deprecated grpc.Errorf calls with status.Errorf (#1651)
  < server: apply deadline to new connections until all handshaking is completed (#1646)
  < codec_benchmark_test: fix racy unmarshal behavior and make some cleanups (#1642)
  < Speed-up quota pools. (#1636)
  < Check ac state shutdown before setting it to TransientFailure (#1643)
  < vet.sh: don't check git status when doing -install (#1641)
  < latency: Listen on localhost:0 instead of :0 in test (#1640)
  < reduce timeout for tests to 5m (7m for testrace) (#1635)
  < Introduce new Compressor/Decompressor API (#1428)
  < Fix settings ack race (#1630)
  < Update examples/README.md (#1629)
  < Get method string from stream (#1588)
  < fix max msg size type issues on different arch (#1623)
  < Deflake roundrobin TestOneServerDown, and fix test error messages (#1622)
  <  Remove self-imposed limit on max concurrent streams if the server doesn't impose any. (#1624)
  < Acquire all stream related quota and cache it locally since no more than one write can happen in parallel on stream (#1614)
  < Make travis 32-bit actually work (#1621)
  < balancer: reduce chattiness (#1608)
  < Revert "cap max msg size to min(max_int, max_uint32) (#1598)" (#1619)
  < cap max msg size to min(max_int, max_uint32) (#1598)
  < Fix parseTarget for unix socket address without scheme (#1611)
  < Fix connectivity state transitions when dialing (#1596)
  < Update go_package declarations (#1593)
  < ClientHandshake should get the dialing endpoint as the authority (#1607)
  < Add functions to ClientConn so it satisfies an interface for generated code (#1599)
  < Re-add support for Go1.6 (#1603)
  < Make passthrouth resolver the default instead of dns (#1606)
  < Fix goroutine leak in grpclb_test (#1595)
  < Add go report card (#1594)
  < Parse ServiceConfig JSON string (#1515)
  < Register and use default balancers and resolvers (#1551)
  < fix misspell (#1592)
  < Serve() should not return error on Stop() or GracefulStop() (#1485)
  < Remove single-entry var blocks (#1589)
  < update fail fast documentation to remove retry language (#1586)
  < Create versioning and release policy document (#1583)
  < Skip proxy_test in race mode (#1584)
  < transport: minor cleanups (comment and error text) (#1576)
  < Use proto3 in interop tests and end2end tests (#1574)
  < Change version to 1.8.0-dev (#1573)
  < Make resolver Build() take a target struct (#1567)
  < Revert "Temporary disable staticcheck" (#1568)
  < Update UnknownServiceHandler comment to be clearer about interceptor behavior (#1566)
  < transport: fix racey send to writes channel in WriteStatus (#1546)
  < fix stats test race (#1560)
  < Run tests without -v (#1562)
  < Remove Go1.6 support (#1492)
  < Temporary disable staticcheck (#1561)
  < fix TestServerCredsDispatch and stats test race (#1554)
  < Make interop client dial blocking (#1559)
  < benchmark: add type assertion benchmarks (#1556)
  < fix typo and lint (#1553)
  < transport: refactor of error/cancellation paths (#1533)
  < New implementation of roundrobin and pickfirst (#1506)
  < Update format string to match type (#1548)
  < add comment to dns package (#1545)
  < Make IO Buffer size configurable. (#1544)
  < Use the same hpack encoder on a transport and share it between RPCs. (#1536)
  < DNS with new API (#1513)
  < update markdown render (#1542)
  < Revert "Added localhost to net.Listen() calls to avoid macOS firewall dialog." (#1541)
  < Added localhost to net.Listen() calls to avoid macOS firewall dialog. (#1539)
  < transport: remove some defers (#1538)
  < Use Type() method for OAuth tokens instead of accessing TokenType field. (#1537)
  < benchmark: add primivites benchmark for Unlocking via defer vs. inline (#1534)
  < benchmain: format output of benchmark to a table (#1493)
  < Fix misspells (#1531)
  < vet.sh: set PATH to force downloaded binaries to be run (#1529)
  < Fix format error on travis (#1527)
  < Move primitives benchmarks to package primitives_test (#1522)
  < Speed up end to end tests by removing an unnecessary sleep (#1521)
  < Change quota version to uint32 instead on uint64 (#1517)
  < Fix deadline error on grpclb streams (#1511)
  < Dedicated goroutine for writing. (#1498)
  < benchmark: add primitives benchmarks for informational purposes (#1501)
  < Truncate payload trace string, and turn trace off by default (#1509)
  < Add leak goroutine checking to grpc/balancer tests (#1497)
  < Add RegisterIgnoreGoroutine to leakcheck package (#1507)
  < remove a debug print that causes deadlock (#1505)
  < vet.sh: fix protoc installation (#1502)
  < Add new Resolver and Balancer APIs (gRFC L9) (#1408)
  < Fix to avoid annoying firewall dialog on macOS (#1499)
  < Move leak check into a separate leakcheck package (#1445)
  < Change version to 1.7.0-dev (#1496)
  < Run Go1.9 and 386 on Travis (#1475)
  < Check "x/net/context" with `go vet` like "context" (#1490)
  < benchmain: add nop compressor and other usability tweaks (#1489)
  < Fix context warnings from govet. (#1486)
  < benchmain: minor bug fixes (#1488)
  < Update proto generation commands in example doc (#1481)
  < Remove expiration_interval from grpclb message (#1477)
  < balancer_test: possible ctx leak, cancel before break (#1479)
  < Merge pull request #1476 from dfawley/pkg
  < Fix for 32-bit architectures (#1471)
  < When sending a non heads-up goaway close the connection if there are no active streams. (#1474)
  < Remove unnecessary function handleStreamSuspension (#1468)
  < fix grpclb protos to not cause re-registration of types (#1466)
  < transport: fix handling of InTapHandle's returned context (#1461)
  < the cancel function should be called to avoid ctx leak (#1465)
  < add comment (#1464)
  < Remove buf copy when the compressor exist (#1427)
  < transport: Fix deadlock in client keepalive. (#1460)
  < benchmark: add benchmain/main.go to run benchmark with flag set (#1352)
  < stats: add methods to allow setting grpc-trace-bin and grpc-tags-bin headers (#1404)
  < deduplicate dns record in lookup (#1454)
  < Add -u to  installation command (#1451)
  < addrConn: change address to slice of address (#1376)
  < go-generate pb.go files and check in Travis to make sure they don't change (#1426)
  < Fix host string passed to PerRPCCredentials (#1433)
  < metadata: Remove NewContext and FromContext for gRFC L7 (#1392)
  < Add status details support to server HTTP handler (#1438)
  < put *gzip.Writer back to pool (#1441)
  < Automatic WriteStatus for RecvMsg/SendMsg error on server side (#1409)
  < Update ServerInHandle comments (#1437)
  < Server should send 2 goaway messages to gracefully shutdown the connection. (#1403)
  < Add and use connectivity package for states (#1430)
  < Add 'experimental' note to ServeHTTP godoc (#1429)
  < Document Server.ServeHTTP (#1406)
  < Set peer before sending request (#1423)
  < Fix missing and wrong license (#1422)
  < Fix a goroutine leak in DialContext (#1424)
  < Use `NewOutgoingContext ` in the metadata doc (#1425)
  < Fix typo
  < Add flags for tls file path (#1419)
  < Change comment on stats.End.Error (#1418)
  < Call cancel on contexts in tests (#1412)
  < Don't use 64-bit integers with atomic. (#1411)
  < benchmark: don't stop timer until after workers are done (#1407)
  < Validate send quota again after acquiring writable channel (#1367)
  < Use log instead of grpclog in routeguide example (#1395)
  < Revert "Make all "grpc-" metadata field names reserved (#1391)" (#1400)
  < Enabling client process multiple GoAways (#1393)
  < Assign testdata path to correct variable (#1397)
  < Do not call testdata.Path when defining flags (#1394)
  < Make all "grpc-" metadata field names reserved (#1391)
  < remove defer funtion in recvBufferReader Read method (#1031)
  < Add testdata package and unify testdata to only one dir (#1297)
  < DNS resolver (#1300)
  < Expose ConnectivityState of a ClientConn. (#1385)
  < status: Add WithDetails and Details functions (#1358)
  < benchmark: remove multi-layer for loop (#1339)
  < transport: fix minor typo in http2_server.go (#1383)
  < Add doc in default implementation fatal functions on os.Exit() (#1365)
  < Fix bufconn.Close to not be blocking. (#1377)
  < Do not create new addrConn when connection error happens (#1369)
  < Change version to 1.6.x (#1382)
  < Revert "Use bufconn in end2end tests." (#1381)
  < Fix logging method (#1375)
  < Use bufconn in end2end tests.
  < Create bufconn package for a local, buffered net.Conn and dialer/listener
  < Fix a typo in examples/gotutorial.md (#1374)
  < Use log severity and verbosity level (#1340)
  < fix deadlock of roundrobin balancer (#1353)
  < Ignore goroutines spanwned by log.init during leakcheck. (#1368)
  < Populate callInfo.peer object for streaming RPCs (#1356)
  < BDP estimation and window update. (#1310)
  < Canonicalize https://grpc.io as the preferred URL prefix
  < Update leckCheck to ignore non-gRPC goroutine introduced in Go1.9 (#1351)
  < Do not flush NewStream header on client side for unary RPCs and streaming RPCs with requests. (#1343)
  < adjust import order (#1311)
  < add license for some proto files (#1322)
  < latency: sleep in Write when BDP is exceeded to avoid buffer bloat (#1330)
  < Add documentation to deprecate WithTimeout dial option (#1333)
  < change objects in recvBuffer queue from interface to concrete type to reduce allocs (#1029)
  < Catch invalid use of Server.RegisterService after Register.Serve (#828)
  < benchmark: add latency/MTU/bandwidth into testcases (#1304)
  < Updated documentation of ClientStream. (#1320)
  < Add support for grpc.SupportPackageIsVersion3 back (#1331)
  < Deflake TestServerGoAway (#1321)
  < dont create new reader in recvMsg (#940)
  < Make Apache 2.0 LICENSE file a verbatim copy (#1329)
  < Protect bytesSent and bytesReceived with mutex to avoid datarace (#1318)
  < Add Severity and VerboseLevel to grpclog. (#922)
  < update LICENSE (#1312)
  < fix spell (#1314)
  < Add goroutine safety doc on stream (#1313)
  < replace 127.0.0.1 with localhost for ipv6 only environment (#1306)
  < transport: fix error handling on Stream deletion (#1275)
  < Behaviour Change: transport errors should be coded Unavailable instead of internal. (#1307)
  < Support ipv6 addresses in grpclb (#1303)
  < Return header in Stream.Header() if available (#1281)
  < add license for some files (#1296)
  < Make RPCs non-failfast in grpclb_test. (#1302)
  < Specify characters allowed in metadata keys (#1299)
  < use subtests for the benchmark_test and add it into the Makefile (#1278)
  < update the path of guide (#950)
  < Create latency package for realistically simulating network latency (#1286)
  < Deflake TestFlowContolLogicalRace (#1279)
  < Merge pull request #1290 from jtattermusch/apache_license
  < Change version to 1.5.0-dev (#1288)
  < transport: fix minor typo in 'GoAway' godoc (#1284)
  < Piggyback window updates for connection with those of a stream. (#1273)
  < Reopening: Server shouldn't Fatalf in case it fails to encode. (#1276)
  < Avoid int32 overflow when applying initial window size setting
  < Revert "Server shouldn't Fatalf in case it fails to encode. (#1251)" (#1274)
  < Server shouldn't Fatalf in case it fails to encode. (#1251)
  < Decouple transport flow control from application read. (#1265)
  < Update references to route_guide.proto to use new directory name (#1270)
  < add MaxConcurrentStreams to benchmark_test when start the server (#1271)
  < Merge pull request #1267 from jtattermusch/improve_contributing
  < re-enable handler_server in end2end test, and fix some failed tests (#1259)
  < Avoid panic caused by stdlib context package errors (#1258)
  < Initialize stream properly in handler_server. (#1260)
  < Expand stream's flow control in case of an active read. (#1248)
  < Suppress server log message when EOF without receiving data for preface (#1052)
  < Fixed comment spelling (#1254)
  < Merge pull request #1165 from lyuxuan/service_config_pr
  < clientconn, server: replace time.After with time.NewTimer (#998)
  < grpclb balancer.Close() should not panic if called more than once (#1250)
  < Add doc and example for mocking streaming RPCs (#1230)
  < Test for EmptyCallOption
  < Implement `EmptyCallOption`
  < Reuse Token for serviceAccount credentials (#1238)
  < Travis: add staticcheck (#1019)
  < Defined GA and add pointer to benchmarks (#1239)
  < call listen with "localhost:port" instead of ":port" in tests (#1237)
  < fix server panic trying to send on stream as client disconnects #1111 (#1115)
  < Eagerly set a pointer to nil to help GC (#1232)
  < add logs to grpclb on send and recv (#1235)
  < Add stats test for client streaming and server streaming RPCs (#1140)
  < Adding dial options for PerRPCCredentials (#1225)
  < Pass custom dialer to balancer (#1205)
  < Http status to grpc status conversion (#1195)
  < Calling handleRPC with context derived from the original (#1227)
  < Use pooled gzip.{Writer,Reader} in gzip{Compressor,Decompressor} (#1217)
  < tentative fix to a flow control over-give-back bug (#1170)
  < Ensure that RoundRobin.Close() does not panic. (#1139)
  < Log the actual error when inTapHandle fails in http2Server (#1185)
  < make ServerOption panic messages more clear. (#1194)
  < Make window size configurable. (#1210)
  < Reset proto before unmarshalling (#1222)
  < Merge pull request #1221 from adelez/doc_fixit
  < Fix go buildable source file problem (#1213)
  < don't add defer func if stats handler is nil (#1214)
  < Change version to 1.4.0-dev (#1212)
  < Fix nil pointer dereferences from status.FromProto(nil) (#1211)
  < Split grpclb client load report test to deflake test. (#1206)
  < Use unpadded base64 encoding for binary metadata headers; handle padded or unpadded input (#1209)
  < Never encode binary metadata within the metadata map (#1188)
  < Client load report for grpclb. (#1200)
  < Use proto.Equal for equalities on Go proto messages (#1204)
  < Update grpclb proto and move grpclb into package grpc (#1186)
  < Revert "temporary disable 1.6 on travis (#1198)" (#1199)
  < temporary disable 1.6 on travis (#1198)
  < Revert "To adhere with protocol the server should send RST_STREAM on observing timeout on a strea, (#1130)"
  < Make sure all in-flight streams close when ClientConn.Close() is called. (#1136)
  < To adhere with protocol the server should send RST_STREAM on observing timeout on a strea, (#1130)
  < Fix broken Markdown headings in examples/gotutorial.md (#1189)
  < Support proxy with dialer (#1098)
  < grpclb should connect to the second balancer (#1181)
@0815fox
Copy link

0815fox commented Nov 13, 2018

I see that issue on my concourse setup, version 4.2.1. Is there any information that I should provide?

@sbkg0002
Copy link

We have the same issue with 4.2.1 with the binary release.

@xtremerui
Copy link
Contributor

@0815fox @sbkg0002 what filesystem are you using?

In concourse CI it is also using overlay now and we don't have this issue. So your performance degradation might caused by something else but filesystem.

@sbkg0002
Copy link

sbkg0002 commented Dec 13, 2018

We changed overlay into btrfs this morning and the speed is back again!
Binary release on Amazon Linux 2.

@0815fox change the fs driver.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Runtime
Accepted
Development

No branches or pull requests