runtime: aggressive GC completion is disruptive to co-tenants

### What version of Go are you using (`go version`)?

go version `devel +1e3c57c Wed Nov 16 20:31:40 2016 +0000` with CL 33093 PS 4 cherry-picked on top, with GOEXPERIMENT=preemptibleloops

### What operating system and processor architecture are you using (`go env`)?

linux/amd64

### What did you do?

I have a program that reads ~1MB records from stdin, decodes the records, and sends UDP datagrams. The program runs with around 20 goroutines. Around 40 copies of the program run on an 8-core host.

### What did you expect to see?

I expected the 95th percentile of mark termination and sweep termination pauses to be 100µs or less.

### What did you see instead?

The program's 95th percentile sweep termination time is around 60ms, and 95th percentile mark termination time is around 30ms. Here's an example gctrace line:

```
gc 11249 @74577.976s 0%: 11+185+35 ms clock, 90+143/358/582+280 ms cpu, 112->114->56 MB, 115 MB goal, 8 P
```

The mark termination pauses are very easy to identify in the execution trace—if this bug needs to be about either sweep term pauses or mark term pauses, let's use it for mark term.

Sweep termination pause distribution:

```
N 10000  sum 115454  mean 11.5454  gmean 0.602555  std dev 22.7221  variance 516.294

     min 0.011
   1%ile 0.014
   5%ile 0.016
  25%ile 0.023
  median 0.34
  75%ile 12
  95%ile 59
  99%ile 107
     max 199

⠀⠀⠀⠀⠀⠀⠰⡄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡖ 0.274
⠀⠀⠀⠀⠀⠀⡂⠄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡇
⠠⠤⠤⠤⠤⠤⠄⠰⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠄⠧ 0.000
⠈⠉⠉⠉⠉⠉⠙⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠙⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠙⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠁
      0                     50                     100
```

Mark termination pause distribution:

```
N 10000  sum 61391.3  mean 6.13913  gmean 0.93492  std dev 12.9254  variance 167.066

     min 0.04
   1%ile 0.053
   5%ile 0.064
  25%ile 0.1
  median 0.82
  75%ile 6.6
  95%ile 29
  99%ile 64
     max 209

⠀⠀⠀⠀⠀⠀⠰⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡖ 0.447
⠀⠀⠀⠀⠀⠀⡁⠄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡇
⠠⠤⠤⠤⠤⠤⠄⠑⠒⠦⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠄⠧ 0.000
⠈⠉⠉⠉⠉⠉⠙⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠋⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠁
      0                                   50
```

And here's the execution trace of a slow mark termination pause, taking over 16ms. Proc 6 organizes the mark termination phase. Procs 1, 2, 3, 4, and 5 execute it quickly at around t=1007ms. Proc 0 does mark termination around t=1014ms, and Proc 7 delays until t=1024ms at which point the global pause concludes.

<img width="1651" alt="screen shot 2016-11-17 at 1 58 11 pm" src="https://cloud.githubusercontent.com/assets/230685/20410040/14b8049c-acd0-11e6-856c-40548a9ca0be.png">



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

runtime: aggressive GC completion is disruptive to co-tenants #17969

What version of Go are you using (`go version`)?

What operating system and processor architecture are you using (`go env`)?

What did you do?

What did you expect to see?

What did you see instead?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

runtime: aggressive GC completion is disruptive to co-tenants #17969

Description

What version of Go are you using (go version)?

What operating system and processor architecture are you using (go env)?

What did you do?

What did you expect to see?

What did you see instead?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

What version of Go are you using (`go version`)?

What operating system and processor architecture are you using (`go env`)?