Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fortio update #460

Merged
merged 10 commits into from Jul 20, 2017
Merged

fortio update #460

merged 10 commits into from Jul 20, 2017

Conversation

ldemailly
Copy link
Contributor

@ldemailly ldemailly commented Jul 9, 2017

  • build and use fortio in perf/standalone scripts
  • fix for having more connections than requested
  • disabling compression by default (saves cpu), re-enable using new -compression flag
  • updating sample in README.md to a test using echosrv (slightly lower latency)
  • (misc/minor) also changing threshold for sleep time histogram to 5%

- build and use fortio in perf/standalone scripts
- (win) fix for having more connections than requested
Also changing threshold for sleep time histogram to 5%
Now getting exactly the expected number of sockets

But performance is still significantly lower than wrk
@ldemailly
Copy link
Contributor Author

this fixes 2 entries from #453

@istio istio deleted a comment from istio-testing Jul 15, 2017
@istio istio deleted a comment from istio-testing Jul 15, 2017
@istio istio deleted a comment from istio-testing Jul 15, 2017
@istio istio deleted a comment from istio-testing Jul 15, 2017
@istio istio deleted a comment from istio-testing Jul 15, 2017
@ldemailly
Copy link
Contributor Author

ldemailly commented Jul 15, 2017

this version now uses the exact/expected number of sockets

it still uses more cpu than wrk but it can now trigger 60k qps (up from 40 but still lower than wrk's 100k qps) out of echosrv. the profile isn't clear

44.44s of 47.82s total (92.93%)
Dropped 287 nodes (cum <= 0.24s)
Showing top 10 nodes out of 72 (cum >= 0.25s)
      flat  flat%   sum%        cum   cum%
    27.61s 57.74% 57.74%     27.76s 58.05%  syscall.Syscall
     7.37s 15.41% 73.15%      7.37s 15.41%  runtime.kevent
     3.06s  6.40% 79.55%      3.06s  6.40%  runtime.usleep
     2.32s  4.85% 84.40%      2.32s  4.85%  runtime.mach_semaphore_wait
     2.06s  4.31% 88.71%      2.06s  4.31%  runtime.mach_semaphore_signal
     1.15s  2.40% 91.11%      1.19s  2.49%  runtime.freedefer
     0.47s  0.98% 92.10%      1.23s  2.57%  runtime.selectgoImpl
     0.18s  0.38% 92.47%      0.55s  1.15%  net/http.setRequestCancel.func3
     0.13s  0.27% 92.74%      0.60s  1.25%  runtime.mallocgc
     0.09s  0.19% 92.93%      0.25s  0.52%  runtime.lock

Probably some locking is getting in the way (I tried to avoid all locking by having separate objects in each goroutine but the http client still seems to shares and lock things)

@ldemailly
Copy link
Contributor Author

ldemailly commented Jul 16, 2017

With #470 it's now very close to wrk

@istio istio deleted a comment from istio-testing Jul 16, 2017
@istio istio deleted a comment from istio-testing Jul 16, 2017
url,
req,
&http.Client{
Timeout: 3 * time.Second,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious: any special reason for 3s timeouts ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it means (I think) the max duration will be 3000ms before giving up - given we're expecting data in the single digit ms range most of the time, it seemed enough but I don't really have a good reason - probably should make it configurable or not set it

@@ -170,7 +170,7 @@ func (r *periodicRunner) Run() {
fmt.Printf("Ended after %v : %d calls. qps=%.5g\n", elapsed, functionDuration.Count, actualQPS)
if useQPS {
percentNegative := 100. * float64(sleepTime.hdata[0]) / float64(sleepTime.Count)
if percentNegative > 3 {
if percentNegative > 5 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

worth documenting choice of 5 here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add a comment, thx, I think it's somewhat arbitrary - it's how often we are falling behind for the target qps - 5% is in the noise (actually maybe 10) - the only effect is more verbose debugging and the warning

@ldemailly ldemailly merged commit 01fadb6 into master Jul 20, 2017
@istio-testing
Copy link
Collaborator

Jenkins job istio/presubmit passed

@ldemailly ldemailly deleted the ldemailly-bench6 branch July 20, 2017 21:01
ldemailly added a commit that referenced this pull request Jul 21, 2017
And misc comment/leftover from #460 etc
ldemailly added a commit that referenced this pull request Jul 27, 2017
* fortio update

- build and use fortio in perf/standalone scripts
- fix for having more connections than requested

* Updating sample to test using echosrv (slightly lower latency)

Also changing threshold for sleep time histogram to 5%

* Adding -gomaxprocs flag

* One client per goroutine

Now getting exactly the expected number of sockets

But performance is still significantly lower than wrk

* Adding profiler option

* Homegrown http client

On my Mac:
Old net/http : 19.7k qps with 2 threads (-http -1)
New -http 1 : 30.5k ops with 2 threads
And 1/8th of the user cpu

Interface to select between the 2 clients

* More comments and backtrack test

* http_test was missing from bazel

ran gazelle

* Added benchmark and fix folding bugs

Folding is 4x faster than built in, 1/3rd of allocs
Search is 10x faster, no alloc instead of a lot

```
BenchmarkASCIIFoldNormalToLower-8      	 3000000	       416 ns/op
96 B/op	       3 allocs/op
BenchmarkASCIIFoldCustomToLowerMap-8   	 5000000	       371 ns/op
96 B/op	       3 allocs/op
BenchmarkASCIIToUpper-8                	20000000	       104 ns/op
32 B/op	       1 allocs/op
BenchmarkFoldFind0-8                   	  200000	      7621 ns/op
2112 B/op	       3 allocs/op
BenchmarkFoldFind-8                    	 2000000	       669 ns/op
 0 B/op	       0 allocs/op
```

* chunked decoding

unit test for parsing, reads the first chunk

* fasthttp client wrapper (-http 3)

- added some doc in WORKSPACE about wtool
- initial support for fasthttp

* Make linter happy

though I might drop fasthttp as it’s not faster (but it’s more
complete) and adds a lot of dependencies

* Split connect out

* Fixing merge error

* adding fortio Logger

* Fixing bazel build

And misc comment/leftover from #460 etc

* Shorter logger names

* Switched to new logging

* Eradicate Verbosity leftovers

* Added -logprefix and -logcaller; Refactor start on http.go

also fixed a couple of calls missing % codes

* Minor edits, saving to GitHub wip

* Woot http1.1 works now

Still unwieldy but working !

* Some profiler optimizations

* More profiler opt

* Make lint happy by making checking connection closed header a flag

* Dropping fasthttp dependency

* Moved 3 changes to #495

* Minor: use version in echoers, use 3 digits

* Code review updates (thx doug!) + serious chunked bug fix

Wasn’t working with last chunk (0\r\n\r\n) being in a separate frame

* Code review comment

thx doug!
mandarjog pushed a commit to mandarjog/istio that referenced this pull request Oct 30, 2017
…#460)

* Implement validation using descriptors for the metrics aspect.

* Add back in default for latency expr


Former-commit-id: f229ffdf83e68abaf35666e365988e1ebced060f
rshriram pushed a commit that referenced this pull request Oct 30, 2017
* fortio update

- build and use fortio in perf/standalone scripts
- fix for having more connections than requested

* Updating sample to test using echosrv (slightly lower latency)

Also changing threshold for sleep time histogram to 5%

* Adding -gomaxprocs flag

* One client per goroutine

Now getting exactly the expected number of sockets

* Adding profiler option

* http_test was missing from bazel

ran gazelle


Former-commit-id: 01fadb6
rshriram pushed a commit that referenced this pull request Oct 30, 2017
* fortio update

- build and use fortio in perf/standalone scripts
- fix for having more connections than requested

* Updating sample to test using echosrv (slightly lower latency)

Also changing threshold for sleep time histogram to 5%

* Adding -gomaxprocs flag

* One client per goroutine

Now getting exactly the expected number of sockets

But performance is still significantly lower than wrk

* Adding profiler option

* Homegrown http client

On my Mac:
Old net/http : 19.7k qps with 2 threads (-http -1)
New -http 1 : 30.5k ops with 2 threads
And 1/8th of the user cpu

Interface to select between the 2 clients

* More comments and backtrack test

* http_test was missing from bazel

ran gazelle

* Added benchmark and fix folding bugs

Folding is 4x faster than built in, 1/3rd of allocs
Search is 10x faster, no alloc instead of a lot

```
BenchmarkASCIIFoldNormalToLower-8      	 3000000	       416 ns/op
96 B/op	       3 allocs/op
BenchmarkASCIIFoldCustomToLowerMap-8   	 5000000	       371 ns/op
96 B/op	       3 allocs/op
BenchmarkASCIIToUpper-8                	20000000	       104 ns/op
32 B/op	       1 allocs/op
BenchmarkFoldFind0-8                   	  200000	      7621 ns/op
2112 B/op	       3 allocs/op
BenchmarkFoldFind-8                    	 2000000	       669 ns/op
 0 B/op	       0 allocs/op
```

* chunked decoding

unit test for parsing, reads the first chunk

* fasthttp client wrapper (-http 3)

- added some doc in WORKSPACE about wtool
- initial support for fasthttp

* Make linter happy

though I might drop fasthttp as it’s not faster (but it’s more
complete) and adds a lot of dependencies

* Split connect out

* Fixing merge error

* adding fortio Logger

* Fixing bazel build

And misc comment/leftover from #460 etc

* Shorter logger names

* Switched to new logging

* Eradicate Verbosity leftovers

* Added -logprefix and -logcaller; Refactor start on http.go

also fixed a couple of calls missing % codes

* Minor edits, saving to GitHub wip

* Woot http1.1 works now

Still unwieldy but working !

* Some profiler optimizations

* More profiler opt

* Make lint happy by making checking connection closed header a flag

* Dropping fasthttp dependency

* Moved 3 changes to #495

* Minor: use version in echoers, use 3 digits

* Code review updates (thx doug!) + serious chunked bug fix

Wasn’t working with last chunk (0\r\n\r\n) being in a separate frame

* Code review comment

thx doug!


Former-commit-id: e09dcaf
mandarjog pushed a commit that referenced this pull request Oct 31, 2017
* Implement validation using descriptors for the metrics aspect.

* Add back in default for latency expr


Former-commit-id: 28b5ce56493e385b065f1e7c50d7cab335bae37c
vbatts pushed a commit to vbatts/istio that referenced this pull request Oct 31, 2017
* fortio update

- build and use fortio in perf/standalone scripts
- fix for having more connections than requested

* Updating sample to test using echosrv (slightly lower latency)

Also changing threshold for sleep time histogram to 5%

* Adding -gomaxprocs flag

* One client per goroutine

Now getting exactly the expected number of sockets

* Adding profiler option

* http_test was missing from bazel

ran gazelle


Former-commit-id: 01fadb6
vbatts pushed a commit to vbatts/istio that referenced this pull request Oct 31, 2017
* fortio update

- build and use fortio in perf/standalone scripts
- fix for having more connections than requested

* Updating sample to test using echosrv (slightly lower latency)

Also changing threshold for sleep time histogram to 5%

* Adding -gomaxprocs flag

* One client per goroutine

Now getting exactly the expected number of sockets

But performance is still significantly lower than wrk

* Adding profiler option

* Homegrown http client

On my Mac:
Old net/http : 19.7k qps with 2 threads (-http -1)
New -http 1 : 30.5k ops with 2 threads
And 1/8th of the user cpu

Interface to select between the 2 clients

* More comments and backtrack test

* http_test was missing from bazel

ran gazelle

* Added benchmark and fix folding bugs

Folding is 4x faster than built in, 1/3rd of allocs
Search is 10x faster, no alloc instead of a lot

```
BenchmarkASCIIFoldNormalToLower-8      	 3000000	       416 ns/op
96 B/op	       3 allocs/op
BenchmarkASCIIFoldCustomToLowerMap-8   	 5000000	       371 ns/op
96 B/op	       3 allocs/op
BenchmarkASCIIToUpper-8                	20000000	       104 ns/op
32 B/op	       1 allocs/op
BenchmarkFoldFind0-8                   	  200000	      7621 ns/op
2112 B/op	       3 allocs/op
BenchmarkFoldFind-8                    	 2000000	       669 ns/op
 0 B/op	       0 allocs/op
```

* chunked decoding

unit test for parsing, reads the first chunk

* fasthttp client wrapper (-http 3)

- added some doc in WORKSPACE about wtool
- initial support for fasthttp

* Make linter happy

though I might drop fasthttp as it’s not faster (but it’s more
complete) and adds a lot of dependencies

* Split connect out

* Fixing merge error

* adding fortio Logger

* Fixing bazel build

And misc comment/leftover from istio#460 etc

* Shorter logger names

* Switched to new logging

* Eradicate Verbosity leftovers

* Added -logprefix and -logcaller; Refactor start on http.go

also fixed a couple of calls missing % codes

* Minor edits, saving to GitHub wip

* Woot http1.1 works now

Still unwieldy but working !

* Some profiler optimizations

* More profiler opt

* Make lint happy by making checking connection closed header a flag

* Dropping fasthttp dependency

* Moved 3 changes to istio#495

* Minor: use version in echoers, use 3 digits

* Code review updates (thx doug!) + serious chunked bug fix

Wasn’t working with last chunk (0\r\n\r\n) being in a separate frame

* Code review comment

thx doug!


Former-commit-id: e09dcaf
mandarjog pushed a commit that referenced this pull request Oct 31, 2017
* Default to POD_NAMESPACE for -n flag

* Fix bad merge
mandarjog pushed a commit that referenced this pull request Nov 2, 2017
* fortio update

- build and use fortio in perf/standalone scripts
- fix for having more connections than requested

* Updating sample to test using echosrv (slightly lower latency)

Also changing threshold for sleep time histogram to 5%

* Adding -gomaxprocs flag

* One client per goroutine

Now getting exactly the expected number of sockets

* Adding profiler option

* http_test was missing from bazel

ran gazelle


Former-commit-id: 01fadb6
mandarjog pushed a commit that referenced this pull request Nov 2, 2017
* fortio update

- build and use fortio in perf/standalone scripts
- fix for having more connections than requested

* Updating sample to test using echosrv (slightly lower latency)

Also changing threshold for sleep time histogram to 5%

* Adding -gomaxprocs flag

* One client per goroutine

Now getting exactly the expected number of sockets

But performance is still significantly lower than wrk

* Adding profiler option

* Homegrown http client

On my Mac:
Old net/http : 19.7k qps with 2 threads (-http -1)
New -http 1 : 30.5k ops with 2 threads
And 1/8th of the user cpu

Interface to select between the 2 clients

* More comments and backtrack test

* http_test was missing from bazel

ran gazelle

* Added benchmark and fix folding bugs

Folding is 4x faster than built in, 1/3rd of allocs
Search is 10x faster, no alloc instead of a lot

```
BenchmarkASCIIFoldNormalToLower-8      	 3000000	       416 ns/op
96 B/op	       3 allocs/op
BenchmarkASCIIFoldCustomToLowerMap-8   	 5000000	       371 ns/op
96 B/op	       3 allocs/op
BenchmarkASCIIToUpper-8                	20000000	       104 ns/op
32 B/op	       1 allocs/op
BenchmarkFoldFind0-8                   	  200000	      7621 ns/op
2112 B/op	       3 allocs/op
BenchmarkFoldFind-8                    	 2000000	       669 ns/op
 0 B/op	       0 allocs/op
```

* chunked decoding

unit test for parsing, reads the first chunk

* fasthttp client wrapper (-http 3)

- added some doc in WORKSPACE about wtool
- initial support for fasthttp

* Make linter happy

though I might drop fasthttp as it’s not faster (but it’s more
complete) and adds a lot of dependencies

* Split connect out

* Fixing merge error

* adding fortio Logger

* Fixing bazel build

And misc comment/leftover from #460 etc

* Shorter logger names

* Switched to new logging

* Eradicate Verbosity leftovers

* Added -logprefix and -logcaller; Refactor start on http.go

also fixed a couple of calls missing % codes

* Minor edits, saving to GitHub wip

* Woot http1.1 works now

Still unwieldy but working !

* Some profiler optimizations

* More profiler opt

* Make lint happy by making checking connection closed header a flag

* Dropping fasthttp dependency

* Moved 3 changes to #495

* Minor: use version in echoers, use 3 digits

* Code review updates (thx doug!) + serious chunked bug fix

Wasn’t working with last chunk (0\r\n\r\n) being in a separate frame

* Code review comment

thx doug!


Former-commit-id: e09dcaf
guptasu pushed a commit to guptasu/istio that referenced this pull request Jun 11, 2018
* Update global dictionary for new attributes.

* Move new attributes to the bottom of the file for backward compatible.
kyessenov pushed a commit to kyessenov/istio that referenced this pull request Aug 13, 2018
luksa pushed a commit to luksa/istio that referenced this pull request Sep 20, 2022
Co-authored-by: maistra-bot <null>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants