Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[common][linux]: improve readlines performance. #1515

Closed
wants to merge 2 commits into from

Conversation

shirou
Copy link
Owner

@shirou shirou commented Aug 27, 2023

This PR will improve the following two points. This idea is based on #1514. Thank you so much.

  1. Use os.ReadFile() and strings.Split() instead of ReadLinesOffsetN() on common.ReadLines()
  2. Read /proc/stat from tail.

For 1, previously ReadLines() uses ReadLinesOffsetN(). but this function is simply too complicated when reading the entire contents of a file. So just use stdlib.
For 2, The btime line is usually found at the bottom part of the stat file. Therefore, I tried to read them in reverse order. It has little effect with my 4 CPUs, but I suspect it will work with more CPUs.

goos: linux
goarch: amd64
pkg: github.com/shirou/gopsutil/v3/process
cpu: AMD Ryzen 7 5800HS with Radeon Graphics
            │  before.txt  │              after.txt               │
            │    sec/op    │    sec/op     vs base                │
Processes-4   19.15m ± 12%   12.91m ± 11%  -32.61% (p=0.000 n=10)

            │  before.txt   │              after.txt               │
            │     B/op      │     B/op      vs base                │
Processes-4   3.901Mi ± 10%   2.642Mi ± 2%  -32.27% (p=0.000 n=10)

            │  before.txt  │              after.txt              │
            │  allocs/op   │  allocs/op   vs base                │
Processes-4   21.79k ± 10%   13.51k ± 2%  -38.01% (p=0.000 n=10)

BenchmarkBootTimeWithManyCPUs

This PR adds BenchmarkBootTimeWithManyCPUs with #1514 stat file which has 80 CPU stat lines. This result seems a significant improvement in performance.

goos: linux
goarch: amd64
pkg: github.com/shirou/gopsutil/v3/internal/common
cpu: AMD Ryzen 7 5800HS with Radeon Graphics
                       │ before.txt  │              after.txt              │
                       │   sec/op    │   sec/op     vs base                │
BootTimeWithManyCPUs-4   19.39µ ± 4%   13.74µ ± 5%  -29.15% (p=0.000 n=10)

                       │  before.txt   │              after.txt               │
                       │     B/op      │     B/op      vs base                │
BootTimeWithManyCPUs-4   12.609Ki ± 0%   9.789Ki ± 0%  -22.37% (p=0.000 n=10)

                       │ before.txt  │             after.txt              │
                       │  allocs/op  │ allocs/op   vs base                │
BootTimeWithManyCPUs-4   103.00 ± 0%   10.00 ± 0%  -90.29% (p=0.000 n=10)

TODO

We should add err check on process_linux.go. But it might be break application. So I will open a new PR in order to revert easily.

@atoulme
Copy link
Contributor

atoulme commented Aug 28, 2023

I see different results on my machine with 80 cores. Here they are:

goos: linux
goarch: arm64
pkg: github.com/shirou/gopsutil/v3/process
             │ old1515.txt │             new1515.txt             │
             │   sec/op    │   sec/op     vs base                │
Processes-80   249.4m ± 2%   297.3m ± 2%  +19.19% (p=0.000 n=10)

This is with go 1.21.0.

Details of benchmarks.
Before:

goos: linux
goarch: arm64
pkg: github.com/shirou/gopsutil/v3/process
BenchmarkProcesses-80    	       4	 250318539 ns/op
BenchmarkProcesses-80    	       5	 244556303 ns/op
BenchmarkProcesses-80    	       4	 256936076 ns/op
BenchmarkProcesses-80    	       5	 249262586 ns/op
BenchmarkProcesses-80    	       4	 250864103 ns/op
BenchmarkProcesses-80    	       5	 249589844 ns/op
BenchmarkProcesses-80    	       5	 244691265 ns/op
BenchmarkProcesses-80    	       5	 245573886 ns/op
BenchmarkProcesses-80    	       5	 233379835 ns/op
BenchmarkProcesses-80    	       5	 251390408 ns/op
PASS
ok  	github.com/shirou/gopsutil/v3/process	24.693s

After:

goos: linux
goarch: arm64
pkg: github.com/shirou/gopsutil/v3/process
BenchmarkProcesses-80    	       4	 305503645 ns/op
BenchmarkProcesses-80    	       4	 298868258 ns/op
BenchmarkProcesses-80    	       4	 286883520 ns/op
BenchmarkProcesses-80    	       4	 300195966 ns/op
BenchmarkProcesses-80    	       4	 300386758 ns/op
BenchmarkProcesses-80    	       4	 292164111 ns/op
BenchmarkProcesses-80    	       4	 290098780 ns/op
BenchmarkProcesses-80    	       4	 300040136 ns/op
BenchmarkProcesses-80    	       4	 295726462 ns/op
BenchmarkProcesses-80    	       4	 294468716 ns/op
PASS
ok  	github.com/shirou/gopsutil/v3/process	23.578s

@shirou
Copy link
Owner Author

shirou commented Aug 29, 2023

Thank you for your testing, @atoulme.

I have tested with AWS c5a.16xlarge (CPU=64). and I got no total performance improvement. hmm...

goos: linux
goarch: amd64
pkg: github.com/shirou/gopsutil/v3/process
cpu: AMD EPYC 7R32
             │  base.txt   │             after.txt              │
             │   sec/op    │   sec/op     vs base               │
Processes-64   37.55m ± 1%   38.32m ± 0%  +2.06% (p=0.002 n=10)

             │   base.txt   │               after.txt               │
             │     B/op     │     B/op       vs base                │
Processes-64   9.266Mi ± 0%   14.602Mi ± 0%  +57.59% (p=0.000 n=10)

             │  base.txt   │              after.txt              │
             │  allocs/op  │  allocs/op   vs base                │
Processes-64   64.68k ± 0%   25.49k ± 0%  -60.59% (p=0.000 n=10)

@shirou shirou closed this Aug 29, 2023
@shirou shirou deleted the feature/improve_readlines branch August 29, 2023 12:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants