New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Regression: System time is twice as slow vs 1.7.0 #45929
Comments
Thanks, would be interesting to bisect. |
For me, user time is a little bit slow. woclass@wos-PC:/mnt/v/tmp$ hyperfine --runs 20 'julia-latest -O3 --math-mode=ieee -- knucleotide.jl 0 < fasta.txt'
Benchmark 1: julia-latest -O3 --math-mode=ieee -- knucleotide.jl 0 < fasta.txt
Time (mean ± σ): 7.836 s ± 0.038 s [User: 7.066 s, System: 0.321 s]
Range (min … max): 7.703 s … 7.890 s 20 runs
woclass@wos-PC:/mnt/v/tmp$ hyperfine --runs 30 'julia-latest -O3 --math-mode=ieee -- knucleotide.jl 0 < fasta.txt'
Benchmark 1: julia-latest -O3 --math-mode=ieee -- knucleotide.jl 0 < fasta.txt
Time (mean ± σ): 7.849 s ± 0.029 s [User: 7.055 s, System: 0.342 s]
Range (min … max): 7.806 s … 7.910 s 30 runs
woclass@wos-PC:/mnt/v/tmp$ hyperfine --runs 30 'julia-1.8 -O3 --math-mode=ieee -- knucleotide.jl 0 < fasta.txt'
Benchmark 1: julia-1.8 -O3 --math-mode=ieee -- knucleotide.jl 0 < fasta.txt
Time (mean ± σ): 7.345 s ± 0.067 s [User: 6.541 s, System: 0.350 s]
Range (min … max): 7.278 s … 7.579 s 30 runs
woclass@wos-PC:/mnt/v/tmp$ hyperfine 'julia-1.7 -O3 --math-mode=ieee -- knucleotide.jl 0 < fasta.txt'
Benchmark 1: julia-1.7 -O3 --math-mode=ieee -- knucleotide.jl 0 < fasta.txt
Time (mean ± σ): 7.577 s ± 0.040 s [User: 6.785 s, System: 0.346 s]
Range (min … max): 7.527 s … 7.673 s 10 runs
woclass@wos-PC:/mnt/v/tmp$ hyperfine 'julia-1.6 -O3 --math-mode=ieee -- knucleotide.jl 0 < fasta.txt'
Benchmark 1: julia-1.6 -O3 --math-mode=ieee -- knucleotide.jl 0 < fasta.txt
Time (mean ± σ): 7.435 s ± 0.090 s [User: 6.674 s, System: 0.343 s]
Range (min … max): 7.368 s … 7.657 s 10 runs test steps
env
|
Your timing "7.278 s" being best on 1.8.0-rc4, is good to know, and your worst, only 5.8% slower (and 7.8x for "User" but likely based of "mean", not "min"), on 1.9.0-DEV.1131, but could you just be measuring noise? For sure web browsers etc closed? At least you do not have the "2x slower" System time regression I reported, at best a minor other regression (on a non-supported WSL2 platform). Windows is a supported platform, so it's helpful to test it (without WSL2, but also with while WSL2 not an officially supported platform?), Actually I see I posted for "--cpu-target=ivybridge" (because used in Debian's benchmark game), but not for the non-regressed numbers... I doubt it affected System, but best to test both with (and without). |
On Windows
PS V:\tmp> hyperfine --runs 30 `
>> 'julia-latest -O3 --cpu-target=ivybridge --math-mode=ieee -- knucleotide.jl 0 < fasta.txt' `
>> 'julia +1.8 -O3 --cpu-target=ivybridge --math-mode=ieee -- knucleotide.jl 0 < fasta.txt' `
>> 'julia +1.7 -O3 --cpu-target=ivybridge --math-mode=ieee -- knucleotide.jl 0 < fasta.txt' `
>> 'julia +1.6 -O3 --cpu-target=ivybridge --math-mode=ieee -- knucleotide.jl 0 < fasta.txt'
Benchmark 1: julia-latest -O3 --cpu-target=ivybridge --math-mode=ieee -- knucleotide.jl 0 < fasta.txt
Time (mean ± σ): 7.644 s ± 0.023 s [User: 0.005 s, System: 0.008 s]
Range (min … max): 7.614 s … 7.749 s 30 runs
Benchmark 2: julia +1.8 -O3 --cpu-target=ivybridge --math-mode=ieee -- knucleotide.jl 0 < fasta.txt
Time (mean ± σ): 7.261 s ± 0.043 s [User: 0.006 s, System: 0.019 s]
Range (min … max): 7.231 s … 7.477 s 30 runs
Benchmark 3: julia +1.7 -O3 --cpu-target=ivybridge --math-mode=ieee -- knucleotide.jl 0 < fasta.txt
Time (mean ± σ): 7.225 s ± 0.092 s [User: 0.005 s, System: 0.018 s]
Range (min … max): 7.175 s … 7.669 s 30 runs
Benchmark 4: julia +1.6 -O3 --cpu-target=ivybridge --math-mode=ieee -- knucleotide.jl 0 < fasta.txt
Time (mean ± σ): 7.050 s ± 0.045 s [User: 0.006 s, System: 0.020 s]
Range (min … max): 6.991 s … 7.235 s 30 runs
Summary
'julia +1.6 -O3 --cpu-target=ivybridge --math-mode=ieee -- knucleotide.jl 0 < fasta.txt' ran
1.02 ± 0.01 times faster than 'julia +1.7 -O3 --cpu-target=ivybridge --math-mode=ieee -- knucleotide.jl 0 < fasta.txt'
1.03 ± 0.01 times faster than 'julia +1.8 -O3 --cpu-target=ivybridge --math-mode=ieee -- knucleotide.jl 0 < fasta.txt'
1.08 ± 0.01 times faster than 'julia-latest -O3 --cpu-target=ivybridge --math-mode=ieee -- knucleotide.jl 0 < fasta.txt' Note: I'm using
PS V:\tmp> hyperfine --runs 30 `
>> 'julia-latest -O3 --cpu-target=ivybridge --math-mode=ieee -e "VERSION"' `
>> 'julia +1.8 -O3 --cpu-target=ivybridge --math-mode=ieee -e "VERSION"' `
>> 'julia +1.7 -O3 --cpu-target=ivybridge --math-mode=ieee -e "VERSION"' `
>> 'julia +1.6 -O3 --cpu-target=ivybridge --math-mode=ieee -e "VERSION"' `
>> 'julia +1.0 -O3 --cpu-target=ivybridge --math-mode=ieee -e "VERSION"'
Benchmark 1: julia-latest -O3 --cpu-target=ivybridge --math-mode=ieee -e VERSION
Time (mean ± σ): 223.8 ms ± 4.6 ms [User: 5.8 ms, System: 10.7 ms]
Range (min … max): 218.3 ms … 238.4 ms 30 runs
Benchmark 2: julia +1.8 -O3 --cpu-target=ivybridge --math-mode=ieee -e VERSION
Time (mean ± σ): 257.5 ms ± 2.9 ms [User: 6.3 ms, System: 22.1 ms]
Range (min … max): 254.1 ms … 267.5 ms 30 runs
Benchmark 3: julia +1.7 -O3 --cpu-target=ivybridge --math-mode=ieee -e VERSION
Time (mean ± σ): 244.9 ms ± 10.3 ms [User: 4.9 ms, System: 23.7 ms]
Range (min … max): 231.5 ms … 272.7 ms 30 runs
Benchmark 4: julia +1.6 -O3 --cpu-target=ivybridge --math-mode=ieee -e VERSION
Time (mean ± σ): 236.6 ms ± 2.5 ms [User: 4.4 ms, System: 25.8 ms]
Range (min … max): 232.3 ms … 241.4 ms 30 runs
Benchmark 5: julia +1.0 -O3 --cpu-target=ivybridge --math-mode=ieee -e VERSION
Time (mean ± σ): 243.7 ms ± 3.2 ms [User: 6.8 ms, System: 23.7 ms]
Range (min … max): 237.0 ms … 251.3 ms 30 runs
Summary
'julia-latest -O3 --cpu-target=ivybridge --math-mode=ieee -e VERSION' ran
1.06 ± 0.02 times faster than 'julia +1.6 -O3 --cpu-target=ivybridge --math-mode=ieee -e VERSION'
1.09 ± 0.03 times faster than 'julia +1.0 -O3 --cpu-target=ivybridge --math-mode=ieee -e VERSION'
1.09 ± 0.05 times faster than 'julia +1.7 -O3 --cpu-target=ivybridge --math-mode=ieee -e VERSION'
1.15 ± 0.03 times faster than 'julia +1.8 -O3 --cpu-target=ivybridge --math-mode=ieee -e VERSION' WSL (Ubuntu) $ hyperfine --runs 30 \
> 'julia-latest -O3 --cpu-target=ivybridge --math-mode=ieee -- knucleotide.jl 0 < fasta.txt' \
> 'julia-1.8 -O3 --cpu-target=ivybridge --math-mode=ieee -- knucleotide.jl 0 < fasta.txt' \
> 'julia-1.7 -O3 --cpu-target=ivybridge --math-mode=ieee -- knucleotide.jl 0 < fasta.txt' \
> 'julia-1.6 -O3 --cpu-target=ivybridge --math-mode=ieee -- knucleotide.jl 0 < fasta.txt'
Benchmark 1: julia-latest -O3 --cpu-target=ivybridge --math-mode=ieee -- knucleotide.jl 0 < fasta.txt
Time (mean ± σ): 7.937 s ± 0.100 s [User: 7.158 s, System: 0.347 s]
Range (min … max): 7.811 s … 8.292 s 30 runs
Benchmark 2: julia-1.8 -O3 --cpu-target=ivybridge --math-mode=ieee -- knucleotide.jl 0 < fasta.txt
Time (mean ± σ): 7.444 s ± 0.102 s [User: 6.598 s, System: 0.382 s]
Range (min … max): 7.364 s … 7.931 s 30 runs
Benchmark 3: julia-1.7 -O3 --cpu-target=ivybridge --math-mode=ieee -- knucleotide.jl 0 < fasta.txt
Time (mean ± σ): 7.650 s ± 0.045 s [User: 6.854 s, System: 0.356 s]
Range (min … max): 7.520 s … 7.757 s 30 runs
Benchmark 4: julia-1.6 -O3 --cpu-target=ivybridge --math-mode=ieee -- knucleotide.jl 0 < fasta.txt
Time (mean ± σ): 7.536 s ± 0.106 s [User: 6.749 s, System: 0.348 s]
Range (min … max): 7.410 s … 7.965 s 30 runs
Summary
'julia-1.8 -O3 --cpu-target=ivybridge --math-mode=ieee -- knucleotide.jl 0 < fasta.txt' ran
1.01 ± 0.02 times faster than 'julia-1.6 -O3 --cpu-target=ivybridge --math-mode=ieee -- knucleotide.jl 0 < fasta.txt'
1.03 ± 0.02 times faster than 'julia-1.7 -O3 --cpu-target=ivybridge --math-mode=ieee -- knucleotide.jl 0 < fasta.txt'
1.07 ± 0.02 times faster than 'julia-latest -O3 --cpu-target=ivybridge --math-mode=ieee -- knucleotide.jl 0 < fasta.txt' Note: My CPU is |
Doubling of systime is still there on real (Ubuntu) Linux (as opposed to WSL2), but at least overall time isn't higher:
so I didn't test 1.9, not sure if this is a worry, I want in general to get systime down, most concerned with combined (and startup cost). I'm not up-to-speed on bisecting, if still relevant. |
[You must first run other benchmark fasta, to get the input file.]
https://benchmarksgame-team.pages.debian.net/benchmarksgame/program/knucleotide-julia-8.html
https://benchmarksgame-team.pages.debian.net/benchmarksgame/program/fasta-julia-8.html
This benchmark is usually run multi-threaded, and there's it's also 2x slower, the total 7% slower:
Actually the above benchmarking was made with one line changed to:
The text was updated successfully, but these errors were encountered: