-
Notifications
You must be signed in to change notification settings - Fork 18.8k
Open
Labels
NeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.compiler/runtimeIssues related to the Go compiler and/or runtime.Issues related to the Go compiler and/or runtime.
Milestone
Description
What version of Go are you using (go version)?
$ go version go version go1.17.6 windows/amd64
Does this issue reproduce with the latest release?
Yes.
What operating system and processor architecture are you using (go env)?
go env Output
$ go env set GO111MODULE= set GOARCH=amd64 set GOBIN= set GOCACHE=C:\Users\a3367\AppData\Local\go-build set GOENV=C:\Users\a3367\AppData\Roaming\go\env set GOEXE=.exe set GOEXPERIMENT= set GOFLAGS= set GOHOSTARCH=amd64 set GOHOSTOS=windows set GOINSECURE= set GOMODCACHE=C:\Users\a3367\go\pkg\mod set GONOPROXY= set GONOSUMDB= set GOOS=windows set GOPATH=C:\Users\a3367\go set GOPRIVATE= set GOPROXY=https://proxy.golang.org,direct set GOROOT=C:\Program Files\Go set GOSUMDB=sum.golang.org set GOTMPDIR= set GOTOOLDIR=C:\Program Files\Go\pkg\tool\windows_amd64 set GOVCS= set GOVERSION=go1.17.6 set GCCGO=gccgo set AR=ar set CC=gcc set CXX=g++ set CGO_ENABLED=1 set GOMOD=NUL set CGO_CFLAGS=-g -O2 set CGO_CPPFLAGS= set CGO_CXXFLAGS=-g -O2 set CGO_FFLAGS=-g -O2 set CGO_LDFLAGS=-g -O2 set PKG_CONFIG=pkg-config set GOGCCFLAGS=-m64 -mthreads -fmessage-length=0 -fdebug-prefix-map=C:\Users\a3367\AppData\Local\Temp\go-build2334633044=/tmp/go-build -gno-record-gcc-switches gdb --version: GNU gdb (GDB) 8.1
What did you do?
I profiled the standard library's png.Encode function and looked at this attached profile:
I then ran both weblist and list on the png.filter function.
What did you expect to see?
I expect both weblist and list to show the same result.
What did you see instead?
I saw list showing the correct result, with the sum of the flat time of png.abs8 in png.filter tallying with the total time spent on png.abs8.
In particular, the below log shows that png.abs8 took 1.56 seconds in total.
PS C:\Users\a3367\work\misc\seg\output> go tool pprof .\png_encode.pprof
Type: cpu
Time: Mar 29, 2022 at 1:28pm (CST)
Duration: 31.43s, Total samples = 17.53s (55.77%)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top
Showing nodes accounting for 16.92s, 96.52% of 17.53s total
Dropped 91 nodes (cum <= 0.09s)
Showing top 10 nodes out of 29
flat flat% sum% cum cum%
7.96s 45.41% 45.41% 13.07s 74.56% image/png.filter
2s 11.41% 56.82% 2.01s 11.47% image/png.abs
1.74s 9.93% 66.74% 1.74s 9.93% runtime.memmove
1.56s 8.90% 75.64% 1.56s 8.90% image/png.abs8
1.44s 8.21% 83.86% 3.46s 19.74% image/png.paeth
0.90s 5.13% 88.99% 0.90s 5.13% hash/adler32.update
0.80s 4.56% 93.55% 0.81s 4.62% compress/flate.(*deflateFast).matchLen
0.29s 1.65% 95.21% 0.31s 1.77% compress/flate.(*huffmanEncoder).bitCounts
0.12s 0.68% 95.89% 0.12s 0.68% runtime.asyncPreempt
0.11s 0.63% 96.52% 1.04s 5.93% compress/flate.(*deflateFast).encode
(pprof)
Adding the time spent on png.abs8 in png.filter also shows the same 1.56 seconds.
(pprof) list png.filter
Total: 17.53s
ROUTINE ======================== image/png.filter in C:\Program Files\Go\src\image\png\writer.go
7.96s 13.07s (flat, cum) 74.56% of Total
. . 202: return len(b), nil
. . 203:}
. . 204:
. . 205:// Chooses the filter to use for encoding the current row, and applies it.
. . 206:// The return value is the index of the filter and also of the row in cr that has had it applied.
10ms 10ms 207:func filter(cr *[nFilter][]byte, pr []byte, bpp int) int {
. . 208: // We try all five filter types, and pick the one that minimizes the sum of absolute differences.
. . 209: // This is the same heuristic that libpng uses, although the filters are attempted in order of
. . 210: // estimated most likely to be minimal (ftUp, ftPaeth, ftNone, ftSub, ftAverage), rather than
. . 211: // in their enumeration order (ftNone, ftSub, ftUp, ftAverage, ftPaeth).
. . 212: cdat0 := cr[0][1:]
110ms 110ms 213: cdat1 := cr[1][1:]
. . 214: cdat2 := cr[2][1:]
. . 215: cdat3 := cr[3][1:]
. . 216: cdat4 := cr[4][1:]
. . 217: pdat := pr[1:]
. . 218: n := len(cdat0)
. . 219:
. . 220: // The up filter.
. . 221: sum := 0
310ms 310ms 222: for i := 0; i < n; i++ {
1.02s 1.03s 223: cdat2[i] = cdat0[i] - pdat[i]
230ms 940ms 224: sum += abs8(cdat2[i])
. . 225: }
. . 226: best := sum
. . 227: filter := ftUp
. . 228:
. . 229: // The Paeth filter.
. . 230: sum = 0
. . 231: for i := 0; i < bpp; i++ {
. . 232: cdat4[i] = cdat0[i] - pdat[i]
. . 233: sum += abs8(cdat4[i])
. . 234: }
790ms 790ms 235: for i := bpp; i < n; i++ {
3.49s 6.99s 236: cdat4[i] = cdat0[i] - paeth(cdat0[i-bpp], pdat[i], pdat[i-bpp])
240ms 760ms 237: sum += abs8(cdat4[i])
180ms 180ms 238: if sum >= best {
. . 239: break
. . 240: }
. . 241: }
40ms 40ms 242: if sum < best {
. . 243: best = sum
. . 244: filter = ftPaeth
. . 245: }
. . 246:
. . 247: // The none filter.
. . 248: sum = 0
. . 249: for i := 0; i < n; i++ {
. . 250: sum += abs8(cdat0[i])
10ms 10ms 251: if sum >= best {
. . 252: break
. . 253: }
. . 254: }
. . 255: if sum < best {
. . 256: best = sum
. . 257: filter = ftNone
. . 258: }
. . 259:
. . 260: // The sub filter.
. . 261: sum = 0
. . 262: for i := 0; i < bpp; i++ {
. . 263: cdat1[i] = cdat0[i]
. . 264: sum += abs8(cdat1[i])
. . 265: }
190ms 210ms 266: for i := bpp; i < n; i++ {
990ms 1s 267: cdat1[i] = cdat0[i] - cdat0[i-bpp]
80ms 390ms 268: sum += abs8(cdat1[i])
150ms 150ms 269: if sum >= best {
. . 270: break
. . 271: }
. . 272: }
. . 273: if sum < best {
. . 274: best = sum
. . 275: filter = ftSub
. . 276: }
. . 277:
. . 278: // The average filter.
. . 279: sum = 0
. . 280: for i := 0; i < bpp; i++ {
70ms 80ms 281: cdat3[i] = cdat0[i] - pdat[i]/2
. . 282: sum += abs8(cdat3[i])
. . 283: }
. . 284: for i := bpp; i < n; i++ {
30ms 30ms 285: cdat3[i] = cdat0[i] - uint8((int(cdat0[i-bpp])+int(pdat[i]))/2)
10ms 30ms 286: sum += abs8(cdat3[i])
10ms 10ms 287: if sum >= best {
. . 288: break
. . 289: }
. . 290: }
. . 291: if sum < best {
. . 292: filter = ftAverage
(pprof)
However, weblist shows the wrong result, it seems to exagerate the result.
Pprof listing.zip
For example, on line 268, list shows only 80ms spent, whereas weblist shows 360ms spent.
Metadata
Metadata
Assignees
Labels
NeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.compiler/runtimeIssues related to the Go compiler and/or runtime.Issues related to the Go compiler and/or runtime.
Type
Projects
Status
Todo