Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

image/draw: increase performances by applying special case if mask is *image.Alpha #46395

Open
owulveryck opened this issue May 26, 2021 · 5 comments

Comments

@owulveryck
Copy link

@owulveryck owulveryck commented May 26, 2021

What version of Go are you using (go version)?

$ go version
go version devel go1.17-e4615ad74d Wed May 26 13:25:43 2021 +0000 darwin/amd64

Does this issue reproduce with the latest release?

yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE="off"
GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/olivierwulveryck/Library/Caches/go-build"
GOENV="/Users/olivierwulveryck/Library/Application Support/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOINSECURE=""
GOMODCACHE="/Users/olivierwulveryck/GOPROJECTS/pkg/mod"
GONOPROXY="github.com/dktunited"
GONOSUMDB="github.com/dktunited"
GOOS="darwin"
GOPATH="/Users/olivierwulveryck/GOPROJECTS"
GOPRIVATE="github.com/dktunited"
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/Users/olivierwulveryck/dev/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/Users/olivierwulveryck/dev/go/pkg/tool/darwin_amd64"
GOVCS=""
GOVERSION="devel go1.17-e4615ad74d Wed May 26 13:25:43 2021 +0000"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -arch x86_64 -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/j3/t2_lsqzd5mgbv7tqvxg9ty5r0000gn/T/go-build4289980978=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

I am building a tool that is manipulating a lot of images. The process is a bit slow.
I made some performance analysis, and it appears that the drawRGBA method is really time-consuming.
Specially the call to mask.At(mx,my).RGBA() is causing performance penalty.

I think that most of the use cases of DrawMask function are using an *image.Alpha structure as a mask. Therefore doing a special treatment if the mask is *image.Alpha could be valuable for most of the usage.

I made a simple test and bench for the drawRGBA method (it should be done for other methods as well).

func Benchmark_drawRGBA(b *testing.B) {
	r := image.Rect(0, 0, 500, 500)
	dst := image.NewRGBA(r)
	mask := image.NewAlpha(r)
	src := image.NewRGBA(r)
	for i := 0; i < b.N; i++ {
		drawRGBA(dst, r, src, image.Point{}, mask, image.Point{}, Over)
	}
}

Then I made a patch as an experiment:

diff --git a/src/image/draw/draw.go b/src/image/draw/draw.go
index 8f96aa2d18..205e8cfb43 100644
--- a/src/image/draw/draw.go
+++ b/src/image/draw/draw.go
@@ -530,7 +530,13 @@ func drawRGBA(dst *image.RGBA, r image.Rectangle, src image.Image, sp image.Poin
                for i, sx, mx := i0, sx0, mx0; sx != sx1; i, sx, mx = i+di, sx+dx, mx+dx {
                        ma := uint32(m)
                        if mask != nil {
-                               _, _, _, ma = mask.At(mx, my).RGBA()
+                               if mask, ok := mask.(*image.Alpha); ok {
+                                       off := mask.PixOffset(mx, my)
+                                       ma = uint32(mask.Pix[off])
+                                       ma |= ma << 8
+                               } else {
+                                       _, _, _, ma = mask.At(mx, my).RGBA()
+                               }
                        }
                        sr, sg, sb, sa := src.At(sx, sy).RGBA()
                        d := dst.Pix[i : i+4 : i+4] // Small cap improves performance, see https://golang.org/issue/27857

The performance enhancement is not negligible:

benchmark                 old ns/op     new ns/op     delta
Benchmark_drawRGBA-12     13178981      8599585       -34.75%

benchmark                 old allocs     new allocs     delta
Benchmark_drawRGBA-12     500000         250000         -50.00%

benchmark                 old bytes     new bytes     delta
Benchmark_drawRGBA-12     2025807       1016234       -49.84%

This is a test for opening the discussion, and I guess that other controls should be added as it makes some tests of the draw package fail. Specifically, the test.mask causes panic, and I cannot figure out what test.mask is nor where it is defined.

What did you expect to see?

N/A

What did you see instead?

N/A

@seankhliao
Copy link
Contributor

@seankhliao seankhliao commented May 26, 2021

@nigeltao
Copy link
Contributor

@nigeltao nigeltao commented May 27, 2021

I'm open in principle to the change, with two comments.

  1. Please be aware of the https://github.com/golang/go/wiki/Go-Release-Cycle
  2. The right fix probably involves touching this switch statement (below) in func DrawMask, and probably involves writing a new func drawMaskOver function instead of tweaking the existing func drawRGBA function.

switch src0 := src.(type) {

@owulveryck
Copy link
Author

@owulveryck owulveryck commented May 27, 2021

Thanks for your reply.

I thought about adding a private function to replace all the calls to img.At(x,y).RGBA(). This would be less invasive and would benefit all functions of the package.

func rgbaAt(x, y int, img image.Image) (r, g, b, a uint32) {
	switch img0 := img.(type) {
	case *image.RGBA:
		off := img0.PixOffset(x, y)
		r = uint32(img0.Pix[off])
		g = uint32(img0.Pix[off+1])
		b = uint32(img0.Pix[off+2])
		a = uint32(img0.Pix[off+3])
		r |= r << 8
		g |= g << 8
		b |= b << 8
		a |= a << 8
		return
	case *image.Gray:
		off := img0.PixOffset(x, y)
		y := uint32(img0.Pix[off])
		y |= y << 8
		return y, y, y, uint32(0xffff)
	case *image.Alpha:
		off := img0.PixOffset(x, y)
		a := uint32(img0.Pix[off])
		a |= a << 8
		return a, a, a, a
	default:
		return img0.At(x, y).RGBA()

	}
}

I tested it, and the results look very promising. Besides the computation time, it is allocating less memory.
I will submit a patch if it is ok for you.

/tmp/initial.bench/tmp/hack.bench
time/opdelta
FillOver-12995µs ± 0%973µs ± 0%~(p=1.000 n=1+1)
FillSrc-1227.1µs ± 0%25.5µs ± 0%~(p=1.000 n=1+1)
CopyOver-12715µs ± 0%727µs ± 0%~(p=1.000 n=1+1)
CopySrc-1221.8µs ± 0%21.6µs ± 0%~(p=1.000 n=1+1)
NRGBAOver-12863µs ± 0%864µs ± 0%~(p=1.000 n=1+1)
NRGBASrc-12471µs ± 0%472µs ± 0%~(p=1.000 n=1+1)
YCbCr-12433µs ± 0%435µs ± 0%~(p=1.000 n=1+1)
Gray-12151µs ± 0%150µs ± 0%~(p=1.000 n=1+1)
CMYK-12471µs ± 0%509µs ± 0%~(p=1.000 n=1+1)
GlyphOver-12236µs ± 0%240µs ± 0%~(p=1.000 n=1+1)
RGBA-122.96ms ± 0%2.99ms ± 0%~(p=1.000 n=1+1)
PalettedFill-126.47µs ± 0%6.77µs ± 0%~(p=1.000 n=1+1)
PalettedRGBA-121.63ms ± 0%1.63ms ± 0%~(p=1.000 n=1+1)
GenericOver-128.05ms ± 0%8.29ms ± 0%~(p=1.000 n=1+1)
GenericMaskOver-124.21ms ± 0%3.29ms ± 0%~(p=1.000 n=1+1)
GenericSrc-123.34ms ± 0%4.20ms ± 0%~(p=1.000 n=1+1)
GenericMaskSrc-126.22ms ± 0%6.21ms ± 0%~(p=1.000 n=1+1)
_drawRGBA-126.24ms ± 0%3.81ms ± 0%~(p=1.000 n=1+1)
_drawRGBAGray-125.83ms ± 0%3.46ms ± 0%~(p=1.000 n=1+1)
 
alloc/opdelta
FillOver-120.00B 0.00B ~(all equal)
FillSrc-120.00B 0.00B ~(all equal)
CopyOver-120.00B 0.00B ~(all equal)
CopySrc-120.00B 0.00B ~(all equal)
NRGBAOver-120.00B 0.00B ~(all equal)
NRGBASrc-120.00B 0.00B ~(all equal)
YCbCr-120.00B 0.00B ~(all equal)
Gray-120.00B 0.00B ~(all equal)
CMYK-120.00B 0.00B ~(all equal)
GlyphOver-120.00B 0.00B ~(all equal)
RGBA-12960kB ± 0%960kB ± 0%~(p=1.000 n=1+1)
PalettedFill-120.00B 0.00B ~(all equal)
PalettedRGBA-1240.0B ± 0%40.0B ± 0%~(all equal)
GenericOver-122.88MB ± 0%2.88MB ± 0%~(p=1.000 n=1+1)
GenericMaskOver-121.04MB ± 0%0.72MB ± 0%~(p=1.000 n=1+1)
GenericSrc-12960kB ± 0%960kB ± 0%~(all equal)
GenericMaskSrc-122.16MB ± 0%1.20MB ± 0%~(p=1.000 n=1+1)
_drawRGBA-12256kB ± 0%4kB ± 0%~(p=1.000 n=1+1)
_drawRGBAGray-12254kB ± 0%2kB ± 0%~(p=1.000 n=1+1)
 
allocs/opdelta
FillOver-120.00 0.00 ~(all equal)
FillSrc-120.00 0.00 ~(all equal)
CopyOver-120.00 0.00 ~(all equal)
CopySrc-120.00 0.00 ~(all equal)
NRGBAOver-120.00 0.00 ~(all equal)
NRGBASrc-120.00 0.00 ~(all equal)
YCbCr-120.00 0.00 ~(all equal)
Gray-120.00 0.00 ~(all equal)
CMYK-120.00 0.00 ~(all equal)
GlyphOver-120.00 0.00 ~(all equal)
RGBA-12120k ± 0%120k ± 0%~(all equal)
PalettedFill-120.00 0.00 ~(all equal)
PalettedRGBA-122.00 ± 0%2.00 ± 0%~(all equal)
GenericOver-12360k ± 0%360k ± 0%~(all equal)
GenericMaskOver-12210k ± 0%90k ± 0%~(p=1.000 n=1+1)
GenericSrc-12120k ± 0%120k ± 0%~(all equal)
GenericMaskSrc-12270k ± 0%150k ± 0%~(p=1.000 n=1+1)
_drawRGBA-12250k ± 0%0k ~(p=1.000 n=1+1)
_drawRGBAGray-12250k ± 0%0k ~(p=1.000 n=1+1)
 
@nigeltao
Copy link
Contributor

@nigeltao nigeltao commented May 28, 2021

I thought about adding a private function to replace all the calls to img.At(x,y).RGBA().

We don't want to end up with a function call and type switch per pixel. Better to hoist it out of the loop, hence writing a new top-level function.

See also https://go-review.googlesource.com/c/go/+/311129 "image: add RGBA64Image interface".

@gopherbot
Copy link

@gopherbot gopherbot commented May 31, 2021

Change https://golang.org/cl/323749 mentions this issue: image/draw: improve performances if mask is *image.Alpha

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants