-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
image/draw: increase performances by applying special case if mask is *image.Alpha #46395
Comments
I'm open in principle to the change, with two comments.
Line 148 in 6ff0ae2
|
Thanks for your reply. I thought about adding a private function to replace all the calls to func rgbaAt(x, y int, img image.Image) (r, g, b, a uint32) {
switch img0 := img.(type) {
case *image.RGBA:
off := img0.PixOffset(x, y)
r = uint32(img0.Pix[off])
g = uint32(img0.Pix[off+1])
b = uint32(img0.Pix[off+2])
a = uint32(img0.Pix[off+3])
r |= r << 8
g |= g << 8
b |= b << 8
a |= a << 8
return
case *image.Gray:
off := img0.PixOffset(x, y)
y := uint32(img0.Pix[off])
y |= y << 8
return y, y, y, uint32(0xffff)
case *image.Alpha:
off := img0.PixOffset(x, y)
a := uint32(img0.Pix[off])
a |= a << 8
return a, a, a, a
default:
return img0.At(x, y).RGBA()
}
} I tested it, and the results look very promising. Besides the computation time, it is allocating less memory.
|
We don't want to end up with a function call and type switch per pixel. Better to hoist it out of the loop, hence writing a new top-level function. See also https://go-review.googlesource.com/c/go/+/311129 "image: add RGBA64Image interface". |
Change https://golang.org/cl/323749 mentions this issue: |
I'd be interested in new "compare to the status quo" benchmark numbers now that "image/draw: add RGBA64Image fast path" https://go-review.googlesource.com/c/go/+/340049 has been submitted. BTW the "delta" column in the tables from your earlier comment has no numbers. |
I made a bench comparing the master branch (commit ab7c904) (that holds the change you mention including the
The last bench are: package draw
import (
"image"
"testing"
)
func Benchmark_drawRGBA(b *testing.B) {
r := image.Rect(0, 0, 500, 500)
src := image.NewRGBA(r)
dst := image.NewRGBA(r)
mask := image.NewAlpha(r)
for i := 0; i < b.N; i++ {
DrawMask(dst, r, src, image.Point{}, mask, image.Point{}, Over)
}
}
func Benchmark_drawRGBAGray(b *testing.B) {
r := image.Rect(0, 0, 500, 500)
// Colors are defined by Red, Green, Blue, Alpha uint8 values.
src := image.NewGray(r)
dst := image.NewRGBA(r)
mask := image.NewAlpha(r)
for i := 0; i < b.N; i++ {
DrawMask(dst, r, src, image.Point{}, mask, image.Point{}, Over)
}
} I am not sure that is what you want. |
Change https://golang.org/cl/351852 mentions this issue: |
Thanks for the new numbers. I overlooked a code path in https://golang.org/cl/340049 which should be addressed by https://golang.org/cl/351852 Once 351852 lands, I'd be curious if you could re-run your benchmarks. I'll repeat that the "delta" column in your tables have no numbers, so it's hard to tell which rows I should focus on. |
https://golang.org/cl/351852 has landed |
This should have been part of https://golang.org/cl/340049 but I overlooked it. That commit added fast path code when the destination image was *not* an *image.RGBA. This commit edits func drawRGBA. name old time/op new time/op delta RGBA1-4 5.11ms ± 1% 1.12ms ± 1% -78.01% (p=0.008 n=5+5) RGBA2-4 8.69ms ± 1% 2.98ms ± 1% -65.77% (p=0.008 n=5+5) Updates #44808. Updates #46395. Change-Id: I899d46d985634fc81ea47ff4f0d436630e8a961c Reviewed-on: https://go-review.googlesource.com/c/go/+/351852 Trust: Nigel Tao <nigeltao@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
Here are the results of my tests: ❯ cat ~/go.516d75ccf1.bench
goos: darwin
goarch: amd64
pkg: image/draw
cpu: Intel(R) Core(TM) i7-8850H CPU @ 2.60GHz
BenchmarkFillOver-12 1240 919849 ns/op 0 B/op 0 allocs/op
BenchmarkFillSrc-12 50834 24552 ns/op 0 B/op 0 allocs/op
BenchmarkCopyOver-12 1705 699634 ns/op 0 B/op 0 allocs/op
BenchmarkCopySrc-12 55306 20768 ns/op 0 B/op 0 allocs/op
BenchmarkNRGBAOver-12 1422 841424 ns/op 0 B/op 0 allocs/op
BenchmarkNRGBASrc-12 2649 464700 ns/op 0 B/op 0 allocs/op
BenchmarkYCbCr-12 2835 436423 ns/op 0 B/op 0 allocs/op
BenchmarkGray-12 8022 152508 ns/op 0 B/op 0 allocs/op
BenchmarkCMYK-12 2582 461211 ns/op 0 B/op 0 allocs/op
BenchmarkGlyphOver-12 5244 228985 ns/op 0 B/op 0 allocs/op
BenchmarkRGBA1-12 1944 611994 ns/op 0 B/op 0 allocs/op
BenchmarkRGBA2-12 746 1626558 ns/op 0 B/op 0 allocs/op
BenchmarkPalettedFill-12 185510 6629 ns/op 0 B/op 0 allocs/op
BenchmarkPalettedRGBA-12 771 1607542 ns/op 40 B/op 2 allocs/op
BenchmarkGenericOver-12 745 1711477 ns/op 0 B/op 0 allocs/op
BenchmarkGenericMaskOver-12 1329 1115717 ns/op 0 B/op 0 allocs/op
BenchmarkGenericSrc-12 1329 899930 ns/op 0 B/op 0 allocs/op
BenchmarkGenericMaskSrc-12 982 1181788 ns/op 0 B/op 0 allocs/op
Benchmark_drawRGBA-12 355 3524321 ns/op 6392 B/op 0 allocs/op
Benchmark_drawRGBAGray-12 432 2776462 ns/op 3508 B/op 0 allocs/op
PASS
ok image/draw 28.130s ❯ cat ~/go.25a0774.bench
goos: darwin
goarch: amd64
pkg: image/draw
cpu: Intel(R) Core(TM) i7-8850H CPU @ 2.60GHz
BenchmarkFillOver-12 1226 937482 ns/op 0 B/op 0 allocs/op
BenchmarkFillSrc-12 50300 24057 ns/op 0 B/op 0 allocs/op
BenchmarkCopyOver-12 1728 688587 ns/op 0 B/op 0 allocs/op
BenchmarkCopySrc-12 56661 21351 ns/op 0 B/op 0 allocs/op
BenchmarkNRGBAOver-12 1413 844938 ns/op 0 B/op 0 allocs/op
BenchmarkNRGBASrc-12 2599 459764 ns/op 0 B/op 0 allocs/op
BenchmarkYCbCr-12 2786 425523 ns/op 0 B/op 0 allocs/op
BenchmarkGray-12 8143 147877 ns/op 0 B/op 0 allocs/op
BenchmarkCMYK-12 2365 469891 ns/op 0 B/op 0 allocs/op
BenchmarkGlyphOver-12 4761 233635 ns/op 0 B/op 0 allocs/op
BenchmarkRGBA1-12 1956 604964 ns/op 0 B/op 0 allocs/op
BenchmarkRGBA2-12 1064 1131638 ns/op 0 B/op 0 allocs/op
BenchmarkPalettedFill-12 182772 6491 ns/op 0 B/op 0 allocs/op
BenchmarkPalettedRGBA-12 786 1536837 ns/op 40 B/op 2 allocs/op
BenchmarkGenericOver-12 775 1532559 ns/op 0 B/op 0 allocs/op
BenchmarkGenericMaskOver-12 1360 884728 ns/op 0 B/op 0 allocs/op
BenchmarkGenericSrc-12 1386 868615 ns/op 0 B/op 0 allocs/op
BenchmarkGenericMaskSrc-12 1164 1060872 ns/op 0 B/op 0 allocs/op
Benchmark_drawRGBA-12 500 2335115 ns/op 4538 B/op 0 allocs/op
Benchmark_drawRGBAGray-12 1375 882834 ns/op 1102 B/op 0 allocs/op
PASS
ok image/draw 26.094s ❯ benchstat -delta-test none ~/go.516d75ccf1.bench ~/go.25a0774.bench
name old time/op new time/op delta
FillOver-12 920µs ± 0% 937µs ± 0% +1.92%
FillSrc-12 24.6µs ± 0% 24.1µs ± 0% -2.02%
CopyOver-12 700µs ± 0% 689µs ± 0% -1.58%
CopySrc-12 20.8µs ± 0% 21.4µs ± 0% +2.81%
NRGBAOver-12 841µs ± 0% 845µs ± 0% +0.42%
NRGBASrc-12 465µs ± 0% 460µs ± 0% -1.06%
YCbCr-12 436µs ± 0% 426µs ± 0% -2.50%
Gray-12 153µs ± 0% 148µs ± 0% -3.04%
CMYK-12 461µs ± 0% 470µs ± 0% +1.88%
GlyphOver-12 229µs ± 0% 234µs ± 0% +2.03%
RGBA1-12 612µs ± 0% 605µs ± 0% -1.15%
RGBA2-12 1.63ms ± 0% 1.13ms ± 0% -30.43%
PalettedFill-12 6.63µs ± 0% 6.49µs ± 0% -2.08%
PalettedRGBA-12 1.61ms ± 0% 1.54ms ± 0% -4.40%
GenericOver-12 1.71ms ± 0% 1.53ms ± 0% -10.45%
GenericMaskOver-12 1.12ms ± 0% 0.88ms ± 0% -20.70%
GenericSrc-12 900µs ± 0% 869µs ± 0% -3.48%
GenericMaskSrc-12 1.18ms ± 0% 1.06ms ± 0% -10.23%
_drawRGBA-12 3.52ms ± 0% 2.34ms ± 0% -33.74%
_drawRGBAGray-12 2.78ms ± 0% 0.88ms ± 0% -68.20%
name old alloc/op new alloc/op delta
FillOver-12 0.00B 0.00B 0.00%
FillSrc-12 0.00B 0.00B 0.00%
CopyOver-12 0.00B 0.00B 0.00%
CopySrc-12 0.00B 0.00B 0.00%
NRGBAOver-12 0.00B 0.00B 0.00%
NRGBASrc-12 0.00B 0.00B 0.00%
YCbCr-12 0.00B 0.00B 0.00%
Gray-12 0.00B 0.00B 0.00%
CMYK-12 0.00B 0.00B 0.00%
GlyphOver-12 0.00B 0.00B 0.00%
RGBA1-12 0.00B 0.00B 0.00%
RGBA2-12 0.00B 0.00B 0.00%
PalettedFill-12 0.00B 0.00B 0.00%
PalettedRGBA-12 40.0B ± 0% 40.0B ± 0% 0.00%
GenericOver-12 0.00B 0.00B 0.00%
GenericMaskOver-12 0.00B 0.00B 0.00%
GenericSrc-12 0.00B 0.00B 0.00%
GenericMaskSrc-12 0.00B 0.00B 0.00%
_drawRGBA-12 6.39kB ± 0% 4.54kB ± 0% -29.01%
_drawRGBAGray-12 3.51kB ± 0% 1.10kB ± 0% -68.59%
name old allocs/op new allocs/op delta
FillOver-12 0.00 0.00 0.00%
FillSrc-12 0.00 0.00 0.00%
CopyOver-12 0.00 0.00 0.00%
CopySrc-12 0.00 0.00 0.00%
NRGBAOver-12 0.00 0.00 0.00%
NRGBASrc-12 0.00 0.00 0.00%
YCbCr-12 0.00 0.00 0.00%
Gray-12 0.00 0.00 0.00%
CMYK-12 0.00 0.00 0.00%
GlyphOver-12 0.00 0.00 0.00%
RGBA1-12 0.00 0.00 0.00%
RGBA2-12 0.00 0.00 0.00%
PalettedFill-12 0.00 0.00 0.00%
PalettedRGBA-12 2.00 ± 0% 2.00 ± 0% 0.00%
GenericOver-12 0.00 0.00 0.00%
GenericMaskOver-12 0.00 0.00 0.00%
GenericSrc-12 0.00 0.00 0.00%
GenericMaskSrc-12 0.00 0.00 0.00%
_drawRGBA-12 0.00 0.00 0.00%
_drawRGBAGray-12 0.00 0.00 0.00% |
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
yes
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
I am building a tool that is manipulating a lot of images. The process is a bit slow.
I made some performance analysis, and it appears that the
drawRGBA
method is really time-consuming.Specially the call to
mask.At(mx,my).RGBA()
is causing performance penalty.I think that most of the use cases of
DrawMask
function are using an*image.Alpha
structure as a mask. Therefore doing a special treatment if the mask is*image.Alpha
could be valuable for most of the usage.I made a simple test and bench for the
drawRGBA
method (it should be done for other methods as well).Then I made a patch as an experiment:
The performance enhancement is not negligible:
This is a test for opening the discussion, and I guess that other controls should be added as it makes some tests of the draw package fail. Specifically, the
test.mask
causes panic, and I cannot figure out whattest.mask
is nor where it is defined.What did you expect to see?
N/A
What did you see instead?
N/A
The text was updated successfully, but these errors were encountered: