-
-
Notifications
You must be signed in to change notification settings - Fork 848
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed. Hot nasty bad-ass speed! #8
Comments
From @mavenius on March 16, 2016 10:11 Might want to check out Scientist.Net for comparing performance and Thank you, Mark Avenius
|
From @rold2007 on March 17, 2016 8:35 The first task is obviously to identify the slow parts. I have never used BenchMarkDotNet but I heard about it recently, I guess this is a good opportunity to try it! I never heard of Scientist.Net before but maybe it could be used to compare even with other similar libraries (AForge.Net, etc.) ? I'll do some tests with BenchMarkDotNet to see if it does the job, and report back here. |
That would be great if you could. I had a read through the instructions but I am terrible when it comes to following them. |
From @rold2007 on March 23, 2016 8:9 I haven't been able to use BenchmarkDotNet with DNX so far, even on the most basic code. I'll report my issues to the BenchmarkDotNet team and see what can be done. Sorry about the delay. |
@rold2007 No worries, thanks for trying. I'm sure someone there will be happy to give us pointers. |
From @rold2007 on March 23, 2016 8:50 @JimBobSquarePants It is now resolved. I'll be quite busy until next week but I'll continue with this asap. |
@rold2007 Awesome! Here's hoping we can get some great info to find bottlenecks. |
From @voidstar69 on May 1, 2016 9:4 I have tried installing BenchmarkDotNet into ImageProcessorCore.Tests via Nuget, both the official release and the prerelease. I cannot use either - the prerelease appears to need the NETStandard libraries, and the official release appears to not support DNXCore 5.0. @JimBobSquarePants @rold2007 Have you managed to get BenchmarkDotNet working to benchmark an existing unit test in ImageProcessorCore? |
@voidstar69 Truth be told I've not tried, I've only used it outside for testing. I'll have a look as soon as I can. It's probably best, however, to put the benchmark tests into a separate NET 4.6 console project so that I can do direct comparisons against System.Drawing. |
@voidstar69 Just added it to the repo now 😄 |
From @voidstar69 on May 19, 2016 21:12 @JimBobSquarePants Thanks for adding BenchmarkDotNet to the repo, looks like it will be very useful to compare against System.Drawing. I have started off by looking at the performance of Crop. On my laptop the System.Drawing version take about 800us, while the Core version takes about 1380us.
to this:
This appears to speed up the crop from 1380us to about 1330us. A small but significant improvement. You should test for a similar performance improvement on your PC, in case this is specific to my PC. One downside is that the array index calculation is now exposed in Crop.cs, whereas before it was hidden away in ImageBase.this[].
|
Hi @voidstar69 Thanks for having a look at this. Have you go the latest version of the codebase? I have a test against Crop here which resulted in Crop taking half the time of
I'll need to add a unit test to double check against the target. The current code was submitted as a bug fix.
I do call What I do need to know is whether I should be checking and calling |
From @mattwarren on May 20, 2016 8:53 Hi, I'm one of the developers for BenchmarkDotNet and it's cool to see you using it, I added you to our list. BTW there's a Diagnostics Nuget package that might be help you out. It gives you information about memory allocations and method inlining that might be useful to have in your benchmarks |
Hi @mattwarren thanks for dropping by. That package sounds like it would definitely be very useful thank you! Happy to have made your list. 😄 |
From @voidstar69 on May 21, 2016 18:17 Hi @JimBobSquarePants, I got the latest code from the Core branch before I tried my changes, and I did test them in Release mode. That is the same test that I ran on my machine, but the performance profile is different for me (i5 CPU, 4 logical processors). What were your timings in us (microseconds)? If you try out my Array.Copy code, does it make Crop faster or slower on your machine? You are right about As for |
@voidstar69 I've taken your advise on board re As for timings. Here's the timing on my main laptop. It's got waaay less cores than my work machine which has twelve. On that machine ImageProcessorCore is 0.46 when scaled which is good and quick.
Using your code addition
I ran the tests a few times and the difference was negligible so I don't want to change the code yet (big emphasis on the yet) You have got me thinking though which is awesome. |
One thing that we should definitely have a look at which affects all processes is how I manage threading. in I determine the number of tasks based on multiplying the processor count by 2 which I will freely admit was a number plucked out from the air. I would love to have someone review this process to see if I am falling short somewhere and taking the wrong approach. |
Well ain't this nice... 😄
|
From @alexmbaker on June 7, 2016 9:10 You may find something useful in https://github.com/Microsoft/Microsoft.IO.RecyclableMemoryStream |
@mattwarren Just to follow up on your package recommendation. My project threw a wobbler citing a missing .dll when I tried to use the diagnostic package. Are there known issues on .NET Core or am I doing something daft? |
From @mattwarren on June 29, 2016 8:40 Have you added the BenchmarkDotNet.Diagnostics.Windows package to your project as well? |
@mattwarren Yeah I do. Here's the message. I'll open an issue if you like?
|
From @danijel-peric on July 1, 2016 22:9 my test for resize, used jpeg image 960x540 and resized it to 320x240, code using (MemoryStream inStream = new MemoryStream(Image))
using (MemoryStream outStream = new MemoryStream())
using (ImageFactory imageFactory = new ImageFactory())
imageFactory.Load(inStream)
.Resize(new Size(320, 240))
.Format(new JpegFormat { Quality = 50 })
.Save(outStream); if you want code used for Windows.Media.Imaging or System.Drawing.Graphics let me know note i tested your library because i was trying to find fast resize library which doesn't use much cpu, from this test Windows.Media.Imaging was using 15% cpu on my machine, your resize used 40-50%
|
Thanks for your info but you're actually benchmarking the wrong codebase. This issue has been raised specifically for ImageProcessor.Core. There's going to be some overhead using the legacy |
From @rold2007 on July 3, 2016 1:0 @danijel-peric : Also, make sure you compare apples with apples. Resize methods like nearest neighbor and bicubic have a very different speed, and quality. But I'm not even sure what ImageProcessor is using. |
From @danijel-peric on July 3, 2016 7:25 :), it looks i have tested with old ImageFactory which was on https://www.nuget.org/packages/ImageProcessor sorry, will do another test with ImageProcessor.Core, so in order to get it i need to download source? and build it, do you have binary files somewhere for download? |
Hi @NeelBhatt Thanks for the publicity! 😄 I'd just like to suggest some updates to your blog if that's ok to correct a few details. Just to clarify. The ImageProcessor version (2.4.4) published in Nuget is the legacy version that runs on the full .NET Framework. That version is in a separate branch in this repo called Framework. What you are looking for to work on .NET Standard 1.1 + is the ImageProcessorCore package which is only on MyGet for the time being. That is hosted here. https://www.myget.org/gallery/imageprocessor I hope that clears things up a little. Cheers James |
From @NeelBhatt 14th July 2016 Oh okay so you mean for .Net core, we need to download it from the link you mentioned in your comment correct? Also can it be downloaded from your Github repository? I will surely change that. Thanks :) |
Just to note. Focus is now on Encoders/Decoders. Bmp is great! The rest.... |
Hi, I "ported" your projects to .NET 4.6 & managed to play with DotPeek. |
Hey @antonfirsov That's great! Which classes in particular were you looking at? Please say the Jpeg decoder/encoder... I would ❤️ if you could have a look at that if you could? I'm focusing on png just now. |
Yes, its the jpeg :) |
Excellent stuff! I got memory usage down on the encoder but the decoder is a mess! |
Lost some of my initial optimism. Only managed to gain ~10% speed growth by reducing allocations + array flattening in the jpeg decoder. Even with ArrayPool-s! The main bottlenecks are the After analyzing the libjpeg code, I have the following conclusions:
I can play a bit more this weekend, but if you want to go 100% managed, someone has to become a jpeg expert sooner or later. |
@antonfirsov That's still progress so don't lose heart yet. I have a big book on the jpeg spec at home and am investigating the different implementations. Already forked libjpeg-turbo 😄 Translating to the floating point version shouldn't be beyond us. There's a faster, slightly less accurate version here also https://github.com/libjpeg-turbo/libjpeg-turbo/blob/master/jidctflt.c FluxJpeg has implementations we can adapt that i'd like to investigate also. Intrigued by their inverse... It seems very simplistic. |
I tried the FluxJpeg implementation ... actually it's slower. Maybe we need a floating point IDCT with System.Numerics. |
This library contains a libjpeg port in C#: Shouldn't we adapt this instead of the golang port? |
It's libjpeg.net. You don't wanna see the source, its brutal and super slow. |
Could you PR with what you have please and I'll have a tinker from there. |
As an initial exercise I'll try to PR scanline buffer reuse in PNG decoder filters, its seems there are redundant allocations hapenning there |
@boguscoder Thanks! |
Just sent a PR, but conflicted :( I'm not sure if it's worth to merge in it's current form. Currently I'm experimenting with SIMD IDCT implementations. Trying to port this: |
I have good news with JpegDecoder! Host Process Environment Information:
BenchmarkDotNet-Dev.Core=v0.9.9.0
OS=Microsoft Windows NT 6.1.7601 Service Pack 1
Processor=Intel(R) Core(TM) i7-4810MQ CPU 2.80GHz, ProcessorCount=8
Frequency=2728115 ticks, Resolution=366.5535 ns, Timer=TSC
CLR=MS.NET 4.0.30319.42000, Arch=64-bit RELEASE [RyuJIT]
GC=Concurrent Workstation
JitModules=clrjit-v4.6.1076.0
Type=DecodeJpeg Mode=Throughput
Current master branch:
My (experimental) master with SIMD optimizations:
Are you interested to pull this in? :) There are lots of additional classes & test cases this time. |
Awwwwwww yeaaaaaaaaah! That's a brilliant improvement! Definitely looking to pull the changes in though I'll need you to strip your PR down to the minimum changes please first. You've got 28 files changed so far, some of which are completely unrelated (png filters) |
@JimBobSquarePants I'd like to help out on perf investigations, any pointers on where I can start/what to focus on? Should we start with some documentation on bechmarking in ImageSharp and an overview of the areas and if anyone is actively looking into them? 😄 Btw, I ran Also, I searched a bit on benchmarking with xUnit, found NBench and xUnit Perf, not sure if anyone has already looked into these. |
@olivif benchmarks are very good starters, but you don't get too far in performance analysis & optimization without a profiler. There is a built-in profiler in VS, but it's quite slow, it's better to grab JetBrains dotTrace if possible. And here is the tricky part: If the infrastructure is ready, it's really easy to profile .NET 4.6 code with dotTrace or with VS profiler. Finding and eliminating bottlenecks is an exciting game, think of a detective movie! :) |
Ahh I think I see now. Benchmarking will give you the overall numbers and help you compare against a baseline (be it a different lib or an earlier version of your lib), whereas the profiler will actually tell you what is slow (or provide enough info on all the parts so you can hunt yourself). Hmm I'm surprised the VS profiler didn't work with a core project, maybe the tooling for core is not all fully out there. |
Closing this as we have come a long way performance wise and it does no good to keep this issue open. |
prevent stylecop.json form being included in package
prevent stylecop.json form being included in package
prevent stylecop.json form being included in package
From @JimBobSquarePants on March 16, 2016 9:44
We've got a pretty good feature set for a V1 release.
Now we need to make it fast.
Things to look at:
ArrayPool
,Slice<T>
For benchmarking we can use BenchMarkDotNet now that the prerelease supports CoreFX
Add your thoughts below.
Copied from original issue: JimBobSquarePants/ImageProcessor#347
The text was updated successfully, but these errors were encountered: