New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert image to grayscale #7

Closed
pfertyk opened this Issue Nov 20, 2016 · 15 comments

Comments

Projects
None yet
3 participants
@pfertyk

pfertyk commented Nov 20, 2016

I'm new to crystal. I tried to use stumpy_png to convert an image to grayscale. The code I created looks like this:

require "stumpy_png"

canvas = StumpyPNG.read("image.png")

(0...canvas.width).each do |x|
  (0...canvas.height).each do |y|
    color = canvas[x, y]
    g = 0_u32
    g = (g + color.r + color.g + color.b) / 3
    canvas[x, y] = StumpyPNG::RGBA.from_gray_n(g, 16)
  end
end

StumpyPNG.write(canvas, "output.png")

Note: I know that grayscale convertion is a bit more complicated than the average value of 3 colors, but this is a simple example.

As you can see, the code looks very complicated. It seems that I cannot assign the value to the color directly (canvas[x, y].r = 0 doesn't work), so I need to assign the color to a variable and then create a new color. I also need to specify the type of the grayscale value (UINT32), otherwise I get incorrect values for bright colors. Moreover, the performance is very unsatisfactory (a full HD image was converted in around 4s).

All this makes me think that I'm doing something wrong. I found some occurences of the term 'grayscale' in stumpy_png repo, but I don't really know how to use it. Could you please provide an example of how to correctly (that is: efficiently and maybe in a less verbose way) solve this problem?

Some things I would like to know are:

  • is there a predefined grayscale filter in stumpy_png?
  • can I write a color directly (without creating new RGBA object)?
  • is there an easier way to apply an operation to each pixel?
@l3kn

This comment has been minimized.

Collaborator

l3kn commented Nov 20, 2016

Hi,

I'll try to answer your questions in order:

  1. When creating this lib (and stumpy_gif),
    I tried to come up with some kind of base infrastructure
    for canvases and colors (stumpy_core)
    so the functionality is very minimal (basically setting & getting pixels + creating colors),
    so what you are doing is pretty much the best way ;)

    Color filters would definitely be useful,
    but I'm not sure in what part of this abstraction they would belong

  2. In crystal there are classes (like in Java / Ruby / etc. ) and structs,
    structs are immutable but allocated in a different way
    which makes them (in most cases) much faster,
    so I chose to implement the canvas as a class
    and the single colors as structs.

    As a result every time you "change" a color,
    you need to create a new object.

  3. Currently not.

About the bad performance:

Can you upload the image your are using somewhere
or tell me its dimensions so I can do some benchmark?

Are you compiling your program with the --release flag?
If not, maybe that will make it a lot faster.

Possible solutions:

Maybe it is possible to implement the Enumerable
interface for the Canvas class, so it could provide
methods like canvas.map!(&block).

canvas.map! do |color|
    g = (g + color.r + color.g + color.b) / 3
    StumpyPNG::RGBA.from_gray_n(g, 16)
end

Additionally, if you come up with a nice function to convert colors to grayscale,
I could add it to the RGBA struct, so the above code could be simplified even more:

canvas.map!(&.to_grayscale)

(map!(&.to_grayscale) is equivalent to map! { |c| c.to_grayscale })

Would you find those changes useful?
If so, would you like to try to implement them yourself
and open up a pull request or shall I do that?

Edit: Enumerable does not seem to be the way to go,
because it provides a lot of functions that would not make sense in this context,
like .select, .product, etc.

@pfertyk

This comment has been minimized.

pfertyk commented Nov 21, 2016

Thanks for the --release flag hint, now the program runs in around 1s :) Still, a program in Python runs in 0.2s for the same image:

from PIL import Image

im = Image.open('image.png')
im_gray = im.convert('L')
im_gray.save('output.png')

As I mentioned, the image size is full HD (1920x1080). Here it is:
image

I am disappointed by the performance. I guess it is the result of using colors as structs and creating a new one for each pixel.

Please don't get me wrong. Your shard creates a very useful abstraction for images, and it's great that you help Crystal community by creating tools like this one. But still, Crystal was supposed to be efficient like C while still having a high level syntax like Ruby (I am not a Rubyist, but I could appreciate a language with both these advantages). It seems however that, at least in this case, Crystal is neither high level (the code is quite verbose) nor efficient (compared with Python, which is supposed to be slower than anything you can compile!).

Let's leave the verbosity of the code for a moment. Do you think something can be done here about the performance? Some preferably small change to stumpy_png itselt or to my code that would boost it up a bit?

Regarding your question, for the moment I don't want to create a PR for grayscale function, since I want to focus on improving the performance of my solution (not the readability).

@l3kn

This comment has been minimized.

Collaborator

l3kn commented Nov 21, 2016

I inserted some code to get the elapsed time after each of the operations
(t1: load image, t2: convert to grayscale, t3: save image)
and it seem like the most time is wasted loading and saving the image

Crystal:

t2 - t1 # => 00:00:00.4599181
t3 - t2 # => 00:00:00.0112788
t4 - t3 # => 00:00:00.2979503

Python:

0.11043930053710938
0.052173614501953125
0.08035922050476074

This is not very scientific but by the looks of it,
the grayscale conversion in crystal is faster than the one in python.

On the other hand loading and saving the image is much faster in python,
maybe PIL uses bindings to libpng or their code is more efficient.

Additionally the output png of the python version is only ~240kb in size
and stored in the 8bit grayscale png format,
this shard currently only supports writing images in 16bit rgb+alpha format,
so the resulting size is ~830kb.

It seems however that, at least in this case, Crystal is neither high level (the code is quite verbose) nor efficient

I would say that is definitely the fault of my code
and not of the language itself.

This lib it's still in a very early stage of development
and there are a lot of areas where performance could be improved.

@pfertyk

This comment has been minimized.

pfertyk commented Nov 21, 2016

Your measurements seem to be consistent with my observations. Initially I was printing the width and height of the loaded image (to check if the loading process was successful) and it took a moment for the output to appear. So it seems that the loading process itself takes a lot of time.

Your shard is one of two that enable image processing. The other one still has TODO in the readme file :) Also, your project is the only one listed here, so that makes you the author of the most popular Crystal image processing lib :)

Crystal is already 2 years old (5 if you count the initial version). I don't have much experience with young programming languages, but I thought that the number of available tools would be higher at this stage of development. Perhaps I was wrong and it takes more time. Or perhaps I chose the wrong topic (I guess HTTP server is something that Crystal programmers need more and it would be more intensively developed).

I just wanted a quick dive in Crystal to check how it works, how the code looks like and what is the performance. It seems though that this language is a bit too young to be fully useful. I've had some issues with installing shards, and now the most popular image processing lib turns out to be in a very early stage of development. It seems that Crystal is still a language that needs a lot of contributions. I might help in the future, but that was not the purpose of this little experiment.

Thank you very much for your help and quick answers! It seems that great community is something that Crystal does not lack :) I hope that you will continue working on this lib and make it faster and easier to use than PIL :)

@l3kn

This comment has been minimized.

Collaborator

l3kn commented Nov 21, 2016

Wow, judging from some comments in the source code
PIL is seems to be more than 20 years old.

I guess these kinds of performance optimizations
just take time (or manpower), I'm pretty new to the language myself,
maybe there are some performance tricks I don't know of (yet).

Some Benchmarks

Because stumpy_png can only write 16bit-RGBA images,
I used one as a input, too.

Go image/png

  • reading: 110ms
  • writing: 650ms

The resulting file is a 16bit-RGB image, ~800kb

Python Pillow

  • reading: 90ms
  • writing: 360ms

The resulting file is a 8bit-RGBA image, ~720kb

Crystal stumpy_png

  • reading: 680ms
  • writing: 320ms

The resulting file is a 16bit-RGBA image, ~1100kb

Ruby chunky_png

  • reading: 2200ms
  • writing: 1600ms

The resulting file is a 8bit-RGB image, ~660kb

Ruby oily_png

  • reading: 110ms
  • writing: 320ms

The resulting file is a 8bit-RGB image, ~660kb

So the writing speed is actually not that bad,
I only need to find out why reading takes so long...

Possible solutions

  • Make reading faster
  • Allow writing other png color types (hard)
@asterite

This comment has been minimized.

Contributor

asterite commented Nov 21, 2016

(It seems @l3kn beat me for a few seconds :-D)

@pfertyk Hi! I've profiled your program and timed each of the parts (read, transform, write), here are the results:

Read: 427.93ms (61.14%)
Transform: 15.98ms (2.28%)
Write: 256.06ms (36.58%)
Total: 699.96ms

That means that transforming the PNG isn't the bottleneck. I didn't think it was because an RGBA is a struct of four UInt16, so in total it's 64 bytes so it's a word and passing that around should be very fast.

We still need to optimize the read and write parts. I believe it's possible, I think there are many intermediate memory allocations that could be avoided. As with every language, you can write slow code in it, using a language doesn't guarantee good efficiency.

I'll try to see how this can be done (I really like optimizing programs and efficient code, and I really enjoy programs where you can see something nice as the output, such as an image ^_^)

@pfertyk

This comment has been minimized.

pfertyk commented Nov 21, 2016

After this comment (by @l3kn) I had no doubt that the processing itself is not a problem here. But the problem for me (as a developer) is that the library does not perform as well as expected. It doesn't matter if the problem is processing the image or reading/writing it. I wanted to convert images efficiently and I couldn't do it.

That is nothing that a good contributor couldn't fix, of course :) I can't wait to see how stumpy_png will work after your modifications, @asterite. Image processing is a domain that could really benefit from high performance.

@l3kn

This comment has been minimized.

Collaborator

l3kn commented Nov 29, 2016

Now there are some ways to make the code a bit less verbose

require "stumpy_png"

canvas = StumpyPNG.read("image.png")

canvas.map! do |color|
  g = (color.r.to_u32 + color.g + color.b) / 3
  StumpyPNG::RGBA.new(g.to_u16)
end

StumpyPNG.write(canvas, "output.png")

and (thank to @asterite) the performance is a lot better.

@l3kn l3kn closed this Nov 29, 2016

@asterite

This comment has been minimized.

Contributor

asterite commented Nov 29, 2016

@l3kn Nice!

Would it help to have a to_gray method on RBGA that does that? That way you could do:

require "stumpy_png"

canvas = StumpyPNG.read("image.png")
canvas.map! &.to_gray
StumpyPNG.write(canvas, "output.png")

That way all the "verbosity" would be gone and it would be almost the same as the python version. I don't know if to_gray makes sense by itself, though, mostly because of the note in the first comment in this thread:

Note: I know that grayscale convertion is a bit more complicated than the average value of 3 colors, but this is a simple example.

@pfertyk

This comment has been minimized.

pfertyk commented Nov 29, 2016

@l3kn I tried your code, but I got this error:

$ crystal build --release file.cr 
Error in ./file.cr:3: instantiating 'StumpyPNG:Module#read(String)'

canvas = StumpyPNG.read("image.png")
                   ^~~~

in ./libs/stumpy_png/stumpy_png.cr:9: instantiating 'StumpyPNG::PNG:Class#new()'

    png = PNG.new
              ^~~

instantiating 'StumpyPNG::PNG#initialize()'

in ./libs/stumpy_png/stumpy_png/png.cr:39: undefined constant IO::Memory

      @idat_buffer = IO::Memory.new
                     ^~~~~~~~~~

Could you help me with this?

@l3kn

This comment has been minimized.

Collaborator

l3kn commented Nov 29, 2016

Which crystal version are you using?

IO::Memory was added with 0.20.0 (as far as I know)
which is pretty new, so hopefully a update would fix your error.

@pfertyk

This comment has been minimized.

pfertyk commented Nov 29, 2016

That was it, thanks! The code works and takes around 0.48s to execute.

I've got to say, I'm impressed. In just a few days you guys doubled the speed and improved the readability of the code. Crystal has a really amazing community :)

Do you think it would be possible to make this program work faster than the one in Python (so less than 0.2s)? It is a bit weird for a compiled language to lose the speed competition with Python ;)

@l3kn

This comment has been minimized.

Collaborator

l3kn commented Nov 29, 2016

@asterite One of the next features I'd like to implement
would the ability to save images in different color types and bit depths
(currently only RGBA with 16bit is supported).

This would require methods like .to_gray etc,
so (unless someone wants to have this right now)
I'd add all of that in one go.

@pfertyk Thanks!

I'm pretty sure it is possible (and going to happen soon) ;)

One of the reasons the python version is faster
is that it supports writing images as RGBA 8-bit and grayscale
so the amount of bytes it has to process when writing
is a little bit lower.

If I take out the grayscale conversion in both programs
the python one runs in 0.400s (and outputs RGBA-8bit instead of 16bit)
and the crystal one in 0.480s, so depending on what you are doing
with the images in your program, the speed difference might not be that great.

@asterite

This comment has been minimized.

Contributor

asterite commented Nov 30, 2016

@pfertyk Also remember that PIL (or Pillow, as I understand) has most of the core of the processing code written in C. Check the project, click on the colored bar bellow "commits", "branches", etc, you'll see: Python 59.2%, C 39.8%, etc. The conversion code is here. So this is Crystal vs. C, really, so it's no wonder C is faster right now, specially with 15 years ahead of us :-)

@l3kn

This comment has been minimized.

Collaborator

l3kn commented Dec 2, 2016

I added the ability specify the color mode and bit depth of the resulting image,
now all you need to do to get a grayscale image is:

require "stumpy_png"

canvas = StumpyPNG.read("image.png")
# No need to convert to grayscale here
StumpyPNG.write(canvas, "output.png", color_type: :grayscale, bit_depth: 8)

The python version seems to be using some other method (not averaging red, green and blue)
to calculate the grayscale value and outputs a smaller (file size) image
but besides that the results are pretty much the same (in terms of png file types).

The updated (very scientific) benchmarks are:

Crystal

  • read: 140ms (90ms if the input image has a color depth of 16)
  • transform: 0ms (done when writing)
  • write: 40ms
  • total: 180ms (130ms if the input image has a color depth of 16)

Python

  • read: 90ms
  • transform: 50ms
  • write: 80ms
  • total: 220ms

So the crystal version is now (for this specific use case)
faster than the python version 🎉

Edit:

Currently I'm not 100% sure why the 16bit version in crystal is faster,
one reason is that the 16bit input file uses the simplest filtertype (0)
for all scanlines.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment