Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fast access of bitmap buffer with numpy #45

Open
jiong3 opened this issue Feb 5, 2017 · 38 comments
Open

Fast access of bitmap buffer with numpy #45

jiong3 opened this issue Feb 5, 2017 · 38 comments

Comments

@jiong3
Copy link
Contributor

jiong3 commented Feb 5, 2017

Hi,

currently the bitmap buffer can be accessed using freetype.Bitmap.buffer which returns a python list of all the bytes. Then I can use np.fromiter to get a numpy array, however, due to the python loop through all the bytes, this is really slow.

Is there a way to access the memory that the buffer points to directly with numpy? Anything I have to consider if I try to do that?

@rougier
Copy link
Owner

rougier commented Feb 5, 2017

Good point. I think numpy.frombuffer might be useful in such a case but I've never really experienced it. However I think this might be a good starting point.

@jiong3
Copy link
Contributor Author

jiong3 commented Feb 7, 2017

So I had a look around on the internet and found different ways to do that:

@staticmethod
def get_np_array0(bitmap, num_bytes):
    # 34.625 / 36.874
    return np.fromiter(bitmap.buffer, dtype=np.uint8)

@staticmethod
def get_np_array1(bitmap, num_bytes):
    # 19.933 / 21.485
    return np.fromiter(bitmap._FT_Bitmap.buffer, dtype=np.uint8, count=num_bytes)

@staticmethod
def get_np_array2(bitmap, num_bytes):
    # 0.037 / 1.158, int_asbuffer is not documented
    return np.core.multiarray.int_asbuffer(ctypes.addressof(bitmap._FT_Bitmap.buffer.contents), num_bytes)

@staticmethod
def get_np_array3(bitmap, num_bytes):
    # 0.418 / 1.540, potential memory leak according to github issue 6511
    return np.ctypeslib.as_array(bitmap._FT_Bitmap.buffer, (num_bytes,))

@staticmethod
def get_np_array4(bitmap, num_bytes):
    # 0.072 / 1.242
    bfm = ctypes.pythonapi.PyBuffer_FromMemory
    bfm.restype = ctypes.py_object
    buffer = bfm(bitmap._FT_Bitmap.buffer, num_bytes)
    return np.frombuffer(buffer, dtype=np.uint8)

@staticmethod
def get_np_array5(bitmap, num_bytes):
    # 0.079 / 1.145
    buffer = ctypes.cast(bitmap._FT_Bitmap.buffer, ctypes.POINTER(ctypes.c_ubyte * num_bytes))
    return np.frombuffer(buffer.contents, dtype=np.uint8)

The numbers in the comments are from cProfile (cumtime of get_np_arrayX) / (cumtime of main function), just to get an idea of the performance. I rendered 10000 characters.

Two things I am not sure about and that might be relevant:
When is the memory of the buffer freed?
When is bitmap.pitch different from bitmap.width, and when is it negative?

@rougier
Copy link
Owner

rougier commented Feb 7, 2017

Nice ! But your last question reminds that we may have a problem with width/pitch difference.

The explanation can be found here:
https://www.freetype.org/freetype2/docs/reference/ft2-basic_types.html#FT_Bitmap

I'm not quite sure I understand it correctly.

@jiong3
Copy link
Contributor Author

jiong3 commented Feb 8, 2017

Here's another explanation of the pitch:
https://www.freetype.org/freetype2/docs/glyphs/glyphs-7.html

The way I understand it is that for just reading the buffer into a numpy array, num_bytes = rows * abs(pitch) should work correctly in all cases. If the pitch is negative the order of the rows has to be reversed (easy to do in numpy). Since the pitch is the number of bytes per row and width the number of pixels, for a normal grayscale (1 pixel = 1 byte) both are the same however if it's a black and white image (1 pixel = 1 bit) you have to unpack the pixels. That should be equally easy on a numpy array.

I think it would make sense to include something that can be used directly with np.frombuffer into the library, maybe method number 4 or 5.

The remaining question is, should the user immediatly create a copy of the array? Since I am not sure how and when the memory of the buffer will be freed.

@rougier
Copy link
Owner

rougier commented Feb 8, 2017

We can also directly return a copy (just in case). I think freetype can free the glyph anytime so it might be safer to return a copy.

@rougier
Copy link
Owner

rougier commented Mar 18, 2017

@StephewZ Can you open a new issue for this problem ?

@HinTak
Copy link
Collaborator

HinTak commented Apr 18, 2017

Sigh. You guys don't understand what 'pitch' is. It is not the same as width, nor number of pixels in gray. It is a memory offset. It is the same concept as what is called 'stride' in numpy lingo. (see https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.ctypes.html , or whatever else is available on numpy).

The idea is that computers are a lot more efficient when dealing with say, 4-bytes or 8-byte chunks. So when you want to faster-forward or backward in memory, you want to do so in such units, instead of bytes.
For bits, it is obvious that pitch is AT LEAST (the number of bits rounded up to multiple of 8)/8, since you can't fast-forward by half a byte. But for grays, you might have stride being width rounded up to multiple of 4, or 8, depending on whether you are on a 32-bit or a 64-bit platform.

Pitch is the distance between the two memory locations of the beginning of row1 and row2, etc. It is always larger than (bit-depth * pixel width) /8 , because memory locations like to be aligned to multiple of 4 or 8, depends on platform. i.e. if you have 17 pixels of gray per row, it is possible that stride can be 20 or 24.

It is called pitch by some, but called stride in numpy's multi-dimensional array type's documentation.

@jiong3
Copy link
Contributor Author

jiong3 commented Apr 19, 2017

Sigh. You guys don't understand what 'pitch' is.

?

As I wrote above, the pitch is the number of bytes per row. According to the documentation, "FreeType functions normally align to the smallest possible integer value". So for grayscale bitmaps width and pitch are likely equal, unless the alignment is changed. In the common case of accessing the buffer as a whole an alignment of the rows to 2 or 4 bytes wouldn't be faster anyway.

@HinTak
Copy link
Collaborator

HinTak commented Apr 20, 2017

No, pitch is not number of byte per row. It is the distance between two rows in bytes. Can you not read?

In cairo lingo, it is also called stride. Cairo even have a special function for converting/calculating stride from width. This tells you stride is not the same as width.

I am concerned that you are proposing fast but wrong code. Code that is wrong, is wrong, whatever the speed.

@HinTak
Copy link
Collaborator

HinTak commented Apr 21, 2017

You also do not seem to be able to read documentation - "normally" means "most of the time" . It is meaningless to quote that sentence in this context.

@jiong3
Copy link
Contributor Author

jiong3 commented Apr 21, 2017

It says so in the documentation:

The pitch's absolute value is the number of bytes taken by one bitmap row [...]

I never suggested not to test for pitch != width, but since they are equal in the most common case this is what should be optimized for.

It is always larger than (bit-depth * pixel width) /8, [...]

That's wrong.

@jiong3
Copy link
Contributor Author

jiong3 commented Apr 21, 2017

In general, how should the buffer be handed to the user? As a raw buffer, numpy (dependency) or python array, with or without padding, bits unpacked to bytes?

@rougier
Copy link
Owner

rougier commented Apr 21, 2017

Goign back to the numpy handling, I think it would good to return a copy by default. We could provide an option to not make a copy, but we don't have real control on when the buffer will be freed.

@HinTak
Copy link
Collaborator

HinTak commented Apr 21, 2017 via email

@rougier
Copy link
Owner

rougier commented Apr 21, 2017

The reason to use of numpy in the wordle example was mostly to have an easy way to test for collision. It does not pretend at anything else. I agree cairo (or the antigrain library) would be a better solution for manipulating/compositing images and drawing but that's a separate problem. Examples are really and only illustrations on how to use the library.

@HinTak
Copy link
Collaborator

HinTak commented Apr 21, 2017 via email

@rougier
Copy link
Owner

rougier commented Apr 21, 2017

A stand-alone cairo example would be a nice addition.

@HinTak
Copy link
Collaborator

HinTak commented Apr 23, 2017

So much for trying to extract the cairo surface code from the other freetype binding - it is simply wrong :
ldo/python_freetype#1
ldo/python_freetype_examples#1

That said, my corrected version is a hell lot faster than the numpy versions... Yes, I am already timing my standalone cairo example. I think numpy is just slow.

@HinTak
Copy link
Collaborator

HinTak commented Apr 25, 2017

I have rewriiten 6 of the samples with pycairo. glyph-{monochrome,alpha,color}, hello-world, example1, and wordle . The last one is the most difficult one - I needed to use a feature newly added to pycairo 1.11 (released two weeks ago), and cannot pack as tightly as the original. OTOH, cairo can paint partly off-buffer, so you can see the difference.

And it is a hell lot faster too...

@HinTak
Copy link
Collaborator

HinTak commented Apr 25, 2017

wordle-cairo

The cairo based wordle drawing. I cannot pack as tight, but can draw partly off screen.

@HinTak
Copy link
Collaborator

HinTak commented Apr 25, 2017

glyph-cairo-gray

cairo-based glyph-alpha

@HinTak
Copy link
Collaborator

HinTak commented Apr 25, 2017

glyph-cairo-mono

cairo-based glyph mono-chrome

@HinTak
Copy link
Collaborator

HinTak commented Apr 25, 2017

glyph-color-cairo

glyph-color

@HinTak
Copy link
Collaborator

HinTak commented Apr 25, 2017

example_1-cairo

The boring example1, no visual difference other than it being a lot faster.

@HinTak
Copy link
Collaborator

HinTak commented Apr 25, 2017

hello-world-cairo

The hello world example.

@HinTak
Copy link
Collaborator

HinTak commented Apr 25, 2017

Since they are proper drawings rather than plots, there are no axes or padding around the figure, nor any grid lines.

glyph-outline.py is essentially half of glyph-color so I'm not going to do it; glyph-vector-2.py have grid lines. I can't really do glyph-lcd . So the above covers all the numpy-based plot example. (there the gl example also uses numpy but I'll let you figure that out...).

When I get the samples cleaned up, and adding some comments on limitations, etc, I'll issue a pull.

@rougier
Copy link
Owner

rougier commented Apr 25, 2017

@HinTak Thanks, nice results. For the PR, it would make sense to add all of them with the "-cairo.py" extension and to keep the old ones (or to have a dedicated cairo subdir) because it requires an extra dependency. For the wordle example, I think the difference come from the collision test. Probably cairo uses bouding boxes and this prevent one text to be drawn over another one even if the glyphs do not collide.

@jiong3 Do you think you're ready to make a PR from your tests and out discussion ?

@HinTak
Copy link
Collaborator

HinTak commented Apr 25, 2017

Yes, that's what I have been doing - 5 *-cairo.py, and an extra bitmap_to_surface.py which consists of extracted, afjusted and bug-fixed routines from the other freetype binding. There is at the moment no separate glyph-monochrome vs glyph-alpha - they differ only by one-line (TARGET_MONO/TARGET_NORMAL) so I just comment/uncomment the alternatives at the moment.

@HinTak
Copy link
Collaborator

HinTak commented Apr 25, 2017

Also I found some of the numpy examples doing y-direction flips - worldle does it at least twice :-(. And also the arrays having width and height in fortran indexing style... haven't seen them in a while...

@rougier
Copy link
Owner

rougier commented Apr 25, 2017

Y-flip in an error, matplotlib can take care of that actually. For numpy array they are C-order but indexing if row (=y) / column (=x).

@HinTak
Copy link
Collaborator

HinTak commented Apr 25, 2017

Viewing vs the saved images differing is a bit painful. The original wordle draws things up-side-down and display it up-side-down, then save it the correct way up. That numpy/matplot can cope isn't quite the point. Anyway, the cairo based one all have things drawn the same way up it is saved. Actually I don't display with any of cairo's display backend, but just save to file then launch python pillows's image displayer.

@HinTak
Copy link
Collaborator

HinTak commented Apr 26, 2017

glyph-outline-cairo

I have decided to add a cairo version of glyph-outline anyway, quite trivial since it is just half of glyph-color.

The pull is at
#55

@HinTak
Copy link
Collaborator

HinTak commented Apr 26, 2017

BTW, the outline example has an transparent background, whereas I paint most of the other's background grey first. PIL displays transparent as black; I have another viewer displays it as white. Gimp shows a checkerboard pattern for transparent pixels.

I have also changed my mind about editing to change between mono or alpha modes of the combined mono+alpha example. It defaults to alpha but if you put any argument to it, it draws mono. Explained in the comments at its top.

@jiong3
Copy link
Contributor Author

jiong3 commented Apr 27, 2017

@rougier No, but if anyone wants to make a PR I would suggest option 4 or 5, or maybe something using a python array (which I haven't tested so far).

@HinTak
Copy link
Collaborator

HinTak commented Apr 27, 2017

wordle-cairo-collison

Here is an example when I got the stride/pitch wrong - noticed how some of the tiles collides? (only a few).
The corrected code/figures are #55 (comment) #55 (comment) (two, depends on whether one's pycairo is latest). Drawing partly over the edge requires latest pycairo.

@HinTak
Copy link
Collaborator

HinTak commented Apr 29, 2017

I thought I couldn't do the LCD example in cairo - but it get better as I get more familiar. So I have added the LCD_V case side-by-side too:
#55 (comment)

The cairo LCD example is about 4 times after than the old; with two panels, it probably means 8x .

As I get more familiar with pycairo, I feel like I could probably rewrite glyph-vector-2 also. It is a vector drawing on top of a bitmap. After that, there is only one file which uses the slow data.extend(bitmap.buffer... idiom: texture_font.py, which is used by the gl example.

@HinTak
Copy link
Collaborator

HinTak commented Apr 29, 2017

To answer an early question: I think you can get negtive pitch if you use a reflecting transform. i.e. if you do a FT_Set_Transform with a matrix which has a negative determinant. I haven't tesed this, but e.g. if you set up example_1 to use matrix = (-1 0, 0, 1) or (1, 0, 0, -1) instead.

Only two examples do FT_Set_Transform at the moment. So, example_1 and wordle would break if they ever get extended to use FT_Set_Transform that way.

@HinTak
Copy link
Collaborator

HinTak commented Apr 30, 2017

I am done with converting/rewritng all the examples from the slow numpy/matplot drawing over to cairo:
#55

A side-effect is none of my code uses the stupid data.extend(... idiom; the *-cairo.py versions are all a lot faster. There is only one data.extend(... left, in texture_font which is used by subpixel-positioning, which uses opengl for drawing so I do not touch.

So I am going to look at the perl-binding of freetype now. It should be obvious by now that I know freetype well and just looking to use it with a different language than C.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants