Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bitmaptools: Add readinto, arrayblit #4403

Merged
merged 13 commits into from
Mar 17, 2021
Merged

Conversation

jepler
Copy link
Member

@jepler jepler commented Mar 14, 2021

When reading uncompressed bitmap data directly, readinto can work much more quickly than a Python-coded loop.

On a Raspberry Pi Pico, I benchmarked a modified version of adafruit_bitmap_font's pcf reader which uses readinto instead of the existing code. My test font was a 72-point file created from Arial.

This decreased the time to load all the ASCII glyphs from 4.9 seconds to just 0.44 seconds.

While this attempts to support many pixel configurations (1/2/4/8/16/24/32 bpp; swapped words and pixels) only the single ombination used by PCF fonts was tested.

When reading uncompressed bitmap data directly, readinto can work
much more quickly than a Python-coded loop.

On a Raspberry Pi Pico, I benchmarked a modified version of
adafruit_bitmap_font's pcf reader which uses readinto instead of
the existing code. My test font was a 72-point file created from Arial.

This decreased the time to load all the ASCII glyphs from 4.9 seconds to
just 0.44 seconds.

While this attempts to support many pixel configurations (1/2/4/8/16/24/32
bpp; swapped words and pixels) only the single combination used by
PCF fonts was tested.
@FoamyGuy
Copy link
Collaborator

I get this error when I attempt to make the uf2 for PyPortal with this branch:

❯ make BOARD=pyportal
Use make V=1, make V=2 or set BUILD_VERBOSE similarly in your environment to increase build verbosity.
QSTR updated
../../shared-module/bitmaptools/__init__.c: In function 'common_hal_bitmaptools_readinto':
../../shared-module/bitmaptools/__init__.c:365:35: error: comparison of integer expressions of different signedness: 'int' and 'size_t' {aka 'unsigned int'} [-Werror=sign-compare]
  365 |                     for(int i=0; i< rowsize_in_u16; i++) {
      |                                   ^
../../shared-module/bitmaptools/__init__.c:371:31: error: comparison of integer expressions of different signedness: 'int' and 'size_t' {aka 'unsigned int'} [-Werror=sign-compare]
  371 |                 for(int i=0; i< rowsize_in_u32; i++) {
      |                               ^
cc1: all warnings being treated as errors
make: *** [../../py/mkrules.mk:55: build-pyportal/shared-module/bitmaptools/__init__.o] Error 1

I noticed that rp2040 feather does build successfully for me though. I will this try this out on that device.

@jepler
Copy link
Member Author

jepler commented Mar 14, 2021

Thanks @FoamyGuy , the CI saw it too. I've attempted to correct it.

Copy link
Collaborator

@FoamyGuy FoamyGuy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice speedup! I tested this successfully on Feather RP2040 and a PyPortal. I saw a 4x speedup loading the yasahsi-ascii-200.pcf font with the bitmap_font library that utilizes this.

@kmatch98
Copy link
Collaborator

kmatch98 commented Mar 15, 2021

This looks like a cool addition. I haven’t tried yet but have a few questions since I couldn’t follow all the code with all the bit operations.

  • Does this require that the dimensions of the destination bitmap and the file bitmap are the same?
  • If the color palettes are different bit sizes or number of colors, how does it handle this?
  • Also how do I get the palette colors from the file into a new pixel_shader palette object?

If I’m missing something important about the main goal of this function, please ascribe that to ignorance rather than malice.

@jepler
Copy link
Member Author

jepler commented Mar 15, 2021

Does this require that the dimensions of the destination bitmap and the file bitmap are the same?

Yes, it always reads all width×height pixels. (oh, and rows of pixels always go from top to bottom, which is different from at least some BMP files)

If the color palettes are different bit sizes or number of colors, how does it handle this?

Whatever displayio_bitmap_write_pixel does.

Also how do I get the palette colors from the file into a new pixel_shader palette object?

This code does not deal with palettes at all, just with pixels. If you had a hypothetical file type with a header, followed by palette entries, followed by uncompressed bitmap data, you would do the header & palette processing in Python, then position the open file at the first byte of the uncompressed bitmap data, and then use this routine to get the bitmap data loaded efficiently.

@tannewt
Copy link
Member

tannewt commented Mar 16, 2021

@FoamyGuy and @kmatch98 (just invited you) please do the review.

@jepler jepler changed the title bitmaptools: Add readinto bitmaptools: Add readinto, arrayblit Mar 16, 2021
@jepler
Copy link
Member Author

jepler commented Mar 16, 2021

I've now added arrayblit.

You can blit the whole bitmap:

>>> b = displayio.Bitmap(8, 8, 256)
>>> a = array.array('H', range(64))
>>> bitmaptools.arrayblit(b, a)
>>> print([b[i] for i in range(64)])
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63]

You can blit just part of it:

>>> bitmaptools.fill_region(b, 0, 0, 8, 8, 0)
>>> bitmaptools.arrayblit(b, a, 2, 2, 5, 7)
>>> print([b[i] for i in range(64)])
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 2, 0, 0, 0, 0, 0, 3, 4, 5, 0, 0, 0, 0, 0, 6, 7, 8, 0, 0, 0, 0, 0, 9, 10, 11, 0, 0, 0, 0, 0, 12, 13, 14, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

You can specify a color index that is treated as transparent:

>>> bitmaptools.fill_region(b, 0, 0, 8, 8, 16)
>>> bitmaptools.arrayblit(b, array.array('B', [0, 1, 2]) * 22, skip_index=2)
>>> print([b[i] for i in range(64)])
[0, 1, 16, 0, 1, 16, 0, 1, 16, 0, 1, 16, 0, 1, 16, 0, 1, 16, 0, 1, 16, 0, 1, 16, 0, 1, 16, 0, 1, 16, 0, 1, 16, 0, 1, 16, 0, 1, 16, 0, 1, 16, 0, 1, 16, 0, 1, 16, 0, 1, 16, 0, 1, 16, 0, 1, 16, 0, 1, 16, 0, 1, 16, 0]

@kmatch98
Copy link
Collaborator

I spent some time on the readinto function and it's working great. It shows a big speedup when loading PCF fonts, and a similar speedup for loading other BMP files. As @jepler mentioned above, some bitmap files are loaded in different row direction. I submitted a PR to give the option to reverse the row loading and an associated PR in the adafruit_imageload library to use this new .readinto function.

It looks good to me! I'll spend some time on the .arrayblit function later today.

@jepler
Copy link
Member Author

jepler commented Mar 16, 2021

I've updated this on top of main, let's see if it allows CI to pass now.

@danielwarnersmith
Copy link

This is awesome! I’m working on a project that sends rgb565 values from Max/MSP over tcp to the MatrixPortal and would love to use this.

//| """Inserts pixels from ``data`` into the rectangle of width×height pixels with the upper left corner at ``(x,y)``
//|
//| The values from ``data`` are taken modulo the number of color values
//| avalable in the destintaion bitmap.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor typo: destination.

Copy link
Collaborator

@kmatch98 kmatch98 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing of the readinto function worked well. Just one minor type in the docstrings.

I briefly tested the arrayblit, function and it operated as described.

Any functions that use this will want to watch out for if the array values exceed the bitmap color depth. In this case the value copied is array_value % color_depth. If used with the skip_index value and the color depth is too large, you can still get the skip_index value copied into your bitmap (since it is the result of the % mod operation). Users just need to proceed with caution, and there is adequate warning in the document strings.

I just realized there is no exposed function for querying a Bitmap's color depth. That would be useful to watch out for the above situation. I'll add an issue for an extension request.

@kmatch98
Copy link
Collaborator

Docs pass. Yay! Looks good to me.

@jepler
Copy link
Member Author

jepler commented Mar 17, 2021

@kmatch98 thanks, can you leave an approving review?

@kmatch98 kmatch98 self-requested a review March 17, 2021 12:50
Copy link
Collaborator

@kmatch98 kmatch98 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. Thanks for the addition.

Sorry I was holding up the show. Didn’t realize the procedure for getting this released. All looks good to me.

@jepler jepler merged commit bfc8c89 into adafruit:main Mar 17, 2021
@v923z
Copy link

v923z commented Mar 17, 2021

@jepler Jeff, would you interface it to ulab like so (pseudocode)?

b = displayio.Bitmap(8, 8, 256)
a = array.array('H', range(64))
bitmaptools.arrayblit(b, a)

arr = ulab.frombuffer(a)

What would be really great is, if this worked in the other direction, too. I.e., if you could pipe the data into a bitmap object, like so

arr = ulab.linspace(0, 100, num=64, dtype=ulab.uint8)

b = displayio.Bitmap(8, 8, 256, buffer=arr.tobytes())

i.e., without having to reserve space. If there was a way of sharing the memory between bitmap and ulab objects, then you could, e.g., run cool animations at C speed without having to implement a single function.

@jepler
Copy link
Member Author

jepler commented Mar 17, 2021

@v923z the goal was to have it work with a ulab array too, though I didn't test it before.

Here's what you can write now that this PR has been merged, freshly tested:

>>> arr = ulab.linspace(0, 100, num=64, dtype=ulab.uint8)
>>> b = displayio.Bitmap(8, 8, 256)
>>> bitmaptools.arrayblit(b, arr)
>>> print([b[i] for i in range(64)])
[0, 1, 3, 4, 6, 7, 9, 11, 12, 14, 15, 17, 19, 20, 22, 23, 25, 26, 28, 30, 31, 33, 34, 36, 38, 39, 41, 42, 44, 46, 47, 49, 50, 52, 53, 55, 57, 58, 60, 61, 63, 65, 66, 68, 69, 71, 73, 74, 76, 77, 79, 80, 82, 84, 85, 87, 88, 90, 92, 93, 95, 96, 98, 100]

similar to how your ndarray works with memoryview() now.

Because of "dirty rectangles" (the bitmap needs to track which portions of the itself have changed, to make display refreshes more efficient) we can't enable using the same storage for a writable ulab array and a bitmap, so a copy will be necessary. Unless someone sees a clever way around this little detail. (a solution which "gets it wrong" if a user forgets to make a call to some function is probably not going to be satisfactory)

@v923z
Copy link

v923z commented Mar 17, 2021

@v923z the goal was to have it work with a ulab array too, though I didn't test it before.

Here's what you can write now that this PR has been merged, freshly tested:

Thanks!

Because of "dirty rectangles" (the bitmap needs to track which portions of the itself have changed, to make display refreshes more efficient) we can't enable using the same storage for a writable ulab array and a bitmap, so a copy will be necessary. Unless someone sees a clever way around this little detail. (a solution which "gets it wrong" if a user forgets to make a call to some function is probably not going to be satisfactory)

What is a bit unclear to me is the distinction between a bitmap object, and what you actually display. Displays can be arbitrarily large, and it would be problematic to store the contents of a 240x320 display in RAM. So, that is probably not the bitmap object.

Perhaps, it could make more sense to hook into the function that transfers the data (however they are generated) to the display. In that case, you wouldn't have problems with the forgotten function call, because you could just do something like this:

arr = ulab.linspace(0, 100, num=256)

display.draw(buffer=arr.tobytes(), x0=0, y0=0, width=16, height=16)

and you would definitely call the display.draw function, if you wanted to show something.

@dglaude
Copy link

dglaude commented Mar 18, 2021

I used the copy in my thermal camera experiment, but noticed the image was not refreshing when updating the bitmap.
Right now I am doing the following, but I don't know if this is the optimal way:

    image_group.remove(image_tile)
    image_group.append(image_tile)

Thanks for the feature, I am not sure I fully understand everything that is happening under the hood, but it works.

@kmatch98
Copy link
Collaborator

kmatch98 commented Mar 18, 2021 via email

@jepler
Copy link
Member Author

jepler commented Mar 18, 2021

@v923z ah, the mental model of displayio is different than what you're thinking of .. There are what I think folks call "immediate mode" graphics engines where to redraw you run through sections of code which say "draw THIS, over THERE" and when it's done the new content of the screen is ready. If the content of a bitmap changes, you have to also run the code which says to draw the bitmap.

By contrast, I'd say that a big element of displayio is that you declare that some graphical element like a TileGrid will appear at a particular location on the display. A TileGrid can show rectangular pieces out of a Bitmap.

Now say you have a bitmap B and you update its pixel (X,Y). If that bitmap is shown in some TileGrid, then any particular tile which contains the (X,Y) pixel will need to be updated. That only happens when displayio can control additional code that runs whenever you do the equivalent of bitmap[X,Y] = 0. If you can modify the bitmap via its buffer, memoryview(bitmap)[X+Y*bitmap.width] = 0 there is not a facility in micropython to notify the bitmap that its content has been modified.

This is sort of the bug that @dglaude experienced above, because I forgot the call to displayio_bitmap_set_dirty_area in arrayblit. [we'll correct the bug in the core of course]

@tannewt I assume you're not interested in bending this rule of bitmaps even if a hypothetical speed improvement can be had due to avoiding memory copies.

@tannewt
Copy link
Member

tannewt commented Mar 18, 2021

@tannewt I assume you're not interested in bending this rule of bitmaps even if a hypothetical speed improvement can be had due to avoiding memory copies.

I'm not sure exactly what you are proposing.

Remember that displayio was designed to 1) reduce the in-memory footprint by not having one big bitmap and 2) reduce the SPI traffic by tracking dirty rectangles. It (barely) works on the SAMD21 for the original Hallowing. I think the TileGrid model does a good job for 2d widget style layouts.

As we grow into larger and faster chips we may want to allow for displaying bitmaps directly. When we support displays that don't have their own framebuffer, then we likely just want to store one ourselves. This could better support the "I just want to draw shapes myself" case.

@dglaude
Copy link

dglaude commented Mar 18, 2021

@kmatch98 I believe that was in my code already but did not have the expected effect.

May need to do display.refresh()

And @jepler explanation is likely right, no need to compare previous content and new content, if something was copied over into the bitmap, it likely has changed.

This is sort of the bug that @dglaude experienced above, because I forgot the call to displayio_bitmap_set_dirty_area in arrayblit.
[we'll correct the bug in the core of course]

I did try to trigger some dirtyness by doing a bitmap[0,0] = 0 just before the copy, but it did not seems to trigger the refresh either. But I quickly gave up when I found the work-around.

I remember doing something similar (but maybe also due to a bug) when I wanted to change the palette of an existing bitmap (to do palette rotation animation and the like). And even if far from perfect, sometime removing and adding the object from the group is just more simple.

@v923z
Copy link

v923z commented Mar 20, 2021

@jepler

If you can modify the bitmap via its buffer, memoryview(bitmap)[X+Y*bitmap.width] = 0 there is not a facility in micropython to notify the bitmap that its content has been modified.

If I understand correctly, your only concern here is that this refresh/update function has to be called manually, otherwise, while the data will be "updated", this fact will not be reflected on the display itself.

But is this really a problem? If someone is using this very low-level buffer utility, then it is expected that they would explicitly call refresh. I reckon, this is probably not for the faint of heart, and then the only question is, whether all functions in circuitpython should be high-level, or is it OK to allow the more experienced users to directly tamper with the hardware. Perhaps, this would set some precedence, so I understand your reluctance to open a potential Pandora's box.

@v923z
Copy link

v923z commented Mar 20, 2021

@tannewt I assume you're not interested in bending this rule of bitmaps even if a hypothetical speed improvement can be had due to avoiding memory copies.

I'm not sure exactly what you are proposing.

Remember that displayio was designed to 1) reduce the in-memory footprint by not having one big bitmap and 2) reduce the SPI traffic by tracking dirty rectangles. It (barely) works on the SAMD21 for the original Hallowing. I think the TileGrid model does a good job for 2d widget style layouts.

The idea that I put forward is trying to solve the very problem of having to reserve extra space in the RAM.

I would also point out that, while this came up in a discussion about interfacing numerical arrays to the display, this concept is not confined to this particular issue. E.g., it could prove useful, if an ADC could directly manipulate the display (this would most probably not write characters, but having the option to update a small segment of the display with raw bytes is still meaningful, cheap, and fast), or you could "share" the display between two microcontrollers, if the serial buffer went directly to the display, etc.

Also, I don't see the difference between a screen, and a DAC: both are peripheral devices, they just, well, display data in different ways. So, if you can pipe data into a DAC, or serial port, then why should you not be able to do the same with a display? I understand that the display requires extra functions, but that is not really a conceptual difference.

@jepler
Copy link
Member Author

jepler commented Mar 20, 2021

@tannewt Hypothetically, if a ulab array can be a writable view onto a bitmap, you would have to call a new dirty() method on the bitmap after modifying it outside the knowledge of the Bitmap itself:

b = displayio.Bitmap(8, 8, 256)
a = ulab.frombuffer(b) # create a ulab.array that views onto b, but see below
a[y,x] = p
b.dirty(x, y, x+1, y+1) # This call doesn't exist yet, because we've worked to prevent the need for it

If you omitted the call to the new dirty() function, or specified the wrong rectangle, the display might not be correct. If you're OK with adding a "displayio.Bitmap.dirty" method that users would have to call if they used this advanced feature, then it might be worth proceeding.

Separate from that, I've noticed that (at least for ulab legacy) ulab.frombuffer is copying rather than being a view on the underlying storage, such that the following prints 1 on standard python and 0 in circuitpython 6.2-beta:

try:
    import numpy as np
except:
    import ulab as np

import array

a = array.array('B', [0] * 36)
u = np.frombuffer(a, dtype=np.uint8)
u[0] = 1
print(a[0])

ulab's main branch behavior is aligned with numpy, so maybe we should wait for CP7 to revisit this idea.

@v923z
Copy link

v923z commented Mar 20, 2021

b = displayio.Bitmap(8, 8, 256)
a = ulab.frombuffer(b) # create a ulab.array that views onto b, but see below
a[y,x] = p
b.dirty(x, y, x+1, y+1) # This call doesn't exist yet, because we've worked to prevent the need for it

If you omitted the call to the new dirty() function, or specified the wrong rectangle, the display might not be correct. If you're OK with adding a "displayio.Bitmap.dirty" method that users would have to call if they used this advanced feature, then it might be worth proceeding.

I see that this a thorny issue, but this is exactly what I had in mind.

Separate from that, I've noticed that (at least for ulab legacy) ulab.frombuffer is copying rather than being a view on the underlying storage,

ulab's main branch behavior is aligned with numpy, so maybe we should wait for CP7 to revisit this idea.

Here is the culprit (legacy): https://github.com/v923z/micropython-ulab/blob/b64fa6d4c73287edef9ccf09cfd6ec5009f9628b/code/ulab_create.c#L711

versus the same code chunk from the master branch: https://github.com/v923z/micropython-ulab/blob/714a3b872747332de9600fda4be07bbcc5f99626/code/ulab_create.c#L703-L704

We changed this in connection to v923z/micropython-ulab#306 . We can definitely backport the change, if that is what you want.

@jepler
Copy link
Member Author

jepler commented Mar 20, 2021

It looks like numpy tracks whether the original storage was mutable or not, so e.g., if the origin is a bytes object it is immutable but if it's a bytearray then it's mutable:

>>> import numpy as np
>>> a = np.frombuffer(b'abcd', dtype=np.uint16)
>>> a
array([25185, 25699], dtype=uint16)
>>> a[0] = 1
ValueError: assignment destination is read-only

versus

>>> a = np.frombuffer(bytearray(b'abcd'), dtype=np.uint16)
>>> a[0] = 1
>>> a
array([    1, 25699], dtype=uint16)

I can help show you what is needed to accomplish the initial check for whether the storage is mutable, but it may cost quite a bit of code to add all the necessary checks .. or maybe it's only a few places (assignments and slice assignments?)?

@v923z
Copy link

v923z commented Mar 20, 2021

I can help show you what is needed to accomplish the initial check for whether the storage is mutable,

I think you brought this up in the thread that I linked to above, and I don't know what the right solution is. On the one hand, numpy-compatibility is great, and I would advocate that. On the other hand, a microcontroller is simply not a computer, and we have to be careful about how we squander bytes both in flash, and RAM. If you judge compatibility to be important, then I am OK with that.

but it may cost quite a bit of code to add all the necessary checks .. or maybe it's only a few places (assignments and slice assignments?)?

I believe, this should be only a couple of places in https://github.com/v923z/micropython-ulab/blob/master/code/ndarray.c, and frombuffer in https://github.com/v923z/micropython-ulab/blob/master/code/ulab_create.c. Otherwise, all other functions produce a new ndarray.

@jepler
Copy link
Member Author

jepler commented Mar 20, 2021

$ ./micropython/ports/unix/micropython 
MicroPython v1.14-120-g42cf77f48 on 2021-03-20; linux version
Use Ctrl-D to exit, Ctrl-E for paste mode
>>> from ulab import numpy as np
>>> a = np.frombuffer('ulab', dtype=np.uint16)
>>> a[0] = 0x5555
Segmentation fault

ow, I didn't realize it would do that (ulab master branch + micropython unix port)

@v923z
Copy link

v923z commented Mar 21, 2021

@jepler It seems to me that we have somewhat diverged from the original issue, so I opened a dedicated one here: v923z/micropython-ulab#350

@tannewt
Copy link
Member

tannewt commented Mar 22, 2021

But is this really a problem? If someone is using this very low-level buffer utility, then it is expected that they would explicitly call refresh. I reckon, this is probably not for the faint of heart, and then the only question is, whether all functions in circuitpython should be high-level, or is it OK to allow the more experienced users to directly tamper with the hardware. Perhaps, this would set some precedence, so I understand your reluctance to open a potential Pandora's box.

I'm totally ok adding advanced APIs to CircuitPython. I don't think Adafruit funded folks generally should but I want us to support folks who do want to add it. Generally, I just want to ensure that those APIs are in their own module so we can enable and disable them for builds of different sizes. If they mirror CPython APIs, then they should be a strict subset under the same module name. Also, if the low level access is "unsafe" and cause crashes I think they should give guard rails and have explicit ways around them as needed. That way folks will know they went off into unsupported territory.

Also, I don't see the difference between a screen, and a DAC: both are peripheral devices, they just, well, display data in different ways. So, if you can pipe data into a DAC, or serial port, then why should you not be able to do the same with a display? I understand that the display requires extra functions, but that is not really a conceptual difference.

You can with busio.SPI. displayio is meant to be higher level and also allow display sharing with the CircuitPython core for "instant-on" displays and serial output display.

@tannewt Hypothetically, if a ulab array can be a writable view onto a bitmap, you would have to call a new dirty() method on the bitmap after modifying it outside the knowledge of the Bitmap itself:

Ya, I'm ok adding a way to add a dirty area to a rectangle manually. It should be an advanced API that isn't required in most cases.

@v923z
Copy link

v923z commented Mar 23, 2021

Also, I don't see the difference between a screen, and a DAC: both are peripheral devices, they just, well, display data in different ways. So, if you can pipe data into a DAC, or serial port, then why should you not be able to do the same with a display? I understand that the display requires extra functions, but that is not really a conceptual difference.

You can with busio.SPI. displayio is meant to be higher level and also allow display sharing with the CircuitPython core for "instant-on" displays and serial output display.

I think busio.SPI is probably too low-level in this case: usually, if you want to write to a display, you have to send some sort of metadata in addition to the actual payload. Something's got to hold the information that tells the display, where the contents of the buffer should end up.

@jepler jepler deleted the bitmap-read-2 branch November 3, 2021 21:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fast copy from ulab to displayio.Bitmap
7 participants