-
-
Notifications
You must be signed in to change notification settings - Fork 7.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
py/modmicropython: Add micropython.memmove() and micropython.memset(). #12487
base: master
Are you sure you want to change the base?
Conversation
Code size report:
|
Codecov Report
@@ Coverage Diff @@
## master #12487 +/- ##
==========================================
- Coverage 98.38% 98.18% -0.20%
==========================================
Files 158 158
Lines 20940 20981 +41
==========================================
- Hits 20602 20601 -1
- Misses 338 380 +42
|
This was based on a discussion about providing a more optimal way to copy data between buffers, however based on the benchmarking so far it seems like it might not be worth the overhead. Signed-off-by: Angus Gratton <angus@redyak.com.au>
bbc7eaf
to
b55ac53
Compare
After applying the thread-local slice optimisation and poking around with This is another very short-lived heap allocation, but it looks like it would be much harder to optimise than the thread-local slice case. |
This is an automated heads-up that we've just merged a Pull Request See #13763 A search suggests this PR might apply the STATIC macro to some C code. If it Although this is an automated message, feel free to @-reply to me directly if |
This was based on a discussion about providing a more optimal way to copy data between buffers, however based on benchmarks so far it seems like it might not be worth it compared to optimising "copy to/from slice" code paths written in idiomatic Python.
Summary
Adds two functions to
micropython
module, gated behind a new config option:micropython.memmove(dest, dest_idx, src, src_idx, [len])
- an optimised equivalent ofdest[dest_idx:dest_idx+len] = src[src_idx:src_idx+len]
. Copies memory contents with semantics of Cmemmove
, hence the name.len
argument is optional, length defaults to the minimum of the length of the source and destination regions.micropython.memset(dest, dest_idx=0, c=0, len=len(dest)-dest_idx)
- an optimised equivalent ofdest[dest_idx:] = bytes([c]*len)
. Modelled on C'smemset
.Unlike assigning to a slice, the destination buffer size never changes as a result of calling either of these functions. Out of bounds assignment raises an exception.
Benchmarks - memmove
Comparing memmove to current MicroPython "best practices" (unix port, i5-1248P CPU):
Honestly I found this a little underwhelming! Admittedly,
slice_copy-6-memmove.py
can do the equivalent ofslice_copy-5-lvalue_rvalue_memoryview.py
(slices on both sides of the assignment) and it's almost twice as fast, but it's only twice as fast (in a tight loop that does nothing else, working with pretty short buffers.)Maybe the C implementation of memmove() needs some tweaks to streamline the error checking 🤷 .
When rebased against PR #10160 things get even closer:
Now
slice_copy-6-memmove.py
is only 1.6x faster thanslice_copy-5-lvalue_rvalue_memoryview.py
, and no faster than assigning a buffer to an lvalue slice...Benchmarks - memset
Kind of the same story with memset, writing out a bytes array (which can be frozen to flash) is basically as fast as using the
memset()
function. The naive versions of this are a lot slower, though!Disclaimer: The new test file names take some liberties with the meaning of
lvalue
andrvalue
, happy to take suggestions for more accurate term to use.This work was funded through GitHub Sponsors.