New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mem: Align data to natural alignment #416
Conversation
8f3c320
to
f3c6dac
Compare
Also, a question not directly related to this PR, but to mem. Would you be interested in a patch that makes |
On June 26, 2022 8:43:02 AM Sebastian Reimers ***@***.***> wrote:
@sreimers commented on this pull request.
> @@ -120,6 +155,11 @@ static inline void mem_unlock(void)
void *mem_alloc(size_t size, mem_destroy_h *dh)
{
struct mem *m;
+ size_t capacity;
+
+ capacity = align_up(size);
is it enough to align only the mem_header_size?
The data starts at mem_header_size offset, and malloc returns a pointer
aligned to 8 or 16, depending on the target platform. So yes, it should be
enough.
|
Ok then I would say we should only align the header offset, looks like the max. diff between size and capacity is only 15 bytes and mem_realloc is mostly called with much higher diffs I think. |
Sorry, I don't understand. The
Do you mean |
Yes I think so. |
I think we have to be very careful with making changes here. The mem system is used by almost all the code in re. Please split up the PR into logical PRs. |
Ok. But then there's a correctness issue in |
f3c6dac
to
62dfd6b
Compare
I've removed |
we also have this problem that we should try to fix:
the vidframe data is sometimes allocated by Either we make mem_alloc aligned to this, or a new function for special alignment needs. Regarding the
Regarding changes in the mem modules we have to be very careful here, We could also make a new testcase in |
On July 9, 2022 11:53:22 AM "Alfred E. Heggestad" ***@***.***> wrote:
we also have this problem that we should try to fix:
```
swscale: created SwsContext: 'yuv420p' 320 x 240 --> 'yuv420p' 642 x 488
[swscaler @ 0x7fc47af52000] Warning: data is not aligned! This can lead to
a speed loss
```
the vidframe data is sometimes allocated by `mem_alloc` and it is used by
FFmpeg.
FFmpeg recommend that a buffer is aligned to fit the SIMD optimalisations.
The alignment here is typically 16, 32, 64 bytes
Either we make mem_alloc aligned to this, or a new function for special
alignment needs.
There are dedicated memory allocation functions in ffmpeg that ensure the
required alignment that should be used when interacting with ffmpeg APIs. I
think the alignment they enforce is 64 bytes currently, which is a bit too
much to have in libre by default.
Adding mem API for overaligned allocations is possible, but would require
additional metadata in the header. I'd prefer not to make it part of this
PR. It can be added as a separate PR.
Regarding the `struct mem` overhead:
1. it should be as small as possible in order to avoid wasting memory
2. it should be as close to the natural alignment as possible (i.e. 8
bytes, 16 bytes, etc).
So are you saying you agree with the `uint32_t` idea?
|
after some thinking, I think we can allocate aligned memory for struct vidframe, for mem_alloc, the alignment should be the same as the malloc implementation, most of the CPUs supported are either 32-bit or 64-bit: 32-bit: 4 bytes alignment by using types like function pointers and size_t in struct mem, For 64-bit the current overhead is 2x8=16 bytes. I dont understand the mem_realloc problem. Could you please write some demonstration code |
This is not true, at least not on x86.
It doesn't matter which types you use,
Not at the end of struct but after nrefs. Compilers do align members of the struct according to their alignment requirements, which means padding is introduced between members as needed. The idea was to fit 32-bit
|
A quick test with retest and baresip shows we never call diff --git a/src/mem/mem.c b/src/mem/mem.c
index 1f08e18c..74b33952 100644
--- a/src/mem/mem.c
+++ b/src/mem/mem.c
@@ -194,6 +194,11 @@ void *mem_realloc(void *data, size_t size)
MAGIC_CHECK(m);
+ if (m->nrefs > 1) {
+ DEBUG_WARNING("realloc: called with multiple nrefs!\n");
+ return NULL;
+ }
+
#if MEM_DEBUG
mem_lock(); |
I think we did call it like that at some point (which originated the bug fix and the optimization), but we have removed all uses of Returning |
Of course, yet another option would be to remove |
Ensure that the pointer returned by mem_alloc & co. is properly aligned similar to malloc on a given platform. This is important e.g. for SIMD processing, including libc string operations. Also, fixed mem_realloc behavior when there are multiple references to the memory buffer being reallocated. In this case, the implementation now allocates a new buffer and copies the contents and decrements the reference counter on the original memory buffer. This way the other references for the old buffer remain valid. Also, fixed formatting in mem_status.
20e2ab2
to
051de66
Compare
051de66
to
93c81eb
Compare
This allows to fit mem header in 16 bytes on 64-bit targets in release build and save 16 bytes of overhead due to data alignment.
93c81eb
to
f6dd72a
Compare
Ensure that the pointer returned by
mem_alloc
& co. is properly aligned similar tomalloc
on a given platform. This is important e.g. for SIMD processing, including libc string operations.Also, made
mem_realloc
optimize awayrealloc
call if there is enough storage in the allocated buffer. This only works if the input pointer is the only reference to the memory block.Also, fixed formatting in
mem_status
.