Conversation
|
I realize there's almost nothing to this PR but I'm pinging you for feedback before putting into a PR to the main repo owned by Zeiss. Any feedback is greatly appreciated. |
Src/libCZI/stdAllocator.cpp
Outdated
| return malloc((size_t)size); | ||
| #else | ||
| #if defined(__GNUC__) | ||
| // it's not clear to me why the code needs the returned pointer to be at a memory boundary |
There was a problem hiding this comment.
I would not submit to libCZI with this comment. It would be ideal to completely understand the code change you are submitting.
There was a problem hiding this comment.
For the sake of discussion:
I'm not an expert on this but here is my limited understanding:
- the cpu fetches memory from RAM into local caches at run time
- so when you need to access a big data structure, the cpu actually delivers it in chunks in the local cache
- the local cache is small but super fast to access
- so it is a net gain to move chunks mem into local cache at run time rather than access the big blob in global ram
- the local cache speed is super fast
- so if your data structure has an element half in and half out of the cache, then you will pay a cost while iterating
- it will have to evict cache and load another chunk
- which is why you want your structures to align to cache boundaries
|
One thing is somewhat unclear to me from your description. Is this a C vs C++ issue or a Ubuntu vs CentOS/Manylinux issue? |
Src/libCZI/stdAllocator.cpp
Outdated
| void* pv = _aligned_malloc((size_t)size, 32); | ||
| return pv; | ||
| #endif | ||
| return ::operator new((size_t)size); |
There was a problem hiding this comment.
can you confirm that the use of operator new will give the 32bit alignment?
There was a problem hiding this comment.
it won't, this code is replacing a malloc but malloc is pure C not C++.
This is functionally equivalent to malloc but if library enhancements have been made they'll be in new not in malloc.
There was a problem hiding this comment.
oh yes! but malloc and new are not exactly equivalent. this is a subtle behavior change that I don't think you intend.
|
It's a C11 vs C++ thing but the waters are muddied by some distros of Linux adding the C11 functions to their cstdlib.h. |
The initial code assumes that aligned_alloc is defined in C++ but aligned_alloc is part of the C11 standard and isn't incorporated into C++ until C++17. To the best of my knowledge, the latest version of C++ that libCZI compiles against is C++14 because Eigen has a conflict with C++17. To complicate matters further some variants of Linux define aligned_alloc in stdlib.h for their platform and it may also be defined in g++XX but I haven't investigated that.
What this means:
libCZI builds on Ubuntu, Arch-Linux, and Linux distro's that add aligned_alloc to cstdlib.h
Manylinux2010 which I believe is based off Centos 6 does not have aligned_alloc defined resulting in a failed build for any c++ based python extension.
To remedy this issue I've simplified the conditional logic added and the condition
defined(_ISOC11_SOURCE)to the#ifstatement in stdAllocator.cpp and defaulting to new and delete usage if aligned_alloc or WIN32's equivalent isn't defined. For the manylinux2010 case this seems to resolve the issue and allows compilation to work.