Add proposed alignment specifier #33

Steve132 · 2021-12-26T12:30:58Z

TThis is a patch demonstrating a proposal to add a new alignment tag which allows a safer and more user-friendly experience across platforms for alignment and facilitates generic code.

Specifically, when specifying alignment, the current options are only overaligned<N>, vector_aligned, or element_aligned.

The difficulty comes from the fact that correctly writing this code is actually impossible in a cross-platform way. For example, suppose we take the following code:

float* data=(float*)aligned_alloc(16);
stdx::native_simd<float> vec(data,stdx::vector_aligned);

This code only works on NEON and SSE, and would be incorrect on AVX, because AVX requires 32 and 64 byte alignment for native size vectors.
Fixing it is impossible without changing the code for the allocation,

float* data=(float*)aligned_alloc(stdx::memory_alignment<stdx::native_simd<float>>);
stdx::native_simd<float> vec(data,stdx::vector_aligned);

But this code might not always be possible if the data buffer is allocated by a library.

Furthermore, if you are trying to use simd::copy_to, then this actually becomes impossible to write back to a variable defined on the stack in a cross platform way:

std::array<float,64> get_vecdata(....){
	....
	std::array<float,64> output; //this is 16-byte aligned by default
	for(int i=0;i<64;i+=stdx::native_simd<float>::size())
	{
		stdx::native_simd<float> vec_result=...;
		vec_result.copy_to(&output[i],vector_aligned); //correct on NEON, SSE2, incorrect on AVX
		vec_result.copy_to(&output[i],element_aligned); //correct on all platforms, but slow.
	}

Similarly, reading from a stack variable fails as well

	float read_vecdata(const std::array<float,64>& data)
	{	
		stdx::native_simd<float> vec_result(&data[0],vector_aligned); //is it aligned on this platform? Maybe!
	}

So, writing correct code on all platforms actually becomes impossible and it is the programmers responsibility to know what alignemnt requirements are satisfied on all targets.
This might involve writing lots of ifdefs, or just dropping back to the slow case. Or maybe using template metaprogramming
Which is exactly the kind of code std::simd is supposed to prevent!

Writing generic code becomes even more difficult. Consider.

template<class T>
std::array<T,64> get_vecdata(....){
	....
	std::array<T,64> output; //this is min(16,alignof(T))-byte aligned by default
	for(int i=0;i<64;i+=stdx::native_simd<T>::size())
	{
		stdx::native_simd<T> vec_result=...;
		vec_result.copy_to(&output[i],vector_aligned);   //which platforms and types does this work on without causing an unaligned access?  Who knows!
	}
	return output;
}

This proposed patch fixes that problem. By giving the programmer a way to specify the exact alignment that is used as a tag, generic code and cross platform code becomes possible again.
Internally, the new "stdx::aligned" tag automatically correctly selects an aligned vector load or an unaligned vector load at compile time, based on the platform architecture and data type of the vector the load is using and the byte alignment passed in by the user. It also throws a compile-time assertion if the given byte alignment isn't even element-aligned (which is common to misunderstand if using structure types). This allows platform agnostic and generic code to be written which doesn't throw alignment exceptions in any case. This takes the load of deciding what
works and what doesn't off of the client code's mind.

Example of reading from a pre-allocated buffer with a fixed alignment:

float* data=(float*)aligned_alloc(16);
stdx::native_simd<float> vec(data,stdx::__proposal::aligned<16>);

Example of reading from a reference

float read_vecdata(const std::array<float,64>& data)
{	
	stdx::native_simd<float> vec_result(&data[0],stdx::__proposal::aligned<alignof(data)>);
}

Example of writing back generic code to an output

template<class T>
std::array<T,64> get_vecdata(....){
	....
	std::array<T,64> output; //this is min(16,alignof(T))-byte aligned by default
	for(int i=0;i<64;i+=stdx::native_simd<T>::size())
	{
		stdx::native_simd<T> vec_result=...;
		vec_result.copy_to(&output[i],stdx::__proposal::aligned<alignof(output)>); 
	}
	return output;
}

This pull request is actually meant to really be something which is incorporated by @mattkretz into the standards proposal and into the libstdc++ implementation.

Add proposed alignment<

0ebe8db

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add proposed alignment specifier #33

Add proposed alignment specifier #33

Steve132 commented Dec 26, 2021

Add proposed alignment specifier #33

Are you sure you want to change the base?

Add proposed alignment specifier #33

Conversation

Steve132 commented Dec 26, 2021