Skip to content

Alignment for sendBuffer32, recvBuffer32 and HalfSpinor32 #172

Open
kostrzewa opened this Issue Oct 15, 2012 · 4 comments

2 participants

@kostrzewa
European Twisted Mass Collaboration member

A halfspinor32 has a size of 48 bytes, therefore, the way it's currently defined:

(unsigned long int) &HalfSpinor32[1] - (unsigned long int) &HalfSpinor32[0] = 48

This means that every second element of HalfSpinor32 is not 32-byte aligned. Strangely, in practice on BG/Q, this doesn't seem to affect the sloppy hopping matrix in terms of performance. In fact, introducing a remapping [1] reduces nocomm performance by 4mflops...

[1] allocating sizeof(halfspinor32)+padding memory for each element, redeclaring HalfSpinor32 as halfspinor32**, allocating memory for a pointer array and repointing to HalfSpinor32_ at sizeof(halfspinor32)+padding intervals

@urbach
urbach commented Oct 16, 2012

For HalfSpinor32 I simply reuse the memory for HalfSpinor in the NewHalfspinor branch. The same I do for the communication buffers. This doesn't solve the alignement problem, of course...

@kostrzewa
European Twisted Mass Collaboration member

I don't really understand though why there doesn't seem to really be an alignment problem...

@urbach
urbach commented Oct 16, 2012

The loads that are used (vec_ld2) need only 16-byte alignement, which is always the case for HalfSpinor32. Otherwise, you could never do a sloppy precision Dirac operator. I think its slower, but to profit from 32-byte alignement you'd need to change the load operation, too...

@kostrzewa
European Twisted Mass Collaboration member

I see, that makes sense. Perhaps we should just leave it be then until a problem shows up somewhere?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.