Skip to content

Conversation

@v-klochkov
Copy link
Contributor

@v-klochkov v-klochkov commented Nov 6, 2023

  1. Fix the big mess in E2E test for block_load(). Test did not really
    check the mask variant. It also used wrong alignments.

  2. Fix the comments for USM and ACC block_load implementations.

  3. Minor optimization for ACC block_load functions that do not accept
    the byte_offset operand. We can assume align16 for them.

…c,...)

1) Fix the big mess in E2E test for block_load(). Test did not really
   check the mask variant. It also used wrong alignments.

2) Fix the comments for USM and ACC block_load implementations.
3) Minor optimization for ACC block_load functions that do not accept
   the byte_offset operand. We can assume align16 for them.

Signed-off-by: Klochkov, Vyacheslav N <vyacheslav.n.klochkov@intel.com>
@v-klochkov v-klochkov requested a review from a team as a code owner November 6, 2023 17:53
static_assert(!PropertyListT::template has_property<cache_hint_L3_key>(),
"L3 cache hint is reserved. The old/experimental L3 LSC cache "
"hint is cache_level::L2 now.");
properties Props{cache_hint_L1<L1Hint>, cache_hint_L2<L2Hint>, alignment<16>};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we want to ignore the properties-given alignment in this and the other cases?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

buffers/accessors should reference aligned memory on device. If byte_offset is zero we load from aligned device-buffer.
If user says the alignment is 4 it is unnecessarily pessimistic, which will cause less efficient code-gen.
If alignment is more than 16, e.g. 256, it will not give any extra benefit to code-gen.

Copy link
Contributor

@sarnex sarnex Nov 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

silently ignoring it seems somewhat strange to me, could we static assert if it is specified or something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

having something like:

static_assert(!IsUserAlignedmentSpecified || UserAlignment >= 16, "Alignment is too pessimistic, specify 16 or more for more efficient code-gen");

seems too ... (annoying?).
Compiler reserves the right to optimize the code. This ignoring of smaller alignment is the optimization.

Copy link
Contributor

@sarnex sarnex Nov 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My worry here is that we're silently ignoring explicit user specification that a user might expect to applied. We already silently ignore any other properties in the properties vector not related to ESIMD and just warn people in a comment but that seems reasonable to me since it's a totally different thing, if we have some ESIMD APIs that honor alignment and some that silently ignore it, that seems like it might be confusing to the user. If the alignment property spec was "aligned by at least X bytes" that's one thing, but I think the spec is exactly X bytes.

Copy link
Contributor Author

@v-klochkov v-klochkov Nov 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

passing alignment means "the address is aligned at least to "N"-bytes", but it may NOT be aligned at less than N-bytes (only N-bytes aligned guaranteed).

It is same as if you call USM aligned_alloc(align=16, size); which means the requested memory must be aligned at at least 16-bytes, but the returned address may be 256-bytes aligned too (accidentally or intentionally, depending on implementation).

In this case user's call of block_load<float, N>(acc, {alignement<4>}); means that user guarantees only 4-byte alignment. If by some heuristics we can guarantee better alignment, then why not use it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok that convinces me, thanks

@v-klochkov v-klochkov merged commit f54f61d into intel:sycl Nov 6, 2023
@v-klochkov v-klochkov deleted the esimd_block_load_slm_unrelated_changes branch November 6, 2023 23:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants