You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Store operation on a part of a Tensor, what I want:
accum_outs=tl.zeros([N], dtype=tl.float32)
forcol_offinrange(0, N, BLOCK_SIZE):
cols=col_off+tl.arange(0, BLOCK_SIZE)
mask=cols<Na_eles=tl.load(a_ptr+cols, mask=mask, other=0.0)
b_eles=tl.load(b_ptr+cols, mask=mask, other=0.0)
# How to implement the following: accum_outs[col_off: col_off+BLOCK_SIZE] =a_eles*b_eles
Because it's going to use "accum_outs" later and I don't want to store it back like following:
forcol_offinrange(0, N, BLOCK_SIZE):
a_eles=tl.load(a_ptr+cols, mask=mask, other=0.0)
b_eles=tl.load(b_ptr+cols, mask=mask, other=0.0)
tl.store(accum_res_ptr+cols, a_eles*b_eles, mask=mask)
# tl.ops that must be done outside the loopforcol_offinrange(0, N, BLOCK_SIZE):
eles=tl.load(accum_res_ptr+cols, mask=mask, other=0.0)
...
The text was updated successfully, but these errors were encountered:
At the moment you can't perform stores on part of tensors as far as I know. You can load a and b using 2d indexing and unravelling the output something like this for your use-case. NUM_BLOCKS should be passed as a tl.constexpr and should be the next power of 2 for (N + BLOCK_SIZE - 1//BLOCK_SIZE)
num_blocks = tl.cdiv(N, BLOCK_SIZE)
block_offs = tl.arange(0, NUM_BLOCKS)
per_block_offs = tl.arange(0, BLOCK_SIZE)
all_offs = block_offs[None, :] * BLOCK_SIZE + per_block_offs[:, None]
# 2d pointers of shape [NUM_BLOCKS, BLOCK_SIZE]
a = tl.load(a_ptr + all_offs,
mask=all_offs < N and block_offs[None, :] < num_blocks,
other=0.0)
b = tl.load(b_ptr + all_offs,
mask=all_offs < N and block_offs[None, :] < num_blocks,
other=0.0)
# [NUM_BLOCKS, BLOCK_SIZE] --> NUM_BLOCKS * BLOCK_SIZE in which N elements will be filled others = 0
accum_outs = tl.ravel(a * b)
Store operation on a part of a Tensor, what I want:
Because it's going to use "accum_outs" later and I don't want to store it back like following:
The text was updated successfully, but these errors were encountered: