You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is a capture of some of the discussion in our meeting on 1/10.
This refers to #1334.
Please add anything I forgot to mention.
Break into separate PRs for forall and kernel changes
Add separate policies for scan and sort because new forall policies contain thread mapping information that is not relevant to scan and sort. Though I suppose that it could be relevant if we ever implemented them ourselves.
kernel policies can specify thread mapping without block mapping, but both should always be mapped.
use global with grid_size set to 1 to specify 1 block instead of just a thread policy.
kernel launch policy can be inconsistent with use
require higher block size in the policies inside the launch.
kernel policies can contradict each other leading to incorrect mapping behavior
This policy snippet would launch a kernel with block size 256 but the first for would map incorrectly because it expected a block size of 128, for< direct_thread_x<128>, lambda<0> >, for< direct_thread_x<256>, lambda<0> >
optimize policies with grid_size or block_size set to 1, index is always 0
The text was updated successfully, but these errors were encountered:
This is a capture of some of the discussion in our meeting on 1/10.
This refers to #1334.
Please add anything I forgot to mention.
for< direct_thread_x<128>, lambda<0> >, for< direct_thread_x<256>, lambda<0> >
The text was updated successfully, but these errors were encountered: