Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MLIR emitters: Vectorize column reductions. #68425

Merged
merged 1 commit into from
May 22, 2024

Conversation

copybara-service[bot]
Copy link

MLIR emitters: Vectorize column reductions.

Special thanks to github user lingzhi98 who experimented with this in
openxla/xla#11018.

I tried to make the logic as similar for vectorized and non-vectorized
reductions as I could. The vectorized logic looks like this:

  • produce N reduced elements per thread, store the intermediate results in
    a vector V
  • loop over the N elements of V, writing each one to shmem
  • loop over N elements, reading them from shmem and writing the result to
    global memory

@copybara-service copybara-service bot force-pushed the exported_pr_636130464 branch 3 times, most recently from cd3853d to d84b712 Compare May 22, 2024 18:41
Special thanks to github user lingzhi98 who experimented with this in
openxla/xla#11018.

I tried to make the logic as similar for vectorized and non-vectorized
reductions as I could. The vectorized logic looks like this:

- produce N reduced elements per thread, store the intermediate results in
  a vector V
- loop over the N elements of V, writing each one to shmem
- loop over N elements, reading them from shmem and writing the result to
  global memory

PiperOrigin-RevId: 636243118
@copybara-service copybara-service bot closed this May 22, 2024
@copybara-service copybara-service bot merged commit aa12419 into master May 22, 2024
1 check passed
@copybara-service copybara-service bot deleted the exported_pr_636130464 branch May 22, 2024 20:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant