[DLight] Update GEMV rule to support Adreno outer reduction by Hzfengsy · Pull Request #15730 · apache/tvm

Hzfengsy · 2023-09-12T11:41:45Z

As Adreno has poor shared memory performance, for LLM workloads, we prefer q4f16_0 instead of q4f16_1.

This PR adds GEMV rule support for Adreno outer reduction, comparing with q4f16_1, we get a performance gain

Prefill(not related with this PR changes): 3636.6501 ms -> 1241.9469 ms
Decode: 211.0834 ms -> 174.9357 ms

As Adreno has poor shared memory performance, for LLM workloads, we prefer `q4f16_0` instead of `q4f16_1`. This PR adds GEMV rule support for Adreno outer reduction, comparing with `q4f16_1`, we get a performance gain - Prefill(not related with this PR changes): 3636.6501 ms -> 1241.9469 ms - Decode: 211.0834 ms -> 174.9357 ms

As apache/tvm#15730 merged, it's no need to dispatch pre-tuned kernel anymore. This PR disables the dispatch.

The PR apache#15730 introduced the outer_reduction for adreno gemv. This PR fixes the length issue when applying on dynamic workloads.

As apache/tvm#15730 merged, it's no need to dispatch pre-tuned kernel anymore. This PR disables the dispatch.

The PR #15730 introduced the outer_reduction for adreno gemv. This PR fixes the length issue when applying on dynamic workloads.

As apache/tvm#15730 merged, it's no need to dispatch pre-tuned kernel anymore. This PR disables the dispatch.

Hzfengsy requested a review from spectrometerHBH September 12, 2023 11:41

spectrometerHBH approved these changes Sep 12, 2023

View reviewed changes

junrushao approved these changes Sep 12, 2023

View reviewed changes

Hzfengsy force-pushed the outer_reduction_gemv branch from 64e50dc to 3faa86b Compare September 13, 2023 01:58

Hzfengsy merged commit 93bb647 into apache:unity Sep 13, 2023

Hzfengsy pushed a commit to Hzfengsy/mlc-llm that referenced this pull request Sep 14, 2023

[Perf] Disable dispatch

d9a038d

As apache/tvm#15730 merged, it's no need to dispatch pre-tuned kernel anymore. This PR disables the dispatch.

Hzfengsy mentioned this pull request Sep 14, 2023

[Unity][DLight] Fix outer_reduction for dynamic shape workloads #15743

Merged

Hzfengsy pushed a commit to Hzfengsy/mlc-llm that referenced this pull request Sep 14, 2023

[Perf] Disable dispatch

cc81ad0

As apache/tvm#15730 merged, it's no need to dispatch pre-tuned kernel anymore. This PR disables the dispatch.

Hzfengsy mentioned this pull request Sep 14, 2023

[Perf] Disable dispatch mlc-ai/mlc-llm#910

Merged

MasterJH5574 pushed a commit to mlc-ai/mlc-llm that referenced this pull request Sep 14, 2023

[Perf] Disable dispatch (#910)

cf883b0

As apache/tvm#15730 merged, it's no need to dispatch pre-tuned kernel anymore. This PR disables the dispatch.

junrushao pushed a commit that referenced this pull request Sep 16, 2023

[Unity][DLight] Fix outer_reduction for dynamic shape workloads (#15743)

dd3bfb3

The PR #15730 introduced the outer_reduction for adreno gemv. This PR fixes the length issue when applying on dynamic workloads.

Hzfengsy deleted the outer_reduction_gemv branch January 25, 2024 09:53

smickey040404 added a commit to smickey040404/mlc-llm that referenced this pull request Feb 11, 2025

[Perf] Disable dispatch (#910)

eb67351

As apache/tvm#15730 merged, it's no need to dispatch pre-tuned kernel anymore. This PR disables the dispatch.

tristankincaid added a commit to tristankincaid/mlc-llm that referenced this pull request Feb 16, 2025

[Perf] Disable dispatch (#910)

7fec62c

As apache/tvm#15730 merged, it's no need to dispatch pre-tuned kernel anymore. This PR disables the dispatch.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DLight] Update GEMV rule to support Adreno outer reduction#15730

[DLight] Update GEMV rule to support Adreno outer reduction#15730
Hzfengsy merged 1 commit intoapache:unityfrom
Hzfengsy:outer_reduction_gemv

Hzfengsy commented Sep 12, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Hzfengsy commented Sep 12, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants