Skip to content

Conversation

@Shixiaowei02
Copy link
Collaborator

No description provided.

@juney-nvidia juney-nvidia merged this pull request into release/0.5.0 Oct 18, 2023
@sjtu-cz sjtu-cz mentioned this pull request Nov 7, 2023
@poweiw poweiw added invalid and removed invalid labels Jun 3, 2025
greg-kwasniewski1 pushed a commit to greg-kwasniewski1/TensorRT-LLM that referenced this pull request Jun 10, 2025
…DIA#7)

* example of inductor pattern matcher for RoPE with explicit cos/sin matcher

Signed-off-by: Frida Hou <201670829+Fridah-nv@users.noreply.github.com>

* move to utils

Signed-off-by: Frida Hou <201670829+Fridah-nv@users.noreply.github.com>

* add usage of scalar_workaround, support op_ignore_type

Signed-off-by: Ubuntu <201670829+Fridah-nv@users.noreply.github.com>

* minor

Signed-off-by: Ubuntu <201670829+Fridah-nv@users.noreply.github.com>

* update all 3 types of RoPE matcher to use inductor pattern matcher

Signed-off-by: Frida Hou <201670829+Fridah-nv@users.noreply.github.com>

* address feedback and refine code/doc

Signed-off-by: Frida Hou <201670829+Fridah-nv@users.noreply.github.com>

* minor

Signed-off-by: Ubuntu <201670829+Fridah-nv@users.noreply.github.com>

* fix 2e2 for llama4 and ds rope, remove legalize_graph in canonicalize_graph, update ds rope impl to match with the exported graph

Signed-off-by: Frida Hou <201670829+Fridah-nv@users.noreply.github.com>

* deprecate previous rope matcher

Signed-off-by: Ubuntu <201670829+Fridah-nv@users.noreply.github.com>

---------

Signed-off-by: Frida Hou <201670829+Fridah-nv@users.noreply.github.com>
Signed-off-by: Ubuntu <201670829+Fridah-nv@users.noreply.github.com>
danielafrimi added a commit to danielafrimi/TensorRT-LLM that referenced this pull request Jun 30, 2025
# This is the 1st commit message:

kernel

Signed-off-by: Ubuntu <dafrimi@nvidia.com>

wip

Signed-off-by: Ubuntu <dafrimi@nvidia.com>

remove prints

Signed-off-by: Ubuntu <dafrimi@nvidia.com>

test pass

Signed-off-by: Ubuntu <dafrimi@nvidia.com>

test refactor with more use cases

Signed-off-by: Ubuntu <dafrimi@nvidia.com>

refacor

Signed-off-by: Ubuntu <dafrimi@nvidia.com>

refacor_2

Signed-off-by: Ubuntu <dafrimi@nvidia.com>

add tuner wip

Signed-off-by: Ubuntu <dafrimi@nvidia.com>

autotuner works

Signed-off-by: Ubuntu <dafrimi@nvidia.com>

bfloat16 works. moer changes to the thop file

Signed-off-by: Ubuntu <dafrimi@nvidia.com>

is tune for autotuner is True --> gets real tactics configs

Signed-off-by: Ubuntu <dafrimi@nvidia.com>

wip

Signed-off-by: Ubuntu <dafrimi@nvidia.com>

wip

Signed-off-by: Ubuntu <dafrimi@nvidia.com>

zeros + quant mode is works

Signed-off-by: Ubuntu <dafrimi@nvidia.com>

act int8

Signed-off-by: Ubuntu <dafrimi@nvidia.com>

removed fp8 for now

Signed-off-by: Ubuntu <dafrimi@nvidia.com>

wip

Signed-off-by: Ubuntu <dafrimi@nvidia.com>

w4a16 linear module

Signed-off-by: Ubuntu <dafrimi@nvidia.com>

wip

Signed-off-by: Ubuntu <dafrimi@nvidia.com>

changed cutalss for sm==89

Signed-off-by: Ubuntu <dafrimi@nvidia.com>

wip

Signed-off-by: Ubuntu <dafrimi@nvidia.com>

test linear work

Signed-off-by: Ubuntu <dafrimi@nvidia.com>

add license

Signed-off-by: Ubuntu <dafrimi@nvidia.com>

works!

Signed-off-by: Ubuntu <dafrimi@nvidia.com>

refactor + linear test pass

Signed-off-by: Ubuntu <dafrimi@nvidia.com>

preprocess in load weights

Signed-off-by: Ubuntu <dafrimi@nvidia.com>

wip

Signed-off-by: Ubuntu <dafrimi@nvidia.com>

wip

Signed-off-by: Ubuntu <dafrimi@nvidia.com>

wip

Signed-off-by: Ubuntu <dafrimi@nvidia.com>

wip

Signed-off-by: Ubuntu <dafrimi@nvidia.com>

refactor + rebase

Signed-off-by: Ubuntu <dafrimi@nvidia.com>

wip

Signed-off-by: Ubuntu <dafrimi@nvidia.com>

wip

Signed-off-by: Ubuntu <dafrimi@nvidia.com>

Blackwell not supported

Signed-off-by: Daniel Afrimi <dafrimi@nvidia.com>

wip

Signed-off-by: Daniel Afrimi <dafrimi@nvidia.com>

skip blackwell

Signed-off-by: Daniel Afrimi <dafrimi@nvidia.com>

wip

Signed-off-by: Daniel Afrimi <dafrimi@nvidia.com>

works

Signed-off-by: Ubuntu <dafrimi@nvidia.com>

# This is the commit message NVIDIA#2:

rebased

Signed-off-by: Ubuntu <dafrimi@nvidia.com>

# This is the commit message NVIDIA#3:

align with my pld worked version of linear

Signed-off-by: Ubuntu <dafrimi@nvidia.com>

# This is the commit message NVIDIA#4:

wip

Signed-off-by: Ubuntu <dafrimi@nvidia.com>

# This is the commit message NVIDIA#5:

refactor

Signed-off-by: Daniel Afrimi <danielafrimi8@gmail.com>

# This is the commit message NVIDIA#6:

refactor

Signed-off-by: Daniel Afrimi <danielafrimi8@gmail.com>

# This is the commit message NVIDIA#7:

refactor

Signed-off-by: Daniel Afrimi <danielafrimi8@gmail.com>

# This is the commit message NVIDIA#8:

refactor

Signed-off-by: Daniel Afrimi <danielafrimi8@gmail.com>

# This is the commit message NVIDIA#9:

sys path

Signed-off-by: Daniel Afrimi <danielafrimi8@gmail.com>

# This is the commit message NVIDIA#10:

sys path

Signed-off-by: Daniel Afrimi <danielafrimi8@gmail.com>
litaotju pushed a commit to litaotju/TensorRT-LLM that referenced this pull request Jul 19, 2025
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
litaotju pushed a commit to litaotju/TensorRT-LLM that referenced this pull request Jul 24, 2025
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
yuxianq added a commit to yuxianq/TensorRT-LLM that referenced this pull request Jul 28, 2025
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
zongfeijing pushed a commit to zongfeijing/TensorRT-LLM that referenced this pull request Jul 31, 2025
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants