Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] P40 support (best cost effective hardware) #2100

Closed
kripper opened this issue Apr 7, 2024 · 3 comments
Closed

[Question] P40 support (best cost effective hardware) #2100

kripper opened this issue Apr 7, 2024 · 3 comments
Labels
question Question about the usage

Comments

@kripper
Copy link
Contributor

kripper commented Apr 7, 2024

❓ General Questions

What's the performance of the P40 using mlc-llm + CUDA?

mlc-llm is the fastest inference engine, since it compiles the LLM taking advantage of hardware specific optimizations, and the P40 is the best cost effective hardware.

This P40 has 3480 CUDA cores:
https://resources.nvidia.com/en-us-virtualization-and-gpus/p40-datasheet

Did you have difficulties using P40 via CUDA?

@kripper kripper added the question Question about the usage label Apr 7, 2024
@Hzfengsy
Copy link
Member

Hzfengsy commented Apr 8, 2024

I think it should work if you turn off flashinfer and cutlass support. However, we do not have resource to optimize for such old device.

@Nero10578
Copy link

I think it should work if you turn off flashinfer and cutlass support. However, we do not have resource to optimize for such old device.

If it is reported to work on other Pascal generation cards like the GTX 1060 etc. Then it should work on the P40 right?

@tqchen
Copy link
Contributor

tqchen commented May 28, 2024

we have seen seevral examples working on older cards, likely we just need to turnoff flash infer, cutlass, and also follow instruction to build tvm from source

@tqchen tqchen closed this as completed May 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Question about the usage
Projects
None yet
Development

No branches or pull requests

4 participants