Skip to content

Conversation

@tjtanaa
Copy link
Member

@tjtanaa tjtanaa commented Nov 22, 2023

No description provided.

tjtanaa and others added 20 commits October 27, 2023 00:27
* port dtype_float16.cuh and cache_kernels.cu

* port dtype_bfloat16.cuh

* port attention_utils.cuh

* port more kernels

* fix typo

* add cuda_compat.h

* sync branches

* update

* update

* fixes

* cleanup

* update

* update

* update

* fmt

* cleanup

* refactor

* update

* detecting rocm and adding flag for compiling

* using asm volatile instead of hip api

* using asm volatile for type casting of f16

---------

Co-authored-by: Philipp Moritz <pcmoritz@gmail.com>
Co-authored-by: Amir Balwel <amoooori04@gmail.com>
@tjtanaa tjtanaa requested review from iAmir97 and kliuae November 22, 2023 04:34
@fxmarty
Copy link

fxmarty commented Nov 30, 2023

Hi @tjtanaa, I am wondering if the work being done in this repo is different (kernel-wise) from vllm-project#1313?

Thank you!

@kliuae kliuae closed this Dec 1, 2023
@kliuae kliuae deleted the v0.2.1.post1-rocm branch December 1, 2023 17:11
@kliuae
Copy link
Collaborator

kliuae commented Dec 4, 2023

Hi @fxmarty the kernels in the v0.2.x ports were built upon vllm-project#1313 with some modifications for them to build in our environments, as well as the inclusion of squeezellm quantization kernels. Thank you

@fxmarty
Copy link

fxmarty commented Dec 4, 2023

I see thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants