-
Notifications
You must be signed in to change notification settings - Fork 8.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL] refactor #6408
base: master
Are you sure you want to change the base?
[SYCL] refactor #6408
Conversation
@slaren Since we can put SYCL related code under a directory instead of a single file, I might introduce headers-only library for performance optimization, as well as simplifying our effort too (my job during work time 😁 ) @ggerganov @mingfeima for aware |
I think that's good, I plan to start using CUTLASS in the CUDA backend as well. |
It's great to see the new structure. |
This is a good refactoring, and would be helpful for debug . I would suggest waiting for some iq quant prs and then resume work on this. |
yes, drop a note when you finished. |
|
for sub folder: dpct
|
There was a lot of changes which caused conflict. Unable to see how to easily resolve it. @airMeng can you see if there is much that needs to be fixed? |
never mind. just more time needed. |
seperate mmq, mmvq, dmmv from the main files avoid g_sycl_gpu_mgr null fix no new line fix backend no new line fix fp16 issues no final newlines backup backup backup backup backup
according to #5277 (reply in thread), the PR does the following:
- [ ] let the common backend to handle H2D/D2H memcpy.let the PR as simple as possible