-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
enable OffloadedCache on XPU from PyTorch 2.7 #36654
Conversation
…tionalGeneration model
Signed-off-by: Yao, Matrix <matrix.yao@intel.com>
Signed-off-by: Yao, Matrix <matrix.yao@intel.com>
Hi 👋, thank you for opening this pull request! The pull request is converted to draft by default. When it is ready for review, please click the |
Signed-off-by: root <root@a4bf01945cfe.jf.intel.com>
the ci failed cases seems irrelevant to my changes. |
Hi @yao-matrix Thank you for make this supported. Hi @n17s, are you interested to take a first look? cc @gante |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine to me overall !
Signed-off-by: N <matrix.yao@intel.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you for adding support! 🤗
Added a minor nit with a more recent import guard practice, happy to merge when it's sorted
Signed-off-by: root <root@a4bf01945cfe.jf.intel.com>
This reverts commit acf1484.
Signed-off-by: root <root@a4bf01945cfe.jf.intel.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks ! LGTM !
Hi @yao-matrix and @SunMarc - it looks like running this PR with torch 2.5.0a0+b465a5843b.nv24.9 (from nvcr.io/nvidia/pytorch:24.09-py3) I see the following error:
Perhaps the guards are on the wrong version of pytorch? |
it's weird, my version checks on 2.7, which means if version >= 2.7, goes the new API; else the old. But I can see in your PR you changed the pytorch from 2.5 to 2.6, both versions go the old path. |
@yao-matrix - yes that is quite odd, but I was able to bisect the failure to this PR, so perhaps it is another code path that this PR is enabling that I'm hitting it from? But it does seem to be resolved by updating the torch version. |
Changes from huggingface/transformers#36654 in transformers cause issues with the torch 2.5 version we were using. This just updated us to use a newer version. --------- Signed-off-by: Logan Adams <loadams@microsoft.com>
@yao-matrix I'm going to revert part of the changes in @yao-matrix to enable your use case I'm going to add an |
Changes from huggingface/transformers#36654 in transformers cause issues with the torch 2.5 version we were using. This just updated us to use a newer version. --------- Signed-off-by: Logan Adams <loadams@microsoft.com>
Changes from huggingface/transformers#36654 in transformers cause issues with the torch 2.5 version we were using. This just updated us to use a newer version. --------- Signed-off-by: Logan Adams <loadams@microsoft.com>
XPU are aligning features in PyTorch w/ CUDA. Since PyTorch 2.6, an device agnostic
torch.Stream
is supported and XPU support this API. So, I enabled OffloadedCache on XPU.Why start from 2.7? The reason is OffloadedCache needs StreamContext, but the PR to support
__enter__
attribute of StreamContext is not merged in 2.6, but will be in 2.7.Tested w/ PyTorch 2.7 dev package(pip install --pre torch==2.7.0.dev20250306 --index-url https://download.pytorch.org/whl/nightly/xpu).