Provide a way to debug explicit CPU fallback #262

dvrogozh · 2024-05-17T00:54:37Z

@fengyuan14 - The commit 5bf9e0c muted debug logs of "explicit" CPU fallbacks. This complicated debug for 3d party contributors trying to evaluate XPU backend capabilities - now I am forced to revert noted commit to understand which operations are not currently implemented by XPU. Please:

Explain what "explicit CPU fallback" means - this seems to be internal to xpu team classification which is unclear and confusing
Extend PYTORCH_DEBUG_XPU_FALLBACK=1 to track any CPU fallback happening in XPU backend. Note: I am fine if "explicit" fallback will be muted by default, but I really need a way to be able to track it.

commit 5bf9e0cc768f7a3b13d829118683275f324399f1 (origin/meng_max_2d)
Author: Feng Yuan <feng1.yuan@intel.com>
Date:   Mon Apr 29 13:05:51 2024 +0800

    Register operator's implementation lazily. (#177)

    1. Avoid dangling operator's implementation (m.impl(torchvision::nms) is
    ahead of `import torchvision` sometime)
    2. Mute debug log of explicit CPU fallback.
    3. Add torchvision.roi_align/_roi_align_backward example case

CC: @jgong5 @mingfeima @XiaobingSuper @ashokei @jingxu10 @gujinghui @EikanWang @fengyuan14 @guangyey

The text was updated successfully, but these errors were encountered:

dvrogozh · 2024-05-17T00:55:28Z

Also filed pytorch/pytorch#126488 for visibility at pytorch level.

fengyuan14 · 2024-05-17T02:46:42Z

Thanks for your feedback. Replied in pytorch/pytorch#126488.

Fixes: intel#262 Fixes: 5bf9e0c ("Register operator's implementation lazily") Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

@gujinghui

Fixes: #262, pytorch/pytorch#126488 5bf9e0c ("Register operator's implementation lazily") disabled warnings printout on explicit CPU fallback. I believe that users and customers will benefit from these warnings in all the cases. Note that "explicit fallback" seems to be some internal intel classification term for supported/unsupported operations unlikely known to others. Thus, non-intel users will likely care for cpu fallback in general regardless of its type. This PR adds warning back for all CPU fallback cases. We did discuss in pytorch/pytorch#126488 that maybe printout in Release build might not be needed - I am thinking otherwise and added printout for all the build modes. Let's discuss this in the PR review. CC: @gujinghui @EikanWang @fengyuan14 @guangyey @jgong5 Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> Co-authored-by: Feng Yuan <feng1.yuan@intel.com>

dvrogozh mentioned this issue May 17, 2024

xpu: provide a way to debug explicit CPU fallback pytorch/pytorch#126488

Closed

dvrogozh added a commit to dvrogozh/torch-xpu-ops that referenced this issue May 24, 2024

Always warn on cpu fallback

2a76a4a

Fixes: intel#262 Fixes: 5bf9e0c ("Register operator's implementation lazily") Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

dvrogozh mentioned this issue May 24, 2024

Always warn on cpu fallback #318

Merged

dvrogozh added a commit to dvrogozh/torch-xpu-ops that referenced this issue Jun 3, 2024

Always warn on cpu fallback

2e28e6b

Fixes: intel#262 Fixes: 5bf9e0c ("Register operator's implementation lazily") Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

fengyuan14 closed this as completed in #318 Jun 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Provide a way to debug explicit CPU fallback #262

Provide a way to debug explicit CPU fallback #262

dvrogozh commented May 17, 2024

dvrogozh commented May 17, 2024

fengyuan14 commented May 17, 2024

Provide a way to debug explicit CPU fallback #262

Provide a way to debug explicit CPU fallback #262

Comments

dvrogozh commented May 17, 2024

dvrogozh commented May 17, 2024

fengyuan14 commented May 17, 2024