Construct `c10::Half` from `float16_t` on ARMv8 #120425

malfet · 2024-02-22T19:20:51Z

By hiding float32 constructors and exposing float16 ones. This allows compiler do implicit conversions as needed, and in safe cases optimize out unneeded upcasts to fp32, see example below

#include <arm_neon.h>

#ifndef __ARM_FEATURE_FP16_SCALAR_ARITHMETIC
#error Ieeee
#endif

float16_t sum1(float16_t x, float16_t y) {
    return x + y;
}

float16_t sum2(float16_t x, float16_t y) {
    return static_cast<float>(x) + static_cast<float>(y);
}

both sum variants are compiled to scalar fp16 add, if build for the platform that supports fp16 arithmetic

sum1(half, half):                            // @sum1(half, half)
        fadd    h0, h0, h1
        ret
sum2(half, half):                            // @sum2(half, half)
        fadd    h0, h0, h1
        ret

Fixes build error in some aarch64 configurations after #119483 which are defined as supporting FP16 but don't define _Float16.

cc @snadampal

And let compiler do implicit conversions as needed

pytorch-bot · 2024-02-22T19:20:54Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/120425

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (5 Unrelated Failures)

As of commit 734f0b3 with merge base 65627cf ():

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

linux-binary-manywheel / manywheel-py3_10-cpu-cxx11-abi-build / build (gh)
../c10/util/C++17.h:13:2: error: #error "You're trying to build PyTorch with a too old version of GCC. We need GCC 9 or later."
linux-binary-manywheel / manywheel-py3_11-cpu-cxx11-abi-build / build (gh)
../c10/util/C++17.h:13:2: error: #error "You're trying to build PyTorch with a too old version of GCC. We need GCC 9 or later."
linux-binary-manywheel / manywheel-py3_12-cpu-cxx11-abi-build / build (gh)
../c10/util/C++17.h:13:2: error: #error "You're trying to build PyTorch with a too old version of GCC. We need GCC 9 or later."
linux-binary-manywheel / manywheel-py3_8-cpu-cxx11-abi-build / build (gh)
../c10/util/C++17.h:13:2: error: #error "You're trying to build PyTorch with a too old version of GCC. We need GCC 9 or later."
linux-binary-manywheel / manywheel-py3_9-cpu-cxx11-abi-build / build (gh)
../c10/util/C++17.h:13:2: error: #error "You're trying to build PyTorch with a too old version of GCC. We need GCC 9 or later."

This comment was automatically generated by Dr. CI and updates every 15 minutes.

mikekgfb

LGTM!

atalman · 2024-02-22T20:39:43Z

@malfet Since #119483 was reverted should we merge both PRs, test and land them together ?

malfet · 2024-02-22T21:18:17Z

@malfet Since #119483 was reverted should we merge both PRs, test and land them together ?

If this can be landed, it potentially achieves the same results as the other PR, but with fewer lines of code

malfet · 2024-02-23T00:45:14Z

@snadampal FYI

malfet · 2024-02-23T03:47:22Z

@pytorchbot merge -i

snadampal · 2024-02-23T03:52:43Z

Looks good to me!

If I understand correctly, this is an extension to the fp16<->fp32 acceleration feature added as part of this PR. now it's reusing the same via Half operator. isn't it?

I'm wondering what the major use cases for fp16 (half) datatype kernels are.

pytorchmergebot · 2024-02-23T03:55:07Z

Merge started

Your change will be merged while ignoring the following 5 checks: linux-binary-manywheel / manywheel-py3_12-cpu-cxx11-abi-build / build, linux-binary-manywheel / manywheel-py3_11-cpu-cxx11-abi-build / build, linux-binary-manywheel / manywheel-py3_8-cpu-cxx11-abi-build / build, linux-binary-manywheel / manywheel-py3_9-cpu-cxx11-abi-build / build, linux-binary-manywheel / manywheel-py3_10-cpu-cxx11-abi-build / build

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

Construct c10::Half from float16_t on ARMv8

e28c627

And let compiler do implicit conversions as needed

mikekgfb approved these changes Feb 22, 2024

View reviewed changes

malfet added the ciflow/binaries_wheel Trigger binary build and upload jobs for wheel on the PR label Feb 22, 2024

Fix lint

734f0b3

malfet added the topic: not user facing topic category label Feb 22, 2024

atalman approved these changes Feb 22, 2024

View reviewed changes

malfet added ciflow/mps Run MPS tests (subset of trunk) module: arm Related to ARM architectures builds of PyTorch. Includes Apple M1 labels Feb 23, 2024

malfet added the ciflow/trunk Trigger trunk jobs on your pull request label Feb 23, 2024

snadampal self-requested a review February 23, 2024 03:34

pytorchmergebot added the merging label Feb 23, 2024

snadampal approved these changes Feb 23, 2024

View reviewed changes

pytorchmergebot added the Merged label Feb 23, 2024

pytorchmergebot closed this in 2240018 Feb 23, 2024

pytorchmergebot removed the merging label Feb 23, 2024

malfet deleted the malfet/construct-Half-from-float16-on-armv8 branch February 23, 2024 14:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Construct `c10::Half` from `float16_t` on ARMv8 #120425

Construct `c10::Half` from `float16_t` on ARMv8 #120425

Uh oh!

malfet commented Feb 22, 2024 •

edited

Loading

Uh oh!

pytorch-bot bot commented Feb 22, 2024 •

edited

Loading

Uh oh!

mikekgfb left a comment

Uh oh!

atalman commented Feb 22, 2024

Uh oh!

malfet commented Feb 22, 2024

Uh oh!

malfet commented Feb 23, 2024

Uh oh!

malfet commented Feb 23, 2024

Uh oh!

snadampal commented Feb 23, 2024

Uh oh!

pytorchmergebot commented Feb 23, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Construct c10::Half from float16_t on ARMv8 #120425

Construct c10::Half from float16_t on ARMv8 #120425

Uh oh!

Conversation

malfet commented Feb 22, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Feb 22, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/120425

✅ You can merge normally! (5 Unrelated Failures)

Uh oh!

mikekgfb left a comment

Choose a reason for hiding this comment

Uh oh!

atalman commented Feb 22, 2024

Uh oh!

malfet commented Feb 22, 2024

Uh oh!

malfet commented Feb 23, 2024

Uh oh!

malfet commented Feb 23, 2024

Uh oh!

snadampal commented Feb 23, 2024

Uh oh!

pytorchmergebot commented Feb 23, 2024

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Construct `c10::Half` from `float16_t` on ARMv8 #120425

Construct `c10::Half` from `float16_t` on ARMv8 #120425

malfet commented Feb 22, 2024 •

edited

Loading

pytorch-bot bot commented Feb 22, 2024 •

edited

Loading