[LLParser] Support identifiers like `nan` and `pinf` for special FP values #102790

mshockwave · 2024-08-11T05:22:20Z

Instead of hexadecimal values, users can now write:

nan for positive quiet NaN with zero payload
qnan(payload) for positive quiet NaN with custom payload
snan(payload) for positive signaling NaN with (non-zero) custom payload
pinf for positive infinity
ninf for negative infinity

Right now only the parser support these new directives, while still printing the hexadecimal values. The AsmWriter support for printing some of them will be in a follow-up patch.

Users can now write `qnan`, `snan`, `pinf`, and `ninf` for certain special floating point constants, instead of the hexidecimal values.

llvmbot · 2024-08-11T05:22:54Z

@llvm/pr-subscribers-llvm-ir

Author: Min-Yih Hsu (mshockwave)

Changes

Users can now write qnan, snan, pinf, and ninf for certain special floating point constants, instead of their hexadecimal values.

Full diff: https://github.com/llvm/llvm-project/pull/102790.diff

3 Files Affected:

(modified) llvm/docs/LangRef.rst (+25-11)
(modified) llvm/lib/AsmParser/LLParser.cpp (+15)
(modified) llvm/unittests/AsmParser/AsmParserTest.cpp (+40)

diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 0ee4d7b444cfcf..899474d2cc413b 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -4387,12 +4387,12 @@ Simple Constants
     zeros. So '``s0x0001``' of type '``i16``' will be -1, not 1.
 **Floating-point constants**
     Floating-point constants use standard decimal notation (e.g.
-    123.421), exponential notation (e.g. 1.23421e+2), or a more precise
-    hexadecimal notation (see below). The assembler requires the exact
-    decimal value of a floating-point constant. For example, the
-    assembler accepts 1.25 but rejects 1.3 because 1.3 is a repeating
-    decimal in binary. Floating-point constants must have a
-    :ref:`floating-point <t_floating>` type.
+    123.421), exponential notation (e.g. 1.23421e+2), identifiers for special
+    values like ``qnan``, or a more precise hexadecimal notation (see below).
+    The assembler requires the exact decimal value of a floating-point
+    constant. For example, the assembler accepts 1.25 but rejects 1.3
+    because 1.3 is a repeating decimal in binary. Floating-point constants
+    must have a :ref:`floating-point <t_floating>` type.
 **Null pointer constants**
     The identifier '``null``' is recognized as a null pointer constant
     and must be of :ref:`pointer type <t_pointer>`.
@@ -4403,13 +4403,12 @@ Simple Constants
 The one non-intuitive notation for constants is the hexadecimal form of
 floating-point constants. For example, the form
 '``double    0x432ff973cafa8000``' is equivalent to (but harder to read
-than) '``double 4.5e+15``'. The only time hexadecimal floating-point
-constants are required (and the only time that they are generated by the
-disassembler) is when a floating-point constant must be emitted but it
+than) '``double 4.5e+15``'. Hexadecimal floating-point
+constants are used when a floating-point constant must be emitted but it
 cannot be represented as a decimal floating-point number in a reasonable
 number of digits. For example, NaN's, infinities, and other special
-values are represented in their IEEE hexadecimal format so that assembly
-and disassembly do not cause any bits to change in the constants.
+values are represented in their IEEE hexadecimal format. This ensures that
+assembly and disassembly do not cause any bits to change in the constants.
 
 When using the hexadecimal form, constants of types bfloat, half, float, and
 double are represented using the 16-digit form shown above (which matches the
@@ -4426,6 +4425,21 @@ represented by ``0xH`` followed by 4 hexadecimal digits. The bfloat 16-bit
 format is represented by ``0xR`` followed by 4 hexadecimal digits. All
 hexadecimal formats are big-endian (sign bit at the left).
 
+Some of the special floating point values can be represented by the following
+identifiers:
+
+    +-----------+---------------------------------------------------+
+    | Name      | Description                                       |
+    +===========+===================================================+
+    | ``qnan``  | Positive quiet NaN w/ payload equals to zero      |
+    +-----------+---------------------------------------------------+
+    | ``snan``  | Positive signaling NaN w/ payload equals to zero  |
+    +-----------+---------------------------------------------------+
+    | ``pinf``  | Positive infinity                                 |
+    +-----------+---------------------------------------------------+
+    | ``ninf``  | Negative infinity                                 |
+    +-----------+---------------------------------------------------+
+
 There are no constants of type x86_amx.
 
 .. _complexconstants:
diff --git a/llvm/lib/AsmParser/LLParser.cpp b/llvm/lib/AsmParser/LLParser.cpp
index f41907f0351257..fe909415eeab5b 100644
--- a/llvm/lib/AsmParser/LLParser.cpp
+++ b/llvm/lib/AsmParser/LLParser.cpp
@@ -3833,6 +3833,21 @@ bool LLParser::parseValID(ValID &ID, PerFunctionState *PFS, Type *ExpectedTy) {
   case lltok::kw_poison: ID.Kind = ValID::t_Poison; break;
   case lltok::kw_zeroinitializer: ID.Kind = ValID::t_Zero; break;
   case lltok::kw_none: ID.Kind = ValID::t_None; break;
+  case lltok::kw_qnan:
+    ID.APFloatVal = APFloat::getQNaN(APFloat::IEEEdouble());
+    ID.Kind = ValID::t_APFloat;
+    break;
+  case lltok::kw_snan:
+    ID.APFloatVal = APFloat::getSNaN(APFloat::IEEEdouble());
+    ID.Kind = ValID::t_APFloat;
+    break;
+  case lltok::kw_pinf:
+  case lltok::kw_ninf:
+    ID.APFloatVal =
+        APFloat::getInf(APFloat::IEEEdouble(),
+                        /*Negative=*/Lex.getKind() == lltok::kw_ninf);
+    ID.Kind = ValID::t_APFloat;
+    break;
 
   case lltok::lbrace: {
     // ValID ::= '{' ConstVector '}'
diff --git a/llvm/unittests/AsmParser/AsmParserTest.cpp b/llvm/unittests/AsmParser/AsmParserTest.cpp
index a70c061d3e3044..7e1b000f17f922 100644
--- a/llvm/unittests/AsmParser/AsmParserTest.cpp
+++ b/llvm/unittests/AsmParser/AsmParserTest.cpp
@@ -82,6 +82,46 @@ TEST(AsmParserTest, TypeAndConstantValueParsing) {
   ASSERT_TRUE(isa<ConstantFP>(V));
   EXPECT_TRUE(cast<ConstantFP>(V)->isExactlyValue(3.5));
 
+  // Special floating point constants.
+  const APFloat *APFloatVal;
+  V = parseConstantValue("double qnan", Error, M);
+  ASSERT_TRUE(V);
+  EXPECT_TRUE(V->getType()->isDoubleTy());
+  ASSERT_TRUE(isa<ConstantFP>(V));
+  APFloatVal = &cast<ConstantFP>(V)->getValueAPF();
+  EXPECT_TRUE(APFloatVal->isNaN() && !APFloatVal->isSignaling());
+
+  V = parseConstantValue("double snan", Error, M);
+  ASSERT_TRUE(V);
+  EXPECT_TRUE(V->getType()->isDoubleTy());
+  ASSERT_TRUE(isa<ConstantFP>(V));
+  APFloatVal = &cast<ConstantFP>(V)->getValueAPF();
+  EXPECT_TRUE(APFloatVal->isNaN() && APFloatVal->isSignaling());
+
+  V = parseConstantValue("double pinf", Error, M);
+  ASSERT_TRUE(V);
+  EXPECT_TRUE(V->getType()->isDoubleTy());
+  ASSERT_TRUE(isa<ConstantFP>(V));
+  APFloatVal = &cast<ConstantFP>(V)->getValueAPF();
+  EXPECT_TRUE(APFloatVal->isInfinity() && !APFloatVal->isNegative());
+
+  V = parseConstantValue("double ninf", Error, M);
+  ASSERT_TRUE(V);
+  EXPECT_TRUE(V->getType()->isDoubleTy());
+  ASSERT_TRUE(isa<ConstantFP>(V));
+  APFloatVal = &cast<ConstantFP>(V)->getValueAPF();
+  EXPECT_TRUE(APFloatVal->isInfinity() && APFloatVal->isNegative());
+
+  // We always parse special values into IEEEdouble first before converting
+  // them into the right semantics once the type info is available.
+  // The following tests whether this conversion works as expected.
+  V = parseConstantValue("bfloat pinf", Error, M);
+  ASSERT_TRUE(V);
+  EXPECT_TRUE(V->getType()->isBFloatTy());
+  ASSERT_TRUE(isa<ConstantFP>(V));
+  APFloatVal = &cast<ConstantFP>(V)->getValueAPF();
+  EXPECT_TRUE(APFloatVal->isInfinity() && !APFloatVal->isNegative());
+
   V = parseConstantValue("i32 42", Error, M);
   ASSERT_TRUE(V);
   EXPECT_TRUE(V->getType()->isIntegerTy());

nikic

Nice idea!

What do you think about also printing them as such?

nikic · 2024-08-12T14:01:15Z

llvm/docs/LangRef.rst

+    +-----------+---------------------------------------------------+
+    | Name      | Description                                       |
+    +===========+===================================================+
+    | ``qnan``  | Positive quiet NaN w/ payload equals to zero      |


nikic · 2024-08-12T14:01:23Z

llvm/docs/LangRef.rst

+    +===========+===================================================+
+    | ``qnan``  | Positive quiet NaN w/ payload equals to zero      |
+    +-----------+---------------------------------------------------+
+    | ``snan``  | Positive signaling NaN w/ payload equals to zero  |


There is no sNaN with an all-0 payload, that would be an infinity instead--the payload needs to have some bit set to not be an infinity. qNaNs always have one bit set, but there is no requisite bit set for the sNaN.

Yeah it's a bit tricky to describe what we want here...maybe we could just say "value equal to APFloat::getSNaN()"? Though I'm not sure if we should rely on an API, which is subject to change anytime, to describe a specification. What do you think?

Maybe it should just be explicit about the payload, e.g. nan(payload) instead of qnan/snan.

I haven't looked at how hard the logic would be to implement, but making nan by itself return the preferred qNaN and nan(payload) be used for other qNaN/sNaN payloads seems the best way forward to me.

nikic · 2024-08-12T14:03:32Z

llvm/unittests/AsmParser/AsmParserTest.cpp

Can you please also add a normal lit test in llvm/test/Assembler? Just an llvm-as | llvm-dis round-trip.

mshockwave · 2024-08-13T23:50:01Z

What do you think about also printing them as such?

I'm fine doing that as well.

And address review feedbacks.

mshockwave · 2024-08-22T05:25:13Z

I've updated the patch so now nan represent the default QNaN with no payload; snan(payload) and qnan(payload) allow you to customize the payload. I decided to have a separate snan and qnan for the payload version because it's easier to create the value using the existing APFloat APIs, which always ask the signaling bit up ahead and "sanitize" the payload accordingly.

There are some rough edges on specifying the payload value, because LLParser always use a IEEEdouble to carry the value before converting them into the actual types, which might truncate the payload in an unexpected way. I found it pretty difficult to fully fix so instead, I stated this issue in the LangRef and asked users to be careful.

A patch for the AsmWriter support, namely, printing nan, pinf, and ninf as well, is on the way. I put it in a separate patch because there are tons of changes in the tests.

Update: AsmWriter patch is #105618

[LLParser] Support identifiers like qnan and pinf for FP constants

56afe10

Users can now write `qnan`, `snan`, `pinf`, and `ninf` for certain special floating point constants, instead of the hexidecimal values.

mshockwave requested review from nikic, jyknight and andykaylor August 11, 2024 05:22

llvmbot added the llvm:ir label Aug 11, 2024

nikic reviewed Aug 12, 2024

View reviewed changes

Allow custom payload for SNaN and QNaN

a87596e

And address review feedbacks.

mshockwave changed the title ~~[LLParser] Support identifiers like qnan and pinf for special FP values~~ [LLParser] Support identifiers like nan and pinf for special FP values Aug 22, 2024

mshockwave requested review from nikic and jcranmer-intel August 22, 2024 05:25

mshockwave mentioned this pull request Aug 22, 2024

[AsmWriter] Print nan, pinf, and ninf when applicable #105618

Open

dtcxzyw requested a review from arsenm August 22, 2024 05:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[LLParser] Support identifiers like `nan` and `pinf` for special FP values #102790

[LLParser] Support identifiers like `nan` and `pinf` for special FP values #102790

Uh oh!

mshockwave commented Aug 11, 2024 •

edited

Loading

Uh oh!

llvmbot commented Aug 11, 2024

Uh oh!

nikic left a comment

Uh oh!

nikic Aug 12, 2024

Uh oh!

mshockwave Aug 22, 2024

Uh oh!

nikic Aug 12, 2024

Uh oh!

jcranmer-intel Aug 13, 2024

Uh oh!

mshockwave Aug 13, 2024

Uh oh!

jyknight Aug 14, 2024

Uh oh!

jcranmer-intel Aug 14, 2024

Uh oh!

nikic Aug 12, 2024

Uh oh!

mshockwave Aug 22, 2024

Uh oh!

mshockwave commented Aug 13, 2024

Uh oh!

mshockwave commented Aug 22, 2024 •

edited

Loading

Uh oh!

Uh oh!

	\| ``qnan`` \| Positive quiet NaN w/ payload equals to zero \|
	\| ``qnan`` \| Positive quiet NaN w/ payload equal to zero \|

	\| ``snan`` \| Positive signaling NaN w/ payload equals to zero \|
	\| ``snan`` \| Positive signaling NaN w/ payload equal to zero \|

[LLParser] Support identifiers like nan and pinf for special FP values #102790

Are you sure you want to change the base?

[LLParser] Support identifiers like nan and pinf for special FP values #102790

Uh oh!

Conversation

mshockwave commented Aug 11, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Aug 11, 2024

Uh oh!

nikic left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mshockwave commented Aug 13, 2024

Uh oh!

mshockwave commented Aug 22, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

[LLParser] Support identifiers like `nan` and `pinf` for special FP values #102790

[LLParser] Support identifiers like `nan` and `pinf` for special FP values #102790

mshockwave commented Aug 11, 2024 •

edited

Loading

mshockwave commented Aug 22, 2024 •

edited

Loading