Skip to content

Conversation

mshockwave
Copy link
Member

@mshockwave mshockwave commented Aug 11, 2024

Instead of hexadecimal values, users can now write:

  • nan for positive quiet NaN with zero payload
  • qnan(payload) for positive quiet NaN with custom payload
  • snan(payload) for positive signaling NaN with (non-zero) custom payload
  • pinf for positive infinity
  • ninf for negative infinity

Right now only the parser support these new directives, while still printing the hexadecimal values. The AsmWriter support for printing some of them will be in a follow-up patch.

Users can now write `qnan`, `snan`, `pinf`, and `ninf` for certain
special floating point constants, instead of the hexidecimal values.
@llvmbot
Copy link
Member

llvmbot commented Aug 11, 2024

@llvm/pr-subscribers-llvm-ir

Author: Min-Yih Hsu (mshockwave)

Changes

Users can now write qnan, snan, pinf, and ninf for certain special floating point constants, instead of their hexadecimal values.


Full diff: https://github.com/llvm/llvm-project/pull/102790.diff

3 Files Affected:

  • (modified) llvm/docs/LangRef.rst (+25-11)
  • (modified) llvm/lib/AsmParser/LLParser.cpp (+15)
  • (modified) llvm/unittests/AsmParser/AsmParserTest.cpp (+40)
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 0ee4d7b444cfcf..899474d2cc413b 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -4387,12 +4387,12 @@ Simple Constants
     zeros. So '``s0x0001``' of type '``i16``' will be -1, not 1.
 **Floating-point constants**
     Floating-point constants use standard decimal notation (e.g.
-    123.421), exponential notation (e.g. 1.23421e+2), or a more precise
-    hexadecimal notation (see below). The assembler requires the exact
-    decimal value of a floating-point constant. For example, the
-    assembler accepts 1.25 but rejects 1.3 because 1.3 is a repeating
-    decimal in binary. Floating-point constants must have a
-    :ref:`floating-point <t_floating>` type.
+    123.421), exponential notation (e.g. 1.23421e+2), identifiers for special
+    values like ``qnan``, or a more precise hexadecimal notation (see below).
+    The assembler requires the exact decimal value of a floating-point
+    constant. For example, the assembler accepts 1.25 but rejects 1.3
+    because 1.3 is a repeating decimal in binary. Floating-point constants
+    must have a :ref:`floating-point <t_floating>` type.
 **Null pointer constants**
     The identifier '``null``' is recognized as a null pointer constant
     and must be of :ref:`pointer type <t_pointer>`.
@@ -4403,13 +4403,12 @@ Simple Constants
 The one non-intuitive notation for constants is the hexadecimal form of
 floating-point constants. For example, the form
 '``double    0x432ff973cafa8000``' is equivalent to (but harder to read
-than) '``double 4.5e+15``'. The only time hexadecimal floating-point
-constants are required (and the only time that they are generated by the
-disassembler) is when a floating-point constant must be emitted but it
+than) '``double 4.5e+15``'. Hexadecimal floating-point
+constants are used when a floating-point constant must be emitted but it
 cannot be represented as a decimal floating-point number in a reasonable
 number of digits. For example, NaN's, infinities, and other special
-values are represented in their IEEE hexadecimal format so that assembly
-and disassembly do not cause any bits to change in the constants.
+values are represented in their IEEE hexadecimal format. This ensures that
+assembly and disassembly do not cause any bits to change in the constants.
 
 When using the hexadecimal form, constants of types bfloat, half, float, and
 double are represented using the 16-digit form shown above (which matches the
@@ -4426,6 +4425,21 @@ represented by ``0xH`` followed by 4 hexadecimal digits. The bfloat 16-bit
 format is represented by ``0xR`` followed by 4 hexadecimal digits. All
 hexadecimal formats are big-endian (sign bit at the left).
 
+Some of the special floating point values can be represented by the following
+identifiers:
+
+    +-----------+---------------------------------------------------+
+    | Name      | Description                                       |
+    +===========+===================================================+
+    | ``qnan``  | Positive quiet NaN w/ payload equals to zero      |
+    +-----------+---------------------------------------------------+
+    | ``snan``  | Positive signaling NaN w/ payload equals to zero  |
+    +-----------+---------------------------------------------------+
+    | ``pinf``  | Positive infinity                                 |
+    +-----------+---------------------------------------------------+
+    | ``ninf``  | Negative infinity                                 |
+    +-----------+---------------------------------------------------+
+
 There are no constants of type x86_amx.
 
 .. _complexconstants:
diff --git a/llvm/lib/AsmParser/LLParser.cpp b/llvm/lib/AsmParser/LLParser.cpp
index f41907f0351257..fe909415eeab5b 100644
--- a/llvm/lib/AsmParser/LLParser.cpp
+++ b/llvm/lib/AsmParser/LLParser.cpp
@@ -3833,6 +3833,21 @@ bool LLParser::parseValID(ValID &ID, PerFunctionState *PFS, Type *ExpectedTy) {
   case lltok::kw_poison: ID.Kind = ValID::t_Poison; break;
   case lltok::kw_zeroinitializer: ID.Kind = ValID::t_Zero; break;
   case lltok::kw_none: ID.Kind = ValID::t_None; break;
+  case lltok::kw_qnan:
+    ID.APFloatVal = APFloat::getQNaN(APFloat::IEEEdouble());
+    ID.Kind = ValID::t_APFloat;
+    break;
+  case lltok::kw_snan:
+    ID.APFloatVal = APFloat::getSNaN(APFloat::IEEEdouble());
+    ID.Kind = ValID::t_APFloat;
+    break;
+  case lltok::kw_pinf:
+  case lltok::kw_ninf:
+    ID.APFloatVal =
+        APFloat::getInf(APFloat::IEEEdouble(),
+                        /*Negative=*/Lex.getKind() == lltok::kw_ninf);
+    ID.Kind = ValID::t_APFloat;
+    break;
 
   case lltok::lbrace: {
     // ValID ::= '{' ConstVector '}'
diff --git a/llvm/unittests/AsmParser/AsmParserTest.cpp b/llvm/unittests/AsmParser/AsmParserTest.cpp
index a70c061d3e3044..7e1b000f17f922 100644
--- a/llvm/unittests/AsmParser/AsmParserTest.cpp
+++ b/llvm/unittests/AsmParser/AsmParserTest.cpp
@@ -82,6 +82,46 @@ TEST(AsmParserTest, TypeAndConstantValueParsing) {
   ASSERT_TRUE(isa<ConstantFP>(V));
   EXPECT_TRUE(cast<ConstantFP>(V)->isExactlyValue(3.5));
 
+  // Special floating point constants.
+  const APFloat *APFloatVal;
+  V = parseConstantValue("double qnan", Error, M);
+  ASSERT_TRUE(V);
+  EXPECT_TRUE(V->getType()->isDoubleTy());
+  ASSERT_TRUE(isa<ConstantFP>(V));
+  APFloatVal = &cast<ConstantFP>(V)->getValueAPF();
+  EXPECT_TRUE(APFloatVal->isNaN() && !APFloatVal->isSignaling());
+
+  V = parseConstantValue("double snan", Error, M);
+  ASSERT_TRUE(V);
+  EXPECT_TRUE(V->getType()->isDoubleTy());
+  ASSERT_TRUE(isa<ConstantFP>(V));
+  APFloatVal = &cast<ConstantFP>(V)->getValueAPF();
+  EXPECT_TRUE(APFloatVal->isNaN() && APFloatVal->isSignaling());
+
+  V = parseConstantValue("double pinf", Error, M);
+  ASSERT_TRUE(V);
+  EXPECT_TRUE(V->getType()->isDoubleTy());
+  ASSERT_TRUE(isa<ConstantFP>(V));
+  APFloatVal = &cast<ConstantFP>(V)->getValueAPF();
+  EXPECT_TRUE(APFloatVal->isInfinity() && !APFloatVal->isNegative());
+
+  V = parseConstantValue("double ninf", Error, M);
+  ASSERT_TRUE(V);
+  EXPECT_TRUE(V->getType()->isDoubleTy());
+  ASSERT_TRUE(isa<ConstantFP>(V));
+  APFloatVal = &cast<ConstantFP>(V)->getValueAPF();
+  EXPECT_TRUE(APFloatVal->isInfinity() && APFloatVal->isNegative());
+
+  // We always parse special values into IEEEdouble first before converting
+  // them into the right semantics once the type info is available.
+  // The following tests whether this conversion works as expected.
+  V = parseConstantValue("bfloat pinf", Error, M);
+  ASSERT_TRUE(V);
+  EXPECT_TRUE(V->getType()->isBFloatTy());
+  ASSERT_TRUE(isa<ConstantFP>(V));
+  APFloatVal = &cast<ConstantFP>(V)->getValueAPF();
+  EXPECT_TRUE(APFloatVal->isInfinity() && !APFloatVal->isNegative());
+
   V = parseConstantValue("i32 42", Error, M);
   ASSERT_TRUE(V);
   EXPECT_TRUE(V->getType()->isIntegerTy());

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice idea!

What do you think about also printing them as such?

+-----------+---------------------------------------------------+
| Name | Description |
+===========+===================================================+
| ``qnan`` | Positive quiet NaN w/ payload equals to zero |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| ``qnan`` | Positive quiet NaN w/ payload equals to zero |
| ``qnan`` | Positive quiet NaN w/ payload equal to zero |

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

+===========+===================================================+
| ``qnan`` | Positive quiet NaN w/ payload equals to zero |
+-----------+---------------------------------------------------+
| ``snan`` | Positive signaling NaN w/ payload equals to zero |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| ``snan`` | Positive signaling NaN w/ payload equals to zero |
| ``snan`` | Positive signaling NaN w/ payload equal to zero |

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no sNaN with an all-0 payload, that would be an infinity instead--the payload needs to have some bit set to not be an infinity. qNaNs always have one bit set, but there is no requisite bit set for the sNaN.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it's a bit tricky to describe what we want here...maybe we could just say "value equal to APFloat::getSNaN()"? Though I'm not sure if we should rely on an API, which is subject to change anytime, to describe a specification. What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it should just be explicit about the payload, e.g. nan(payload) instead of qnan/snan.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't looked at how hard the logic would be to implement, but making nan by itself return the preferred qNaN and nan(payload) be used for other qNaN/sNaN payloads seems the best way forward to me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please also add a normal lit test in llvm/test/Assembler? Just an llvm-as | llvm-dis round-trip.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's done.

@mshockwave
Copy link
Member Author

What do you think about also printing them as such?

I'm fine doing that as well.

And address review feedbacks.
@mshockwave mshockwave changed the title [LLParser] Support identifiers like qnan and pinf for special FP values [LLParser] Support identifiers like nan and pinf for special FP values Aug 22, 2024
@mshockwave
Copy link
Member Author

mshockwave commented Aug 22, 2024

I've updated the patch so now nan represent the default QNaN with no payload; snan(payload) and qnan(payload) allow you to customize the payload. I decided to have a separate snan and qnan for the payload version because it's easier to create the value using the existing APFloat APIs, which always ask the signaling bit up ahead and "sanitize" the payload accordingly.

There are some rough edges on specifying the payload value, because LLParser always use a IEEEdouble to carry the value before converting them into the actual types, which might truncate the payload in an unexpected way. I found it pretty difficult to fully fix so instead, I stated this issue in the LangRef and asked users to be careful.

A patch for the AsmWriter support, namely, printing nan, pinf, and ninf as well, is on the way. I put it in a separate patch because there are tons of changes in the tests.

Update: AsmWriter patch is #105618

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants