Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[libc++] LWG2381: Inconsistency in parsing floating point numbers #77948

Merged
merged 9 commits into from
May 21, 2024

Conversation

frederick-vs-ja
Copy link
Contributor

@frederick-vs-ja frederick-vs-ja commented Jan 12, 2024

This PR implements LWG2381 by rejecting 'i', 'I', 'n', 'N' in FP parsing, as inf and NaN are intendedly rejected by that LWG issue.

The source character array used for parsing is "0123456789abcdefABCDEFxX+-pPiInN", whose first 26 or 28 characters
are used for parsing integers or floating-point values respectively. Previously, libc++ used 32 characters, including 'i', 'I', 'n', 'N', for FP parsing, which was inconsistent with LWG2381. This PR also replaces magic numbers 26 and 28 (formerly 32) with named constants.

Drive-by change: when the first character (possibly after the leading '+' or '-') is not a decimal digit but an acceptable character (e.g., 'p' or 'e'), the character is not accumulated now (per Stage 2 in [facet.num.get.virtuals]/3).

#65168 may be rendered invalid, see #65168 (comment).

Apple back-deployment targets remain broken, likely due to dylib. XFAIL is marked in related tests.

@frederick-vs-ja frederick-vs-ja requested a review from a team as a code owner January 12, 2024 17:13
@llvmbot llvmbot added the libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. label Jan 12, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Jan 12, 2024

@llvm/pr-subscribers-libcxx

Author: A. Jiang (frederick-vs-ja)

Changes

This PR implements LWG2381 by rejecting 'i', 'I', 'n', 'N' in FP parsing, as inf and NaN are intendedly rejected by that LWG issue. Also replaces magic numbers 26 and 28 (formerly 32) are with named constants.

As a driven-by change, some parts of get_long_double.pass.cpp are guarded with LDBL_MANT_DIG >= 64, because these parts fail on platforms (e.g. MSVC) where long double is of binary64 format.

#65168 may be rendered invalid, see #65168 (comment).


Full diff: https://github.com/llvm/llvm-project/pull/77948.diff

5 Files Affected:

  • (modified) libcxx/docs/Status/Cxx23Issues.csv (+1-1)
  • (modified) libcxx/include/locale (+15-12)
  • (modified) libcxx/test/std/localization/locale.categories/category.numeric/locale.num.get/facet.num.get.members/get_double.pass.cpp (+18-18)
  • (modified) libcxx/test/std/localization/locale.categories/category.numeric/locale.num.get/facet.num.get.members/get_float.pass.cpp (+18-18)
  • (modified) libcxx/test/std/localization/locale.categories/category.numeric/locale.num.get/facet.num.get.members/get_long_double.pass.cpp (+21-18)
diff --git a/libcxx/docs/Status/Cxx23Issues.csv b/libcxx/docs/Status/Cxx23Issues.csv
index b24ecc5525a149..a835a57ba90c54 100644
--- a/libcxx/docs/Status/Cxx23Issues.csv
+++ b/libcxx/docs/Status/Cxx23Issues.csv
@@ -98,7 +98,7 @@
 `3555 <https://wg21.link/LWG3555>`__,"``{transform,elements}_view::iterator::iterator_concept`` should consider const-qualification of the underlying range","June 2021","","","|ranges|"
 "","","","","",""
 `2191 <https://wg21.link/LWG2191>`__,"Incorrect specification of ``match_results(match_results&&)``","October 2021","|Nothing To Do|",""
-`2381 <https://wg21.link/LWG2381>`__,"Inconsistency in parsing floating point numbers","October 2021","",""
+`2381 <https://wg21.link/LWG2381>`__,"Inconsistency in parsing floating point numbers","October 2021","|Complete|","18.0"
 `2762 <https://wg21.link/LWG2762>`__,"``unique_ptr operator*()`` should be ``noexcept``","October 2021","",""
 `3121 <https://wg21.link/LWG3121>`__,"``tuple`` constructor constraints for ``UTypes&&...`` overloads","October 2021","",""
 `3123 <https://wg21.link/LWG3123>`__,"``duration`` constructor from representation shouldn't be effectively non-throwing","October 2021","","","|chrono|"
diff --git a/libcxx/include/locale b/libcxx/include/locale
index 70d22ff95e1ee1..839b7045fc3c30 100644
--- a/libcxx/include/locale
+++ b/libcxx/include/locale
@@ -366,9 +366,11 @@ _LIBCPP_HIDE_FROM_ABI _ForwardIterator __scan_keyword(
 
 struct _LIBCPP_EXPORTED_FROM_ABI __num_get_base {
   static const int __num_get_buf_sz = 40;
+  static const size_t __int_chr_cnt = 26;
+  static const size_t __fp_chr_cnt  = 28;
 
   static int __get_base(ios_base&);
-  static const char __src[33];
+  static const char __src[33]; // "0123456789abcdefABCDEFxX+-pPiInN"
 };
 
 _LIBCPP_EXPORTED_FROM_ABI void
@@ -431,7 +433,7 @@ private:
   template <typename _Tp>
   const _Tp* __do_widen_p(ios_base& __iob, _Tp* __atoms) const {
     locale __loc = __iob.getloc();
-    use_facet<ctype<_Tp> >(__loc).widen(__src, __src + 26, __atoms);
+    use_facet<ctype<_Tp> >(__loc).widen(__src, __src + __int_chr_cnt, __atoms);
     return __atoms;
   }
 
@@ -447,7 +449,7 @@ private:
 template <class _CharT>
 string __num_get<_CharT>::__stage2_int_prep(ios_base& __iob, _CharT* __atoms, _CharT& __thousands_sep) {
   locale __loc = __iob.getloc();
-  std::use_facet<ctype<_CharT> >(__loc).widen(__src, __src + 26, __atoms);
+  std::use_facet<ctype<_CharT> >(__loc).widen(__src, __src + __int_chr_cnt, __atoms);
   const numpunct<_CharT>& __np = std::use_facet<numpunct<_CharT> >(__loc);
   __thousands_sep              = __np.thousands_sep();
   return __np.grouping();
@@ -458,7 +460,7 @@ template <class _CharT>
 string __num_get<_CharT>::__stage2_float_prep(
     ios_base& __iob, _CharT* __atoms, _CharT& __decimal_point, _CharT& __thousands_sep) {
   locale __loc = __iob.getloc();
-  std::use_facet<ctype<_CharT> >(__loc).widen(__src, __src + 32, __atoms);
+  std::use_facet<ctype<_CharT> >(__loc).widen(__src, __src + __fp_chr_cnt, __atoms);
   const numpunct<_CharT>& __np = std::use_facet<numpunct<_CharT> >(__loc);
   __decimal_point              = __np.decimal_point();
   __thousands_sep              = __np.thousands_sep();
@@ -490,7 +492,7 @@ __num_get<_CharT>::__stage2_int_loop(_CharT __ct, int __base, char* __a, char*&
     }
     return 0;
   }
-  ptrdiff_t __f = std::find(__atoms, __atoms + 26, __ct) - __atoms;
+  ptrdiff_t __f = std::find(__atoms, __atoms + __int_chr_cnt, __ct) - __atoms;
   if (__f >= 24)
     return -1;
   switch (__base) {
@@ -546,8 +548,8 @@ int __num_get<_CharT>::__stage2_float_loop(
     }
     return 0;
   }
-  ptrdiff_t __f = std::find(__atoms, __atoms + 32, __ct) - __atoms;
-  if (__f >= 32)
+  ptrdiff_t __f = std::find(__atoms, __atoms + __num_get_base::__fp_chr_cnt, __ct) - __atoms;
+  if (__f >= __num_get_base::__fp_chr_cnt)
     return -1;
   char __x = __src[__f];
   if (__x == '-' || __x == '+') {
@@ -846,7 +848,7 @@ _InputIterator num_get<_CharT, _InputIterator>::__do_get_signed(
   int __base = this->__get_base(__iob);
   // Stage 2
   char_type __thousands_sep;
-  const int __atoms_size = 26;
+  const int __atoms_size = __num_get_base::__int_chr_cnt;
 #ifdef _LIBCPP_ABI_OPTIMIZED_LOCALE_NUM_GET
   char_type __atoms1[__atoms_size];
   const char_type* __atoms = this->__do_widen(__iob, __atoms1);
@@ -895,7 +897,7 @@ _InputIterator num_get<_CharT, _InputIterator>::__do_get_unsigned(
   int __base = this->__get_base(__iob);
   // Stage 2
   char_type __thousands_sep;
-  const int __atoms_size = 26;
+  const int __atoms_size = __num_get_base::__int_chr_cnt;
 #ifdef _LIBCPP_ABI_OPTIMIZED_LOCALE_NUM_GET
   char_type __atoms1[__atoms_size];
   const char_type* __atoms = this->__do_widen(__iob, __atoms1);
@@ -942,7 +944,7 @@ _InputIterator num_get<_CharT, _InputIterator>::__do_get_floating_point(
     iter_type __b, iter_type __e, ios_base& __iob, ios_base::iostate& __err, _Fp& __v) const {
   // Stage 1, nothing to do
   // Stage 2
-  char_type __atoms[32];
+  char_type __atoms[__num_get_base::__fp_chr_cnt];
   char_type __decimal_point;
   char_type __thousands_sep;
   string __grouping = this->__stage2_float_prep(__iob, __atoms, __decimal_point, __thousands_sep);
@@ -996,10 +998,11 @@ _InputIterator num_get<_CharT, _InputIterator>::do_get(
   // Stage 1
   int __base = 16;
   // Stage 2
-  char_type __atoms[26];
+  char_type __atoms[__num_get_base::__int_chr_cnt];
   char_type __thousands_sep = char_type();
   string __grouping;
-  std::use_facet<ctype<_CharT> >(__iob.getloc()).widen(__num_get_base::__src, __num_get_base::__src + 26, __atoms);
+  std::use_facet<ctype<_CharT> >(__iob.getloc())
+      .widen(__num_get_base::__src, __num_get_base::__src + __num_get_base::__int_chr_cnt, __atoms);
   string __buf;
   __buf.resize(__buf.capacity());
   char* __a     = &__buf[0];
diff --git a/libcxx/test/std/localization/locale.categories/category.numeric/locale.num.get/facet.num.get.members/get_double.pass.cpp b/libcxx/test/std/localization/locale.categories/category.numeric/locale.num.get/facet.num.get.members/get_double.pass.cpp
index c802ab787682aa..ce56bf1c1877b5 100644
--- a/libcxx/test/std/localization/locale.categories/category.numeric/locale.num.get/facet.num.get.members/get_double.pass.cpp
+++ b/libcxx/test/std/localization/locale.categories/category.numeric/locale.num.get/facet.num.get.members/get_double.pass.cpp
@@ -116,9 +116,9 @@ int main(int, char**)
             f.get(cpp17_input_iterator<const char*>(str),
                   cpp17_input_iterator<const char*>(str+sizeof(str)),
                   ios, err, v);
-        assert(base(iter) == str+sizeof(str)-1);
-        assert(err == ios.goodbit);
-        assert(v == INFINITY);
+        assert(base(iter) == str);
+        assert(err == ios.failbit);
+        assert(v == 0.0);
     }
     {
         const char str[] = "INF";
@@ -128,9 +128,9 @@ int main(int, char**)
             f.get(cpp17_input_iterator<const char*>(str),
                   cpp17_input_iterator<const char*>(str+sizeof(str)),
                   ios, err, v);
-        assert(base(iter) == str+sizeof(str)-1);
-        assert(err == ios.goodbit);
-        assert(v == INFINITY);
+        assert(base(iter) == str);
+        assert(err == ios.failbit);
+        assert(v == 0.0);
     }
     {
         const char str[] = "-inf";
@@ -140,9 +140,9 @@ int main(int, char**)
             f.get(cpp17_input_iterator<const char*>(str),
                   cpp17_input_iterator<const char*>(str+sizeof(str)),
                   ios, err, v);
-        assert(base(iter) == str+sizeof(str)-1);
-        assert(err == ios.goodbit);
-        assert(v == -INFINITY);
+        assert(base(iter) == str+1);
+        assert(err == ios.failbit);
+        assert(v == 0.0);
     }
     {
         const char str[] = "-INF";
@@ -152,9 +152,9 @@ int main(int, char**)
             f.get(cpp17_input_iterator<const char*>(str),
                   cpp17_input_iterator<const char*>(str+sizeof(str)),
                   ios, err, v);
-        assert(base(iter) == str+sizeof(str)-1);
-        assert(err == ios.goodbit);
-        assert(v == -INFINITY);
+        assert(base(iter) == str+1);
+        assert(err == ios.failbit);
+        assert(v == 0.0);
     }
     {
         const char str[] = "nan";
@@ -164,9 +164,9 @@ int main(int, char**)
             f.get(cpp17_input_iterator<const char*>(str),
                   cpp17_input_iterator<const char*>(str+sizeof(str)),
                   ios, err, v);
-        assert(base(iter) == str+sizeof(str)-1);
-        assert(err == ios.goodbit);
-        assert(std::isnan(v));
+        assert(base(iter) == str);
+        assert(err == ios.failbit);
+        assert(v = 0.0);
     }
     {
         const char str[] = "NAN";
@@ -176,9 +176,9 @@ int main(int, char**)
             f.get(cpp17_input_iterator<const char*>(str),
                   cpp17_input_iterator<const char*>(str+sizeof(str)),
                   ios, err, v);
-        assert(base(iter) == str+sizeof(str)-1);
-        assert(err == ios.goodbit);
-        assert(std::isnan(v));
+        assert(base(iter) == str);
+        assert(err == ios.failbit);
+        assert(v = 0.0);
     }
     {
         v = -1;
diff --git a/libcxx/test/std/localization/locale.categories/category.numeric/locale.num.get/facet.num.get.members/get_float.pass.cpp b/libcxx/test/std/localization/locale.categories/category.numeric/locale.num.get/facet.num.get.members/get_float.pass.cpp
index 79c8480d0699b0..38d713a46076e2 100644
--- a/libcxx/test/std/localization/locale.categories/category.numeric/locale.num.get/facet.num.get.members/get_float.pass.cpp
+++ b/libcxx/test/std/localization/locale.categories/category.numeric/locale.num.get/facet.num.get.members/get_float.pass.cpp
@@ -105,9 +105,9 @@ int main(int, char**)
             f.get(cpp17_input_iterator<const char*>(str),
                   cpp17_input_iterator<const char*>(str+sizeof(str)),
                   ios, err, v);
-        assert(base(iter) == str+sizeof(str)-1);
-        assert(err == ios.goodbit);
-        assert(v == INFINITY);
+        assert(base(iter) == str);
+        assert(err == ios.failbit);
+        assert(v == 0.0f);
     }
     {
         const char str[] = "INF";
@@ -117,9 +117,9 @@ int main(int, char**)
             f.get(cpp17_input_iterator<const char*>(str),
                   cpp17_input_iterator<const char*>(str+sizeof(str)),
                   ios, err, v);
-        assert(base(iter) == str+sizeof(str)-1);
-        assert(err == ios.goodbit);
-        assert(v == INFINITY);
+        assert(base(iter) == str);
+        assert(err == ios.failbit);
+        assert(v == 0.0f);
     }
     {
         const char str[] = "-inf";
@@ -129,9 +129,9 @@ int main(int, char**)
             f.get(cpp17_input_iterator<const char*>(str),
                   cpp17_input_iterator<const char*>(str+sizeof(str)),
                   ios, err, v);
-        assert(base(iter) == str+sizeof(str)-1);
-        assert(err == ios.goodbit);
-        assert(v == -INFINITY);
+        assert(base(iter) == str+1);
+        assert(err == ios.failbit);
+        assert(v == 0.0f);
     }
     {
         const char str[] = "-INF";
@@ -141,9 +141,9 @@ int main(int, char**)
             f.get(cpp17_input_iterator<const char*>(str),
                   cpp17_input_iterator<const char*>(str+sizeof(str)),
                   ios, err, v);
-        assert(base(iter) == str+sizeof(str)-1);
-        assert(err == ios.goodbit);
-        assert(v == -INFINITY);
+        assert(base(iter) == str+1);
+        assert(err == ios.failbit);
+        assert(v == 0.0f);
     }
     {
         const char str[] = "nan";
@@ -153,9 +153,9 @@ int main(int, char**)
             f.get(cpp17_input_iterator<const char*>(str),
                   cpp17_input_iterator<const char*>(str+sizeof(str)),
                   ios, err, v);
-        assert(base(iter) == str+sizeof(str)-1);
-        assert(err == ios.goodbit);
-        assert(std::isnan(v));
+        assert(base(iter) == str);
+        assert(err == ios.failbit);
+        assert(v == 0.0f);
     }
     {
         const char str[] = "NAN";
@@ -165,9 +165,9 @@ int main(int, char**)
             f.get(cpp17_input_iterator<const char*>(str),
                   cpp17_input_iterator<const char*>(str+sizeof(str)),
                   ios, err, v);
-        assert(base(iter) == str+sizeof(str)-1);
-        assert(err == ios.goodbit);
-        assert(std::isnan(v));
+        assert(base(iter) == str);
+        assert(err == ios.failbit);
+        assert(v == 0.0f);
     }
     {
         v = -1;
diff --git a/libcxx/test/std/localization/locale.categories/category.numeric/locale.num.get/facet.num.get.members/get_long_double.pass.cpp b/libcxx/test/std/localization/locale.categories/category.numeric/locale.num.get/facet.num.get.members/get_long_double.pass.cpp
index e2b2aeafd1ef92..406187f46121e6 100644
--- a/libcxx/test/std/localization/locale.categories/category.numeric/locale.num.get/facet.num.get.members/get_long_double.pass.cpp
+++ b/libcxx/test/std/localization/locale.categories/category.numeric/locale.num.get/facet.num.get.members/get_long_double.pass.cpp
@@ -16,6 +16,7 @@
 #include <locale>
 #include <ios>
 #include <cassert>
+#include <cfloat>
 #include <streambuf>
 #include <cmath>
 #include "test_macros.h"
@@ -105,9 +106,9 @@ int main(int, char**)
             f.get(cpp17_input_iterator<const char*>(str),
                   cpp17_input_iterator<const char*>(str+sizeof(str)),
                   ios, err, v);
-        assert(base(iter) == str+sizeof(str)-1);
-        assert(err == ios.goodbit);
-        assert(v == INFINITY);
+        assert(base(iter) == str);
+        assert(err == ios.failbit);
+        assert(v == 0.0l);
     }
     {
         const char str[] = "INF";
@@ -117,9 +118,9 @@ int main(int, char**)
             f.get(cpp17_input_iterator<const char*>(str),
                   cpp17_input_iterator<const char*>(str+sizeof(str)),
                   ios, err, v);
-        assert(base(iter) == str+sizeof(str)-1);
-        assert(err == ios.goodbit);
-        assert(v == INFINITY);
+        assert(base(iter) == str);
+        assert(err == ios.failbit);
+        assert(v == 0.0l);
     }
     {
         const char str[] = "-inf";
@@ -129,9 +130,9 @@ int main(int, char**)
             f.get(cpp17_input_iterator<const char*>(str),
                   cpp17_input_iterator<const char*>(str+sizeof(str)),
                   ios, err, v);
-        assert(base(iter) == str+sizeof(str)-1);
-        assert(err == ios.goodbit);
-        assert(v == -INFINITY);
+        assert(base(iter) == str+1);
+        assert(err == ios.failbit);
+        assert(v == 0.0l);
     }
     {
         const char str[] = "-INF";
@@ -141,9 +142,9 @@ int main(int, char**)
             f.get(cpp17_input_iterator<const char*>(str),
                   cpp17_input_iterator<const char*>(str+sizeof(str)),
                   ios, err, v);
-        assert(base(iter) == str+sizeof(str)-1);
-        assert(err == ios.goodbit);
-        assert(v == -INFINITY);
+        assert(base(iter) == str+1);
+        assert(err == ios.failbit);
+        assert(v == 0.0l);
     }
     {
         const char str[] = "nan";
@@ -153,9 +154,9 @@ int main(int, char**)
             f.get(cpp17_input_iterator<const char*>(str),
                   cpp17_input_iterator<const char*>(str+sizeof(str)),
                   ios, err, v);
-        assert(base(iter) == str+sizeof(str)-1);
-        assert(err == ios.goodbit);
-        assert(std::isnan(v));
+        assert(base(iter) == str);
+        assert(err == ios.failbit);
+        assert(v == 0.0l);
     }
     {
         const char str[] = "NAN";
@@ -165,10 +166,11 @@ int main(int, char**)
             f.get(cpp17_input_iterator<const char*>(str),
                   cpp17_input_iterator<const char*>(str+sizeof(str)),
                   ios, err, v);
-        assert(base(iter) == str+sizeof(str)-1);
-        assert(err == ios.goodbit);
-        assert(std::isnan(v));
+        assert(base(iter) == str);
+        assert(err == ios.failbit);
+        assert(v == 0.0l);
     }
+#if LDBL_MANT_DIG >= 64
     {
         const char str[] = "1.189731495357231765021264e+49321";
         std::ios_base::iostate err = ios.goodbit;
@@ -229,6 +231,7 @@ int main(int, char**)
         assert(err != ios.failbit);
         assert(v == 304888344611713860501504000000.0L);
     }
+#endif // LDBL_MANT_DIG >= 64
     {
         v = -1;
         const char str[] = "1.19973e+4933"; // unrepresentable

Copy link

github-actions bot commented Jan 12, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Member

@mordante mordante left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this. I've a few questions before I want to look deeper in the changes.

libcxx/include/locale Show resolved Hide resolved
libcxx/include/locale Outdated Show resolved Hide resolved
This PR implements LWG2381 by rejecting 'i', 'I', 'n', 'N' in FP
parsing, as inf and NaN are intendedly rejected by that LWG issue.

The source character array used for parsing is
"0123456789abcdefABCDEFxX+-pPiInN", whose first 26 or 28 characters
are used for parsing integers or floating-point values respectively.
Previously, libc++ used 32 characters, including 'i', 'I', 'n', 'N',
for FP parsing, which was inconsistent with LWG2381. This PR also
replaces magic numbers 26 and 28 (formerly 32) with named constants.

As a driven-by change, some parts of get_long_double.pass.cpp are
guarded with `LDBL_MANT_DIG >= 64`, because these parts fail on
platforms where long double is of binary64 format (e.g. MSVC).

Apple back-deployment targets remain broken due to dylib. XFAIL is
marked in related tests.
@frederick-vs-ja
Copy link
Contributor Author

frederick-vs-ja commented Mar 23, 2024

@mordante ping. It seems that this missed for 18.0.

Copy link

✅ With the latest revision this PR passed the Python code formatter.

@mordante
Copy link
Member

@mordante ping. It seems that this missed for 18.0.

Thanks for the reminder, it slipped under radar. I'll try to have a look soon.

Copy link
Member

@mordante mordante left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the patch! In general this looks good. Could you apply the changes to the double test to float and long double too? After these changes are applied things look good to me.

- reorder declarations and expand comments
- `UNSUPPORTED: using-built-library-before-llvm-19`
- test `p00` and its friends
Copy link
Member

@mordante mordante left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general LGTM with some minor nits. I've posted one question.

Copy link
Member

@mordante mordante left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@mordante mordante merged commit 3e15c97 into llvm:main May 21, 2024
9 of 10 checks passed
@frederick-vs-ja frederick-vs-ja deleted the lwg-2381 branch May 21, 2024 17:07
mub-at-arm pushed a commit to mub-at-arm/llvm-project that referenced this pull request May 22, 2024
…vm#77948)

This PR implements [LWG2381](https://cplusplus.github.io/LWG/issue2381)
by rejecting `'i'`, `'I'`, `'n'`, `'N'` in FP parsing, as inf and NaN
are intendedly rejected by that LWG issue.

The source character array used for parsing is
`"0123456789abcdefABCDEFxX+-pPiInN"`, whose first 26 or 28 characters
are used for parsing integers or floating-point values respectively.
Previously, libc++ used 32 characters, including `'i'`, `'I'`, `'n'`,
`'N'`, for FP parsing, which was inconsistent with LWG2381. This PR also
replaces magic numbers 26 and 28 (formerly 32) with named constants.

Drive-by change: when the first character (possibly after the leading
`'+'` or `'-'`) is not a decimal digit but an acceptable character
(e.g., `'p'` or `'e'`), the character is not accumulated now (per Stage
2 in [facet.num.get.virtuals]/3).

llvm#65168 may be rendered invalid, see
llvm#65168 (comment).

Apple back-deployment targets remain broken, likely due to dylib. XFAIL
is marked in related tests.

---------

Co-authored-by: Mark de Wever <koraq@xs4all.nl>
VyacheslavLevytskyy pushed a commit to VyacheslavLevytskyy/llvm-project that referenced this pull request May 22, 2024
…vm#77948)

This PR implements [LWG2381](https://cplusplus.github.io/LWG/issue2381)
by rejecting `'i'`, `'I'`, `'n'`, `'N'` in FP parsing, as inf and NaN
are intendedly rejected by that LWG issue.

The source character array used for parsing is
`"0123456789abcdefABCDEFxX+-pPiInN"`, whose first 26 or 28 characters
are used for parsing integers or floating-point values respectively.
Previously, libc++ used 32 characters, including `'i'`, `'I'`, `'n'`,
`'N'`, for FP parsing, which was inconsistent with LWG2381. This PR also
replaces magic numbers 26 and 28 (formerly 32) with named constants.

Drive-by change: when the first character (possibly after the leading
`'+'` or `'-'`) is not a decimal digit but an acceptable character
(e.g., `'p'` or `'e'`), the character is not accumulated now (per Stage
2 in [facet.num.get.virtuals]/3).

llvm#65168 may be rendered invalid, see
llvm#65168 (comment).

Apple back-deployment targets remain broken, likely due to dylib. XFAIL
is marked in related tests.

---------

Co-authored-by: Mark de Wever <koraq@xs4all.nl>
jameshu15869 pushed a commit to jameshu15869/llvm-project that referenced this pull request May 31, 2024
…vm#77948)

This PR implements [LWG2381](https://cplusplus.github.io/LWG/issue2381)
by rejecting `'i'`, `'I'`, `'n'`, `'N'` in FP parsing, as inf and NaN
are intendedly rejected by that LWG issue.

The source character array used for parsing is
`"0123456789abcdefABCDEFxX+-pPiInN"`, whose first 26 or 28 characters
are used for parsing integers or floating-point values respectively.
Previously, libc++ used 32 characters, including `'i'`, `'I'`, `'n'`,
`'N'`, for FP parsing, which was inconsistent with LWG2381. This PR also
replaces magic numbers 26 and 28 (formerly 32) with named constants.

Drive-by change: when the first character (possibly after the leading
`'+'` or `'-'`) is not a decimal digit but an acceptable character
(e.g., `'p'` or `'e'`), the character is not accumulated now (per Stage
2 in [facet.num.get.virtuals]/3).

llvm#65168 may be rendered invalid, see
llvm#65168 (comment).

Apple back-deployment targets remain broken, likely due to dylib. XFAIL
is marked in related tests.

---------

Co-authored-by: Mark de Wever <koraq@xs4all.nl>
@bgra8
Copy link
Contributor

bgra8 commented May 31, 2024

@mordante this patch causes strings like ".5" to no longer parse and this is most likely a bug.

Here's a small reproducer:

#include <sstream>
#include <cassert>

int main() {
  std::istringstream strm(".5");
  float val = 0;
  strm >> val;

  assert(!strm.fail());
  assert(strm.eof());
  assert(val == .5);
  return 0;
}

Can you please take a look?

@frederick-vs-ja
Copy link
Contributor Author

It seems that I made a wrong change. Decimal point '.' should also be allowed to be the leading character.

// the leading character excluding the sign must be a decimal digit
if (!__is_leading_parsed) {
if (__a_end - __a >= 1 && __a[0] != '-' && __a[0] != '+') {
if ('0' <= __a[0] && __a[0] <= '9')
__is_leading_parsed = true;
else
break;
} else if (__a_end - __a >= 2 && (__a[0] == '-' || __a[0] == '+')) {
if ('0' <= __a[1] && __a[1] <= '9')
__is_leading_parsed = true;
else
break;
}

I'll fix the regression later today.

@bgra8
Copy link
Contributor

bgra8 commented May 31, 2024

@frederick-vs-ja can you please revert first to unblock us? You can then work without pressure to fix the issue and avoid introducing other new issues.

@mordante sorry for misattributing this to you!

frederick-vs-ja added a commit to frederick-vs-ja/llvm-project that referenced this pull request May 31, 2024
PR llvm#77948 mistakenly rejected floating-point representation with a
leading decimal point, e.g. ".5".

This PR fixes the regression.
@frederick-vs-ja
Copy link
Contributor Author

I'm not a big fan of revert-and-reland. Given both reverting and a direct fix require review, I'd like to open a PR for direct fix.

If maintainers prefers reverting, I'll change the PR to revert.

frederick-vs-ja added a commit to frederick-vs-ja/llvm-project that referenced this pull request May 31, 2024
PR llvm#77948 mistakenly rejected floating-point representation with a
leading decimal point, e.g. ".5".

This PR fixes the regression.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants