Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Overhaul realfloat #637

Open
wants to merge 35 commits into
base: master
Choose a base branch
from

Conversation

BebeSparkelSparkel
Copy link
Contributor

@BebeSparkelSparkel BebeSparkelSparkel commented Jan 7, 2024

@BebeSparkelSparkel
Copy link
Contributor Author

I tested performance with ghc-9.2.7 comparing this pull with branch realfloat-bench t

Benchmark bytestring-bench: RUNNING...
All
  Data.ByteString.Builder
    Non-bounded encodings
      RealFloat
        FGeneric
          Positive
            Float (100000):                     OK
              83.8 ms ± 3.4 ms, 33% less than baseline
            Double (100000):                    OK
              59.2 ms ± 2.8 ms, 46% less than baseline
            DoubleSmall (100000):               OK
              213  ms ±  18 ms, 20% less than baseline
          Negative
            Float (100000):                     OK
              82.8 ms ± 6.7 ms, 32% less than baseline
            Double (100000):                    OK
              62.0 ms ± 4.5 ms, 42% less than baseline
            DoubleSmall (100000):               OK
              62.2 ms ± 5.6 ms, 43% less than baseline
          Special
            Float  Average (100000):            OK
              28.4 ms ± 2.2 ms,       same as baseline
            Double Average (100000):            OK
              29.4 ms ± 2.8 ms,       same as baseline
        FScientific
          Positive
            Float (100000):                     OK
              65.5 ms ± 4.5 ms, 34% less than baseline
            Double (100000):                    OK
              60.0 ms ± 3.4 ms, 41% less than baseline
            DoubleSmall (100000):               OK
              58.6 ms ± 2.8 ms, 47% less than baseline
          Negative
            Float (100000):                     OK
              65.8 ms ± 4.5 ms, 34% less than baseline
            Double (100000):                    OK
              57.0 ms ± 4.5 ms, 45% less than baseline
            DoubleSmall (100000):               OK
              61.9 ms ± 2.2 ms, 45% less than baseline
          Special
            Float  Average (100000):            OK
              29.1 ms ± 2.2 ms, 37% less than baseline
            Double Average (100000):            OK
              29.1 ms ± 2.2 ms, 29% less than baseline
        FStandard
          Positive
            without
              Float (100000):                   OK
                341  ms ±  22 ms,       same as baseline
              Double (100000):                  OK
                890  ms ±  72 ms,       same as baseline
              DoubleSmall (100000):             OK
                436  ms ±  16 ms, 11% less than baseline
            precision
              Float-Preciaion-1 (100000):       OK
                419  ms ±  22 ms,       same as baseline
              Double-Preciaion-1 (100000):      OK
                129  ms ± 7.8 ms, 28% less than baseline
              DoubleSmall-Preciaion-1 (100000): OK
                390  ms ±  13 ms, 15% less than baseline
              Float-Preciaion-6 (100000):       OK
                520  ms ±  27 ms, 10% less than baseline
              Double-Preciaion-6 (100000):      OK
                252  ms ±  13 ms, 17% less than baseline
              DoubleSmall-Preciaion-6 (100000): OK
                512  ms ±  18 ms, 12% less than baseline
          Negative
            without
              Float (100000):                   OK
                335  ms ±  13 ms, 10% less than baseline
              Double (100000):                  OK
                842  ms ±  45 ms,  9% less than baseline
              DoubleSmall (100000):             OK
                424  ms ±  22 ms, 13% less than baseline
            precision
              Float-Preciaion-1 (100000):       OK
                408  ms ±  18 ms, 10% less than baseline
              Double-Preciaion-1 (100000):      OK
                128  ms ±  11 ms, 29% less than baseline
              DoubleSmall-Preciaion-1 (100000): OK
                399  ms ±  22 ms, 12% less than baseline
              Float-Preciaion-6 (100000):       OK
                542  ms ±  54 ms,       same as baseline
              Double-Preciaion-6 (100000):      OK
                252  ms ±  18 ms, 18% less than baseline
              DoubleSmall-Preciaion-6 (100000): OK
                517  ms ±  18 ms, 10% less than baseline
          Special
            Float  Average (100000):            OK
              27.1 ms ± 2.2 ms,       same as baseline
            Double Average (100000):            OK
              29.2 ms ± 2.8 ms,       same as baseline

Comment on lines +85 to +86
import Data.ByteString.Builder.RealFloat.Internal (FloatFormat(..), fScientific, fGeneric)
import Data.ByteString.Builder.RealFloat.Internal (positiveZero, negativeZero)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
import Data.ByteString.Builder.RealFloat.Internal (FloatFormat(..), fScientific, fGeneric)
import Data.ByteString.Builder.RealFloat.Internal (positiveZero, negativeZero)
import Data.ByteString.Builder.RealFloat.Internal (FloatFormat(..), fScientific, fGeneric, SpecialStrings(..))

Importing positiveZero and negativeZero directly by name doesn't work so well due to NoFieldSelectors/DuplicateRecordFields

@@ -735,12 +787,12 @@ packWord16 l h =
#endif

-- | Unpacks a 16-bit word into 2 bytes [lsb, msb]
unpackWord16 :: Word# -> (# Word#, Word# #)
unpackWord16 :: Word# -> (# Word8#, Word8# #)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Word8# doesn't exist until ghc-8.8, and the conversion operators were renamed in ghc-9.2, with no compatibility wrappers left behind that I could quickly find.

If you really want to track Word8#-versus-Word# in the types, I guess the necessary CPP looks like this:

#if !MIN_VERSION_base(4,13,0)
type Word8# = Word#
wordToWord8# :: Word# -> Word8#
wordToWord8# = narrow8Word#

word8ToWord# :: Word8# -> Word#
word8ToWord# x = x
#elif !MIN_VERSION_base(4,16,0)
wordToWord8# :: Word# -> Word8#
wordToWord8# = narrowWord8#

word8ToWord# :: Word8# -> Word#
word8ToWord# = extendWord8#
#endif

Or the compatibility story gets easier with boxed types, with the trade-off being that we need to make sure GHC can unbox everything for us in worker/wrapper.

{-# LANGUAGE AllowAmbiguousTypes #-}
{-# LANGUAGE BlockArguments #-}
{-# LANGUAGE NamedFieldPuns #-}
{-# LANGUAGE NoFieldSelectors #-}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NoFieldSelectors only exists since ghc-9.2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants