Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: sconcat and stimes. #580

Draft
wants to merge 5 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions benchmarks/haskell/Benchmarks/Pure.hs
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,8 @@ import qualified Data.Text.Encoding as T
import qualified Data.Text.Lazy as TL
import qualified Data.Text.Lazy.Builder as TB
import qualified Data.Text.Lazy.Encoding as TL
import Data.Semigroup
import Data.List.NonEmpty (NonEmpty((:|)))

data Env = Env
{ bsa :: !BS.ByteString
Expand Down Expand Up @@ -83,6 +85,14 @@ benchmark kind ~Env{..} =
[ benchT $ nf T.concat tl
, benchTL $ nf TL.concat tll
]
, bgroup "sconcat"
[ benchT $ nf sconcat (T.empty :| tl)
, benchTL $ nf sconcat (TL.empty :| tll)
]
, bgroup "stimes"
[ benchT $ nf (stimes (10 :: Int)) ta
, benchTL $ nf (stimes (10 :: Int)) tla
]
, bgroup "cons"
[ benchT $ nf (T.cons c) ta
, benchTL $ nf (TL.cons c) tla
Expand Down
2 changes: 2 additions & 0 deletions src/Data/Text.hs
Original file line number Diff line number Diff line change
Expand Up @@ -361,6 +361,8 @@ instance Read Text where
-- | @since 1.2.2.0
instance Semigroup Text where
(<>) = append
stimes = replicate . P.fromIntegral
Copy link
Contributor

@Bodigrim Bodigrim Apr 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm somewhat cautious about potential overflow in fromIntegral. Let's throw an error if Text is non-empty and n does not fit into Int, same as base does for ByteArray: https://hackage.haskell.org/package/base-4.19.1.0/docs/src/Data.Array.Byte.html#stimesPolymorphic

(We do not need to check that len * n fits into Int, because this is validated by Data.Text.replicate itself)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not follow why we should evaluate to the empty text when given a negative number but evaluate to an error when given a number that is bigger than maxBound ∷ Int, but your wish is my command.

Copy link
Contributor Author

@kindaro kindaro Apr 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, we are not going to check the input in the same way as base because we should stay abstract. The way integers are represented is an implementation detail that had changed in the past and may change in the future. Supporting all versions of GHC we test with will need a lot of CPP without significant improvement in performance. So, I shall do an equality check instead of matching on the constructors of Integer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not follow why we should evaluate to the empty text when given a negative number but evaluate to an error when given a number that is bigger than maxBound ∷ Int, but your wish is my command.

Ah, I did not remember this peculiarity of replicate.

I'd be in favor of stimes throwing an error on negative arguments, even if replicate does not mind to swallow it silently. @Lysxia how do you feel about it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the behaviour on lists:

ghci> replicate (-1) [1..10]
[]
ghci> import Data.Semigroup
ghci> stimes (-1) [1.. 10]
*** Exception: stimes: [], negative multiplier

So, stimes is not defined even while replicate is defined on lists and negative numbers. We can do the same. Let me patch it up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Bodigrim   So why would you be in favour of stimes throwing an error on negative arguments?

@Lysxia   So why does this sound good to you?

I should like to have the underlying reasoning recorded for posterity.

Copy link
Contributor Author

@kindaro kindaro Apr 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this is a shocking revelation — the property checks of text overall compare the behaviour of functions on Text to their correspondents on String, but the check for stimes is missing both for strict and lazy Text. It would be consistent to add such a check for stimes for both strict and lazy Text and adjust the definitions as needed for it to pass. @Bodigrim Should I add a check for lazy Text and make it pass, or should I only add a check for strict Text?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Asking to "replicate a string n times" with negative n is nonsense. There must be an error in the definition of n, so throwing an exception lets users be aware of that error and fix it.
  2. The default definition of stimes already throws an exception for n <= 0. People haven't complained about it. Extending the definition for n = 0 is reasonable for a monoid.
  3. There could still be a case made in favor of making stimes less partial and more similar to replicate (I think "replicate a string n times" is nonsense as a sentence in natural language, but I don't have a strong argument that code must follow natural language). Until someone makes a good case for extending stimes, throwing an exception for negative arguments is forward-compatible: we can extend the function later (it would only break code that catches the exception, a fishy thing to do). If we made it total now and changed our minds later, that would be a more breaking change.

You can add the stimes test for strict Text now and add lazy stimes in another PR (small PR = good!)

sconcat = concat . NonEmptyList.toList

instance Monoid Text where
mempty = empty
Expand Down
Loading