Skip to content

Commit

Permalink
Stop using filter to define unicode. (#303)
Browse files Browse the repository at this point in the history
Rather than filter unicodeAll, we can construct the set of characters we want.
This has the added benefit of removing the `GenBase m ~ Identity` constraint.
  • Loading branch information
ajmcmiddlin authored and jacobstanley committed Jun 14, 2019
1 parent 079c0f7 commit de401e9
Showing 1 changed file with 11 additions and 3 deletions.
14 changes: 11 additions & 3 deletions hedgehog/src/Hedgehog/Internal/Gen.hs
Original file line number Diff line number Diff line change
Expand Up @@ -1004,11 +1004,19 @@ latin1 =
enum '\0' '\255'

-- | Generates a Unicode character, excluding noncharacters and invalid standalone surrogates:
-- @'\0'..'\1114111' (excluding '\55296'..'\57343')@
-- @'\0'..'\1114111' (excluding '\55296'..'\57343', '\65534', '\65535')@
--
unicode :: (MonadGen m, GenBase m ~ Identity) => m Char
unicode :: (MonadGen m) => m Char
unicode =
filter (not . isNoncharacter) $ filter (not . isSurrogate) unicodeAll
let
s1 =
(55296, enum '\0' '\55295')
s2 =
(8190, enum '\57344' '\65533')
s3 =
(1048576, enum '\65536' '\1114111')
in
frequency [s1, s2, s3]

-- | Generates a Unicode character, including noncharacters and invalid standalone surrogates:
-- @'\0'..'\1114111'@
Expand Down

0 comments on commit de401e9

Please sign in to comment.