Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Fake a monad transformer #32

Merged
merged 5 commits into from
Jun 10, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
64 changes: 64 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -178,6 +178,70 @@ main = do
For seeing the full list of combinators, see the module documentation of
`Faker.Combinators`.


Using the `FakeT` transformer
-----------------------------

When generating values, you may want to perform some side-effects.

```haskell
import Control.Monad.IO.Class
import Control.Monad.Logger
import Data.Text
import Data.Text.IO
import Faker.ChuckNorris

logQuote :: (MonadIO m, MonadLogger m) => m ()
logQuote = do
userName <- liftIO getLine
quote <- generateNonDeterministic fact
$(logInfo) $ "Chuck Norris" userName quote
```

This works fine for one-off generation - but if you try to repeatedly
generate values, you will run into performance trouble.

```haskell
import Control.Monad (replicateM)

slowFunction :: (MonadIO m, MonadLogger m) => m ()
slowFunction = replicateM 1000 logQuote
```

This is because generating a `Fake` parses the data files and builds a
cache for future use. Using the `Monad` instance on `Fake` shares that
cache between `Fake`s, making faking fast. But in the above code, a
new `Fake` is generated each time - so the cache is discarded, and
performance is much worse.

It's better to use the `FakeT` monad transformer when writing such code,
to get the benefits of sharing the cache, as well as being able to
perform side effects. `FakeT` comes with the `mtl`-style `MonadFake`
class, for easy use with your monad stack, which lets you lift `Fake`s
with `liftFake`.

```haskell
import Faker.Class

betterLogQuote :: (MonadIO m, MonadLogger m, MonadFake m) => m ()
betterLogQuote = do
userName <- liftIO getLine
quote <- liftFake fact
$(logInfo) $ "Chuck Norris" userName quote
```

`slowFunction` can be rewritten to be much faster, because the `FakeT`
is shared between all the calls to `fact`.

```haskell
fastFunction :: (MonadIO m, MonadLogger m) => m ()
fastFunction = generateNonDeterministic go
where
go :: FakeT m ()
go = replicateM 1000 logQuote
```
psibi marked this conversation as resolved.
Show resolved Hide resolved


Comparision with other libraries
--------------------------------

Expand Down
22 changes: 14 additions & 8 deletions fakedata.cabal
Original file line number Diff line number Diff line change
@@ -1,10 +1,8 @@
cabal-version: 1.12

-- This file has been generated from package.yaml by hpack version 0.33.0.
-- This file has been generated from package.yaml by hpack version 0.34.4.
--
-- see: https://github.com/sol/hpack
--
-- hash: 47d937a2bbf041ef0cb0b882d0bf7d68d1eb9504988a12d676e9e699288f2bda

name: fakedata
version: 0.8.0
Expand Down Expand Up @@ -273,6 +271,7 @@ library
Faker.Cannabis
Faker.Chiquito
Faker.ChuckNorris
Faker.Class
Faker.Code
Faker.Coffee
Faker.Coin
Expand Down Expand Up @@ -595,7 +594,8 @@ library
hs-source-dirs:
src
build-depends:
attoparsec
QuickCheck
, attoparsec
, base >=4.7 && <5
, bytestring
, containers
Expand All @@ -605,10 +605,12 @@ library
, filepath
, hashable
, random
, regex-tdfa
, string-random
, template-haskell
, text
, time
, transformers
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you do the changes in package.yaml instead since we generate the cabal file from there ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I commit the updated cabal file as well?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the generated cabal file needs to be updated.

Copy link
Member

@psibi psibi Jun 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, can you add a performance note about this (similar to what you mentioned in the PR description) in README.md:

Using this monad transformer, we were able to share a single FakerSettings, including the cache, across lots of Fake values, without having to rewrite everything in terms of IO. This was a massive performance improvement. Since it was useful to us, I thought it would be good to contribute back to upstream.

And also give an example of a good performance and a non good performance code using this MR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I've moved those changes to package.yaml, including a couple of dependencies that were only in the .cabal file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've now added a README section which describes how to use FakeT for better performance - let me know what you think.

, unordered-containers
, vector
, yaml
Expand Down Expand Up @@ -689,7 +691,8 @@ test-suite fakedata-test
test
ghc-options: -threaded -rtsopts -with-rtsopts=-N
build-depends:
attoparsec
QuickCheck
, attoparsec
, base >=4.7 && <5
, bytestring
, containers
Expand All @@ -702,15 +705,15 @@ test-suite fakedata-test
, hspec
, hspec-discover
, random
, regex-tdfa
, string-random
, template-haskell
, text
, time
, transformers
, unordered-containers
, vector
, yaml
, regex-tdfa
, QuickCheck
default-language: Haskell2010

benchmark fakebench
Expand All @@ -722,7 +725,8 @@ benchmark fakebench
bench
ghc-options: -O2
build-depends:
attoparsec
QuickCheck
, attoparsec
, base >=4.7 && <5
, bytestring
, containers
Expand All @@ -735,10 +739,12 @@ benchmark fakebench
, gauge
, hashable
, random
, regex-tdfa
, string-random
, template-haskell
, text
, time
, transformers
, unordered-containers
, vector
, yaml
Expand Down
3 changes: 3 additions & 0 deletions package.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,9 @@ dependencies:
- string-random
- fakedata-parser
- attoparsec
- regex-tdfa
- QuickCheck
- transformers

library:
source-dirs: src
Expand Down
74 changes: 42 additions & 32 deletions src/Faker.hs
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,13 @@
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE InstanceSigs #-}
{-# LANGUAGE BangPatterns #-}
{-# LANGUAGE PatternSynonyms #-}

module Faker
(
-- * Types
Fake(..)
Fake
, FakeT(.., Fake)
, FakerSettings
, FakerException(..)
, defaultFakerSettings
Expand Down Expand Up @@ -172,58 +174,66 @@ replaceCacheFile cache fs = do
ref <- newIORef cache
pure $ fs {fsCacheFile = ref}

newtype FakeT m a = FakeT
{ runFakeT :: FakerSettings -> m a
}

-- | Fake data type. This is the type you will be using to produce
-- fake values.
newtype Fake a = Fake
{ unFake :: FakerSettings -> IO a
}
type Fake a = FakeT IO a

instance Functor Fake where
pattern Fake :: (FakerSettings -> IO a) -> Fake a
pattern Fake f = FakeT f

unFake :: Fake a -> FakerSettings -> IO a
unFake = runFakeT

instance Monad m => Functor (FakeT m) where
{-# INLINE fmap #-}
fmap :: (a -> b) -> Fake a -> Fake b
fmap f (Fake h) =
Fake
fmap :: (a -> b) -> FakeT m a -> FakeT m b
fmap f (FakeT h) =
FakeT
(\r -> do
a <- h r
let b = f a
pure b)

instance Applicative Fake where
instance Monad m => Applicative (FakeT m) where
{-# INLINE pure #-}
pure x = Fake (\_ -> pure x)
pure x = FakeT (\_ -> pure x)
{-# INLINE (<*>) #-}
(<*>) = ap

instance Monad Fake where
instance Monad m => Monad (FakeT m) where
{-# INLINE return #-}
return :: a -> Fake a
return x = Fake (\_ -> return x)
return :: a -> FakeT m a
return x = FakeT (\_ -> return x)
{-# INLINE (>>=) #-}
(>>=) :: Fake a -> (a -> Fake b) -> Fake b
(>>=) :: FakeT m a -> (a -> FakeT m b) -> FakeT m b
f >>= k = generateNewFake f k

generateNewFake :: Fake a -> (a -> Fake b) -> Fake b
generateNewFake (Fake h) k = Fake (\settings -> do
generateNewFake :: Monad m => FakeT m a -> (a -> FakeT m b) -> FakeT m b
generateNewFake (FakeT h) k = FakeT (\settings -> do
let deterministic = getDeterministic settings
currentStdGen = getRandomGen settings
newStdGen = if deterministic
then currentStdGen
else fst $ split currentStdGen
item <- h settings
let (Fake k1) = k item
let (FakeT k1) = k item
k1 (setRandomGen newStdGen settings))
{-# SPECIALIZE INLINE generateNewFake :: Fake Text -> (Text -> Fake Text) -> Fake Text #-}

instance MonadIO Fake where
liftIO :: IO a -> Fake a
liftIO xs = Fake (\_ -> xs >>= pure)
instance MonadIO m => MonadIO (FakeT m) where
liftIO :: IO a -> FakeT m a
liftIO xs = FakeT (\_ -> liftIO xs)

-- | @since 0.6.1
instance Semigroup a => Semigroup (Fake a) where
instance (Semigroup a, Monad m) => Semigroup (FakeT m a) where
mx <> my = (<>) <$> mx <*> my

-- | @since 0.6.1
instance Monoid a => Monoid (Fake a) where
instance (Monoid a, Monad m) => Monoid (FakeT m a) where
mempty = pure mempty
mappend mx my = mappend <$> mx <*> my

Expand All @@ -234,10 +244,10 @@ instance Monoid a => Monoid (Fake a) where
-- λ> generate FN.name
-- "Antony Langosh"
-- @
generate :: Fake a -> IO a
generate (Fake f) = do
cacheField <- newIORef HM.empty
cacheFile <- newIORef HM.empty
generate :: MonadIO m => FakeT m a -> m a
generate (FakeT f) = do
cacheField <- liftIO $ newIORef HM.empty
cacheFile <- liftIO $ newIORef HM.empty
f $ defaultFakerSettings {fsCacheField = cacheField, fsCacheFile = cacheFile}

-- | Generate fake value with 'defaultFakerSettings' but with non
Expand All @@ -252,7 +262,7 @@ generate (Fake f) = do
-- λ> generateNonDeterministic FN.name
-- "Savannah Buckridge"
-- @
generateNonDeterministic :: Fake a -> IO a
generateNonDeterministic :: MonadIO m => FakeT m a -> m a
generateNonDeterministic = generateWithSettings $ setNonDeterministic defaultFakerSettings

-- | Generate fake value with supplied 'FakerSettings'
Expand All @@ -261,14 +271,14 @@ generateNonDeterministic = generateWithSettings $ setNonDeterministic defaultFak
-- λ> generateWithSettings defaultFakerSettings FN.name
-- "Antony Langosh"
-- @
generateWithSettings :: FakerSettings -> Fake a -> IO a
generateWithSettings settings (Fake f) = do
generateWithSettings :: MonadIO m => FakerSettings -> FakeT m a -> m a
generateWithSettings settings (FakeT f) = do
let deterministic = getDeterministic settings
stdGen <-
if deterministic
then pure $ getRandomGen settings
else newStdGen
else liftIO newStdGen
let newSettings = setRandomGen stdGen settings
cacheField <- newIORef HM.empty
cacheFile <- newIORef HM.empty
cacheField <- liftIO $ newIORef HM.empty
cacheFile <- liftIO $ newIORef HM.empty
f $ newSettings {fsCacheField = cacheField, fsCacheFile = cacheFile}
2 changes: 1 addition & 1 deletion src/Faker/Address.hs
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
module Faker.Address where

import Data.Text (Text)
import Faker (Fake(..), FakerSettings(..), getLocale)
import Faker (Fake, FakeT(..), FakerSettings(..), getLocale)
import Faker.Internal (cachedRandomUnresolvedVec, cachedRandomVec, cachedRegex, RegexFakeValue(..))
import Faker.Provider.Address
import Faker.TH
Expand Down
33 changes: 33 additions & 0 deletions src/Faker/Class.hs
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
module Faker.Class
( MonadFake(..)
) where

import Faker (FakeT(..), Fake)

import Control.Monad.IO.Class
import Control.Monad.Trans.Class (MonadTrans(lift))
import Control.Monad.Trans.Except (ExceptT)
import Control.Monad.Trans.Identity (IdentityT)
import Control.Monad.Trans.Maybe (MaybeT)
import Control.Monad.Trans.Reader (ReaderT)
import Control.Monad.Trans.State (StateT)
import Control.Monad.Trans.Writer (WriterT)

class Monad m => MonadFake m where
liftFake :: Fake a -> m a

instance MonadIO m => MonadFake (FakeT m) where
liftFake (Fake f) = FakeT (liftIO . f)

instance MonadFake m => MonadFake (ReaderT r m) where
liftFake = lift . liftFake
instance (Monoid w, MonadFake m) => MonadFake (WriterT w m) where
liftFake = lift . liftFake
instance MonadFake m => MonadFake (StateT s m) where
liftFake = lift . liftFake
instance MonadFake m => MonadFake (IdentityT m) where
liftFake = lift . liftFake
instance MonadFake m => MonadFake (ExceptT e m) where
liftFake = lift . liftFake
instance MonadFake m => MonadFake (MaybeT m) where
liftFake = lift . liftFake
Loading