-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make Fake
a monad transformer
#32
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. We can ignore the CI issue for the nightly builds, I have left a minor comment and also the changelog needs update. Thank you!
@@ -609,6 +610,7 @@ library | |||
, template-haskell | |||
, text | |||
, time | |||
, transformers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you do the changes in package.yaml
instead since we generate the cabal file from there ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I commit the updated cabal file as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the generated cabal file needs to be updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, can you add a performance note about this (similar to what you mentioned in the PR description) in README.md:
Using this monad transformer, we were able to share a single FakerSettings, including the cache, across lots of Fake values, without having to rewrite everything in terms of IO. This was a massive performance improvement. Since it was useful to us, I thought it would be good to contribute back to upstream.
And also give an example of a good performance and a non good performance code using this MR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, I've moved those changes to package.yaml
, including a couple of dependencies that were only in the .cabal
file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've now added a README section which describes how to use FakeT
for better performance - let me know what you think.
This is a use-case that came up in my work, where we were generating lots of 'Fake' values. The trouble with running 'Fake' is that parsing YAML and building the cache are very expensive, but all the effort involved is thrown away after the 'Fake' is run - so it makes sense to generate as much data as possible all in the same 'Fake'. But if you're using a monad stack, you would have to put 'Fake' at the bottom - when instead it makes more sense as a transformer, which can share a single YAML parser and cache between many 'Fake' runs. This adds the 'FakeT' monad transformer, and makes 'Fake' a synonym for @FAKET IO@, preserving as much backwards-compatibility as possible with functions and pattern synonyms.
Following on from the 'FakeT' transformer, it makes sense to be able to "lift" the default 'Fake' values into the monad stack, in order to allow them to actually share the transformer cache and generator. This is done by the 'MonadFake' class, which comes with an instance for 'FakeT', as well as some of the other transformers defined in the `transformers` package.
This allows users to use combinators on any 'FakeT', instead of just the base 'Fake' monad.
8c4d2cb
to
494201d
Compare
Thanks! |
Motivation
At my work, we were running lots of
Fake
s in a monad stack to generate very large amounts of data, but it was quite slow. When I did some profiling, I found that the biggest bottlenecks were work thatfakedata
does every time you generate aFake
- namely, parse the YAML files for the data, and build a cache for accessing it.That meant that we should have been running all of our code inside a single
Fake
, to take advantage of the cache - but then we would have had to run our monad code inIO
, since that's what's required byFake
. So instead, we created this monad transformer:which works just like
Fake
, but with any monad stack we want, and still shares the cache.We then lifted our
Fake
s toFakeT
sto allow us to run them using the
FakerSettings
fromFakerT
.Using this monad transformer, we were able to share a single
FakerSettings
, including the cache, across lots ofFake
values, without having to rewrite everything in terms ofIO
. This was a massive performance improvement. Since it was useful to us, I thought it would be good to contribute back to upstream.Breaking change
This is a breaking change, since importing
Fake(..)
is no longer enough to import theFake
constructor (which is now actually a pattern synonym).Future work
I still want to go thoughThis is done now.Faker.Combinators
to generalise them toFakerT m
instead ofFake
, but I thought I would share these changes first.