New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce basic approach to concurrent log writing. #45
Conversation
I'm not sure about the best approach here, so I came with two, both seem to make sense and can be further extended. It's really hard to get into the trap of overgeneralization here :/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much, @qnikst! This looks sooo great! I highly appreciate your effort on this issue! I left some comments, but mostly they are questions. I'm totally okay with having different strategies for concurrency. The co-log
library is very young, so it's okay to have different ways for doing things and figure out the best one later.
My another concern is about function like fileLogAction
:
I'm not sure it will work properly if multiple threads try to write concurrently to the same file using the same Handler
. Though, I assume we could add one bracket
-style combinator on top of the forkBackgroundLogger
:
withBackgroundLogger :: Natural -> LogAction IO msg -> (LogAction IO msg -> IO r) -> IO r
co-log/src/Colog/Concurrent.hs
Outdated
@@ -0,0 +1,129 @@ | |||
-- | | |||
-- Sometimes an application may need to dump logs asynchonously. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation for this module is super cool!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
co-log/src/Colog/Concurrent.hs
Outdated
-- @ | ||
-- | ||
-- This code will run a background thread and process different | ||
-- loggers in that thread. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add some note like This can be useful if logging a single message requires some computation? Or general guideline when people might want to use concurrent logging. I personally aware only of the cases when people might not want to use concurrent logging due to some subtle concurrency issues. But it's still very useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, will add a note and more comments about use-cases and warnings.
co-log/src/Colog/Concurrent.hs
Outdated
-- $worker-thread | ||
-- Worker thread approach allows to create a single worker | ||
-- thread that will perform all logging. This approach allows | ||
-- user to use different loggers and change them on the flight. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and change them on the flight
That's a really cool feature — to be able to replace logging function with different one. Am I right that it's possible to do something like this: implement simple program that prints messages every second in some format A
and listens input and after user gives some input to this program it instantly starts to print messages in format B
?
Also, it's not that clear from existing documentation and code, how to replace current logging function with different one on the flight...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I see a problem, will discuss.
But I think I'll discuss the direct passing of the log action, and leave Colog.Monad
approach aside.
co-log/src/Colog/Concurrent.hs
Outdated
-- @ | ||
-- bracket ('mkBackgroundThread' 4096) | ||
-- ('killBackgroundThread') | ||
-- $ \thread -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like thread
variable in the documentation should be named as worker
according to the code below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+
co-log/src/Colog/Concurrent.hs
Outdated
-- write logs will be blocked. | ||
mkBackgroundThread :: Int -> IO (BackgroundWorker (IO ())) | ||
mkBackgroundThread capacity = do | ||
queue <- newTBQueueIO (fromIntegral capacity) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that newTBQueueIO
function takes Natural
in stm-2.5.0.0
. Maybe it's a good idea to have stm ^>= 2.5.0.0
in build-depends
in that case and pass Natural
to the mkBackgroundThread
function as well? Also, probably it's not enough to have stm >= 2.4
because stm-2.4.5.1
contains very important bugfix for TBQueue
:
I even expect from GHC to give you warning about fromIntegral
usage after you change type from Int
to Natural
because we have -Widentities
warning enabled for co-log
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's up to you. As far as I understand stm-2.5.0.0
will cut many older GHC's, in the case if you want to support 3 major versions then it's not an option. And I find it's easer to keep the same interface for all versions then doing CPP magic to provide the best option for each version.
co-log/src/Colog/Concurrent.hs
Outdated
-- | Create a background worker with a given capacity. | ||
-- If capacity is reached, then the thread that tries to | ||
-- write logs will be blocked. | ||
mkBackgroundThread :: Int -> IO (BackgroundWorker (IO ())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Input message as IO ()
looks interesting 🤔 I wonder how useful such simple LogAction
could be:
idLogAction :: LogAction m (m ())
idLogAction = LogAction id
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not very good at CT to find what that object will be and derive a use-case for that based on that knowledge. Right now I don't see how it can be used :/
co-log/src/Colog/Concurrent.hs
Outdated
-- Logger is executed in the other thread entirely, so if | ||
-- logger takes any thread related context it will be | ||
-- read from the other thread. | ||
runInBackgroundThread :: BackgroundWorker (IO ()) -> LogAction IO msg -> LogAction IO msg |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's possible to have the result type of runInBackgroundThread
function to be MonadIO m => LogAction m msg
but I wonder how useful it can be. Whole current API is built on top of the plain IO
and the only sensible generalisation that I know involves MonadUnliftIO
but I'm not sure it's a good idea to bring MonadUnliftIO
now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can be useful, because that way you may provide useful capabilities there. But I'm uneasy with that change, currently, there are a number of different approaches that allow such generalizations:
- MonadUnliftIO and unliftio ecosystem in general
- monad-control that is still widely used
- MTL-heavy libraries that are using MTL as final tagless system
- a family of the effects systems.
- direct handle approach
Making a preference to the one system other the others may be a bad thing, as it increases dependencies set and makes people used to other systems uneasy when using your library.
And IO
on it's own is very powerful and composable type, in general with any of the approaches mentioned above you can write LogAction IO msg
and "unlift" your system to those. So basically you need unlift :: SomeConstraints m => LogAction m msg -> LogAction IO msg
that will be called in the parent thread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I can just add liftLogIO :: MonadIO m => LogAction IO msg -> LogAction m msg
to co-log-core
to make life of users easier. So they have some way accessible without extra dependencies introduced.
co-log/src/Colog/Concurrent.hs
Outdated
-- takes a 'LogAction' that should describe how to write | ||
-- logs. | ||
-- | ||
-- @capability@ - parameter tells how many in flight messages |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This variable called capacity
in multiple places, so probably it's a good idea to name it capacity
here as well instead of capability
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks!
co-log/src/Colog/Concurrent.hs
Outdated
-- @capability@ - parameter tells how many in flight messages | ||
-- are allowed, if that value is reached then user's thread | ||
-- that emits logs will be blocked until any message will be | ||
-- written. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think alternative strategy is to drop all log messages after reaching the limit. But I agree with making current behavior the default one. Probably in production it's more important to lose some performance and hang for a while rather than losing important logs about what is actually going on within the running system.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'll add some docs and expose more internal so users will be able to define such strategies on their own. I don't like dropping messages, it should be some very special use-case. If you have too many logs so you can't dump them, it means that your environment has some severe problems that should be solved anyway.
co-log/src/Colog/Concurrent.hs
Outdated
queue <- newTBQueueIO (fromIntegral capacity) | ||
tid <- forkIO $ forever $ do | ||
msg <- atomically $ readTBQueue queue | ||
unLogAction logAction msg |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw, just general question. Would you appreciate some fancy operator for unwrapping LogAction
newtype? So instead of
unLogAction logAction msg
you could write something like:
logAction <| msg
Or probably have shorter alias for unLogAction
to partially apply logAction
in more convenient way?
toLogFun logAction :: msg -> m ()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that may be useful, but in internals, I like to use the full name. So it was not difficult for me to write unLogAction
With any of those approaches Just a bit of discussion while I'm trying to prepare myself for addressing the comments :). In general, we have two types of concurrency possible, I'll call them depth and width concurrency.
|
@chshersh, please take another look, I've reworked my branch, fixed few concurrency issues and improved documentation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The example documentation is super cool! I really liked the ASCII schema ❤️ I have some comments to the Haddock, but mostly looks great 👍
|
||
-- | A wrapper type that carries capacity. The internal | ||
-- type may be differrent for the different GHC versions. | ||
#if MIN_VERSION_base_noprelude(4,12,0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably here you should specify MIN_VERSION_stm_(2,5,0)
. Currently Travis CI is failing for cabal-install and GHC 8.4.3.
co-log/src/Colog/Concurrent.hs
Outdated
-- affected or not. | ||
-- | ||
-- 1. Unbounded memory usage - if there is no backpressure mechanism | ||
-- the the user threads, they may generate more logs that can be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Double the
. Probably a typo
co-log/src/Colog/Concurrent.hs
Outdated
-- | ||
-- 1. Unbounded memory usage - if there is no backpressure mechanism | ||
-- the the user threads, they may generate more logs that can be | ||
-- written in the same about of time. In those cases messages will |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe here you wanted to say in the same amount of time?
co-log/src/Colog/Concurrent.hs
Outdated
-- $ 'Colog.Monad.usingLoggerT' $ __do__ | ||
-- 'Colog.Monad.logMsg' "Starting application..." | ||
-- where | ||
-- logger = 'Colog.Action.withLogByteStringFile' "/var/log/myapp/log" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
co-log/src/Colog/Concurrent.hs
Outdated
-- main = | ||
-- 'withBackgroundLogger' | ||
-- 'defCapacity' | ||
-- (Aeson.encode `'cmap'` logger) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Somehow cmap
doesn't render properly with Haddock 🤔
I found this blog post with common suggestions for Haddock extremely useful!
But there's no such case for this situation...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
\``Core.Action.cmap`\`
seems to work
co-log/src/Colog/Concurrent.hs
Outdated
(killBackgroundLogger) | ||
(action . convertToLogAction) | ||
|
||
-- | Default capacity size. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe say in the documentation that it's equal to 4096? Because on Hackage you can't see the source code without clicking Source
link. This might be useful piece of information. We only need to not forget to update it after the value changes.
co-log/src/Colog/Concurrent.hs
Outdated
-- $background-logger | ||
-- | ||
-- Background logger is specialized version of the 'BackgroundWorker' process. | ||
-- Instead of running any jobu it will accept @msg@ type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
jobu probably a typo
Introduce two approaches for concurrent log writing and basic utilities for working with concurrent logs. Two approaches are needed in order to allow user to chose between generality and speed, though approaches were not benchmarked to check the hypothesys.
I hope I've fixed CI and typos. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks super cool!
Introduce two approaches for concurrent log writing
and basic utilities for working with concurrent logs.
Two approaches are needed in order to allow user to
chose between generality and speed, though approaches
were not benchmarked to check the hypothesys.