Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thread safety changes #2

Closed
wants to merge 45 commits into from
Closed

Thread safety changes #2

wants to merge 45 commits into from

Conversation

mleise
Copy link

@mleise mleise commented Sep 20, 2014

This pull request addresses the mentioned basic thread safety issues while keeping the API mostly the same. In particular I changed the following:

  • All methods documented as overridable (writeLogMsg, beginLogMsg, logMsgPart and finishLogMsg) are now protected. They are the ones who no longer need to implement thread-safety themselves and never get called by user code. They are only called by library code that synchronizes with the Logger's mutex before calling them. When implementing your own loggers, remember to not accidentally make the overrides public.
  • stdlog when set to null will swap the default logger back in.
  • The unittesting symbols like randomString() no longer leak into user code.
  • The function isLoggingActive is now split into an compile time constant and a template (isLoggingActiveAt) for a specific level. The return values are the same, but code & symbols for the functions are no longer emitted.
  • Compiler now warns about unimplemented Logger methods (instead of ending with linker errors).
  • All methods except the ones documented as overridable have been made final. This is to implement the thread-safety measures inside these other methods and be safe from later overrides that break them.

A word on Logger chaining: A special method for logging chains has been added to replace the now inaccessible writeLogMsg() with the name forwardMsg(). As public a method it will do the proper synchronization before calling writeLogMsg() internally and not check the globalLogLevel again.

Note: This pull request does not address the recursive logger call issue.

burner and others added 30 commits September 16, 2014 00:08
some more docu fixes

fatel -> fatal

changed the way of calling

some rework

some better comments and win32.mak fix

more documentation

trying to figure out why win32 fails

test before you commit

another try to work around windows files and processes

maybe this time

maybe this merges

update 1 mostly soenke

MultiLogger and Logger have names

more docu

unittest fix

some better comments and win32.mak fix

Conflicts:
	std/logger.d

antoher doc fix

less code dup and more log functions

docs are now generated

some more fixing

speedup

some threading

forgot some

better docu gen and more testing

two new LogLevel, most functions are now at least @trusted

Tracer functionality

fatal delegate

some phobos style fixes

another bug bites the dust

version disable enhancements

default global LogLevel set to LogLevel.all

logLevel compare bugfix

delete of dead code

tab to whitespace

bugfixes, reduandency removal, and docu

multilogger was logging to all it childs without checking there LogLevel in
relation to the LogLevel of the LoggerPayload.

Some constructors where redundant.

more examples and more documentation

some fixes

NullLogger and moduleName

wrong doc

I splitted the multi logger implementations out

loglevelF to loglevelf in order to make phobos style think writefln

document building

win makefile fixes

some optimizations thanks to @elendel-
some whitespace

some updates from the github logger project

* stdio- and filelogger now use a templatelogger base to reduce code

makefile fixes

makefile

fixed the unittest

made sure the stdiologger is set up at the end

some comment fixes

finding the filelogger segfault

closed another file

a lookup fix

darwin unittest fail output

more diagnostics

* more documentation for the templatelogger and multilogger
* it breaks the log function api
 * log and logf now behave as their write and writef counterparts
 * for logging with an explicit LogLevel call logl loglf
 * for logging with an explicit condition call logc logcf
 * for logging with an explicit LogLevel and explicit condition call
 * loglc loglcf
 * the log function with an explicit LogLevel like info, warning, ...
 * can be extended into c, f or cf and therefore require a condition
 * and/or are printf functions

something is still interesting

rebase and lazy Args

saver file handling

whitespace

some updates

tracer is gone and more doc updates
* StdIOLogger prints now threadsafe
* concurrentcy hang fix
* makefile fix
@ChronosWS
Copy link

That was my concern as well. I understand the original design was intended to be simple, with loggers invoked essentially directly with configuration happening essentially statically. Such a model comes with a number of design restrictions if it is intended that the client can be agnostic about which logger is being used (a reasonable expectation in most scenarios as logger configuration frequently occurs through configuration files and runtime selection rather than compile-time.) It's fine to have such restrictions in the design as long as they are deliberate, well communicated and consistent.

@ChronosWS
Copy link

An example of limiting by design could be the following:

  1. Logging is assumed to be configured statically, once, at the beginning of the program and not changed during the program execution. Attempts to reconfigure the logging system fail, or can only be done when the system is in a shut-down state. Configuration is all-or-nothing.
  2. Logging is not allowed to be re-entrant - that is, a logger in operation cannot cause log messages to be generated. The system should enforce this.
  3. Logging methods are synchronized such that the logger's processing of any top-level call does not overlap any other top-level call. If re-entrancy is forbidden, mutex synchronization is sufficient here, but note that this also makes all logging calls synchronous.
  4. If clients can make any assumptions about the synchronous/asynchronous nature of the logger, those assumptions need to be verifiable by the client (through a query mechanism, or by definition in the API.) This is particularly important if some part of the code receives a logger but is unaware of its configuration (such as library code which leans on the host-supplied logger implementation.)

Using such a design greatly simplifies the kinds of inter-thread issues we have to deal with. I feel like that is more like what Robert had originally intended. Once you allow arbitrary reconfiguration of the logging system, re-entrancy or multiple loggers a-la the MultiLogger the system must become larger and more complex, perhaps beyond the original intentions.

@mleise
Copy link
Author

mleise commented Sep 21, 2014

I know some web server folks wont be happy at all. But message passing is not really what I want to get my hands into. I have no experience in the internals of big logging frameworks like log4j and no understanding of web server requirements. As far as this PR is concerned logging is now thread-safe, which is already a net gain.

@burner
Copy link
Owner

burner commented Sep 21, 2014

yeah message passing is way to heavyweight. This is something that should be build on top of std.logger

@ChronosWS
Copy link

One question i have about the mutex in Logger - it appears its purpose is to protect unsynchronized access to two member variables: logLevel_ and fatalHandler_. As a result, all log methods must acquire this mutex in order to perform any logging since they all use both members. However, they hold this mutex long enough to make calls to user code (Logger impls.) This should not be necessary and can be tidied up.

First, as it stands there is no guarantee of the state of the Logger at the time the log function is entered with respect to the content of logLevel_ or fatalHandler_. They are what they are once we enter the mutex. Further, they are not synchronized with respect to each other either, as they can be set individually and each setting takes the mutex on its own. With this behavior, acquiring the mutex for the whole duration of each logging call (spanning the Logger impl) is unnecessary - instead, hold the mutex just long enough to snapshot the logLevel_ and fatalHandler_ values into locals, then proceed with the impl calls without the mutex.

This can be improved even further by moving logLevel_ and _fatalHandler into an immutable LoggerConfiguration class which may be exchanged any time the settings get updated. Again, interactions between the logging system on different threads do not have any ordering guarantees (nor can the logging system make such guarantees) so we are free to change that configuration at any time so long as within the logging functions we operate on only a single configuration (which we can snapshot simply by taking a reference to it.)

Once you have done this, the only purpose mutex serves now is to prevent Logger impls (that is, begin, part and finish) from being entered simultaneously on several threads. To my mind this is functionality best deferred to the Logger derivation, which may have its own strategy for dealing with multithreading (a-la Robert's suggestion that those who need additional functionality can extend Logger.) This can be done by having beginLogMessage setup a context in which the rest of the message is accumulated and dispatched. Some D'ish pseudocode follows for the various logImpl()s :

{
auto callConfig = this.loggerConfiguration_;
auto logCtx = this.beginLogMsg(... callConfig.logLevel ...);
static if(ll == LogLevel.fatal)
scope(exit): callConfig.fatalHandler();
scope(exit): logCtx.finishLogMsg();
various MsgRange(...); or logMsgPart(...) as required which invoke on logCtx. e.g. logCtx.logMsgPart(...)
}

(if I have misconstrued some D functionality please let me know)

beginLogMsg is now responsible for setting up whatever synchronization may be necessary based on the requirements of the logger. The "standard" loggers could inherit from MutexSynchronizedLogger whose beginLogMsg call acquires a mutex and whose finishLogMsg call releases that mutex. Asynchornous loggers may return a context which simply accumulates the message parts then posts the completed request to the queue without acquiring any longer-standing mutex. By moving the calls to logMsgPart and finishLogMsg() to a returned context, Logger impls are free to efficiently support logging from multiple threads without necessarily requiring those threads to synchronize across each other for the duration of the call. Simultaneously anyone who does want this simpler behavior can simply derive from MutexSynchronizedLogger. A trivial logCtx implementation might either be a forwarder to the logger impl or simply a reference to the logger itself with those methods implemented.

I think the above will give us all the simplicity of the original design without inadvertently burdening extensions with unnecessary synchronization.

Thoughts?

@ChronosWS
Copy link

I'd be happy to try and create a pull request if that would be more useful/clearer...

@@ -1228,14 +1212,14 @@ abstract class Logger
l.log(1337);
--------------------
*/
void log(int line = __LINE__, string file = __FILE__,
final void log(int line = __LINE__, string file = __FILE__,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please elaborate why these methods should be final.
If the user tries to shoot his foot he should be able to, as this logger should be the bases for all possible impls.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I previously replied to that in the conversation thread (#2 (comment)). There are several motivations for that modification:

  • Final methods can get inlined by the compiler.
  • Even if they don't, there is a small performance benefit from not going through the vftable of the Logger instance.
  • Virtual methods should be used sparingly and documented as extension points. Making a final method virtual later is possible, but making a virtual method final is a breaking change, which became apparent when a method in std.zip that was not designed to be overridable, was changed to final and someone's code broke. To cite Walter from the "final by default" discussion: "The argument for final by default […] is a good one. Even Andrei agrees with it."
  • As shown in FileLogger, bottom-up synchronization spread out over 3 methods is easy to get wrong. In this PR I followed the idea that thread-safety is easy when you lock all mutable state at the top most level. By making the public methods final and have them take the lock, the four protected, overridable methods are safely encapsulated.
  • log and other methods of its kind encapsulate the basic behavior of a Logger as documented (checking its log level and the condition and calling the fault handler), which are customizable through logLevel and faultHandler, and pass their arguments as is into beginLogMsg, which can be overridden. They don't contain anything worth overriding in my opinion, and this change alone should not preclude any use cases.

See also commit mleise@f757f2b and the PR description #2 (comment)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, I must have overlooked that

@mleise
Copy link
Author

mleise commented Sep 21, 2014

@burner It is explained in the pull request description.
The related commit was mleise@f757f2b .
Also see the discussion about final-by-default. A library that makes methods virtual must support this actively, because later taking it back will break code. I thought this was also your intention, to overwrite writeLogMsg and/or the other three methods, while the other methods provide forwarding and - in this PR - thread safety.
P.S.: Also this particular method forwards all its arguments into user overridable methods already. The only thing unmodifiable about it is calling fatalHandler, but that is in line with the documentation and the delegate can be set to an empty function.

@mleise
Copy link
Author

mleise commented Sep 21, 2014

@ChronosWS Yes, that mutex protects unsynchronized access to logLevel_ and fatalHandler_, but also the other fields, from outside access and in descendants of Logger. That setup was intentional: Final methods that are called by the logging infrastructure lock out all other threads, so the overridable methods can freely use all object state including logLevel_, fatalHandler_, header and appender. Or to put the cart before the horse: If you don't span the mutex lock over the whole user code, you don't gain single-threaded performance. (I measured std.logger with and without my modifications to perform the same.) But you lose multi-threaded safety. In particular you have to build it from bottom up and here is why that is bad and I decided for pulling it up into those final methods:

There are 12(!) log methods in the Logger class that call beginLogMsg and finishLogMsg, which are not thread-safe.
We could either override all 12 log methods, which have a complex signature and a lot of boiler plate code or just those two plus logMsgPart. The implementations would look very similar if all they want to be is thread safe:

protected void beginLogMsg(string file, int line, string
                          funcName, string prettyFuncName,
                          string moduleName, LogLevel
                          logLevel, Tid threadId, SysTime
                          timestamp, Logger logger) @trusted
{
    // Don't lock anything if we statically disabled logs
    static if (isLoggingActive())
    {
        // no 2nd thread may begin logging
        this.mutex.lock();
        // something may go wrong in super's method
        scope (failure) this.mutex.unlock();

        super.beginLogMsg(file, line, funcName,
                          prettyFuncName, moduleName,
                          logLevel, threadId,
                          timestamp, logger);
    }
}

protected void logMsgPart(const(char)[] msg)
{
    // don't mess with the mutex outside of enabled logging
    static if (isLoggingActive())
       {
        // something may go wrong in super's method
        scope (failure) this.mutex.unlock();
        super.logMsgPart(msg);
    }
}

protected void finishLogMsg()
{
    // don't mess with the mutex outside of enabled logging
    static if (isLoggingActive())
    {
        // in any case unlock the mutex afterwards
        scope (exit) this.mutex.unlock();
        super.finishLogMsg()
    }
}

The above is a minimum requirement to make Logger thread safe and I think it is easy to forget a static-if or the conditions under which to unlock the mutex. It would be nicer if std.logger was already thread safe to begin with, because it is today common to deal with more than one thread.

Also the implementation above has two issues:

  1. It cannot be used as a base class, because as an implementation detail it already encapsulates the use of Appender which will use the GC.
  2. I think it is bad to split up locking and unlocking a mutex over the course of three methods if it is avoidable.

So my conclusion is to be able to have a reusable thread safe Logger, which doesn't depend on the GC, it is necessary to pull the thread safety code one level up into the 12 log methods.

  • Only after that I took a look at FileLogger to find that it missed the case where logMsgPart() needs to unlock the mutex in case of an Exception, reassuring me that my gut feeling that this is too easy to get wrong was correct.
  • It is impractical to extend from such a bottom-up synchronized class, since for every state you add to a derived class you need to add another layer of synchronization before calling e.g. super.writeLogMsg(...);. There is no such thing as a ready to use MutexLogger base class with bottom-up.
  • In other words: Critical sections really should be around the code they protect, not interspersed with it.

Do you think we can merge this now (since it also contains two or three bug fixes in the original code) and then iterate further on this? The discussion is really shifting into another iteration on the whole API and I would feel better if we could first get this iteration merged. All in all the changes are still simple and manageable, and they get thread-safety done now without hurting single-threaded performance.
I think we can even continue discussing here after a merge until the next PR. :-)

@ChronosWS
Copy link

Sounds good to me. I'll have to review your original case against my proposal in more detail (in local code I expect.) Iterating on a later PR if needed sounds like a swell idea.

@mleise
Copy link
Author

mleise commented Sep 22, 2014

Reviewing and testing is fine, but my fear is that you will end up with a counter-proposal and Robert ends up in a deadlock, where he has to decide which version to merge based on empathy more than on software engineering reasons.
I looked at log4j and found they implemented it the same way I did, with more or less global locking and locking on every property:

  public synchronized void setErrorHandler(ErrorHandler eh) {
    if(eh == null) {
      LogLog.warn("You have tried to set a null error-handler.");
    } else {
      this.errorHandler = eh;
      if(this.qw != null) {
    this.qw.setErrorHandler(eh);
      }
    }
  }

(We could copy that assertion on !is null.)

So my goal with this PR and all threading code I write is to deliver 100% sequential correctness, in the transactional meaning of the expression. I think this was the original design goal for log4j as well. For this Logger that means, while the stdlog is in use it is safe to assume it wont change. While a Logger writes back the message, its logLevel will stay fixed until it is done. (And only then can it be set to something that would have filtered out the message.) This is safe and easy to reason about.

But, there have been several reports and questions about why log4j can only one log message in flight at any given time. How can anyone so stupid and place a global lock there? Well, it should be clear now, that it was the technically correct decision, but that it became impractical for some applications. [If we could agree on that part I'd be happy. That would mean we are on the same page now.]

I am open now for cutting back on sequential consistency for the "greater good". But we should only do that where it actually helps implementing some use case. Let's assume in a first step, that I remove the global lock securing stdlog. (I'll prepare a commit soon). We lose the sequential consistency guarantee that stdlog is the same before and after a call to it. But we get rid of the global lock that prevents any parallel logging to the stdlog. So this is an absolutely necessary change.
Anything in the sort of "I need to quickly dump 10.000 messages into a log file and can't wait for the serial write to finish" is a task for message passing to another thread. We already discussed that and removing locks wont buy us anything.
What are the remaining use cases? As far as I can tell it would have to be one log call, that ends in any one of a set of outputs. Each of these outputs is eventually serial by nature, but by delegating incoming log calls to only a specific one at a time they achieve some level of parallelization. E.g. an "requests.log" and a "memory.log". Is that the right mental image?

@mleise
Copy link
Author

mleise commented Sep 22, 2014

Lol, for a moment I was just considering making synchronization a public property. If set to off, the methods that would usually synchronize would to the same job bypassing the mutex.

@mleise
Copy link
Author

mleise commented Sep 22, 2014

Or rather protected, so implementations can decide to only make it public when they are prepared to deal with loosing the "protection from above".

class MyLogger : Logger
{
    override protected @property bool defaultSynchronization()
    {
        // Yes, we want the ctor of Logger to set up the default mutex protection
        // for our `writeLogMsg()` call.
        return true;
    }
    ...
}

I think it is kind of neat: It only creates the Mutex when needed according to the overridable property defaultSynchronization (which could default to true in Logger). So whenever you don't need to deal with managing multiple requests at once you get 100% thread safety by default.

@mleise
Copy link
Author

mleise commented Sep 22, 2014

Ah no, stupid idea. We still cannot read/write delegates atomically, so the mutex has to be used either way.

…andard logging call in an exchange of sequential consinstency for throughput.
(The template constraint is not required in this case.)
@mleise
Copy link
Author

mleise commented Sep 22, 2014

@ChronosWS Sorry, I misunderstood "swell idea" as something negative. (Non-native speaker.) The urban dictionary cleared it up for me. So if you want to pick up and continue from here, I'm fine with that. I just saw my invested time over the weekend on this go down the drain and felt a little uneasy, especially since I also had fixed a few subtle bugs along the way.

@ChronosWS
Copy link

No problem. I happen to be rather passionate about asynchronous and lock-free programming, so any time I see a blocking synchronization point I want to try to eliminate it. My experience has been that every time you can do this you benefit (except for the very few cases where such locks are truly needed.) However, I am not willing to throw a fit to block this PR, and I'm not going to make someone else do it for me. It's best for me to give it a shot (since I need more practice with D anyway) and then submit the PR myself.

Certainly the work you have done here is not wasted, whether or not I can come up with another proposal that satisfies all of the requirements and eliminates the locks. Iteration is the key to making good software quickly - I have no desire to sit here and spin our wheels designing ad infinitum. :)

@ghost
Copy link

ghost commented Sep 22, 2014

My attempt to achieve something similar: https://github.com/klamonte/logger/commit/4391484fcdbfaf69c1776da93a36c792c6ff405c

It has the advantages of not synchronizing any of the functions, by way of removing the appender and header fields from Logger. However it feels a bit "non-D-like" in how it uses MsgRange to effectively act as a Java-style Interface.

More discussion of this is issue is over at burner/logger#18 for anyone interested. I thought at the time that the path forward was to leave Loggers thread-unsafe and just have subclasses implement thread safety if they want to.

Log4D's solution is to use thread-local Loggers that dispatch to global �log4j-like Appenders. and a global configuration that is applied once in each new Thread that calls Log4D.getLogger() or Log4D.getRootLogger(). stdlog is made thread-safe by specific overrides that use header and msgAppender fields keyed to the current Thread. It's a performance waste, but only for stdlog.

@mleise
Copy link
Author

mleise commented Sep 23, 2014

My attempt to achieve something similar: klamonte/logger@ 4391484

@klamonte Of course I'm interested! I implemented point 3. in your list and performance for single-threaded logging stays the same even with the synchronization. The actual formatting and writing are much more heavy weight operations than locking an uncontended mutex. (Windows' critical sections and pthreads' mutexes are "lightweight", in that they can operate in userspace for the uncontended case.)

It has the advantages of not synchronizing any of the functions, by way of removing the appender and header fields from Logger.

Looking at your implementation, it only addresses thread-safety in a rather narrow window (the appender). This PR is about fixing all the threading bugs (and some others I found while testing): http://forum.dlang.org/thread/vbotavcclttrgvzcjjia@forum.dlang.org?page=20#post-20140919073336.6b38ec45:40marco-leise.homedns.org.

However it feels a bit "non-D-like" in how it uses MsgRange to effectively act as a Java-style Interface.

In fact it doesn't work like that. If you don't turn MsgRange from a struct into an interface/class, you wont be able create new types of MsgRanges to remove the dependency on the GC, as Robert already pointed out. And if you do, you allocate a new object no the GC every time you call getMsgRange(). Catch 22.

Log4D's solution is to use thread-local Loggers that dispatch to global log4j-like Appenders. and a global configuration that is applied once in each new Thread that calls Log4D.getLogger() or Log4D.getRootLogger(). stdlog is made thread-safe by specific overrides that use header and msgAppender fields keyed to the current Thread. It's a performance waste, but only for stdlog.

You need to create a global mutex here:

        synchronized {
            auto tlsAppender = tlsMsgAppender[Thread.getThis().toHash()];
            tlsAppender.put(msg);
        }
        […]
        synchronized {
            auto theHeader = tlsHeader[Thread.getThis().toHash()];
            auto tlsAppender = tlsMsgAppender[Thread.getThis().toHash()];
            theHeader.msg = tlsAppender.data;
            this.writeLogMsg(theHeader);
        }

Otherwise you get races when one thread changes the global hash map, while another reads from it:

Synchronized allows only one thread at a time to execute ScopeStatement by using a mutex.
What mutex is used is determined by the Expression. If there is no Expression, then a global mutex is created, one per such synchronized statement. Different synchronized statements will have different global mutexes.

A hash map will call .toHash() on its own. If you pre-hash, the identity of the thread is lost (i.e. potential of two threads getting the same hash and overwriting each other's appenders).

@ghost
Copy link

ghost commented Sep 23, 2014

@mleise Thank you for the feedback, you've seen much more detail than I did. This is great, I'll get it fixed in Log4D.

Do you think it would be fair to say that once this PR is figured out and merged, than any subclass of Logger could override ONLY writeLogMsg() and still get a valid (thread-safe) payload? (Even if the default uses the GC?) That's really the only thing I need for Log4D, my overrides on beginLogMsg/logMsgPart/finishLogMsg are just trying to ensure that races in std.logger haven't corrupted the LogEntry.

@mleise
Copy link
Author

mleise commented Sep 23, 2014

@klamonte Yes, this PR splits Logger's methods in a set of final methods that perform the synchronization and then call the 4 overridable methods in this context. So whatever you override you are already fully protected from races.
But I'm not in the position to say this will stay so. Maybe it can stay a default or we can make your proposal work better. Collecting use cases might be a good idea. Today Manu told me about another type of logger that needs new API.

@burner
Copy link
Owner

burner commented Sep 26, 2014

merged

@mleise mleise closed this Oct 2, 2014
@mleise mleise deleted the logger-thread-safety branch October 2, 2014 18:43
@mleise mleise restored the logger-thread-safety branch October 9, 2014 01:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants