Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A new MultiWriters design that doesn't require parsing levels #8

Merged
merged 7 commits into from Aug 15, 2020

Conversation

suzaku
Copy link
Contributor

@suzaku suzaku commented Aug 13, 2020

In the current implementation, MultiWriter is used as Logger.Writer. Each time a message is written, we need to first parse the level from the input.

I can't help wondering if this can be avoided. In this pull request (still WIP), I want to propose a new design to avoid the level parsing step.

Instead of making MultiWriter an io.Writer, we just make it an optional field for Logger. If it's set, we use it instead of Writer.
When creating an Event, we use the level to decide what writers to use by calling MultiWriter.CombineWriters, which collects the matched writers into one io.Writer implementation called CombinedWriter:

type CombineWriter []io.Writer

func (cw CombineWriter) Write(p []byte) (n int, firstErr error) {
	for _, w := range cw {
		var err error
		n, err = w.Write(p)
		if err != nil && firstErr == nil {
			firstErr = err
		}
	}
	return n, firstErr
}

var _ io.Writer = CombineWriter{}

Please let me know what you think about this design. Thanks.

@phuslu
Copy link
Owner

phuslu commented Aug 13, 2020

It's hard for me answer the question/issue.

Please give me a chance to talk about this log motivation. As we have seen, a traditional logging design has 3 layers.

Logger
     |-------Formatter
                      |---------Writer

In this log, I removed/cancelled the Formatter layer to try make life easier :)

But, the real world is really complicated, I must face to a number of disadvantages of it,
A internal user has complained to me that the ConsoleWriter performance is not good. His case is try simulating glog file output to integrate with legacy components,

var glog = (&log.Logger{
	Level:      log.InfoLevel,
	Caller:     1,
	TimeFormat: "0102 15:04:05.999999",
	Writer: &log.ConsoleWriter{
		Template: template.Must(template.New("").Parse(`{{.Level.First}}{{.Time}} {{.Goid}} {{.Caller}}] {{.Message}}`)),
		Writer: FileWriter{
			Filename:   "main.log",
			MaxSize:    2 * 1024 * 1024 * 1024,
			MaxBackups: 2,
		},
	},
}).Sugar(nil)

I explained why ConsoleWriter performance is not good, because it always do json unmarshal/marshal during writing. The root cause is that we didn't have(lost) the Formatter and Logger already formatted output to JSON, so we have to do some re-formating work in Writers.

I think MultiWriter shares the some root cause in this issue. Do you agree this opinion ?

If It is, so the strategy dispatching leveled output is a trade off to us.

  1. dispatching in Writer, like current MultiWriter does.
    Pros: keep logger simple, users will not be surprised about 2 Writers, one is single and another is combined.
    Cons: level guessing in writer is not serious and the slow fallback(level finding) is really slow. Plus, the ParseLevel callback is a little surprise.
  2. dispatching in Logger, like this PR
    Pros: make MultiWriter simple and efficient.
    Cons: 2 Writer fields in Logger will make non-MultiWriter users confused and bring performance cost for non-MultiWriter scenarios

(Please correct me if there're some mistakes in above comparison)

So what will we choose?
Currently, this log lib is still fresh & young, we don't known the users requirements well, I'd like let future users voting for this.

  1. Keep this PR open
  2. Link this PR to multi.go comments, so users who about this issue can be involved to this discussion.

Let's see what will be happen, make sense?

Repository owner deleted a comment from sonarcloud bot Aug 13, 2020
@suzaku
Copy link
Contributor Author

suzaku commented Aug 14, 2020

Thanks for your explanation.

I agree with you that with an extra "formatter" layer, the problem of writing to console with
a different format can be solved more efficiently.
That said, we might also decide to not provide that perfect solution at all, maybe just provide a ConsoleLogger that's optimized for writing to console.

The problem with MultiWriter is a little similar: we are working at the right abstraction layer. That's why we have to parse for something that's encoded in upper layers to get things work.
My proposed solution tried to pull the solution up to the Logger layer so that we know the level without parsing anything.

As to the cons you listed:

  1. It's just one more pointer field in Logger and one extra if l.MultiWriters != nil { check in header, so the performance cost is very small for use cases that don't need multiple writers;
  2. Adding a field may or may not confuse users, it depends on how we name the field and whether we give it a reasonable default value.

By the way, I've added two benchmarks to see how the new and old solution compared:

goos: darwin
goarch: amd64
pkg: github.com/phuslu/log
BenchmarkOldMultiWriter-12         57088             19269 ns/op             648 B/op          6 allocs/op
BenchmarkOldMultiWriter-12         61423             19219 ns/op             648 B/op          6 allocs/op
BenchmarkOldMultiWriter-12         61670             19632 ns/op             648 B/op          6 allocs/op
PASS
ok      github.com/phuslu/log   4.550s

goos: darwin
goarch: amd64
pkg: github.com/phuslu/log
BenchmarkNewMultiWriter-12         92205             11290 ns/op             744 B/op          9 allocs/op
BenchmarkNewMultiWriter-12         89068             11454 ns/op             744 B/op          9 allocs/op
BenchmarkNewMultiWriter-12        103980             11342 ns/op             744 B/op          9 allocs/op
PASS
ok      github.com/phuslu/log   4.433s

@phuslu
Copy link
Owner

phuslu commented Aug 14, 2020

Sorry, I haven't go through your code carefully.

The performance cost in my thought:

  1. we could introduce a new interface named LeveledWriter, and MultiWriter implements the interface
 type LeveledWriter interface {
	WriteLevel(Level, []byte) (int, error)
}
  1. In Event.Msg(string), a type assertion will detect the implemention,
func (e *Event) Msg(msg string) {
        // blah blah
	if lw, ok := e.w.(LeveledWriter); ok {    // performance hit
		lw.LevelWrite(e.level, e.buf)
	} else {
		e.w.Write(e.buf)
	}
}

This is my initial thought and I think above performance cost does not make sense. (Because it's unfair for non-MultiWriter users)

Now, combine my thought and your PR, maybe we could

  1. still provide a LeveledWriter interface and implements it in MultiWriter
  2. adds a optional LeveledWriter field to Logger,
// A Logger represents an active logging object that generates lines of JSON output to an io.Writer.
type Logger struct {
        // blah blah blah

	// Writer specifies the writer of output. It uses os.Stderr in if empty.
	Writer io.Writer

	// a good document to clarify what LeveledWriter for
	LeveledWriter LeveledWriter
}

Please comment.

@suzaku
Copy link
Contributor Author

suzaku commented Aug 14, 2020

Sure, we can use some more interfaces.

By the way, I find the semantics of InfoWriter a little confusing. If I didn't read the source code of MultiWriter.Write, I would assume it's for logging messages with a level >= InfoLevel , but it turns out to be more like a DebugWriter. Do you think we should keep this behavior or come up with a more clear design.

@phuslu
Copy link
Owner

phuslu commented Aug 14, 2020

Yes, it's undecided. When I adding MultiWriter, the original requirement comes from glog simulation.
As we known, glog has no "Debug" level in its design, it use a sub level(named severity) instead. So Its output files only contains 3 names (Info/Warn/Error). The "Debug" level log will log to InfoWriter by using Info+severity strategy.

I'm not sure how to do a well compromise between this log level desgin and glog design.

  1. Add a DebugWriter field to let "glog users" assign some writer value as InfoWriter to it.
  2. Or, add a document description to clarify DebugLevel log will log to InfoWriter

@suzaku
Copy link
Contributor Author

suzaku commented Aug 14, 2020

Then let's defer the decision making around these levels. I'll keep the new design compatible with the original one.

@phuslu
Copy link
Owner

phuslu commented Aug 14, 2020

A fair(to non-MultiWiter user) draft,

diff --git a/logger.go b/logger.go
index b73696c..f1131af 100644
--- a/logger.go
+++ b/logger.go
@@ -42,6 +42,9 @@ type Logger struct {
 
 	// Writer specifies the writer of output. It uses os.Stderr in if empty.
 	Writer io.Writer
+
+	// a good document needed.
+	LeveledWriter LeveledWriter
 }
 
 const (
@@ -288,6 +291,9 @@ func (l *Logger) header(level Level) *Event {
 	} else {
 		e.w = os.Stderr
 	}
+	if l.LeveledWriter != nil {
+		e.w = wrapLeveledWriter{level, l.LeveledWriter}
+	}
 	// time
 	if l.TimeField == "" {
 		e.buf = append(e.buf, "{\"time\":"...)
diff --git a/multi.go b/multi.go
index 85fe0df..de7c376 100644
--- a/multi.go
+++ b/multi.go
@@ -1,10 +1,13 @@
 package log
 
 import (
-	"bytes"
 	"io"
 )
 
+type LeveledWriter interface {
+	WriteLevel(Level, []byte) (int, error)
+}
+
 // MultiWriter is an io.WriteCloser that log to different writers by different levels
 type MultiWriter struct {
 	// InfoWriter specifies the level large than info logs writes to
@@ -21,16 +24,6 @@ type MultiWriter struct {
 
 	// StderrLevel specifies the minimal level logs it will be writes to stderr
 	StderrLevel Level
-
-	// ParseLevel specifies an optional callback for parse log level in output
-	//    log.DebugLogger.Writer = &log.MultiWriter {
-	//        InfoWriter:   &log.FileWriter{Filename: "main-info.log"},
-	//        ErrorWriter:  &log.FileWriter{Filename: "main-error.log"},
-	//        StderrWriter: &log.ConsoleWriter{ColorOutput: true},
-	//        StderrLevel:  log.ErrorLevel,
-	//        ParseLevel:   func([]byte) log.Level { return log.ParseLevel(string(p[49])) },
-	//  }
-	ParseLevel func([]byte) Level
 }
 
 // Close implements io.Closer, and closes the underlying Writers.
@@ -54,52 +47,10 @@ func (w *MultiWriter) Close() (err error) {
 	return
 }
 
-var levelBegin = []byte(`"level":"`)
-
-// Write implements io.Writer.
-func (w *MultiWriter) Write(p []byte) (n int, err error) {
-	var level = noLevel
-	if w.ParseLevel != nil {
-		level = w.ParseLevel(p)
-	} else {
-		var l byte
-		// guess level by fixed offset
-		lp := len(p)
-		if lp > 49 {
-			_ = p[49]
-			switch {
-			case p[32] == 'Z' && p[42] == ':' && p[43] == '"':
-				l = p[44]
-			case p[32] == '+' && p[47] == ':' && p[48] == '"':
-				l = p[49]
-			}
-		}
-		// guess level by "level":" beginning
-		if l == 0 {
-			if i := bytes.Index(p, levelBegin); i > 0 && i+len(levelBegin)+1 < lp {
-				l = p[i+len(levelBegin)]
-			}
-		}
-		// convert byte to Level
-		switch l {
-		case 't':
-			level = TraceLevel
-		case 'd':
-			level = DebugLevel
-		case 'i':
-			level = InfoLevel
-		case 'w':
-			level = WarnLevel
-		case 'e':
-			level = ErrorLevel
-		case 'f':
-			level = FatalLevel
-		case 'p':
-			level = PanicLevel
-		}
-	}
-
+// WriteLevel implements log.LeveledWriter.
+func (w *MultiWriter) WriteLevel(level Level, p []byte) (n int, err error) {
 	var err1 error
+
 	switch level {
 	case noLevel, PanicLevel, FatalLevel, ErrorLevel:
 		if w.ErrorWriter != nil {
@@ -132,3 +83,13 @@ func (w *MultiWriter) Write(p []byte) (n int, err error) {
 
 	return
 }
+
+type wrapLeveledWriter struct {
+	Level         Level
+	LeveledWriter LeveledWriter
+}
+
+// WriteLevel implements log.LeveledWriter.
+func (w wrapLeveledWriter) Write(p []byte) (int, error) {
+	return w.LeveledWriter.WriteLevel(w.Level, p)
+}

Motivation:

  1. I think we should not introduce a hidden/private field to Logger, because it will break Logger "value semantics"
  2. I also dont want to introduce a LeveledWriter field to Event, because it's unfaire to "non-MultiWriter" users.

Could you take a look this draft? Maybe it's a good new base for this PR.

I admit the draft has deeper interfaces may be hit performance. Perhaps we need a bench for it 😄

@suzaku
Copy link
Contributor Author

suzaku commented Aug 14, 2020

Cool. It seems to integrate better with your existing code. I'll give it a try.

@phuslu
Copy link
Owner

phuslu commented Aug 14, 2020

Thanks! I think the existing code of MultiWriter fields already aligned to glog -alsologtostderr and -stderrthreshold options explicitly , so I'd not like change it.

https://github.com/golang/glog/blob/master/glog.go#L44

@suzaku
Copy link
Contributor Author

suzaku commented Aug 14, 2020

I've only used zap before, so I didn't recognize it.

@phuslu
Copy link
Owner

phuslu commented Aug 14, 2020

yep, we're zerolog and glog heavy user. the two logging libs impact this log deeply.

@phuslu
Copy link
Owner

phuslu commented Aug 14, 2020

I just found some similar requirements from logrus and zap

sirupsen/logrus#618
sirupsen/logrus#1068
uber-go/zap#581
uber-go/zap#586
uber-go/zap#718

@suzaku
Copy link
Contributor Author

suzaku commented Aug 15, 2020

I've changed the implementation to use the interface you shown.

I noticed that I've made a off-by-one mistake when running benchmarks for the new implementation. It's actually a little slower than the original implementation:

goos: darwin
goarch: amd64
pkg: github.com/phuslu/log
BenchmarkNewMultiWriter-12         56396             19657 ns/op             744 B/op          9 allocs/op
BenchmarkNewMultiWriter-12         60616             19556 ns/op             744 B/op          9 allocs/op
BenchmarkNewMultiWriter-12         60768             19912 ns/op             744 B/op          9 allocs/op
PASS
ok      github.com/phuslu/log   4.489s

@phuslu phuslu changed the title WIP, Proposal: A new MultiWriters design that doesn't require parsing levels A new MultiWriters design that doesn't require parsing levels Aug 15, 2020
@phuslu phuslu merged commit cba483d into phuslu:master Aug 15, 2020
@phuslu
Copy link
Owner

phuslu commented Aug 15, 2020

Thanks and Cheers ! 🍻

@phuslu
Copy link
Owner

phuslu commented Aug 15, 2020

Will push/release a new tag soon :D

@phuslu
Copy link
Owner

phuslu commented Aug 19, 2020

Currently, ConsoleWriter performance is acceptable (#10), I have received a mixed usecase as below

var glog = (&log.Logger{
	Level:      log.InfoLevel,
	Caller:     1,
	TimeFormat: "0102 15:04:05.999999",
	Writer: &log.ConsoleWriter{
		Template: template.Must(template.New("").Parse(`{{.Level.First}}{{.Time}} {{.Goid}} {{.Caller}}] {{.Message}}`)),
		Writer: &log.MultiWriter{
			InfoWriter: &log.FileWriter{Filename: "server.INFO", MaxSize: 1610612736, MaxBackups: 2},
			WarnWriter: &log.FileWriter{Filename: "server.WARNING", MaxSize: 1610612736, MaxBackups: 2},
			InfoWriter: &log.FileWriter{Filename: "server.ERROR", MaxSize: 1610612736, MaxBackups: 2},
			StderrWriter: os.Stderr,
			StderrLevel: log.WarnLevel,
			ParseLevel: func(p []data) log.Level { return log.ParseLevel(string(p[0])); },
		},
	},
}).Sugar(nil)

This is a completely glog simulation(includes format & logfile) but a little harder to adapt to current design. 😞

@suzaku
Copy link
Contributor Author

suzaku commented Aug 19, 2020

Currently, ConsoleWriter performance is acceptable (#10), I have received a mixed usecase as below

var glog = (&log.Logger{
	Level:      log.InfoLevel,
	Caller:     1,
	TimeFormat: "0102 15:04:05.999999",
	Writer: &log.ConsoleWriter{
		Template: template.Must(template.New("").Parse(`{{.Level.First}}{{.Time}} {{.Goid}} {{.Caller}}] {{.Message}}`)),
		Writer: &log.MultiWriter{
			InfoWriter: &log.FileWriter{Filename: "server.INFO", MaxSize: 1610612736, MaxBackups: 2},
			WarnWriter: &log.FileWriter{Filename: "server.WARNING", MaxSize: 1610612736, MaxBackups: 2},
			InfoWriter: &log.FileWriter{Filename: "server.ERROR", MaxSize: 1610612736, MaxBackups: 2},
			StderrWriter: os.Stderr,
			StderrLevel: log.WarnLevel,
			ParseLevel: func(p []data) log.Level { return log.ParseLevel(string(p[0])); },
		},
	},
}).Sugar(nil)

This is a completely glog simulation(includes format & logfile) but a little harder to adapt to current design. 😞

Cool, I'll take a look later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants