Skip to content

Optimize mtlog performance to achieve near-zero allocations #52

@willibrandon

Description

@willibrandon

Description:
Implement performance optimizations to reduce allocations for property-based logging. While mtlog already achieves zero allocations for simple string logging (76.87 ns/op), operations with properties currently allocate 6-12 times per call. The goal is to achieve near-zero allocations for common logging patterns.

Current Performance:

| Operation | Time | Allocations | Memory |
|-----------|------|-------------|--------|
| Simple String | 76.87 ns | 0 allocs ✓ | 0 B |
| Two Properties | 515.8 ns | 6 allocs | 504 B |
| Ten Properties | 1891 ns | 12 allocs | 1704 B |
| With Context | 450.8 ns | 3 allocs | 416 B |
| Complex Object | 841.6 ns | 13 allocs | 804 B |
| Console Output | 1980 ns | 29 allocs | 1104 B |
| Typical Web Request | 701.4 ns | 8 allocs | 568 B |
| Typical Error | 637.5 ns | 8 allocs | 552 B |

Performance Issues:

  • Property extraction allocates once per property plus overhead (6 allocs for 2 properties, 12 for 10)
  • Console output performs 29 allocations per event
  • Template parsing happens at runtime for each log call
  • New property map allocated for every log event

Proposed Optimizations:

1. Template Pre-Compilation

// Pre-compile templates with property extraction plan
type CompiledTemplate struct {
    segments    []segment
    propertyMap map[string]int  // Property name -> argument index
    argCount    int
}

// Global template cache
var templateCache = &TemplateCompiler{
    maxSize: 10000, // Bounded to prevent memory leaks
}

2. Object Pooling

// Separate pools for events and buffers
var eventPool = sync.Pool{
    New: func() any {
        return &LogEvent{
            Properties: make(map[string]any, 16),
        }
    },
}

// Stack-allocated properties for common cases
type PropertySet struct {
    fixed    [16]Property  // No allocation for ≤16 properties
    overflow map[string]any // Only if needed
}

3. Zero-Allocation Formatting

// Direct strconv operations instead of fmt.Sprintf
func formatInt(w io.Writer, n int64) {
    b := strconv.AppendInt(scratch[:0], n, 10)
    w.Write(b)
}

4. Console Sink Optimization

// Pre-compute ANSI sequences at initialization
func NewConsoleSink(w io.Writer) *ConsoleSink {
    return &ConsoleSink{
        writer: w,
        buffer: bytes.NewBuffer(make([]byte, 0, 512)),
        colors: map[LogLevel][]byte{
            VerboseLevel: []byte("\033[37m"),  // Pre-computed
            DebugLevel:   []byte("\033[36m"),
            // ...
        },
    }
}

Target Performance:

| Operation | Current → Target | Allocation Reduction |
|-----------|-----------------|---------------------|
| Two Properties | 6 allocs → 1 alloc | 83% reduction |
| Ten Properties | 12 allocs → 2 allocs | 83% reduction |
| Console Output | 29 allocs → 3 allocs | 90% reduction |
| Typical Web Request | 8 allocs → 1 alloc | 87% reduction |

Implementation Phases:

  1. Template pre-compilation with decoupled rendering
  2. Object pooling with PropertySet for stack allocation
  3. Bounded string interning with LRU eviction
  4. Console sink optimization with pre-computed ANSI codes
  5. Zero-allocation number formatting using strconv

Memory Considerations:

  • Fixed overhead: ~200KB for caches and pools
  • Bounded caches with LRU eviction to prevent memory leaks
  • String intern cache limited to 10,000 entries
  • Template cache with random eviction at capacity

Compatibility:

  • 100% backward compatible
  • No API changes required
  • All optimizations transparent to users
  • Optional performance tuning via options

Testing Requirements:

  • Benchmark suite for each optimization phase
  • Memory leak detection tests
  • Cache eviction stress tests
  • Concurrent access verification
  • Hit rate monitoring for all caches

Use Cases:

  • High-throughput services needing minimal GC pressure
  • Latency-sensitive applications
  • Systems with strict memory constraints
  • Cloud-native applications where every allocation counts

Success Criteria:

  • Achieve ≤2 allocations for common property logging
  • Reduce console output allocations by 90%
  • Maintain zero allocations for simple string logging
  • Keep memory overhead under 500KB total
  • Pass all existing tests without modification

Progress Tracking

Benchmark Current Performance Target Status Progress
SimpleString 76.87 ns, 0 allocs ✅ Maintain - Baseline
TwoProperties 515.8 ns, 6 allocs <300 ns, 1 alloc 🔄 In Progress -
TenProperties 1891 ns, 12 allocs <1000 ns, 2 allocs ⏳ Planned -
WithContext 450.8 ns, 3 allocs <300 ns, 1 alloc ⏳ Planned -
ComplexObject 841.6 ns, 13 allocs <500 ns, 3 allocs ⏳ Planned -
ConsoleOutput 1980 ns, 29 allocs <1000 ns, 3 allocs ⏳ Planned -
TypicalWebRequest 701.4 ns, 8 allocs <400 ns, 1 alloc ⏳ Planned -
TypicalError 637.5 ns, 8 allocs <400 ns, 1 alloc ⏳ Planned -

Optimization Status

  • 🔄 Template Pre-compilation: Started - targeting TwoProperties/TenProperties
  • Object Pooling: Planned - for property maps and events
  • PropertySet: Planned - stack allocation for ≤16 properties
  • Console Optimization: Planned - pre-computed ANSI, buffer reuse
  • Zero-alloc Formatting: Planned - strconv instead of fmt
  • String Interning: Planned - for property names

Legend: ✅ Complete | 🔄 In Progress | ⏳ Planned

Metadata

Metadata

Assignees

Labels

enhancementNew feature or requestperformancePerformance improvements and optimizations. Zero-allocation promise territory. 🚀

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions