Skip to content

Improve MatchTemplate to pick best template instead of first match #11

@STRRL

Description

@STRRL

Summary

MatchTemplate currently returns the first template whose wildcard tokens fit a log line. When Drain produces overlapping templates (e.g. two patterns that both match the same line), this first-match rule can assign lines to the wrong cluster depending on template iteration order.

Since BuildWorkspace relies on MatchTemplate for counts and samples, overlapping templates can produce incorrect summary/error statistics.

Proposed Improvement

Instead of returning the first match, score all matching templates and pick the best one. A simple heuristic: prefer the template with the fewest wildcards (most specific match). Ties could be broken by template count (prefer higher-count clusters).

Reference

Originated from PR review: #9 (comment)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions