Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The regex is different #87

Closed
LeeWangWang opened this issue Oct 25, 2022 · 3 comments
Closed

The regex is different #87

LeeWangWang opened this issue Oct 25, 2022 · 3 comments

Comments

@LeeWangWang
Copy link

The regex in Drain_demo.py is

regex = [
r'blk_(|-)[0-9]+' , # block id
r'(/|)([0-9]+.){3}[0-9]+(:[0-9]+|)(:|)', # IP
r'(?<=[^A-Za-z0-9])(-?+?\d+)(?=[^A-Za-z0-9])|[0-9]+$', # Numbers
]

but the regex in Drain_benchmark.py is

'regex': [r'blk_-?\d+', r'(\d+.){3}\d+(:\d+)?']

I wonder why

@zhujiem
Copy link
Member

zhujiem commented Oct 25, 2022

Demo file is for test only. Pls refer to the benchmark file for accuracy numbers.

@LeeWangWang
Copy link
Author

when I run benchmark on The Windows dataset, the results are different than your Windows_2k.log_templates.csv

This is mine:
(1)Loaded Servicing Stack v6.1.7601.23505 with Core: C:\Windows\winsxs\amd64_microsoft-windows-servicingstack_31bf3856ad364e35_6.1.7601.23505_none_681aa442f6fed7f0\cbscore.dll"
(2)<> WcpInitialize (wcp.dll version 0.0.0.6) called (stack <>

This is yours:
(1)Loaded Servicing Stack <> with Core: <>\cbscore.dll
(2)<>@<>/<>/<>:<>:<>:<>.<> WcpInitialize (wcp.dll version <>) called (stack @<>)

@JinYang88
Copy link
Member

Different methods can produce different results, may I know which algorithm you are using?

@xpai xpai closed this as completed Sep 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants