Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Display original log with results? #8

Open
jmlane8 opened this issue May 28, 2020 · 6 comments
Open

Display original log with results? #8

jmlane8 opened this issue May 28, 2020 · 6 comments

Comments

@jmlane8
Copy link

jmlane8 commented May 28, 2020

Do you have anything that displays the original log records, their ground truth status as normal and abnormal, and the result from logdeep predictions?

@cherishwsx
Copy link

I can help with this one. :)
I think you can find the original log data in loghub. You can find the HDFS (specifically HDFS_1) data and BGL data along with their labels in the corresponding folders. For the logdeep predictions, are you saying the evaluation result? The result is shown in the Benchmark results section in README.md.

@cherishwsx
Copy link

Forgot to put the link. loghub

@jmlane8
Copy link
Author

jmlane8 commented Jun 3, 2020

Thank you. I wanted to run the abnormal and normal predictions, and be able to point back to the original unstructured log records, and say: the neural network picked up something abnormal here.

@cherishwsx
Copy link

Thank you. I wanted to run the abnormal and normal predictions, and be able to point back to the original unstructured log records, and say: the neural network picked up something abnormal here.

I think you can actually print out the block_id (which is the event sequence identifer in HDFS dataset) or row number when there is a abnormal record detected. Looking at the inference part script predict.py might help.

@cherishwsx
Copy link

When I was thinking about the "tracking back to raw log records" problem, it seems to me like there is no way to actually track record by record (more of a streaming analysis) since we are training and predicting on event sequence, instead of every log records/single event. So I guess we can only know which event sequence is abnormal, right? And it's more suitable for batch log analysis?

Correct me if I'm wrong and any ideas are welcome! @donglee-afar

@d0ng1ee
Copy link
Owner

d0ng1ee commented Jun 11, 2020

You are right, @cherishwsx
I think if you understand the pipeline of log anomaly detection, this is a very simple job ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants