Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Algorithm] Drain3 raw log parsing potential enhancement #9

Open
2 tasks
Superskyyy opened this issue Jul 18, 2022 · 2 comments
Open
2 tasks

[Algorithm] Drain3 raw log parsing potential enhancement #9

Superskyyy opened this issue Jul 18, 2022 · 2 comments
Assignees
Labels
Algorithm The work is on the algorithm side analysis: log enhancement New feature or request upstream A issue that could be submitted to upstream repos first
Milestone

Comments

@Superskyyy
Copy link
Member

Superskyyy commented Jul 18, 2022

Background: Drain log parsing works best on ingesting only log content - meaning we trim the rest with some simple Regex or rule. Slicing the contents accurately from

Dec 10 07:28:08 LabSZ sshd[24247]: Received disconnect from 112.95.230.3: 11: Bye Bye [preauth]
to below requires prior knowledge on the delimiter, which I am 99% sure users don't care to give. So we need to adapt Drain to be more robust.

Received disconnect from 112.95.230.3: 11: Bye Bye [preauth]
I found a potentially(?) major enhancement to the algorithm on RAW log parsing.

The current test is shown below yields much better clustering than the original unreadable results (over-convergence), but it also requires a tiny adjustment to global similarity threshold - So the idea is all clusters should have their own standard of accepting new templates, not by a global constraint. (This is mentioned in the updated version of research paper, not my invention)

I will attempt to submit a patch to the upstream IBM/Drain3 repo and see if it's accepted.

BUT! To yield the most accurate result, we still need to implement a dynamic threshold calculation and clustering merger for the similarity function;

  • Dynamic Threshold Calculator
  • Log Group Merger
ID=7     : size=177730    : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] <*> <*> <*> authentication <*> logname= uid=<NUM> euid=<NUM> tty=ssh ruser= <*> <*>
ID=8     : size=141046    : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] <*> <*> <*> <*> <*> <*> <*> <*> <*>
ID=6     : size=122118    : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] <*> <*> <*> <*> <*> <*> <*> <*>
ID=2     : size=120488    : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] <*> <*> <*> <*> <*>
ID=3     : size=35308     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] <*> <*> <*> <*> <*> <*> <*>
ID=5     : size=20241     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Failed <*> for invalid user <*> from <IP> port <NUM> ssh2
ID=1     : size=18909     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] reverse mapping checking getaddrinfo for <*> [<IP>] failed - POSSIBLE BREAK-IN ATTEMPT!
ID=4     : size=15645     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] <*> <*> <*> authentication <*> logname= uid=<NUM> euid=<NUM> tty=ssh ruser= <*>
ID=9     : size=1331      : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] <*> <*> <*> <*> <*> <*> <*> <*> <*> <*>
ID=11    : size=932       : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] error Received disconnect from <IP> <NUM> <*> <*> <*> <*> <*> <*> [preauth]
ID=15    : size=497       : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Address <IP> maps to <*> but this does not map back to the address - POSSIBLE BREAK-IN ATTEMPT!
ID=10    : size=493       : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] <*> <*> <*> <*> <*> <*> <*> <*> <*> <*> [preauth]
ID=13    : size=154       : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] <*> <*> <*> <*> <*> <*>
ID=18    : size=108       : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Received disconnect from <IP> <NUM> <*> <*> <*> <*> <*> <*>
ID=20    : size=92        : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Received disconnect from <IP> <NUM> Normal Shutdown, Thank you for playing [preauth]
ID=14    : size=30        : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Failed password for invalid user <*> <*> from <IP> port <NUM> ssh2
ID=12    : size=14        : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] <*> <*> <*> <*>
ID=16    : size=7         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Bad protocol version identification 'GET <*> HTTP/<NUM>.<NUM>' from <IP> port <NUM>
ID=17    : size=3         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] error Received disconnect from <IP> <NUM> com.jcraft.jsch.JSchException timeout in waiting for rekeying process. [preauth]
ID=19    : size=1         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Bad protocol version identification 'CONNECT xui.ptlogin2.qq.com <NUM> HTTP/<NUM>.<NUM>' from <IP> port <NUM>

Threshold 0.4 (Default, not best)

ID=10    : size=140950    : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] <*> password for <*> from <IP> port <NUM> ssh2
ID=9     : size=140701    : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] pam unix(sshd auth) authentication failure; logname= uid=<NUM> euid=<NUM> tty=ssh ruser= <*> <*>
ID=7     : size=68958     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Connection closed by <IP> [preauth]
ID=8     : size=46608     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Received disconnect from <IP> <NUM> <*> <*> [preauth]
ID=14    : size=37963     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] PAM service(sshd) ignoring max retries; <NUM> > <NUM>
ID=12    : size=37298     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Disconnecting Too many authentication failures for <*> [preauth]
ID=13    : size=37029     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] PAM <NUM> more authentication <*> logname= uid=<NUM> euid=<NUM> tty=ssh ruser= <*> <*>
ID=11    : size=36967     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] message repeated <NUM> times <Averylonglist[]>
ID=6     : size=20241     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Failed <*> for invalid user <*> from <IP> port <NUM> ssh2
ID=4     : size=19852     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] pam unix(sshd auth) check pass; user unknown
ID=1     : size=18909     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] reverse mapping checking getaddrinfo for <*> [<IP>] failed - POSSIBLE BREAK-IN ATTEMPT!
ID=2     : size=14551     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Invalid user <*> from <IP>
ID=3     : size=14551     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] input userauth request invalid user <*> [preauth]
ID=5     : size=14356     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] pam unix(sshd auth) authentication failure; logname= uid=<NUM> euid=<NUM> tty=ssh ruser= <*>
ID=18    : size=1289      : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] PAM <NUM> more authentication <*> logname= uid=<NUM> euid=<NUM> tty=ssh ruser= <*>
ID=24    : size=952       : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] fatal Read from socket failed Connection reset by peer [preauth]
ID=19    : size=930       : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] error Received disconnect from <IP> <NUM> No more user authentication methods available. [preauth]
ID=15    : size=838       : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Did not receive identification string from <IP>
ID=17    : size=592       : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Received disconnect from <IP> <NUM> Closed due to user request. [preauth]
ID=32    : size=497       : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Address <IP> maps to <*> but this does not map back to the address - POSSIBLE BREAK-IN ATTEMPT!
ID=20    : size=182       : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] pam unix(sshd session) session opened for user <*> by (uid=<NUM>)
ID=22    : size=182       : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] pam unix(sshd session) session closed for user <*>
ID=16    : size=177       : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] error Received disconnect from <IP> <NUM> com.jcraft.jsch.JSchException Auth <*> [preauth]
ID=33    : size=108       : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Received disconnect from <IP> <NUM> [preauth]
ID=63    : size=92        : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Received disconnect from <IP> <NUM> Normal Shutdown, Thank you for playing [preauth]
ID=47    : size=81        : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Received disconnect from <IP> <NUM> Client disconnecting normally [preauth]
ID=52    : size=60        : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Received disconnect from <IP> <NUM> disconnect [preauth]
ID=21    : size=34        : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Received disconnect from <IP> <NUM> disconnected by user
ID=29    : size=30        : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] input userauth request invalid user <*> <*> [preauth]
ID=30    : size=30        : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Failed password for invalid user <*> <*> from <IP> port <NUM> ssh2
ID=38    : size=13        : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Invalid user from <IP>
ID=39    : size=13        : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] input userauth request invalid user [preauth]
ID=40    : size=13        : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Failed <*> for invalid user from <IP> port <NUM> ssh2
ID=31    : size=11        : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Invalid user <*> admin from <IP>
ID=35    : size=7         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Bad protocol version identification <*> <*> from <IP> port <NUM>
ID=37    : size=7         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Bad protocol version identification 'GET <*> HTTP/<NUM>.<NUM>' from <IP> port <NUM>
ID=41    : size=7         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Bad protocol version identification <*> from <IP> port <NUM>
ID=34    : size=6         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] fatal no hostkey alg [preauth]
ID=49    : size=6         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] error connect to <IP> port <NUM> failed.
ID=57    : size=6         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Received disconnect from <IP> <NUM> disconnected by user [preauth]
ID=28    : size=5         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Invalid user myapn cen from <IP>
ID=46    : size=4         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Invalid user ftp <*> from <IP>
ID=56    : size=4         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] error Received disconnect from <IP> <NUM> Authentication cancelled by user. [preauth]
ID=23    : size=3         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] fatal Write failed Connection reset by peer [preauth]
ID=48    : size=3         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] error Received disconnect from <IP> <NUM> com.jcraft.jsch.JSchException timeout in waiting for rekeying process. [preauth]
ID=54    : size=3         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Server listening on <IP> port <NUM>.
ID=55    : size=3         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Server listening on port <NUM>.
ID=61    : size=3         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Received disconnect from <IP> <NUM> Disconnect requested by Windows SSH Client.
ID=25    : size=2         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] error Received disconnect from <IP> <NUM> User request [preauth]
ID=36    : size=2         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Invalid user back newshops from <IP>
ID=45    : size=2         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Invalid user bash spm from <IP>
ID=51    : size=2         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] error Received disconnect from <IP> <NUM> org.vngx.jsch.userauth.AuthCancelException User authentication canceled by user [preauth]
ID=59    : size=2         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Invalid user lcap oracle from <IP>
ID=60    : size=2         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Invalid user zxdbm epg from <IP>
ID=26    : size=1         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Bad packet length <NUM>. [preauth]
ID=27    : size=1         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Disconnecting Packet corrupt [preauth]
ID=42    : size=1         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Invalid user ram k from <IP>
ID=43    : size=1         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] error Received disconnect from <IP> <NUM> java.net.SocketTimeoutException Read timed out [preauth]
ID=44    : size=1         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Received disconnect from <IP> <NUM>
ID=50    : size=1         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Corrupted MAC on input. [preauth]
ID=53    : size=1         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Invalid user sugon test from <IP>
ID=58    : size=1         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] error Received disconnect from <IP> <NUM> com.jcraft.jsch.JSchException reject HostKey <IP> [preauth]
ID=62    : size=1         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Bad protocol version identification 'CONNECT xui.ptlogin2.qq.com <NUM> HTTP/<NUM>.<NUM>' from <IP> port <NUM>
ID=64    : size=1         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] syslogin perform logout logout() returned an error

Threshold 0.3 compared to below baseline result, looks almost perfect

--- Done processing file in 45.46 sec. Total of 655147 lines, rate 14411.4 lines/sec, 53 clusters
ID=10    : size=140950    : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] <*> password for <*> from <IP> port <NUM> ssh2
ID=16    : size=140668    : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] pam unix(sshd auth) authentication failure; logname= uid=<NUM> euid=<NUM> tty=ssh ruser= <*> <*>
ID=7     : size=68958     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Connection closed by <IP> [preauth]
ID=8     : size=46608     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Received disconnect from <IP> <NUM> <*> <*> [preauth]
ID=13    : size=37963     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] PAM service(sshd) ignoring max retries; <NUM> > <NUM>
ID=12    : size=37298     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Disconnecting Too many authentication failures for <*> [preauth]
ID=11    : size=36967     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] message repeated <NUM> times <Averylonglist[]>
ID=47    : size=36803     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] PAM <NUM> more authentication <*> logname= uid=<NUM> euid=<NUM> tty=ssh ruser= <*> <*>
ID=6     : size=20241     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Failed <*> for invalid user <*> from <IP> port <NUM> ssh2
ID=4     : size=19852     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] pam unix(sshd auth) check pass; user unknown
ID=1     : size=18909     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] reverse mapping checking getaddrinfo for <*> [<IP>] failed - POSSIBLE BREAK-IN ATTEMPT!
ID=2     : size=14551     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Invalid user <*> from <IP>
ID=3     : size=14551     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] input userauth request invalid user <*> [preauth]
ID=5     : size=14356     : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] pam unix(sshd auth) authentication failure; logname= uid=<NUM> euid=<NUM> tty=ssh ruser= <*>
ID=18    : size=1289      : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] PAM <NUM> more authentication <*> logname= uid=<NUM> euid=<NUM> tty=ssh ruser= <*>
ID=24    : size=952       : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] fatal Read from socket failed Connection reset by peer [preauth]
ID=19    : size=932       : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] error Received disconnect from <IP> <NUM> <*> <*> <*> <*> <*> <*> [preauth]
ID=14    : size=838       : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Did not receive identification string from <IP>
ID=17    : size=592       : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Received disconnect from <IP> <NUM> Closed due to user request. [preauth]
ID=31    : size=497       : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Address <IP> maps to <*> but this does not map back to the address - POSSIBLE BREAK-IN ATTEMPT!
ID=9     : size=259       : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] <*> <*> <*> authentication <*> logname= uid=<NUM> euid=<NUM> tty=ssh ruser= <*> user=root
ID=20    : size=182       : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] pam unix(sshd session) session opened for user <*> by (uid=<NUM>)
ID=22    : size=182       : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] pam unix(sshd session) session closed for user <*>
ID=15    : size=177       : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] error Received disconnect from <IP> <NUM> com.jcraft.jsch.JSchException Auth <*> [preauth]
ID=32    : size=108       : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Received disconnect from <IP> <NUM> [preauth]
ID=52    : size=92        : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Received disconnect from <IP> <NUM> Normal Shutdown, Thank you for playing [preauth]
ID=42    : size=87        : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Received disconnect from <IP> <NUM> <*> <*> <*> [preauth]
ID=46    : size=60        : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Received disconnect from <IP> <NUM> disconnect [preauth]
ID=21    : size=34        : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Received disconnect from <IP> <NUM> disconnected by user
ID=28    : size=30        : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Invalid user <*> <*> from <IP>
ID=29    : size=30        : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] input userauth request invalid user <*> <*> [preauth]
ID=30    : size=30        : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Failed password for invalid user <*> <*> from <IP> port <NUM> ssh2
ID=36    : size=13        : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Invalid user from <IP>
ID=37    : size=13        : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] input userauth request invalid user [preauth]
ID=38    : size=13        : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Failed <*> for invalid user from <IP> port <NUM> ssh2
ID=34    : size=7         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Bad protocol version identification <*> <*> from <IP> port <NUM>
ID=35    : size=7         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Bad protocol version identification 'GET <*> HTTP/<NUM>.<NUM>' from <IP> port <NUM>
ID=39    : size=7         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Bad protocol version identification <*> from <IP> port <NUM>
ID=33    : size=6         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] fatal no hostkey alg [preauth]
ID=40    : size=6         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] error Received disconnect from <IP> <NUM> <*> <*> <*> <*> [preauth]
ID=44    : size=6         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] error connect to <IP> port <NUM> failed.
ID=23    : size=3         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] fatal Write failed Connection reset by peer [preauth]
ID=43    : size=3         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] error Received disconnect from <IP> <NUM> com.jcraft.jsch.JSchException timeout in waiting for rekeying process. [preauth]
ID=48    : size=3         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Server listening on <IP> port <NUM>.
ID=49    : size=3         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Server listening on port <NUM>.
ID=50    : size=3         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Received disconnect from <IP> <NUM> Disconnect requested by Windows SSH Client.
ID=25    : size=2         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] error Received disconnect from <IP> <NUM> User request [preauth]
ID=26    : size=1         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Bad packet length <NUM>. [preauth]
ID=27    : size=1         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Disconnecting Packet corrupt [preauth]
ID=41    : size=1         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Received disconnect from <IP> <NUM>
ID=45    : size=1         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Corrupted MAC on input. [preauth]
ID=51    : size=1         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] Bad protocol version identification 'CONNECT xui.ptlogin2.qq.com <NUM> HTTP/<NUM>.<NUM>' from <IP> port <NUM>
ID=53    : size=1         : <Month> <NUM> <NUM> <NUM> <NUM> LabSZ sshd[<NUM>] syslogin perform logout logout() returned an error

Original version without my patch, but sliced with prior knowledge, threshold 0.4 default

--- Done processing file in 25.76 sec. Total of 655147 lines, rate 25432.1 lines/sec, 51 clusters
ID=10    : size=140768    : Failed password for <*> from <IP> port <NUM> ssh2
ID=9     : size=140701    : pam unix(sshd auth) authentication failure; logname= uid=<NUM> euid=<NUM> tty=ssh ruser= <*> <*>
ID=7     : size=68958     : Connection closed by <IP> [preauth]
ID=8     : size=46642     : Received disconnect from <IP> <NUM> <*> <*> <*>
ID=14    : size=37963     : PAM service(sshd) ignoring max retries; <NUM> > <NUM>
ID=12    : size=37298     : Disconnecting Too many authentication failures for <*> [preauth]
ID=13    : size=37029     : PAM <NUM> more authentication <*> logname= uid=<NUM> euid=<NUM> tty=ssh ruser= <*> <*>
ID=11    : size=36967     : message repeated <NUM> times <Averylonglist[]>
ID=6     : size=20241     : Failed <*> for invalid user <*> from <IP> port <NUM> ssh2
ID=4     : size=19852     : pam unix(sshd auth) check pass; user unknown
ID=1     : size=18909     : reverse mapping checking getaddrinfo for <*> [<IP>] failed - POSSIBLE BREAK-IN ATTEMPT!
ID=2     : size=14551     : Invalid user <*> from <IP>
ID=3     : size=14551     : input userauth request invalid user <*> [preauth]
ID=5     : size=14356     : pam unix(sshd auth) authentication failure; logname= uid=<NUM> euid=<NUM> tty=ssh ruser= <*>
ID=18    : size=1289      : PAM <NUM> more authentication <*> logname= uid=<NUM> euid=<NUM> tty=ssh ruser= <*>
ID=24    : size=952       : fatal Read from socket failed Connection reset by peer [preauth]
ID=19    : size=932       : error Received disconnect from <IP> <NUM> <*> <*> <*> <*> <*> <*> [preauth]
ID=15    : size=838       : Did not receive identification string from <IP>
ID=17    : size=595       : Received disconnect from <IP> <NUM> <*> <*> <*> <*> <*> <*>
ID=31    : size=497       : Address <IP> maps to <*> but this does not map back to the address - POSSIBLE BREAK-IN ATTEMPT!
ID=20    : size=182       : Accepted password for <*> from <IP> port <NUM> ssh2
ID=21    : size=182       : pam unix(sshd session) session opened for user <*> by (uid=<NUM>)
ID=22    : size=182       : pam unix(sshd session) session closed for user <*>
ID=16    : size=177       : error Received disconnect from <IP> <NUM> com.jcraft.jsch.JSchException Auth <*> [preauth]
ID=32    : size=108       : Received disconnect from <IP> <NUM> [preauth]
ID=50    : size=92        : Received disconnect from <IP> <NUM> Normal Shutdown, Thank you for playing [preauth]
ID=42    : size=87        : Received disconnect from <IP> <NUM> <*> <*> <*> [preauth]
ID=46    : size=60        : Received disconnect from <IP> <NUM> disconnect [preauth]
ID=28    : size=30        : Invalid user <*> <*> from <IP>
ID=29    : size=30        : input userauth request invalid user <*> <*> [preauth]
ID=30    : size=30        : Failed password for invalid user <*> <*> from <IP> port <NUM> ssh2
ID=36    : size=13        : Invalid user from <IP>
ID=37    : size=13        : input userauth request invalid user [preauth]
ID=38    : size=13        : Failed <*> for invalid user from <IP> port <NUM> ssh2
ID=34    : size=7         : Bad protocol version identification <*> <*> from <IP> port <NUM>
ID=35    : size=7         : Bad protocol version identification 'GET <*> HTTP/<NUM>.<NUM>' from <IP> port <NUM>
ID=39    : size=7         : Bad protocol version identification <*> from <IP> port <NUM>
ID=33    : size=6         : fatal no hostkey alg [preauth]
ID=40    : size=6         : error Received disconnect from <IP> <NUM> <*> <*> <*> <*> [preauth]
ID=44    : size=6         : error connect to <IP> port <NUM> failed.
ID=23    : size=3         : fatal Write failed Connection reset by peer [preauth]
ID=43    : size=3         : error Received disconnect from <IP> <NUM> com.jcraft.jsch.JSchException timeout in waiting for rekeying process. [preauth]
ID=47    : size=3         : Server listening on <IP> port <NUM>.
ID=48    : size=3         : Server listening on port <NUM>.
ID=25    : size=2         : error Received disconnect from <IP> <NUM> User request [preauth]
ID=26    : size=1         : Bad packet length <NUM>. [preauth]
ID=27    : size=1         : Disconnecting Packet corrupt [preauth]
ID=41    : size=1         : Received disconnect from <IP> <NUM>
ID=45    : size=1         : Corrupted MAC on input. [preauth]
ID=49    : size=1         : Bad protocol version identification 'CONNECT xui.ptlogin2.qq.com <NUM> HTTP/<NUM>.<NUM>' from <IP> port <NUM>
ID=51    : size=1         : syslogin perform logout logout() returned an error
···
@Superskyyy Superskyyy added enhancement New feature or request Algorithm The work is on the algorithm side analysis: log upstream A issue that could be submitted to upstream repos first labels Jul 18, 2022
@Superskyyy Superskyyy added this to the 0.1.0 milestone Jul 18, 2022
@Superskyyy Superskyyy self-assigned this Jul 18, 2022
@Superskyyy
Copy link
Member Author

Some additional information on log merger, we need to merge the clusters with very minimal number of log hits to similar clusters based on a similarity threshold calculated using some equation.. though IDK which yet, gonna figure that out using some heuristics.

@Superskyyy Superskyyy removed their assignment Jul 19, 2022
@Superskyyy
Copy link
Member Author

The algorithm enhancements have turned out to be effective. In light of issue #14 that requires further structural modification of the Drain3 code base, we have heavily modified the MIT-licensed Drain3 implementation and intend to host it in our repo (algorithm files will respect the original MIT license header).

@Superskyyy Superskyyy self-assigned this Sep 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Algorithm The work is on the algorithm side analysis: log enhancement New feature or request upstream A issue that could be submitted to upstream repos first
Projects
Status: No status
Development

No branches or pull requests

1 participant