A machine learning based log parser
Datasets | Log size | Description | Source |
---|---|---|---|
HDFS | 11197705 | Hadoop runtime log | W. Xu, L. Huang, A. Fox, D. Patterson, and M. I. Jordan, “Detecting large-scale system problems by mining console logs,” in Proc. of ACM SOSP, 2009, p. 117–132. |
BGL | 4747963 | HPC Blue Gene/L runtime log | A. Oliner and J. Stearley, “What supercomputers say: A study of five system logs,” in Proc. of IEEE/IFIP DSN, 2007, pp. 575–584. |
Spark | 98671134 | Spark runtime log | Our 3-node cluster server that hosts our lab’s calculation tasks, [Download] |
Apache | 1643182 | Apache server access log | Open source Apache server access log set, [Download] |
UofS | 2408625 | University of Saskatchewan's WWW server access log | M. F. Arlitt and C. L. Williamson, “Web server workload characterization: The search for invariants,” ACM SIGMETRICS Perform. Evaluation Rev., vol. 24, no. 1, pp. 126–137, 1996. |
Jul95 | 1891714 | NASA Kennedy Space Center WWW server access log | M. F. Arlitt and C. L. Williamson, “Web server workload characterization: The search for invariants,” ACM SIGMETRICS Perform. Evaluation Rev., vol. 24, no. 1, pp. 126–137, 1996. |
Nginx | 1067357 | Nginx server access log | Open source Nginx server access log set, [Download] |
Openstack | 189386 | Openstack runtime log | M. Du, F. Li, G. Zheng, and V. Srikumar, “Deeplog: Anomaly detection and diagnosis from system logs through deep learning,” in Proc. of ACM SIGSAC, 2017, pp. 1285–1298. |
Security Log | 22694356 | Connection log from Security Data Analysis Labs | M. Sconzo and D. Dorsey, “Connection log,” 2014, data retrieved from Security Data Analysis Labs, https://github.com/sooshie/Security-Data-Analysis |
Thunderbird | 211212192 | HPC Thunderbird runtime log | A. Oliner and J. Stearley, “What supercomputers say: A study of five system logs,” in Proc. of IEEE/IFIP DSN, 2007, pp. 575–584. |
Big Brother | 68666 | Big Brother diagnostic log data | I. V. C. Committee, “Big brother data,” 2013, data retrieved from VAST Challenge 2013, http://vacommunity.org/VAST+Challenge+2013 |
Web Log | 1125760 | Security Repo access log | Samples of security related data, https://www.secrepo.com/ |
ThingWorx | 1849361 | ThingWorx platform log | Platform data collected from https://developer.thingworx.com/en/platform, [Download] |
4SICS-151020 | 246137 | 4SICS Geek Lounge Pcaps | T. industrial cyber security conference 4SICS, “Capture files from 4sics geek lounge,” 2015, data retrieved from 4SICS 2015, https://www.netresec.com/index.ashx?page=PCAP4SICS |
4SICS-151021 | 1253100 | 4SICS Geek Lounge Pcaps | T. industrial cyber security conference 4SICS, “Capture files from 4sics geek lounge,” 2015, data retrieved from 4SICS 2015, https://www.netresec.com/index.ashx?page=PCAP4SICS |
4SICS-151022 | 2274747 | 4SICS Geek Lounge Pcaps | T. industrial cyber security conference 4SICS, “Capture files from 4sics geek lounge,” 2015, data retrieved from 4SICS 2015, https://www.netresec.com/index.ashx?page=PCAP4SICS |
EMSD | 5764522 | Energy Management System log | https://sites.google.com/a/uah.edu/tommy-morris-uah/ics-data-sets |
IoT sentinel | 129371 | IoT devices captures | M. Miettinen, S. Marchal, I. Hafeez, N. Asokan, A.-R. Sadeghi, and S. Tarkoma, “Iot sentinel: Automated device-type identification for security enforcement in iot,” in Proc. of IEEE ICDCS, 2017, pp. 2177–2184. |