Skip to content

abubakarp789/Python_Log_Analyzer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

3 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Automated Log Analysis Using Python

A comprehensive Blue Team cybersecurity tool for detecting anomalies and extracting suspicious activities from system and network logs.

Created by: Abubakar and Raqib Hayat


๐Ÿ“š Academic Project Information

Course: CS-332 Information Security

Semester: 6 - Spring 2025

Instructor: Dr. Mudassar Raza

Project Type: Semester Project

CLO Alignment: CLO-3, CLO-4

Team Classification: Blue Team Operations Tool

Course Learning Outcomes (CLOs)

  • CLO-3: Apply security mechanisms and protocols to protect information systems
  • CLO-4: Evaluate and implement cybersecurity tools and techniques for threat detection and response

This project demonstrates practical application of cybersecurity concepts through automated log analysis, focusing on defensive security operations and threat detection methodologies taught in the Information Security curriculum.


๐ŸŽฏ Project Overview

The Python Log Analyzer is an automated cybersecurity tool designed to parse, analyze, and detect suspicious activities in various log formats. This tool addresses the critical need for efficient log monitoring in cybersecurity operations, where manual analysis is time-consuming and error-prone.

Blue Team Focus Areas

This tool specifically supports Blue Team defensive security operations by:

  • Threat Detection: Automated identification of malicious activities in log data
  • Incident Response: Rapid analysis capabilities for security event investigation
  • Continuous Monitoring: Real-time and batch processing for ongoing security surveillance
  • Evidence Collection: Forensic-ready reporting for incident documentation
  • Compliance Support: Structured analysis supporting regulatory requirements

Key Features

  • Multi-format Support : Handles Syslog, Apache, and custom log formats
  • Automated Parsing : Uses regex patterns to extract structured data from unstructured logs
  • Anomaly Detection : Identifies suspicious patterns like brute-force attacks and frequent errors
  • Threshold-based Alerting : Customizable thresholds for flagging suspicious IPs
  • Visual Reports : Generates both text reports and graphical charts
  • Command-line Interface : Easy integration into security workflows and automation scripts

๐Ÿ›ก๏ธ Cybersecurity Framework Integration

Blue Team Operations Alignment

This tool follows industry-standard Blue Team methodologies:

  1. Detection Engineering: Automated pattern recognition for threat indicators
  2. Log Analysis: Systematic examination of system and network logs
  3. Incident Response: Rapid identification and documentation of security events
  4. Threat Hunting: Proactive search for indicators of compromise (IOCs)
  5. Security Operations Center (SOC) Support: Integration-ready for SOC workflows

Academic Learning Integration

The project demonstrates practical application of course concepts:

  • Information Security Principles: Confidentiality, Integrity, Availability through monitoring
  • Threat Landscape Understanding: Recognition of common attack patterns
  • Defense-in-Depth: Multi-layered security approach through comprehensive log monitoring
  • Risk Assessment: Categorization and prioritization of security threats
  • Security Tools Development: Hands-on experience with cybersecurity tool creation

๐Ÿš€ Installation and Setup

Prerequisites

Ensure you have Python 3.6+ installed on your system:

python --version
# or
python3 --version

Required Libraries

Install the necessary Python packages:

pip install pandas matplotlib regex

Note: The regex library provides enhanced pattern matching. If unavailable, the script falls back to Python's built-in re module.

Download the Tool

  1. Save the log_analyzer.py script to your working directory
  2. Download the sample log files:
    • syslog_failed_ssh.log
    • apache_access_errors.log
  3. Make the script executable (Linux/Mac):
    chmod +x log_analyzer.py

๐Ÿ“– Usage Guide

Basic Command Structure

python log_analyzer.py <log_file> <log_type> [options]

Parameters

  • log_file (required): Path to the log file to analyze
  • log_type (required): Type of log file (syslog, apache, or custom)
  • --threshold (optional): Minimum count to flag IP as suspicious (default: 5)
  • --status-code (optional): Filter Apache logs by specific HTTP status code
  • --output (optional): Save report to specified file

Example Commands

1. Analyze Syslog for Failed SSH Logins

python log_analyzer.py syslog_failed_ssh.log syslog --threshold 5

What it does:

  • Parses syslog entries for "Failed password" messages
  • Groups failed attempts by IP address
  • Flags IPs with 5+ failed login attempts
  • Generates a security report with flagged IPs

2. Analyze Apache Logs for 404 Errors

python log_analyzer.py apache_access_errors.log apache --status-code 404 --threshold 10

What it does:

  • Parses Apache access logs
  • Filters for 404 "Not Found" errors
  • Identifies IPs generating 10+ 404 errors
  • Useful for detecting reconnaissance activities

3. General Apache Error Analysis with Report Output

python log_analyzer.py apache_access_errors.log apache --threshold 3 --output security_report.txt

What it does:

  • Analyzes all HTTP errors (4xx, 5xx status codes)
  • Flags IPs with 3+ error-causing requests
  • Saves detailed report to security_report.txt

4. Custom Log Analysis

python log_analyzer.py custom.log custom --threshold 2 --output custom_report.txt

What it does:

  • Applies keyword-based detection for custom log formats
  • Looks for suspicious terms like "failed", "error", "denied", "unauthorized"
  • Extracts IP addresses and timestamps where possible

๐Ÿ” Understanding the Output

Console Output Example

๐Ÿš€ Starting Automated Log Analysis...
Target: syslog_failed_ssh.log (syslog format)
๐Ÿ“– Parsing syslog log file: syslog_failed_ssh.log
โœ… Processed 20 lines, parsed 18 entries
๐Ÿ” Analyzing suspicious activities...
๐ŸŽฏ Found 17 suspicious entries out of 18 total entries
๐Ÿšจ Flagged 2 IP addresses exceeding threshold of 5

============================================================
๐Ÿ“Š AUTOMATED LOG ANALYSIS REPORT
============================================================
Log File: syslog_failed_ssh.log
Log Type: SYSLOG
Analysis Date: 2024-05-15 14:30:25
Threshold: 5

--- SUMMARY STATISTICS ---
Total Log Entries: 18
Suspicious Entries: 17
Flagged IP Addresses: 2

--- FLAGGED IP ADDRESSES (Exceeding Threshold) ---
  ๐Ÿšจ 192.168.1.100: 7 suspicious events [HIGH RISK]
  ๐Ÿšจ 203.0.113.50: 6 suspicious events [MEDIUM RISK]

--- RECOMMENDATIONS ---
  ๐Ÿ”’ Consider blocking or monitoring flagged IP addresses
  ๐Ÿ“ˆ Investigate high-risk IPs for potential security threats
  ๐Ÿ• Review logs for unusual timing patterns

Generated Files

  1. Text Report : Detailed analysis saved to specified output file
  2. Visualization Chart : Bar chart showing top suspicious IPs (suspicious_ips_chart_[logtype].png)

๐Ÿ›ก๏ธ Cybersecurity Applications

1. Incident Detection

  • Brute-force Attacks : Identifies repeated failed login attempts
  • Web Reconnaissance : Detects scanning for common vulnerabilities
  • System Intrusions : Flags unusual system events and errors

2. Compliance Monitoring

  • Audit Trails : Provides evidence of security monitoring
  • Regulatory Requirements : Supports PCI DSS, HIPAA, and other compliance standards
  • Forensic Analysis : Creates detailed records for incident investigation

3. Operational Security

  • Real-time Monitoring : Can be automated for continuous log analysis
  • Threshold Alerting : Customizable rules for different threat levels
  • Integration Ready : Outputs can feed into SIEM systems

๐Ÿ”ง Technical Details

Supported Log Formats

Syslog Format

  • Pattern Detection : Failed SSH logins, system errors, unauthorized access
  • Regex Patterns : Extracts timestamps, hostnames, process names, and messages
  • Example : May 15 08:32:10 server1 sshd[12345]: Failed password for root from 192.168.1.100

Apache Combined/Common Log Format

  • HTTP Analysis : Status codes, request methods, user agents
  • Error Detection : 4xx client errors, 5xx server errors
  • Example : 192.168.1.50 - - [15/May/2024:10:30:15 +0000] "GET /admin.php HTTP/1.1" 404 512

Custom Logs

  • Flexible Parsing : Keyword-based suspicious activity detection
  • IP Extraction : Identifies IP addresses in various log formats
  • Extensible : Easy to modify for specific log structures

Analysis Algorithms

  1. Pattern Matching : Uses compiled regex patterns for efficient parsing
  2. Statistical Analysis : Counts events by IP, time windows, and event types
  3. Threshold-based Detection : Flags IPs exceeding configurable limits
  4. Risk Assessment : Categorizes threats as HIGH, MEDIUM, or LOW risk

๐Ÿ“Š Sample Data Analysis

Demo 1: SSH Brute-force Detection

Input : syslog_failed_ssh.log with multiple failed login attempts Command : python log_analyzer.py syslog_failed_ssh.log syslog --threshold 5 Result : Identifies 192.168.1.100 with 7 failed attempts and 203.0.113.50 with 6 attempts

Demo 2: Web Attack Detection

Input : apache_access_errors.log with reconnaissance attempts Command : python log_analyzer.py apache_access_errors.log apache --status-code 404 --threshold 3 Result : Flags 203.0.113.75 for excessive 404 errors (potential vulnerability scanning)

๐ŸŽ“ Educational Value & Learning Outcomes

Skills Demonstrated

Technical Skills:

  • Python programming for cybersecurity applications
  • Regular expression pattern matching for log parsing
  • Data analysis and visualization techniques
  • Command-line interface development
  • File I/O and error handling

Cybersecurity Concepts:

  • Log analysis methodologies
  • Threat detection and anomaly identification
  • Incident response procedures
  • Risk assessment and categorization
  • Security monitoring best practices

Blue Team Operations:

  • Defensive security tool development
  • Automated threat detection systems
  • Security operations center (SOC) workflows
  • Continuous monitoring implementation
  • Evidence collection and documentation

Real-world Applications

This project prepares students for careers in:

  • SOC Analyst: Monitoring and analyzing security events
  • Incident Response Specialist: Investigating and responding to security incidents
  • Security Engineer: Developing and implementing security tools
  • Cybersecurity Consultant: Assessing and improving organizational security posture

๐Ÿ”ฎ Future Enhancements

Planned Features

  • Machine Learning Integration : Advanced anomaly detection using ML algorithms
  • Real-time Processing : Live log monitoring with streaming analysis
  • Web Interface : GUI for easier configuration and visualization
  • SIEM Integration : Direct integration with Security Information and Event Management systems
  • Geo-IP Analysis : Location-based threat intelligence
  • Threat Intelligence Feeds : Integration with IOC (Indicators of Compromise) databases

Contributing

This tool is designed to be extensible. Key areas for enhancement:

  • Additional log format parsers
  • Advanced visualization options
  • Custom rule engines
  • Performance optimizations for large log files

๐Ÿ“š References and Resources

Official Documentation (Primary Sources)

Cybersecurity Standards and Frameworks

  • NIST Cybersecurity Framework : Guidelines for log monitoring and incident response
  • OWASP Logging Cheat Sheet : Best practices for security logging
  • SANS Blue Team Operations : Defensive security methodologies
  • MITRE ATT&CK Framework : Threat detection and response strategies

Academic References

  • Course textbook and lecture materials (CS-332 Information Security)
  • Industry white papers on log analysis and threat detection
  • Research papers on automated security monitoring systems

โš ๏ธ Important Notes

Academic Integrity

This project was developed following academic guidelines and represents original work by the project team. All external resources and references have been properly cited.

Limitations

  • Performance : Large log files (>1GB) may require optimization
  • Complex Attacks : May miss sophisticated, low-and-slow attacks
  • False Positives : Requires tuning for specific environments
  • Context : Provides detection but limited contextual analysis

Security Considerations

  • Log Privacy : Ensure compliance with data protection regulations
  • Access Control : Restrict access to log files and analysis reports
  • Data Retention : Follow organizational policies for log data storage
  • Alert Fatigue : Balance sensitivity with practicality

๐Ÿ†˜ Troubleshooting

Common Issues

File Not Found Error

โŒ Error: Log file 'logfile.log' not found.

Solution : Verify the file path and ensure read permissions

No Entries Parsed

โŒ No valid log entries found. Please check the log file format.

Solution : Ensure the log type matches the actual file format

Permission Denied

โŒ Error: Permission denied accessing '/var/log/secure'.

Solution : Run with appropriate permissions or copy logs to accessible location

Getting Help

For questions or issues:

  1. Check the error messages for specific guidance
  2. Verify log file format matches the specified type
  3. Test with provided sample files first
  4. Review regex patterns for custom log formats
  5. Consult official documentation for libraries used
  6. Contact project team or course instructor for academic support

๐Ÿ“ Project Submission Details

Project Team: Abu Bakar (NUM-BSCS-2022-41) and Raqib Hayat (NUM-BSCS-2022-40)

Course: CS-332 Information Security

Submission Date: 27/05/2025

Tool Version: 1.0

Last Updated: May 2024

Project Deliverables

  • Complete Python log analysis tool
  • Comprehensive documentation (this README)
  • Sample log files for testing
  • Usage examples and demonstrations
  • Technical architecture documentation

This tool is developed for educational purposes as part of the CS-332 Information Security course curriculum. It demonstrates practical application of cybersecurity concepts and is designed for legitimate security monitoring purposes. Always ensure compliance with applicable laws and organizational policies when analyzing log data.

Academic Disclaimer: This project follows all academic integrity guidelines and represents original work by the development team under the supervision of Dr. Mudassar Raza for the Spring 2025 semester of CS-332 Information Security.

About

An automated Python-based cybersecurity tool for analyzing system and network logs to detect anomalies and suspicious activities. Developed by Abu Bakar and Raqib Hayat for CS-332 Information Security, focused on Blue Team threat detection and defensive operations.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages