A SQL-like query processor that provides a powerful and intuitive interface for searching and retrieving logs from multiple storage backends.
LogX Query Processor is a custom-built log querying system that allows you to search through logs using a SQL-like query language. Built with ANTLR4, it parses custom query syntax and processes log searches across different data stores. The system provides two main query types:
- SELECT queries: Retrieve logs within a specific time range
- FIND queries: Search for specific patterns or keywords in logs
The processor reads serialized Java log objects containing metadata such as:
- Log data/message
- Timestamp (Unix epoch)
- Thread ID and Thread Name
- Severity level (CRITICAL, HIGH, MEDIUM, LOW, WARN, UNDEFINED)
- Stack trace information
Traditional log searching can be cumbersome and require:
- Direct file system access and parsing
- Complex Elasticsearch queries
- Writing custom code for each search scenario
- Understanding underlying storage mechanisms
LogX Query Processor solves these problems by:
- Abstraction: Provides a unified query interface regardless of underlying storage (file system or Elasticsearch)
- Simplicity: SQL-like syntax that's familiar and easy to learn
- Flexibility: Support for multiple storage backends with easy extensibility
- Efficiency: Built-in binary search for time-range queries using timestamps
- Type Safety: Strongly-typed AST (Abstract Syntax Tree) representation of queries
- Interactive CLI: User-friendly command-line interface with help system
# Clone the repository
cd query-processor
# Build the project
mvn clean compile
# Run the application
mvn exec:javaThe query processor supports two types of queries:
SELECT field1, field2, field3 FROM XXXXXXXXXX.log BETWEEN YYYYYYYYYY AND ZZZZZZZZZZ;Example:
SELECT error, exception, failed FROM 1234567890.log BETWEEN 1234567890 AND 1234569999;field1, field2, ...: Keywords/patterns to search for in log messagesXXXXXXXXXX: 10-digit timestamp identifier for the log fileYYYYYYYYYY: Start timestamp (Unix epoch, 10 digits)ZZZZZZZZZZ: End timestamp (Unix epoch, 10 digits)
FIND field1, field2, field3 IN XXXXXXXXXX.log;Example:
FIND error, warning, critical IN 1234567890.log;field1, field2, ...: Keywords/patterns to search forXXXXXXXXXX: 10-digit timestamp identifier for the log file
When you run the application:
mvn exec:javaYou'll be prompted to:
-
Select data source (elasticsearch or file)
- For Elasticsearch: Provide host URL and API key
- For File: Uses default
~/logsdirectory
-
Enter queries at the interactive prompt
- Type
?for help - Type
exitto quit
- Type
Example Session:
Enter source type (elasticsearch/file):
file
(Press ? for help)
Enter your query: FIND NullPointerException IN 1696118400.log;
-------------------------ELEMENT: NullPointerException-------------------------
Thread-1
Worker-Thread
java.lang.NullPointerException at line 42
2023-10-01 00:00:00.0
... stack trace ...
- Java 17: Primary programming language with modern features (records, switch expressions)
- Apache Maven: Build automation and dependency management
- ANTLR4 (v4.13.2): Parser generator for creating the custom query language
- Grammar definition in
Query.g4 - Generates lexer, parser, and visitor classes
- Visitor pattern for AST construction
- Grammar definition in
- Elasticsearch Java Client (v9.0.1): For Elasticsearch integration
- SLF4J NOP (v2.0.9): Logging facade (silent mode for Elasticsearch logs)
- JUnit Jupiter (v5.11.0): Unit testing framework
- Factory Pattern:
ReaderFactoryandQueryProcessorFactoryfor creating appropriate instances - Strategy Pattern: Different readers (File, Elasticsearch) implementing common
Readerinterface - Visitor Pattern: ANTLR-generated visitors for AST traversal
- Record Classes: Immutable data carriers for AST nodes (
SelectBetween,FindIn)
Configuration:
- Default location:
~/logs/(auto-created if not exists) - File format: Serialized Java objects (
.logfiles) - File naming:
{10-digit-timestamp}.log
How it works:
- Reads serialized
Logobjects using Java'sObjectInputStream - Stores logs as binary files for fast deserialization
- Suitable for local development and small-scale deployments
Example:
# File location
/home/user/logs/1696118400.logConfiguration:
- Host URL: Elasticsearch server endpoint
- API Key: Authentication token
How it works:
- Connects to Elasticsearch cluster using Java client
- Index name derived from filename (timestamp portion)
- Performs
match_allquery with size limit of 1000 documents - Maps Elasticsearch documents to
LogPOJO
Example:
Enter source type (elasticsearch/file):
elasticsearch
Enter the host URL:
https://localhost:9200
Enter the API key:
your-api-key-hereAdding a new storage backend is straightforward:
- Implement the
Readerinterface - Add logic to
ReaderFactoryto instantiate your reader - Your reader should return
List<Log>from theread(String logFile)method
query-processor/
├── src/main/java/tech/thedumbdev/
│ ├── antlr4/ # ANTLR grammar definition
│ │ └── Query.g4
│ ├── ast/ # Abstract Syntax Tree nodes
│ │ ├── ASTQuery.java
│ │ ├── FindIn.java
│ │ └── SelectBetween.java
│ ├── enums/ # Enumerations
│ │ └── Severity.java
│ ├── gen/ # ANTLR generated classes
│ ├── parser/ # AST builder
│ │ └── ASTBuilder.java
│ ├── pojo/ # Plain Old Java Objects
│ │ └── Log.java
│ ├── queryprocessor/ # Query execution logic
│ │ ├── FindQueryProcessor.java
│ │ ├── SelectQueryProcessor.java
│ │ ├── QueryProcessor.java
│ │ └── QueryProcessorFactory.java
│ ├── reader/ # Data source readers
│ │ ├── Reader.java
│ │ ├── FileReader.java
│ │ ├── ElasticReader.java
│ │ └── ReaderFactory.java
│ ├── util/ # Utility classes
│ │ └── Occurrence.java (Binary search for timestamps)
│ └── App.java # Main entry point
└── pom.xml # Maven configuration
- ✅ Custom Query Language: SQL-like syntax for log searching
- ✅ Multi-backend Support: File system and Elasticsearch
- ✅ Time-range Queries: Efficient binary search on timestamps
- ✅ Multi-field Search: Search for multiple keywords simultaneously
- ✅ Interactive CLI: User-friendly command-line interface
- ✅ Detailed Log Output: Thread info, timestamps, stack traces
- ✅ Error Handling: Comprehensive error messages and validation
- ✅ Type Safety: Compile-time type checking with records and enums
This project is part of a logging-notification-system.