Mutation testing is a fault-based software testing technique used to evaluate the quality of a test suite. It involves modifying a program in small ways (creating "mutants"), such as changing a logical operator or deleting a statement and running the test suite against these mutants.
If the test suite fails (detects the error), the mutant is considered killed. If the test suite passes despite the code change, the mutant survived, indicating a gap in test coverage. It essentially tests the tests.
The primary objective of this project is to build a functional, lightweight SQL database engine ("TinySQL") and apply rigorous software testing techniques. Specifically, we aim to:
- Develop a source code rich in control flow and logic (Java).
- Implement Unit and Integration Testing using JUnit 5.
- Perform Mutation Testing using PITest to analyze test suite efficacy at the unit level.
- Demonstrate Integration Level Mutation by manually modeling specific integration faults (IMCD, IPEX, IREM).
- Apply Fuzz Testing using JQF to detect edge-case failures in the tokenizer and aggregator logic.
- Language: Java 21
- Build Tool: Maven 3.11+
- Unit/Integration Testing: JUnit 5 (Jupiter)
- Mutation Testing: PITest (v1.15.0)
- Fuzz Testing: JQF (Java QuickCheck Fuzzing v2.1)
Here is the combined and formatted section for your README.md.
TinySQL, is a relational database engine with disk persistence capabilities. It supports essential SQL functionalities including SELECT, CREATE, INSERT, JOIN, and WHERE clause filtering. The project consists of approximately 1000+ lines of code.
The architecture is modularized into the following packages:
- Engine (
com.tinysql.engine): Contains the core processing logic.Executor: The central controller that orchestrates data operations (SELECT,INSERT,CREATE) and manages memory/disk interaction.JoinProcessor: Implements a Nested-Loop algorithm to handle Equi-Joins between tables.Aggregator: Performs mathematical aggregations on datasets (SUM,AVG,MAX,MIN).ConditionEvaluator: A logic-heavy component responsible for parsing and processingWHEREclause conditionals against row data.
- Storage (
com.tinysql.storage): Manages Input/Output operations, specifically saving and loading tables as CSV files to ensure data persistence. - Tokenizer (
com.tinysql.tokenizer): A lexical analyzer that breaks raw SQL input strings into distinct tokens for parsing. - Model (
com.tinysql.model): Defines the data structures representing the database schema:Table,Row,Column, andDataType. - Transaction (
com.tinysql.transaction): Includes basic stubs for transaction management and ACID property support.
The engine supports a strict set of data types for defining table schemas:
INT: Integer values (mapped to JavaInteger).TEXT: String literals (mapped to JavaString).BOOL: Boolean values (true/false).FLOAT: Floating-point numbers (mapped to JavaFloat).DOUBLE: Double-precision numbers (mapped to JavaDouble).
The TinySQL engine supports a strictly typed subset of SQL commands. Below are the valid functionalities and syntax patterns derived from the source code.
Defines a new table schema in memory and initializes a CSV file for persistence.
- Functionality: Creates a new table with specified column names and enforced data types.
- Supported Types:
INT,DOUBLE,FLOAT,TEXT,BOOL. - Syntax:
CREATE TABLE <table_name> (<col_name> <type>, <col_name> <type>, ...)
- Example:
CREATE TABLE users (id INT, name TEXT, active BOOL, balance DOUBLE)
Adds a single row of data to the specified table.
- Functionality: Parses input values, matches them to the schema, and appends them to the table storage. Internal IDs are auto-incremented.
- Syntax:
INSERT INTO <table_name> VALUES <val1> <val2> <val3> ...
- Example:
INSERT INTO users VALUES 1 Alice true 500.50
Retrieves rows from a table, optionally filtering them with a WHERE clause.
- Functionality: Fetches all columns (
*) for rows matching the specific criteria. - Supported Operators:
=,!=,>,<,>=,<=. - Syntax:
SELECT * FROM <table_name> [WHERE <column> <operator> <value>]
- Example:
SELECT * FROM users WHERE balance > 100.0
Performs a join operation between two tables.
- Functionality: Combines rows from two tables using a Nested-Loop Join algorithm where the specified columns match (Equi-Join).
- Syntax:
JOIN <table1> <table2> ON <col1_in_t1> <col2_in_t2>
- Example:
JOIN users orders ON id user_id
Performs calculations on a specific column across all matching rows.
- Functionality: Computes a single result value based on the selected mathematical function.
- Supported Functions:
COUNT,SUM,AVG,MIN,MAX. - Syntax:
SELECT <FUNCTION>(<column>) FROM <table_name> [WHERE <column> <operator> <value>]
- Examples:
SELECT COUNT(id) FROM users SELECT AVG(balance) FROM users WHERE active = true
We performed comprehensive JUnit-based unit testing across all core packages.
Unit tests were written for the following components:
-
com.tinysql.engine
- Execution logic, query operators, aggregators, and evaluator classes.
-
com.tinysql.model
- Table, Row, Column, Schema, and in-memory data representations.
-
com.tinysql.storage
- StorageManager, file I/O, page-level read/write operations, and persistence logic.
-
com.tinysql.tokenizer
- SQL Tokenizer, token classification, keyword detection, literal parsing, and operator handling.
-
com.tinysql.transaction
- Transaction lifecycle, locks, commit/rollback behavior, and isolation assumptions.
-
com.tinysql.util
- Utility helpers including type conversion, comparisons, and general-purpose helpers.
Integration tests were designed to validate how major subsystems interact when executing complete SQL workflows.
Targets
com.tinysql.engine.Executorcom.tinysql.engine.JoinProcessorcom.tinysql.storage.StorageManagercom.tinysql.model.Table(as part of end-to-end flow)
These enhanced Unit and Mutation Testing strategies ensure both correctness of individual components and robustness of the overall test suite.
- Unit Level (Automated via PITest): We ran PITest against the
com.tinysql.*packages. PITest automatically generates mutants by altering bytecode. - Integration Level (Manual Modeling): As per project requirements, we manually designed integration mutants to verify if our integration tests could catch faults in component interfaces:
- IMCD (Integration Method Call Deletion): We created a mutant where the
storage.saveTable()call was deleted to test persistence verification. - IPEX (Integration Parameter Exchange): We swapped column parameters in the
executeJoinmethod to test if the test suite detects mismatched join keys. - IREM (Integration Return Expression Modification): We modified
executeSelectto return empty lists to test data flow verification.
- IMCD (Integration Method Call Deletion): We created a mutant where the
We used JQF (Java QuickCheck Fuzzing) to perform generative testing.
- Tokenizer Fuzzing: Generates random strings to ensure the tokenizer does not crash on malformed SQL.
- Aggregator Fuzzing: Generates random
Rowobjects with mixed types (Double,String,Integer) to test the robustness of math functions. - CLI Fuzzing: Feeds random command inputs to the
Mainclass.
Below is the summary of the PITest coverage report generated after running the suite.
| Metric | Score |
|---|---|
| Mutation Score | 82% |
| Line Coverage | 94% |
| Mutants Killed | 348/426] |
| Mutations with no coverage | 28 (Test strength 87%) |
We ran the fuzzer for 30 seconds per method.
- Tokenizer: No crashes detected. Handled malformed inputs by throwing handled exceptions rather than crashing.
- Aggregator: Robust against
ClassCastExceptiondue to strict type checking implemented in theAggregatorclass.
Fuzz Execution Output: (Place a screenshot of the terminal showing the JQF run)
- Java 21
- Maven
Compile the code and the tests, and build the executable JAR.
mvn clean packagemvn testThis generates the mutation report in target/pit-reports/.
mvn org.pitest:pitest-maven:mutationCoverageRun specific fuzz drivers for 30 seconds each.
# Fuzz the Tokenizer
mvn jqf:fuzz -Dclass=com.tinysql.fuzz.TokenizerFuzz -Dmethod=fuzzTokenizer -Dtime=30s
# Fuzz the Aggregator
mvn jqf:fuzz -Dclass=com.tinysql.fuzz.AggregatorFuzz -Dmethod=fuzzAggregator -Dtime=30s
# Fuzz the Main CLI
mvn jqf:fuzz -Dclass=com.tinysql.fuzz.MainFuzz -Dmethod=fuzzCLI -Dtime=30s
The work was divided equally between the team members:
Ananthakrishna K (IMT2022086)
- Unit Testing: Wrote and improved JUnit tests based on mutations for
Engine,ModelandStoragemodules. - Integration Mutation: Designed the IPEX (Parameter Exchange) and IMCD (Method Call Deletion) mutation scenarios and corresponding tests.
- Documentation: Authored the corresponding section of the report.
- Fuzz Testing: Explored
Fuzztesting.
Aditya Priyadarshi (IMT2022075)
- Unit Testing: Wrote and improved JUnit tests based on mutations for
Tokenizer,TransactionandMainmodules. - Integration Mutation: Designed the IREM (Return Expression Modification) mutation scenario.
- Build System: Configured
pom.xmlfor PITest and Shade plugins. - Documentation: Authored the corresponding section of the report.
- Fuzz Testing: Explored
Fuzztesting.
We verify that we used LLMs for the following purposes in this project:
- Source Code Generation: Assisting in generating boilerplate code for the
Tokenizerand standard Getters/Setters. - JUnit Syntax: Helping generate the initial syntax for parameterized JUnit tests.
- Report Assistance: Assisting in structuring this README file and the final project report.
Note: The core logic for test case generation, the selection of mutation operators, and the design of the integration mutants were performed entirely by us.
