A Spring Boot application that processes employee payroll data using Spring Batch framework. The application reads employee information from CSV files, calculates tax and bonuses, stores processed payroll data in a MySQL database, and provides REST APIs to trigger batch jobs and export results.
This application demonstrates enterprise-grade batch processing capabilities using Spring Batch to:
- Read employee data from CSV files (10000+ records efficiently)
- Calculate tax (10%) and performance bonuses (5% for salaries > ₹60,000)
- Store processed payroll in MySQL database with Flyway migrations
- Export processed payroll data back to CSV format
- Schedule automatic daily batch processing
- Provide REST endpoints for manual job triggering
CSV File (employee.csv)
↓
FlatFileItemReader (reads 500 records/chunk)
↓
EmpProcessor (calculates tax, bonus, net salary)
↓
JdbcBatchItemWriter (writes to processed_payroll table)
↓
MySQL Database
- **Spring Boot ** - Application framework
- **Spring Batch ** - Batch processing framework
- Spring Data JPA - Database operations
- Flyway - Database migration management
- MySQL - Production database
- Swagger/OpenAPI - API documentation
- Spring Scheduler - Job scheduling
The EmpProcessor applies the following calculations:
- Tax Rate: 10% of gross salary
- Bonus Threshold: ₹60,000
- Bonus Rate: 5% for salaries exceeding threshold
- Net Salary Formula:
Gross Salary - Tax + Bonus
Example:
Employee: Vinit Shinde
Gross Salary: ₹94,238
Tax (10%): ₹9,423.80
Bonus (5%): ₹4,711.90
Net Salary: ₹89,526.10
Critical Settings to Verify:
# MySQL Connection - UPDATE THESE
spring.datasource.url=jdbc:mysql://localhost:3306/processed_payroll?createDatabaseIfNotExist=true
spring.datasource.username=root
spring.datasource.password=rootLocation: src/main/resources/db/migration/
Required Migration Files:
V1__create_spring_batch_tables.sql → Spring Batch metadata tables
Location: src/main/resources/employee.csv
Expected Format:
id,name,salary
1,Aaditya Kirdat,94238
2,Purva Jadhav,125475File: BatchConfig.java
Key Parameters to Review:
.chunk(500, transactionManager) // Process 500 records per transaction
.faultTolerant() // Enable error handling
.retry(Exception.class) // Retry on exceptions
.retryLimit(3) // Maximum 3 retry attemptsTuning Recommendations:
- Chunk Size: 500 is optimal for 1000 records; adjust for larger datasets
- Retry Limit: Increase for unreliable data sources
- Skip Policy: Add
.skip()to skip malformed records instead of failing
Important Note: The SQL statement in BatchConfig.writer() contains:
ON DUPLICATE KEY UPDATEStep 1: Read
FlatFileItemReaderreadsemployee.csv- Skips header row (
setLinesToSkip(1)) - Maps CSV columns to
Employeeobject
Step 2: Process
EmpProcessorcalculates tax and bonus- Transforms
Employee→ProcessedPayroll - Applies business rules (10% tax, 5% bonus logic)
Step 3: Write
JdbcBatchItemWriterinserts intoprocessed_payrolltable- Uses batch insert for performance
- Updates existing records if duplicate ID found
Step 4: Listen
PayrollJobListenertracks job start/completion- Logs record count after successful completion
Export Step:
JdbcCursorItemReaderqueries all records from databaseFlatFileItemWriterwrites toprocessed_payroll_export.csv- Adds CSV header automatically
- Processes 10 records per chunk
Output Location: Project root directory
Access Swagger UI: http://localhost:8080/swagger-ui/index.html#/
GET /run-batch
Purpose: Manually trigger payroll processing job
Response: "Batch job has been initiated!"
What Happens: Reads CSV → Processes → Writes to DB
GET /export-to-csv
Purpose: Export processed payroll to CSV file
Response: "Export to CSV initiated! Check processed_payroll_export.csv file."
Output: Creates processed_payroll_export.csv in project root
Using cURL:
# Trigger batch processing
curl http://localhost:8080/run-batch
# Export results
curl http://localhost:8080/export-to-csvUsing Browser: Simply visit the URLs above or use Swagger UI for interactive testing.
The application includes automatic scheduling:
@Scheduled(cron = "0 0 0 * * ?") // Runs daily at midnightCron Expression Breakdown:
0 0 0- 00:00:00 (midnight)* * ?- Every day, every month
To Modify Schedule:
- Every hour:
0 0 * * * ? - Every 30 minutes:
0 */30 * * * ? - Weekdays only:
0 0 9 ? * MON-FRI(9 AM, Monday-Friday)
To Disable Scheduling:
Remove @EnableScheduling from PayrollBatchProcessorApplication.java
| Column | Type | Description |
|---|---|---|
| id | BIGINT | Employee ID (Primary Key) |
| name | VARCHAR(255) | Employee name |
| salary | DOUBLE | Gross salary |
| tax | DOUBLE | Calculated tax (10%) |
| bonus | DOUBLE | Performance bonus (5%) |
| net_salary | DOUBLE | Final take-home pay |
| created_at | TIMESTAMP | Record creation timestamp |
Flyway creates the following Spring Batch tables automatically:
BATCH_JOB_INSTANCE- Job definitionsBATCH_JOB_EXECUTION- Job execution historyBATCH_STEP_EXECUTION- Step execution detailsBATCH_JOB_EXECUTION_PARAMS- Job parameters- Sequence tables for ID generation
- MySQL server is running
- Database credentials in
application.propertiesare correct - Database user has sufficient permissions
- Flyway migrations are in correct location
-
employee.csvexists insrc/main/resources/ - CSV format matches expected structure
- You can Create raw data employee.csv using EmployeeCSV.java
-
spring.batch.job.enabled=false(prevents auto-run on startup) - JPA DDL mode:
spring.jpa.hibernate.ddl-auto=update - Flyway is enabled:
spring.flyway.enabled=true
- Port 8080 is available
- Java 21 or higher is installed
- Gradle dependencies are downloaded
-
Application Startup:
- Flyway runs migrations
- Spring Batch metadata tables created
- Application tables created
- Server starts on port 8080
-
Batch Job Trigger:
- User calls
/run-batchendpoint - Processor calculates for each employee
- Writer inserts/updates database
- Job completes with status: COMPLETED
- User calls
-
Export Trigger:
- User calls
/export-to-csvendpoint - Reader queries database
- Writer creates CSV file in project root
- File contains all processed records
- User calls
1. Check Database:
SELECT COUNT(*) FROM processed_payroll; -- Should return 7 (or your CSV row count)
SELECT * FROM processed_payroll WHERE salary > 60000; -- Should show bonus > 0
SELECT name, salary, tax, bonus, net_salary
FROM processed_payroll
ORDER BY net_salary DESC;2. Check Export File:
- Look for
processed_payroll_export.csvin project root - Open and verify calculations are correct
- Header should be:
id,name,salary,tax,bonus,net_salary
3. Check Batch History:
SELECT job_execution_id, status, start_time, end_time
FROM BATCH_JOB_EXECUTION
ORDER BY job_execution_id DESC;Edit EmpProcessor.java:
private static final double TAX_RATE = 0.15; // Change to 15%
private static final double BONUS_RATE = 0.08; // Change to 8%
private static final double BONUS_THRESHOLD = 50000; // Change thresholdEdit BatchConfig.java:
.chunk(1000, transactionManager) // Process 1000 records per transactionAdd to BatchConfig.reader():
reader.setStrict(true); // Fail on malformed records
reader.setLinesToSkip(1);Edit ExportToCSV.csvWriter():
writer.setResource(new FileSystemResource("payroll_" + LocalDate.now() + ".csv"));- Chunk Size: 500 is balanced for 1000 records; use 1000-5000 for larger datasets
- Database Connection Pool: HikariCP default settings are optimized
- Transaction Management: Each chunk is a separate transaction for rollback safety
- Retry Mechanism: 3 retries prevent transient failures from failing entire job