Skip to content

Vin-it-9/Spring_Batch_Processor

Repository files navigation

Payroll Batch Processor

A Spring Boot application that processes employee payroll data using Spring Batch framework. The application reads employee information from CSV files, calculates tax and bonuses, stores processed payroll data in a MySQL database, and provides REST APIs to trigger batch jobs and export results.

Project Overview

This application demonstrates enterprise-grade batch processing capabilities using Spring Batch to:

  • Read employee data from CSV files (10000+ records efficiently)
  • Calculate tax (10%) and performance bonuses (5% for salaries > ₹60,000)
  • Store processed payroll in MySQL database with Flyway migrations
  • Export processed payroll data back to CSV format
  • Schedule automatic daily batch processing
  • Provide REST endpoints for manual job triggering

Architecture

Batch Processing Flow

CSV File (employee.csv) 
    ↓
FlatFileItemReader (reads 500 records/chunk)
    ↓
EmpProcessor (calculates tax, bonus, net salary)
    ↓
JdbcBatchItemWriter (writes to processed_payroll table)
    ↓
MySQL Database

Technology Stack

  • **Spring Boot ** - Application framework
  • **Spring Batch ** - Batch processing framework
  • Spring Data JPA - Database operations
  • Flyway - Database migration management
  • MySQL - Production database
  • Swagger/OpenAPI - API documentation
  • Spring Scheduler - Job scheduling

Business Logic

Payroll Calculation Rules

The EmpProcessor applies the following calculations:

  • Tax Rate: 10% of gross salary
  • Bonus Threshold: ₹60,000
  • Bonus Rate: 5% for salaries exceeding threshold
  • Net Salary Formula: Gross Salary - Tax + Bonus

Example:

Employee: Vinit Shinde
Gross Salary: ₹94,238
Tax (10%): ₹9,423.80
Bonus (5%): ₹4,711.90
Net Salary: ₹89,526.10

Configuration Checklist

1. Database Configuration (application.properties)

Critical Settings to Verify:

# MySQL Connection - UPDATE THESE
spring.datasource.url=jdbc:mysql://localhost:3306/processed_payroll?createDatabaseIfNotExist=true
spring.datasource.username=root
spring.datasource.password=root

2. Flyway Migration Files

Location: src/main/resources/db/migration/

Required Migration Files:

V1__create_spring_batch_tables.sql  → Spring Batch metadata tables

3. Input CSV File

Location: src/main/resources/employee.csv

Expected Format:

id,name,salary
1,Aaditya Kirdat,94238
2,Purva Jadhav,125475

4. Batch Configuration

File: BatchConfig.java

Key Parameters to Review:

.chunk(500, transactionManager)  // Process 500 records per transaction
.faultTolerant()                 // Enable error handling
.retry(Exception.class)          // Retry on exceptions
.retryLimit(3)                   // Maximum 3 retry attempts

Tuning Recommendations:

  • Chunk Size: 500 is optimal for 1000 records; adjust for larger datasets
  • Retry Limit: Increase for unreliable data sources
  • Skip Policy: Add .skip() to skip malformed records instead of failing

5. Writer Configuration

Important Note: The SQL statement in BatchConfig.writer() contains:

ON DUPLICATE KEY UPDATE

How It Works

Job 1: Payroll Processing (payrollJob)

Step 1: Read

  • FlatFileItemReader reads employee.csv
  • Skips header row (setLinesToSkip(1))
  • Maps CSV columns to Employee object

Step 2: Process

  • EmpProcessor calculates tax and bonus
  • Transforms EmployeeProcessedPayroll
  • Applies business rules (10% tax, 5% bonus logic)

Step 3: Write

  • JdbcBatchItemWriter inserts into processed_payroll table
  • Uses batch insert for performance
  • Updates existing records if duplicate ID found

Step 4: Listen

  • PayrollJobListener tracks job start/completion
  • Logs record count after successful completion

Job 2: Export to CSV (exportJob)

Export Step:

  • JdbcCursorItemReader queries all records from database
  • FlatFileItemWriter writes to processed_payroll_export.csv
  • Adds CSV header automatically
  • Processes 10 records per chunk

Output Location: Project root directory

REST API Endpoints

Available Endpoints

Access Swagger UI: http://localhost:8080/swagger-ui/index.html#/

1. Run Batch Processing

GET /run-batch

Purpose: Manually trigger payroll processing job
Response: "Batch job has been initiated!"
What Happens: Reads CSV → Processes → Writes to DB

2. Export to CSV

GET /export-to-csv

Purpose: Export processed payroll to CSV file
Response: "Export to CSV initiated! Check processed_payroll_export.csv file."
Output: Creates processed_payroll_export.csv in project root

Testing the APIs

Using cURL:

# Trigger batch processing
curl http://localhost:8080/run-batch

# Export results
curl http://localhost:8080/export-to-csv

Using Browser: Simply visit the URLs above or use Swagger UI for interactive testing.

Scheduled Execution

The application includes automatic scheduling:

@Scheduled(cron = "0 0 0 * * ?")  // Runs daily at midnight

Cron Expression Breakdown:

  • 0 0 0 - 00:00:00 (midnight)
  • * * ? - Every day, every month

To Modify Schedule:

  • Every hour: 0 0 * * * ?
  • Every 30 minutes: 0 */30 * * * ?
  • Weekdays only: 0 0 9 ? * MON-FRI (9 AM, Monday-Friday)

To Disable Scheduling: Remove @EnableScheduling from PayrollBatchProcessorApplication.java

Database Schema

Table: processed_payroll

Column Type Description
id BIGINT Employee ID (Primary Key)
name VARCHAR(255) Employee name
salary DOUBLE Gross salary
tax DOUBLE Calculated tax (10%)
bonus DOUBLE Performance bonus (5%)
net_salary DOUBLE Final take-home pay
created_at TIMESTAMP Record creation timestamp

Spring Batch Metadata Tables

Flyway creates the following Spring Batch tables automatically:

  • BATCH_JOB_INSTANCE - Job definitions
  • BATCH_JOB_EXECUTION - Job execution history
  • BATCH_STEP_EXECUTION - Step execution details
  • BATCH_JOB_EXECUTION_PARAMS - Job parameters
  • Sequence tables for ID generation

What to Check Before Running

Pre-Deployment Checklist

Database

  • MySQL server is running
  • Database credentials in application.properties are correct
  • Database user has sufficient permissions
  • Flyway migrations are in correct location

Files

  • employee.csv exists in src/main/resources/
  • CSV format matches expected structure
  • You can Create raw data employee.csv using EmployeeCSV.java

Configuration

  • spring.batch.job.enabled=false (prevents auto-run on startup)
  • JPA DDL mode: spring.jpa.hibernate.ddl-auto=update
  • Flyway is enabled: spring.flyway.enabled=true

Application

  • Port 8080 is available
  • Java 21 or higher is installed
  • Gradle dependencies are downloaded

Expected Behavior

Successful Execution

  1. Application Startup:

    • Flyway runs migrations
    • Spring Batch metadata tables created
    • Application tables created
    • Server starts on port 8080
  2. Batch Job Trigger:

    • User calls /run-batch endpoint
    • Processor calculates for each employee
    • Writer inserts/updates database
    • Job completes with status: COMPLETED
  3. Export Trigger:

    • User calls /export-to-csv endpoint
    • Reader queries database
    • Writer creates CSV file in project root
    • File contains all processed records

Verification Steps

1. Check Database:

SELECT COUNT(*) FROM processed_payroll;  -- Should return 7 (or your CSV row count)

SELECT * FROM processed_payroll WHERE salary > 60000;  -- Should show bonus > 0

SELECT name, salary, tax, bonus, net_salary 
FROM processed_payroll 
ORDER BY net_salary DESC;

2. Check Export File:

  • Look for processed_payroll_export.csv in project root
  • Open and verify calculations are correct
  • Header should be: id,name,salary,tax,bonus,net_salary

3. Check Batch History:

SELECT job_execution_id, status, start_time, end_time 
FROM BATCH_JOB_EXECUTION 
ORDER BY job_execution_id DESC;

Customization Options

Modify Tax and Bonus Rates

Edit EmpProcessor.java:

private static final double TAX_RATE = 0.15;      // Change to 15%
private static final double BONUS_RATE = 0.08;     // Change to 8%
private static final double BONUS_THRESHOLD = 50000;  // Change threshold

Change Chunk Size

Edit BatchConfig.java:

.chunk(1000, transactionManager)  // Process 1000 records per transaction

Add CSV Validation

Add to BatchConfig.reader():

reader.setStrict(true);  // Fail on malformed records
reader.setLinesToSkip(1);

Custom Export Filename

Edit ExportToCSV.csvWriter():

writer.setResource(new FileSystemResource("payroll_" + LocalDate.now() + ".csv"));

Performance Considerations

  • Chunk Size: 500 is balanced for 1000 records; use 1000-5000 for larger datasets
  • Database Connection Pool: HikariCP default settings are optimized
  • Transaction Management: Each chunk is a separate transaction for rollback safety
  • Retry Mechanism: 3 retries prevent transient failures from failing entire job

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages