Skip to content

SourceGroove/spring-batch-file-layout

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 

Repository files navigation

Spring Batch File Layouts

  • Note - don't use anything prior to v 1.3.0

Overview

Spring Batch File Layouts aim to simplify the construction of File ItemReader/Writers for fixed width, delimited and Excel files.

This library sits on top of existing SpringBatch components and allows the developer to focus on defining a single file layout that can be used to both read and write flat files in a simplified manner.

If you are familiar with SpringBatch FlatFileItemReader/Writer, then this should be super-obvious. It's really just a builder/facade abstraction on top of Spring's out of the box components (standing on the shoulder's of giants).

Rationale

The need for this library arose from building systems that rely heavily on importing and exporting large, structured data sets represented as flat files. The systems needed to be able to import and export data according to a ton of different file specifications that:

  • Often consisted of more than 100+ fields
  • Usually contained multiple record types (that themselves had varying layouts - i.e. header, detail01, detail02, supplemental01, supplemental02, footer, trailer01, trailer02, etc..)
  • Changed periodically
  • Needed to be versioned over an effective date (i.e files generated on or after 1/1/2020, user file spec 1.2, everything before uses 1.1)

Using this library, we were able to create two simple SpringBatch jobs - one for import and one for export - we just supply the layout and a physical file at runtime.

Key Classes

  • FixedWidthFileLayout - used to construct a fixed width file layout that can return configured FileLayoutItemReader/Writers

  • DelimitedFileLayout - used to construct a delimited file layout that can return configured FileLayoutItemReader/Writers

  • ExcelFileLayout - used to construct an Excel file layout that can return configured FileLayoutItemReader/Writers

  • FileLayoutItemReader - an interface that extends ResourceAwareItemReaderItemStream and InitializingBean - used to enforce FileLayout implementations to return appropriate reader implementations.

  • FileLayoutItemWriter - an interface that extends ResourceAwareItemReaderWriterStream and InitializingBean - used to enforce FileLayout implementations to return appropriate writer implementations.

Notes

  • For fixed width files, use start/end (ranges) over width where possible (widths are sketchy when you start incorporating filler and custom formats.. )
  • The Excel item writer isn't implemented yet
  • Formats - Formatting is applied top down, where the most specific is used. For example, in the following scenario, the dateOfBirth field will end up getting the YYYYMM format.
FileLayout layout = new FixedWidthFileLayout()
     .editor(LocalDate.class, new LocalDateEditor("yyyyMMdd"))
     .record(MockUserRecord.class)
        .editor(LocalDate.class, new LocalDateEditor("yyyy-MM-dd"))
        .column("recordType", 1, 4)
        .column("username", 5, 10)
        .column("firstName", 11, 20)
        .column("lastName", 21, 30)
        .column("dateOfBirth", 31, 38, Format.YYYYMM)
    .build();

Because of this, the library will allow you to add editors in different ways:

  • Globally to all record types in the layout
  • Globally to all readers for all record types in the layout
  • Globally to all writers for all record types in the layout
  • At the record level for all readers and writers
  • At the record level for readers only
  • At the record level for writers only

Any editor added at the record level for the same object type defined at the global level will be overwritten for that record. i.e:

FileLayout layout = new FixedWidthFileLayout()
    .readEditor(LocalDate.class, new LocalDateEditor("yyyyMMdd"))
    .record(MockUserRecord.class, "USER")
        .readEditor(LocalDate.class, new LocalDateEditor("MM/dd/yyyy"))
        .column("dateOfBirth", 1, 7)
    .record(MockRoleRecord.class, "Role")
        .column("effectiveDate", 1, 7)
    .build();

In this case, the MockUserRecord#dateOfBirth will use 'MM/dd/yyyy' and the MockRoleRecord#effectiveDate will use yyyyMMdd for reading only

Additionally, column level formats will preside over all others.

Consider the following example where MockRecord has three LocalDate properties - favoriteYear, birthMonth and effectiveDate:

FileLayout layout = new FixedWidthFileLayout()
      .readEditor(LocalDate.class, new LocalDateEditor("yyyy-MM-dd"))
    .record(MockRecord.class)
        .column("favoriteYear", 1, 3, Format.YYYY)
        .column("birthMonth", 4, 9, Format.YYYYMM)
        .column("effectiveDate", 10, 19)
    .build();

The ItemReader will convert the values in the file to LocalDate objects using defaults for the missing values. So, if the first 4 columns of the file ar '2019', the favoriteYear property on the resulting bean will be January 1, 2019. The ItemWriter will write that bean's favoriteYear property to the first 4 columns of the file as '2019'.

Usage

Maven Dependency

<dependency>
  <groupId>com.github.sourcegroove</groupId>
  <artifactId>spring-batch-file-layout</artifactId>
  <version>1.3.10</version>
</dependency>

Gradle Dependency

implementation 'com.github.sourcegroove:spring-batch-file-layout:1.3.2'

Declarative file layouts are used to create ItemReaders and ItemWriters

All you need to do is define your delimited, fixed width or excel file layout and ask it for an item reader or wrier. By default, the implementations uses Spring's BeanWrapperFieldSetMapper & BeanWrapperFieldExtractor to map your POJO's properties to either column 'Ranges' in a fixed width file or column order in a delimited file.

You can, of course, override this by changing the FieldSetMapper/FieldExtractor on the ItemReader/Writers if you need to.

FileLayout layout = new FixedWidthFileLayout()
    .record(MockUserRecord.class)
        .column("recordType", 1, 4)
        .column("username", 5, 10)
        .column("firstName", 11, 20)
        .column("lastName", 21, 30)
        .column("dateOfBirth", 31, 38)
    .build();

LayoutItemReader<MockUserRecord> reader = layout.getItemReader();
LayoutItemWriter<MockUserRecord> writer = layout.getItemWriter();

File Layouts

There are three implementations of the FileLayout interface - FixedWidthFileLayout, DelimitedFileLayout & ExcelFileLayout.
Each of these contains a collection of 'record (or sheet) layouts' that define the records in the file.

Defining layouts in Java is simple, but they can also be dynamically defined at runtime using persisted data

Fixed Width Layouts

Fixed width file layouts define columns by the position they appear in the file. If the positions defined in the layout leave a 'gap' in between defined columns, filler (empty spaces) will be added to the line to fill the gap.

You can customize the way columns are serialized to text when being written by specifying a FixedWidthFormatBuilder.Format value in your column definition.

i.e.

    ...
    .column("dateOfBirth", 31, 38, FixedWidthFormatBuilder.Format.YYYYMM) // will format the date to YYYYMM
    ...

Current format options:

Enum Value Example Description
STRING "TEXT______" left aligned text
INTEGER "0000000123" left padded with 0's
ZD "123_______" left aligned number
DECIMAL "000000123.5" right aligned decimal with 2 digit decimal
YYYYMMDD "20190930" date formatted to YYYYMMDD
YYYYMM "201909" date formatted to YYYYMM
YYYY "2019" date formatted to YYYY
CONSTANT "__________" filler space (mostly used internally to fill gaps between columns

Fixed width file layout simple

FileLayout layout = new FixedWidthFileLayout()
    .record(MockUserRecord.class)
        .column("recordType", 1, 4)
        .column("username", 5, 10)
        .column("firstName", 11, 20)
        .column("lastName", 21, 30)
        .column("dateOfBirth", 31, 38)
    .build();

LayoutItemReader<MockUserRecord> reader = layout.getItemReader();
LayoutItemWriter<MockUserRecord> writer = layout.getItemWriter();

Fixed width file layout with custom property editors, multiple record types and custom column formats

FileLayout layout = new FixedWidthFileLayout()
    .linesToSkip(1)
    .record(MockUserRecord.class)
        .editor(LocalDate.class, new LocalDateEditor("yyyyMMdd"))
        .prefix("USER*")
        .column("recordType", 1, 4)
        .column("username", 5, 10)
        .column("firstName", 11, 20)
        .column("lastName", 21, 30)
        .column("dateOfBirth", 31, 38, FixedWidthFormatBuilder.Format.YYYYMM)
    .record(MockRoleRecord.class)
        .prefix("ROLE*")
        .column("recordType", 1, 4)
        .column("roleKey", 5, 8)
        .column("role", 9, 20)
    .build();

LayoutItemReader reader = layout.getItemReader();
LayoutItemWriter writer = layout.getItemWriter();

Fixed width file layout with header and footer callbacks

Header and footer write callbacks are implemented by defining a header and/or footer record and then 'enabling' the callback at runtime (so you can provide it with data).

FileLayout layout = new FixedWidthFileLayout()
    .header(MockHeaderRecord.class, "HEAD")
        .column("recordType",  1, 4)
        .column("recordCount", 5, 100, Format.INTEGER)
    .record(MockUserRecord.class, "USER")
        .editor(LocalDate.class, new LocalDateEditor("yyyyMMdd"))
        .column("recordType", 1, 4)
        .column("username", 5, 10)
        .column("firstName", 11, 20)
        .column("lastName", 21, 30)
        .column("dateOfBirth", 31, 38, FixedWidthFormatBuilder.Format.YYYYMM)
    .build();

... in your job ...

@Bean
@StepScope
public LayoutItemWriter<FileRecord> itemWriter() {
    LayoutItemWriter<MockUserRecord> writer = getLayout().getItemWriter();
    writer.setResource(resource);
    writer.enableHeaderCallback(recordCount);
    return writer;
}

Footer callback is identical

Delimited Layouts

Delimited layouts need to define the columns in the order they appear in the file. The delimiter and qualifier can be defined.

Delimited file layout

FileLayout layout = new DelimitedFileLayout()
        .linesToSkip(1)
        .record(MockUserRecord.class)
            .editor(LocalDate.class, new LocalDateEditor("yyyyMMdd"))
            .column("username")
            .column("firstName")
            .column("lastName")
            .column("dateOfBirth")
        .build();

LayoutItemReader<MockUserRecord> reader = layout.getItemReader();
LayoutItemWriter<MockUserRecord> writer = layout.getItemWriter();

Delimited file layout with custom qualifier and delimiter

FileLayout layout = new DelimitedFileLayout()
        .linesToSkip(1)
        .qualifier('~')
        .delimiter("|")
        .record(MockUserRecord.class)
            .editor(LocalDate.class, new LocalDateEditor("yyyyMMdd"))
            .column("username")
            .column("firstName")
            .column("lastName")
            .column("dateOfBirth")
        .build();

LayoutItemReader<MockUserRecord> reader = layout.getItemReader();
LayoutItemWriter<MockUserRecord> writer = layout.getItemWriter();

Excel Layouts

By default, the excel reader will read all sheets in the workbook and use the StreamingExcelItemReader implementation. There is a SimpleExcelItemReader which loads the entire workbook into memory, but I can't think of a valid reason to use this over the streaming implementation at this point.

Excel file layout simple

Layout layout = new ExcelLayout()
                .linesToSkip(1)
                .sheet(MockUserRecord.class)
                .column("username")
                .column("firstName")
                .column("lastName")
                .column("dateOfBirth")
                .editor(LocalDate.class, new LocalDateEditor())
                .layout();

LayoutItemReader<MockUserRecord> reader = layout.getItemReader();
LayoutItemWriter<MockUserRecord> writer = layout.getItemWriter();

Excel file layout one sheet

Layout layout = new ExcelLayout()
                .linesToSkip(1)
                .sheet(MockUserRecord.class)
                .sheetIndex(1) // reads the second sheet in the workbook
                .column("username")
                .column("firstName")
                .column("lastName")
                .column("dateOfBirth")
                .editor(LocalDate.class, new LocalDateEditor())
                .layout();

LayoutItemReader<MockUserRecord> reader = layout.getItemReader();
LayoutItemWriter<MockUserRecord> writer = layout.getItemWriter();

Excel file layout one sheet

Layout layout = new ExcelLayout()
                .linesToSkip(1)
                .sheet(MockUserRecord.class)
                .sheetIndex(1) // reads the second sheet in the workbook
                .column("username")
                .column("firstName")
                .column("lastName")
                .column("dateOfBirth")
                .editor(LocalDate.class, new LocalDateEditor())
                .layout();

LayoutItemReader<MockUserRecord> reader = layout.getItemReader();
LayoutItemWriter<MockUserRecord> writer = layout.getItemWriter();

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •  

Languages