Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use generic APIs for more type safety #388

Closed
fmbenhassine opened this issue Aug 16, 2020 · 1 comment
Closed

Use generic APIs for more type safety #388

fmbenhassine opened this issue Aug 16, 2020 · 1 comment

Comments

@fmbenhassine
Copy link
Member

The Record<P> API has been generic from its inception, since the record's payload can be of any type. However, as of v6.1, most (if not all) APIs using Record<P> are not generic: RecordReader, RecordWriter, etc:

public interface RecordReader {
   Record readRecord() throws Exception; // using raw type
   // open/close methods omitted
}

Using raw types in those key APIs has a few drawbacks:

  • It is not type safe
  • It requires additional casts in custom implementations of processors, mappers, writers, etc.
  • It generates "raw types usage" warnings

The following snippet is a valid job definition in v6 that fails with a ClassCastException at runtime:

Job job = new JobBuilder()
        .reader(new IterableRecordReader(Arrays.asList("foo", "bar")))
        .processor(new RecordProcessor<Record<Integer>, Record<Integer>>() {
            @Override
            public Record<Integer> processRecord(Record<Integer> record) throws Exception {
                return new GenericRecord<>(record.getHeader(), record.getPayload() + 1); // ClassCastException
            }
        })
        .writer(new StandardOutputRecordWriter())
        .build();

JobExecutor jobExecutor = new JobExecutor();
JobReport report = jobExecutor.execute(job);
jobExecutor.shutdown();

All APIs should be updated to use generics for more type safety, and especially JobBuilder which should enforce that record types are as expected and coherent between the reader, processor and writer. With that, the previous job definition should not compile until we explicitly specify the input/output types. A type safe version of such a job definition would be something like:

Job job = new JobBuilder<Integer, Integer>() // Explicit specification of the input/output types
        .reader(new IterableRecordReader<>(Arrays.asList(1 , 2)))
        .processor(new RecordProcessor<Integer, Integer>() {
            @Override
            public Record<Integer> processRecord(Record<Integer> record) {
                return new GenericRecord<>(record.getHeader(), record.getPayload() + 1);
            }
        })
        .writer(new StandardOutputRecordWriter<>())
        .build();
@fmbenhassine
Copy link
Member Author

The following APIs have been updated to implement this enhancement:

  • RecordReader (and its listener): 7ed62a5
  • Batch (and its listener): ba60977
  • RecordWriter (and its listener): d437aa5
  • RecordProcessor (and its listener / sub interfaces): 279ed29
  • Predicate: 063fcc5
  • JobBuilder: 6929936

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

1 participant