-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CSV Realtime Decoder #8658
CSV Realtime Decoder #8658
Conversation
Codecov Report
@@ Coverage Diff @@
## master #8658 +/- ##
=============================================
+ Coverage 29.36% 66.13% +36.76%
- Complexity 0 4385 +4385
=============================================
Files 1691 1277 -414
Lines 89308 64445 -24863
Branches 13530 10018 -3512
=============================================
+ Hits 26225 42620 +16395
+ Misses 60671 18806 -41865
- Partials 2412 3019 +607
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
|
||
public class CSVMessageDecoder implements StreamMessageDecoder<byte[]> { | ||
|
||
private static final String CONFIG_FILE_FORMAT = "csv.fileFmt"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the properties is under the context of the decoder, we may remove the csv.
prefix for simplicity and consistency with other decoders. Also, suggest using the full name for the key, e.g. fileFormat
, header
, delimiter
etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated. My thought process was that
...decoder.props.csv.delimiter
is more idiomatic than:
...decoder.props.delimiter
, as the former tells the users that this property configures the delimiter of the CSV record. But I think this prefix is not really needed as the user already configures the decoder class in the config, so he would already know what these props are about even without the prefix. Thanks!
format = CSVFormat.TDF; | ||
break; | ||
default: | ||
format = CSVFormat.DEFAULT; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Log a warning here stating that we cannot recognize the format, and fall back to the default
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
String incomingRecord = "Alice;18;F"; | ||
GenericRow destination = new GenericRow(); | ||
messageDecoder.decode(incomingRecord.getBytes(StandardCharsets.UTF_8), destination); | ||
Assert.assertNotNull(destination.getValue("name")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(minor) We may static import these Assert
methods in tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@Jackie-Jiang Requesting re-review, thanks :) |
@Jackie-Jiang @npawar Just a reminder, thanks :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with some minor comments. Thanks for contributing the feature!
format = CSVFormat.TDF; | ||
break; | ||
default: | ||
LOGGER.info("Could not recognise the configured CSV file format: {}, falling back to DEFAULT format", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Log warning
for unrecognized format
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ohh missed it, done.
if (csvFormat == null) { | ||
format = CSVFormat.DEFAULT; | ||
} else { | ||
switch (csvFormat.toUpperCase()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a case for the "DEFAULT"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
} else { | ||
format = format.withHeader(StringUtils.split(csvHeader, csvDelimiter)); | ||
} | ||
if (props.containsKey(CONFIG_COMMENT_MARKER)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(minor) Follow the same way aw csvDelimiter
to reduce one map lookup (get and check if it is null
), same for other configs
format = format.withHeader(StringUtils.split(csvHeader, csvDelimiter)); | ||
} | ||
if (props.containsKey(CONFIG_COMMENT_MARKER)) { | ||
Character commentMarker = props.get(CONFIG_COMMENT_MARKER).charAt(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(minor) no need to boxing, same for other configs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inlined all of the configs.
#8617