New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NIFI-6680: Nifi PutKudu processor - Convert record field names to low… #3748
Conversation
@@ -103,6 +103,7 @@ private void setUpTestRunner(TestRunner testRunner) throws InitializationExcepti | |||
testRunner.setProperty(PutKudu.KUDU_MASTERS, DEFAULT_MASTERS); | |||
testRunner.setProperty(PutKudu.SKIP_HEAD_LINE, SKIP_HEAD_LINE); | |||
testRunner.setProperty(PutKudu.IGNORE_NULL, "true"); | |||
testRunner.setProperty(PutKudu.LOWERCASE_FIELD_NAMES, "true"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately the current tests are not capable of verifying the correct behaviour regarding this change.
Wether these settings are omitted or or changed to "false"
makes no difference.
This setProperty
could be removed (a bit misleading as it doesn't really do anything in the current state) and a couple targeted tests could be added, something like these:
@Test
public void testBuildPartialRowWithLowercaseKuduColumn() {
String kuduColumn = "name";
final String recordField = "NAME";
boolean lowercaseFields = true;
testBuildPartialRow(kuduColumn, recordField, lowercaseFields);
}
@Test
public void testBuildPartialRowWithUppercaseKuduColumn() {
String kuduColumn = "NAME";
final String recordField = "NAME";
boolean lowercaseFields = false;
testBuildPartialRow(kuduColumn, recordField, lowercaseFields);
}
private void testBuildPartialRow(String kuduColumn, String recordField, boolean lowercaseFields) {
// GIVEN
final Schema kuduSchema = new Schema(Arrays.asList(
new ColumnSchema.ColumnSchemaBuilder(kuduColumn, Type.STRING).nullable(true).build()
));
final RecordSchema schema = new SimpleRecordSchema(Arrays.asList(
new RecordField(recordField, RecordFieldType.STRING.getDataType())
));
Map<String, Object> values = new HashMap<>() {{
put(recordField, "foo");
}};
PartialRow row = kuduSchema.newPartialRow();
// WHEN
processor.buildPartialRow(
kuduSchema,
row,
new MapRecord(schema, values),
schema.getFieldNames(),
true,
lowercaseFields
);
// THEN
assertEquals("foo", row.getString(kuduColumn));
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @tpalfy, I will add a test to verify the correct behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tpalfy, I've pushed additional unit tests to ensure behavior is as expected.
@@ -412,6 +413,7 @@ private void buildPartialRow(Long id, String name, Short age) { | |||
kuduSchema.newPartialRow(), | |||
new MapRecord(schema, values), | |||
schema.getFieldNames(), | |||
true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think setting lowercaseFields
to true
is a bit misleading. At the moment this only works because
- All the
fieldNames
in the test are already in lower case. - There are no assertions that the
buildPartialRow
actually did set the values correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a unit test with uppercase schema.
f0fcd47
to
0e07e11
Compare
for (String colName : fieldNames) { | ||
String recordFieldName = colName; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know it's kinda minor, but I think this way it would be really less confusing and more consistent:
...
for (String recordFieldName : fieldNames) {
String colName = recordFieldName;
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed
@@ -366,54 +368,219 @@ public void testUpdateFlowFiles() throws Exception { | |||
|
|||
@Test | |||
public void testBuildRow() { | |||
buildPartialRow((long) 1, "foo", (short) 10); | |||
final Schema kuduSchema = new Schema(Arrays.asList( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's quite a few code duplication here. Not sure why even the existing abstraction was replaced with code duplication.
Also it might be better for tests to be as thin and focused as possible so that it's easier to understand what and how is being tested and to know what went wrong when the test fails.
To test different types we indeed need a more complex schema.
But to test only the lowercase functionality, a simply schema with a single field would be enough (unless the the logic depends on the type of the field).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I will revert back to the abstracted method and reduce the test size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since logic doesn't depend on type, I added a parameter in the buildPartialRow method to allow you to modify the name of the id kudu column and id record field and modified this argument for unit tests.
...u-bundle/nifi-kudu-processors/src/test/java/org/apache/nifi/processors/kudu/TestPutKudu.java
Show resolved
Hide resolved
...-kudu-bundle/nifi-kudu-processors/src/main/java/org/apache/nifi/processors/kudu/PutKudu.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM +1
(I'm still reluctant of
public class TestPutKudu {
...
private void setUpTestRunner(TestRunner testRunner) throws InitializationException {
...
testRunner.setProperty(PutKudu.LOWERCASE_FIELD_NAMES, "true");
as it sets the property to a non-default value while not really having any effect, but I guess it's not that big of a deal.)
9de788e
to
84559d4
Compare
@SandishKumarHN, Would you be able to merge these changes, or should I tag someone else? |
@kjmccarthy after build pass. I think @tpalfy should be able to do it. |
@tpalfy would you be able to merge this? |
…record field names to lowercase Signed-off-by: Joe Witt <joewitt@apache.org>
…record field names to lowercase Signed-off-by: Joe Witt <joewitt@apache.org>
Thank you for submitting a contribution to Apache NiFi.
Please provide a short description of the PR here:
Description of PR
Adds additional attribute that specifies whether or not to convert columns in the Kudu schema to lowercase when retrieving the column's index
In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:
For all changes:
Is there a JIRA ticket associated with this PR? Is it referenced
in the commit message?
Does your PR title start with NIFI-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.
Has your PR been rebased against the latest commit within the target branch (typically
master
)?Is your initial contribution a single, squashed commit? Additional commits in response to PR reviewer feedback should be made on this branch and pushed to allow change tracking. Do not
squash
or use--force
when pushing to allow for clean monitoring of changes.For code changes:
mvn -Pcontrib-check clean install
at the rootnifi
folder?LICENSE
file, including the mainLICENSE
file undernifi-assembly
?NOTICE
file, including the mainNOTICE
file found undernifi-assembly
?.displayName
in addition to .name (programmatic access) for each of the new properties?For documentation related changes:
Note:
Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible.