Skip to content

Conversation

@Srilatha-ramreddy
Copy link
Contributor

@Srilatha-ramreddy Srilatha-ramreddy commented Mar 7, 2025

…s on Attributes

Summary

NIFI-14337

Tracking

Please complete the following tracking steps prior to pull request creation.

Issue Tracking

Pull Request Tracking

  • Pull Request title starts with Apache NiFi Jira issue number, such as NIFI-00000
  • Pull Request commit message starts with Apache NiFi Jira issue number, as such NIFI-00000

Pull Request Formatting

  • Pull Request based on current revision of the main branch
  • Pull Request refers to a feature branch with one commit containing changes

Verification

Please indicate the verification steps performed prior to pull request creation.

Build

  • Build completed using mvn clean install -P contrib-check
    • JDK 21

Licensing

  • New dependencies are compatible with the Apache License 2.0 according to the License Policy
  • New dependencies are documented in applicable LICENSE and NOTICE files

Documentation

  • Documentation formatting appears as expected in rendered files

Copy link
Contributor

@exceptionfactory exceptionfactory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for proposing this improvement @Srilatha-ramreddy.

Although having an optional property works, it is not immediately clear that this would alter the behavior. To make the implementation clearer, it would be helpful to introduce an additional strategy property. The property could be named JSON Source and could have values of Attribute or FlowFile using an enum that implements DescribedValue to bound the supported options. The property would be required, and would default to FlowFile, maintaining the current behavior. The new FlowFile Attribute property would depend on this strategy property.

Although that approach means introducing an additional property, it would make the configured behavior of the Processor much clearer. Feel free to raise any questions about this strategy.

Comment on lines 68 to 69
.name("Json Attribute")
.displayName("Json Attribute")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The displayName() method is not needed when the name() is the same. I also recommend renaming the property to FlowFile Attribute.

Suggested change
.name("Json Attribute")
.displayName("Json Attribute")
.name("FlowFile Attribute")

Comment on lines 484 to 485
final Map<String, String> attributes = Collections.singletonMap("jsonAttr",
"{\"rating\":{\"primary\":{\"value\":3},\"series\":{\"value\":[5,4]},\"quality\":{\"value\":}}}");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could simply use Map.of like you do on line 497. Similar comment for line 511.

Suggested change
final Map<String, String> attributes = Collections.singletonMap("jsonAttr",
"{\"rating\":{\"primary\":{\"value\":3},\"series\":{\"value\":[5,4]},\"quality\":{\"value\":}}}");
final Map<String, String> attributes = Map.of("jsonAttr",
"{\"rating\":{\"primary\":{\"value\":3},\"series\":{\"value\":[5,4]},\"quality\":{\"value\":}}}");

Comment on lines 61 to 62
+ "When 'Json Source' is set to FLOW_FILE, the FlowFile content is transformed and the modified FlowFile is routed to 'success' relationship. "
+ "When 'Json Source' is set to ATTRIBUTE, the specified attribute's value is transformed and updated in place, with the FlowFile routed to 'success' relationship. "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
+ "When 'Json Source' is set to FLOW_FILE, the FlowFile content is transformed and the modified FlowFile is routed to 'success' relationship. "
+ "When 'Json Source' is set to ATTRIBUTE, the specified attribute's value is transformed and updated in place, with the FlowFile routed to 'success' relationship. "
+ "When 'Json Source' is set to FLOW_FILE, the FlowFile content is transformed and the modified FlowFile is routed to the 'success' relationship. "
+ "When 'Json Source' is set to ATTRIBUTE, the specified attribute's value is transformed and updated in place, with the FlowFile routed to the 'success' relationship. "

jsonSourceAttributeName = context.getProperty(JSON_SOURCE_ATTRIBUTE).evaluateAttributeExpressions(original).getValue();
final String jsonSourceAttributeValue = original.getAttribute(jsonSourceAttributeName);
if (StringUtils.isBlank(jsonSourceAttributeValue)) {
logger.error("FlowFile attribute value evaluated to null");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StringUtils.isBlank is not only when a string is null. It can even be when the string is empty or only has white space.

Suggested change
logger.error("FlowFile attribute value evaluated to null");
logger.error("FlowFile attribute value was blank);

final boolean isSourceFlowFileContent = SourceStrategy.FLOW_FILE == context.getProperty(JSON_SOURCE).asAllowableValue(SourceStrategy.class);
String jsonSourceAttributeName = null;

if (isSourceFlowFileContent ) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (isSourceFlowFileContent ) {
if (isSourceFlowFileContent) {

@Test
void testJsonAttributeNotInitialised() throws IOException {
runner.setProperty(JoltTransformJSON.JSON_SOURCE, SourceStrategy.ATTRIBUTE);
runner.setProperty(JoltTransformJSON.JOLT_SPEC, "./src/test/resources/specs/shiftrSpec.json");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this value "./src/test/resources/specs/shiftrSpec.json" being used a total of three times by you on this line, line 481, and line 494. Another test uses the same value on line 225. In addition there is another form of this same file used "src/test/resources/specs/shiftrSpec.json" without the leading ./ on lines 214 and 255. Please make a private static final String variable with one of these values and use it in all six places.

runner.setProperty(JoltTransformJSON.JSON_SOURCE, SourceStrategy.ATTRIBUTE);
runner.setProperty(JoltTransformJSON.JOLT_SPEC, "./src/test/resources/specs/shiftrSpec.json");
runner.setProperty(JoltTransformJSON.JOLT_TRANSFORM, JoltTransformStrategy.SHIFTR);
runner.setProperty(JoltTransformJSON.JSON_SOURCE_ATTRIBUTE, "jsonAttr");
Copy link
Contributor

@dan-s1 dan-s1 Mar 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this value "jsonAttr" being used 8 times for defining the name of the attribute (this line and on lines 483, 484, 498, 510, 511, 523 and 525). Please make a private static final String variable and use it in each of those places.

runner.assertNotValid();
}

@Test
Copy link
Contributor

@dan-s1 dan-s1 Mar 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@exceptionfactory I see 4 new unit tests which are configured very similarly. Do you think these four tests should be folded into a JUnit5 ParamaterizedTest and use the MethodSource annotation to define a method which would return the necessary arguments for each of these tests?

@Srilatha-ramreddy
Copy link
Contributor Author

@exceptionfactory @dan-s1 Thanks for the feedback. Review comments are now addressed and is Ready for review. Thanks

@exceptionfactory
Copy link
Contributor

Thanks for the updates @Srilatha-ramreddy, please review the Checkstyle warnings. I will take a closer look at the latest version soon.

Warning:  src/test/java/org/apache/nifi/processors/jolt/TestJoltTransformJSON.java:[478] (sizes) LineLength: Line is longer than 200 characters (found 213).

Comment on lines 475 to 484
return Stream.of(
Arguments.of(JSON_SOURCE_ATTR_NAME, null, SHIFTR_SPEC_PATH,
JoltTransformStrategy.SHIFTR, false, null),
Arguments.of(JSON_SOURCE_ATTR_NAME, Map.of(JSON_SOURCE_ATTR_NAME, INVALID_INPUT_JSON), SHIFTR_SPEC_PATH,
JoltTransformStrategy.SHIFTR, false, null),
Arguments.of("${dynamicJsonAttr}", Map.of("dynamicJsonAttr", JSON_SOURCE_ATTR_NAME, JSON_SOURCE_ATTR_NAME, EXPECTED_JSON), SHIFTR_SPEC_PATH,
JoltTransformStrategy.SHIFTR, true, SHIFTR_JSON_OUTPUT),
Arguments.of(JSON_SOURCE_ATTR_NAME, Map.of(JSON_SOURCE_ATTR_NAME, EXPECTED_JSON), CHAINR_SPEC_PATH,
JoltTransformStrategy.CHAINR, true, CHAINR_JSON_OUTPUT)
);
Copy link
Contributor

@dan-s1 dan-s1 Mar 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Srilatha-ramreddy Thanks for adding the ParamaterizedTest. That is what I had in mind. I am requesting one minor change, instead of using Arguments.of please use Arguments.argumentSet so you can give a meaningful name to each of the tests. I was going to start from the original names of the unit tests you had started with although I no longer see that commit as it seems you squashed it. Please note in general once the PR has been submitted no squashes should be done. Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dan-s1 Thanks for the feedback. Please review the latest commit and will also not squash the commits anymore.

Copy link
Contributor

@exceptionfactory exceptionfactory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates @Srilatha-ramreddy. I noted a few minor adjustments, and one larger question regarding Expression Language evaluation. As mentioned, support for Expression Language opens up some additional possibilities that should be considered.

try {
inputJson = jsonUtil.jsonToObject(jsonSourceAttributeValue);
} catch (final Exception e) {
logger.error("JSON parsing failed on FlowFile attribute for {}", original, e);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be a good idea to include the name of the attribute like you did above.

Suggested change
logger.error("JSON parsing failed on FlowFile attribute for {}", original, e);
logger.error("JSON parsing failed on attribute '{}' of FlowFile {}", jsonSourceAttributeName, original, e);

Comment on lines 206 to 207
logger.info("Transform completed on FlowFile attribute for {}", original);
}
Copy link
Contributor

@dan-s1 dan-s1 Mar 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, it may be beneficial to name the attribute which was transformed so it is clear in the logs

Suggested change
logger.info("Transform completed on FlowFile attribute for {}", original);
}
logger.info("Transform completed on attribute {} of FlowFile {}", sonSourceAttributeName, original);
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dan-s1 Updated as per feedback.

Copy link
Contributor

@exceptionfactory exceptionfactory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working through the feedback @Srilatha-ramreddy, and thanks for the reviews as well @dan-s1, the latest version looks good! +1 merging

TomaszK-stack pushed a commit to TomaszK-stack/nifi that referenced this pull request May 5, 2025
This closes apache#9785

Signed-off-by: David Handermann <exceptionfactory@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants