Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NIFI-8613: Improve FlattenJson Processor #5083

Closed
wants to merge 1 commit into from
Closed

Conversation

naddym
Copy link
Contributor

@naddym naddym commented May 18, 2021

Thank you for submitting a contribution to Apache NiFi.

Description of PR

Please provide a short description of the PR here:

Improvement to FlattenJson Processor has following changes:

  • Unflattening a flattened json
  • Preserving primitive arrays such as strings, numbers, booleans and null in a nested json
  • Logging errors when failure
  • Pretty printing resulted json

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

For all changes:

  • Is there a JIRA ticket associated with this PR? Is it referenced
    in the commit message?

  • Does your PR title start with NIFI-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.

  • Has your PR been rebased against the latest commit within the target branch (typically main)?

  • Is your initial contribution a single, squashed commit? Additional commits in response to PR reviewer feedback should be made on this branch and pushed to allow change tracking. Do not squash or use --force when pushing to allow for clean monitoring of changes.

For code changes:

  • Have you ensured that the full suite of tests is executed via mvn -Pcontrib-check clean install at the root nifi folder?
  • Have you written or updated unit tests to verify your changes?
  • Have you verified that the full build is successful on JDK 8?
  • Have you verified that the full build is successful on JDK 11?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE file, including the main LICENSE file under nifi-assembly?
  • If applicable, have you updated the NOTICE file, including the main NOTICE file found under nifi-assembly?
  • If adding new Properties, have you added .displayName in addition to .name (programmatic access) for each of the new properties?

For documentation related changes:

  • Have you ensured that format looks appropriate for the output in which it is rendered?

Note:

Please ensure that once the PR is submitted, you check GitHub Actions CI for build issues and submit an update to your PR as soon as possible.

Copy link
Contributor

@exceptionfactory exceptionfactory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution @naddym! The addition of the unflatten option looks useful. The changes look good in general, I noted a few a couple questions about the character set expectations and included a recommendation to adjust error logging.


flowFile = session.write(flowFile, os -> os.write(flattened.getBytes()));
final StringBuilder contents = new StringBuilder();
session.read(flowFile, in -> contents.append(IOUtils.toString(in, Charset.defaultCharset())));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although the previous approach relied on the system default character set when converting from byte array to string, what do you think about either making UTF-8 the standard character set, or adding a new processor property to configure the character set?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, let me add new character set property to stay consistent with other processors. Thanks.

.unflatten();
}

flowFile = session.write(flowFile, out -> out.write(resultedJson.getBytes()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to the comment on handling input, specifying a character set on String.getBytes() would clarify the expected output as opposed to relying on the system defaults.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will do.

if (returnType.equals(RETURN_TYPE_FLATTEN)) {
resultedJson = new JsonFlattener(contents.toString())
.withFlattenMode(flattenMode)
.withSeparator(separator.charAt(0))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The separator character could be declared once outside of the conditional and reused as opposed to calling separator.charAt(0) in both conditional blocks.


session.transfer(flowFile, REL_SUCCESS);
} catch (Exception ex) {
} catch (Exception e) {
getLogger().error("Failed to {} json due to {}", new Object[]{returnType, e});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recent updates to the logger interface now allow passing placeholder values as variable arguments. If the goal is to include the stack trace of the exception, which would be helpful, then the second placeholder should be removed from the log message string.

Suggested change
getLogger().error("Failed to {} json due to {}", new Object[]{returnType, e});
getLogger().error("Failed to {} JSON", returnType, e);

@naddym
Copy link
Contributor Author

naddym commented May 19, 2021

Thank you @exceptionfactory for the detailed review. All suggestions pointed out looks good, will work on changing them..

@@ -157,25 +203,36 @@ public void onTrigger(final ProcessContext context, final ProcessSession session
final String mode = context.getProperty(FLATTEN_MODE).getValue();
final FlattenMode flattenMode = getFlattenMode(mode);

String separator = context.getProperty(SEPARATOR).evaluateAttributeExpressions(flowFile).getValue();

final Character separator = context.getProperty(SEPARATOR).evaluateAttributeExpressions(flowFile).getValue().charAt(0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making the updates @naddym, looks like the automated builds failed due to missing a trailing ) on this line.

Suggested change
final Character separator = context.getProperty(SEPARATOR).evaluateAttributeExpressions(flowFile).getValue().charAt(0;
final Character separator = context.getProperty(SEPARATOR).evaluateAttributeExpressions(flowFile).getValue().charAt(0);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, Somehow missed it while copying after test. Thanks again for commenting.

- Unflattening a flattened json
- Preserving primitive arrays such as strings, numbers, booleans and null in a nested json
- Logging errors when failure
- Pretty printing resulted json
Copy link
Contributor

@exceptionfactory exceptionfactory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested flatten and unflatten operations, changes look good, thanks @naddym! +1 Merging.

@asfgit asfgit closed this in c113960 May 20, 2021
krisztina-zsihovszki pushed a commit to krisztina-zsihovszki/nifi that referenced this pull request Jun 28, 2022
- Unflattening a flattened json
- Preserving primitive arrays such as strings, numbers, booleans and null in a nested json
- Logging errors when failure
- Pretty printing resulted json

This closes apache#5083

Signed-off-by: David Handermann <exceptionfactory@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants