Skip to content

Conversation

@apiri
Copy link
Member

@apiri apiri commented Dec 1, 2015

@joemeszaros Had a user open an issue concerning the processor's handling of events with only one record. I agree that the functionality seems a bit odd and was wondering if there was a specific case that would be missed should the containerOption be the only item to dictate format.

apiri added a commit to apiri/incubator-nifi that referenced this pull request Dec 1, 2015
@apiri
Copy link
Member Author

apiri commented Dec 2, 2015

@trkurc good catches on the comments. Overall would you feel like this is a bug? That's the impression I get and trying to think if this is adjusted is it breaking? I imagine it could be but have a hard time envisioning that behavior being desirable. Thoughts?

@trkurc
Copy link
Contributor

trkurc commented Dec 2, 2015

both the code and processor description strongly indicated this was intentional.

"If an incoming FlowFile contains a stream of multiple Avro records, the resultant FlowFile will contain a JSON Array containing all of the Avro records."

@olegz
Copy link
Contributor

olegz commented Dec 2, 2015

I am on the side of @apiri. I can't imagine that was the desirable behavior either, since it was simply inconsistent. If one explicitly requests an array, than it should return an array, no matter how many elements in it.

apiri added a commit to apiri/incubator-nifi that referenced this pull request Dec 2, 2015
@apiri
Copy link
Member Author

apiri commented Dec 2, 2015

It covers the case of records but not record. I read the documentation to mean that as long as the input flowfile was of Avro formatted record(s), it would perform the associated conversion but admittedly the comments in the code provide much more context.

Regardless, have updated the comments and pushed the branch.

@trkurc
Copy link
Contributor

trkurc commented Dec 2, 2015

So, some evidence that this was intentional.

  1. it is harder to handle the single record case without arrays. @markap14 appears to have contributed this as part of NIFI-855 (even though @joemeszaros added the comments and was the last to touch these lines of code)
  2. the processor description clearly states it has different behavior with multiple records
  3. It makes sense to me that if you were administering a flow that only ever had a single avro record per flow file, you would NOT want it inside an array

I believe 3 is a valid use case, and this change may break flows. I highly recommend adding a flag to preserve this behavior. I explained more in jira (https://issues.apache.org/jira/browse/NIFI-1234)

@apiri
Copy link
Member Author

apiri commented Dec 2, 2015

@trkurc
I am failing at the Github and JIRA comment game, but as commented on the issue (and I am additionally editing myself for context here):

Sorry, missed some of your earlier comments (these were from last night when you expanded the cases) in flipping between here and Github. Also hate the property bloat, but think it is necessary to avoid the breaking change. Had a hard time grokking the use cases that steered it away from bug status, so am onboard with your viewpoint.

… records and adjusting formatting in ConvertAvroToJSON.

This closes apache#136.
@apiri
Copy link
Member Author

apiri commented Dec 2, 2015

Made the changes to include a property which defaults to the old behavior but enables control over whether a single record is placed in a container

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For next time, comment is good, but parens would make the logic more clear for the next coder to come along:

if ((recordCount > 1 && useContainer) || wrapSingleRecord) {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, isn't that the wrong logic? shouldn't it be

if (useContainer && (recordCount > 1 || wrapSingleRecord))

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is correct.

useContainer only applies to the case where there is more than one record. Those cases where there are zero or one are driven by wrapSingleRecord which incorporates whether or not the container should be used. Happy to rework this (and the grouping) if I can get some better clarity. Could certainly alleviate the line where it is defined earlier.

wrapSingleRecord is defined as:
final boolean wrapSingleRecord = context.getProperty(WRAP_SINGLE_RECORD).asBoolean() && useContainer;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see now from line 127 that this is what is happening

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@apiri how logic flows through code is a personal preference. I was confused for a spell, but there were unit tests checking the behavior.

@asfgit asfgit closed this in ecc240b Dec 3, 2015
@trkurc
Copy link
Contributor

trkurc commented Dec 3, 2015

@olegz - I am also guilty of missing comments between github and jira. the choice of array or none for container wasn't in 0.3.0, so it wasn't an explicit option then. I was basing a lot of my assumptions for "correct" behavior on the original patch submitted for NIFI-855

@apiri apiri deleted the NIFI-1234 branch December 9, 2015 15:45
mattyb149 pushed a commit to mattyb149/nifi that referenced this pull request Dec 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants