Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NIFI-2731 MergeContent default max number of flow files and max number of bins should be smaller #1071

Closed
wants to merge 1 commit into from

Conversation

pvillard31
Copy link
Contributor

No description provided.

@mosermw
Copy link
Member

mosermw commented Sep 28, 2016

For backwards compatibility purposes, I think changing the default 100 Maximum Number of Bins to 5 will have a large impact on users. Right now most people just leave this at the default. If they upgrade NiFi and they were regularly using 10 bins, for instance, then after the upgrade their MergeContent will behave a LOT differently.

What memory footprint are we targetting for this change? Even with a 512 MB heap, 100 bins does not cause problems.

I have really big concerns about this one. Can we discuss more?

@joewitt
Copy link
Contributor

joewitt commented Sep 29, 2016

@mosermw Keep in mind changing the default value for number of bins from 100 to 5 will not change peoples existing flows. I just verified this in both the flow.xml and a created template that indeed the default value is serialized and isn't just some UI trick. So, it just means that for NEW uses of MergeContent the initial default value will be 5. So I believe the main case you're laying out as a point of concern does not apply as it would still behave the way you'd prefer for existing uses. Does that eliminate your concern for this part or is there an aspect I am overlooking?

For the maximum number of entries value the previous setting of nothing meant there was no defined ceiling on the number of flow files which could be merged in a single pass. This unbounded sort of behavior can be problematic from a memory usage perspective so setting a small default value just means it will now be defined. Going from undefined to defined behavior could alter the behavior of the flow but of course since it was undefined there was no guarantee. It also seems like something that could be easily managed in the migration guide and something which would have behavior which is now easy to reason over. Now, if we were going from defined to undefined then I'd certainly think this would be inappropriate to do outside of probably a major release.

As for the target memory size the idea wasn't really about a target memory size but rather having a conservative default behavior which the user could adjust if they wanted to. The value of 100 for max bins was also not targeting a specific memory size when it was chosen and the decision to have the number of entries allowed in a bin by default unbounded was simply a mistake. What I observed was that on a vanilla install on a laptop I could easily create memory exhaustion using these defaults and the reason why became pretty clear as I had lots of open bins holding lots of flow file objects. Once I changed to the values mentioned here the behavior was far more intuitive.

All that said, MergeContent is a complicated animal and desperately needs a custom UI. That would also serve us quite well but I believe this proposed JIRA and PR is a good step unless I am overlooking some important detail.

@mosermw
Copy link
Member

mosermw commented Sep 29, 2016

Yes @joewitt you've alleviated my concerns about the default for Max Bins. I forgot that default values get serialized to the flow.xml when a new processor is dropped on the graph. Thanks.

I was on board with setting a default for Max Entries and not leaving that unbounded. It certainly can cause trouble otherwise, and new users won't know why.

@markap14
Copy link
Contributor

@pvillard31 looks good. +1 merged to master. Unfortunately I forgot to include the magical "This closes #1071." message, though. Can you close this PR?

@pvillard31 pvillard31 closed this Oct 13, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants