Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NIFI-6931 Add status-history graph to track backpressure estimates #3918

Closed
wants to merge 1 commit into from

Conversation

jmark99
Copy link
Contributor

@jmark99 jmark99 commented Dec 5, 2019

NIFI-6931 Add a new status history graph that tracks the time-to-backpressure
estimates that are calculated when utilizing the prediction analytics
engine. When using the analytics engine, the graph will allow users to
view how the backpressure estimates trend over time. A user defined
threshold provides a value to determine the maximum value displayed
onthe graph. The default is 6 hrs.

For all changes:

  • [ x] Is there a JIRA ticket associated with this PR? Is it referenced
    in the commit message?

  • [ x] Does your PR title start with NIFI-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.

  • [ x] Has your PR been rebased against the latest commit within the target branch (typically master)?

  • [ x] Is your initial contribution a single, squashed commit? Additional commits in response to PR reviewer feedback should be made on this branch and pushed to allow change tracking. Do not squash or use --force when pushing to allow for clean monitoring of changes.

For code changes:

  • [x ] Have you ensured that the full suite of tests is executed via mvn -Pcontrib-check clean install at the root nifi folder?
  • [ x] Have you written or updated unit tests to verify your changes?
  • [ x] Have you verified that the full build is successful on both JDK 8 and JDK 11?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE file, including the main LICENSE file under nifi-assembly?
  • If applicable, have you updated the NOTICE file, including the main NOTICE file found under nifi-assembly?
  • If adding new Properties, have you added .displayName in addition to .name (programmatic access) for each of the new properties?

For documentation related changes:

  • Have you ensured that format looks appropriate for the output in which it is rendered?

Note:

Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible.

Add a new status history graph that tracks the time-to-backpressure
estimates that are calculated when utilizing the prediction analytics
engine. When using the analytics engine, the graph will allow users to
view how the backpressure estimates trend over time. A user defined
threshold provides a value to determine the maximum value displayed
onthe graph. The default is 6 hrs.
@@ -321,10 +322,13 @@
// analytics defaults
public static final String DEFAULT_ANALYTICS_PREDICTION_ENABLED = "false";
public static final String DEFAULT_ANALYTICS_PREDICTION_INTERVAL = "3 mins";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this change in query interval impact other components which use the analytics engine? Can you explain the increase from 3 -> 5 min?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was an issue at the time of development where if the value for the DEFAULT_ANALYTICS_QUERY_INTERVAL and DEFAULT_ANALYTICS_PREDICTION_INTERVAL were identical then we were seeing invalid computations due to a difference in the way some threads were accessing the data. A fix was to increase the query interval to be larger than the prediction interval. This should not affect the computations since they are based upon the prediction interval. It would perhaps delay the first computation from being calculated by two minutes, after that there would be no effect. This problem may be corrected with NIFI-6801. If so, we can revert the value back to 3 at a later time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. Thanks!

public static void setNifiProperties(NiFiProperties nifiProperties) {
ConnectionStatusDescriptor.nifiProperties = nifiProperties;
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be my ignorance, but are labels intended to be immutable? Is modifying them a bad thing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was labeled as 'final' originally, but I'm not sure if that is because it is thought that labels are always immutable or rather the case for these particular labels changing was not foreseen. I wanted a means for the user to know when viewing the graph that the reason it was at time 0 was not due to a queue overflow but rather that the analytic engine was not being utilized. I'm open to other means of getting that info across if you have any suggestions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be good to get @markap14 's feedback here since he's the original author.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @markap14, would you be able to take a look at this and offer your thoughts? Thanks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@m-hogue I haven't received any feedback from @markap14 at this point. Should we continue to wait or consider merging?


TIME_TO_BACKPRESSURE(
"backpressureEstimate",
"Backpressure Estimate",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might clarify that the time is displayed in millis. Stating that the default is 6 hours might give the user the impression that the displayed time is also hours.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the graph axis the time is displayed as HH:MM:SS. This link shows an image of what the graph looks like: https://github.com/jmark99/nifi-images/blob/master/count-graph.png
The image is slightly older and the "(count)" specifier is no longer there. If you feel there would still be confusion I can updated the names of the variables to add something like 'millis' or adjust comments, etc. Let me know what you think.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think updating the names of variables would help.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After going back and re-running a nifi instance with status analytics enabled I was reminded that the backpressureEstimate variable above is only used as a reference to the graph label. Since the graph does display the time estimate in hours I don't think there should be any confusion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah - i wasn't aware of that. No worries from me then.

public final static String DEFAULT_ANALYTICS_CONNECTION_MODEL_IMPLEMENTATION = "org.apache.nifi.controller.status.analytics.models.OrdinaryLeastSquares";
public static final String DEFAULT_ANALYTICS_CONNECTION_SCORE_NAME = "rSquared";
public static final double DEFAULT_ANALYTICS_CONNECTION_SCORE_THRESHOLD = .90;
// This is the time, below which the status history graph will actively display
// backpressure estimates if prediction analytics are enabled.
public static final String DEFAULT_ANALYTICS_STATUS_GRAPH_THRESHOLD = "6 hrs";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a rationale for using 6 hours?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was a value chosen at the suggestion of someone familiar with the project. Beyond that, no. It can be set to whatever the user desires, but if it gets too large then the graph can get difficult to read.

@github-actions
Copy link

We're marking this PR as stale due to lack of updates in the past few months. If after another couple of weeks the stale label has not been removed this PR will be closed. This stale marker and eventual auto close does not indicate a judgement of the PR just lack of reviewer bandwidth and helps us keep the PR queue more manageable. If you would like this PR re-opened you can do so and a committer can remove the stale tag. Or you can open a new PR. Try to help review other PRs to increase PR review bandwidth which in turn helps yours.

@github-actions github-actions bot added the Stale label Apr 25, 2021
@github-actions github-actions bot closed this May 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3 participants