NIFI-8113 Adding persistent status history repository backed by embed… #4821

simonbence · 2021-02-11T09:14:01Z

This would be a proposal for having persistent status history for both component and node level metrics. During the implementation I tried to balance between introducing some flexibility in usage and supporting the existing behaviour. As an end result, I did split the component status repository (now it is StatusRepository) into one which is responsible for component metrics and an other responsible for node level metrics. They might be configured independently on some level (like it is possible to store data into a different storage or for a different amount of time window), but in order to support the previous configuration, there is a facade for them which provides the same composite service as it did before (component and node).

As for code organisational part I worked with three concepts: repository is the top level entity, provides the service for the clients (FlowController, etc.). In general, this is the same as before, only split into two parts: node and component. The storage classes are part of the repositories and are responsible for manage the details of a given type, like processor status, node status, etc. Finally the WriterTemplates and ReaderTemplates are merely helpers exist to deal with QuestDB API calls.

Some remarks on design decisions:

I kept the name for VolatileComponentStatusRepository, but the actual repositories are using the prefix InMemory to show more contrast with QuestDB
The PR contains a small fix on ProcessGroupStatusDescriptor#calculateTaskMillis(): previously every subsequent level makes a nano->milli conversion, which accumulates, reducing the task time in children groups into 0. Now this should be fixed (The QuestDB tests are implicitly proves this as well)
Configuration contains the new parameters but they are commented out. At this point I think, the VolatileComponentStatusRepository should be kept as default
The implementation depends on the latest of the 4.X version of QuestDB. Currently QuestDB is at 5.0.5, but the 5.X line depends on Java 11. There are some non vital features in 5.X (all_tables request, dropping partition using WHERE closure), but these are not unavoidable for us.

Thank you for investing your time to my PR!
In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

For all changes:

Is there a JIRA ticket associated with this PR? Is it referenced
in the commit message?
Does your PR title start with NIFI-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.
Has your PR been rebased against the latest commit within the target branch (typically main)?
Is your initial contribution a single, squashed commit? Additional commits in response to PR reviewer feedback should be made on this branch and pushed to allow change tracking. Do not squash or use --force when pushing to allow for clean monitoring of changes.

For code changes:

Have you ensured that the full suite of tests is executed via mvn -Pcontrib-check clean install at the root nifi folder?
Have you written or updated unit tests to verify your changes?
Have you verified that the full build is successful on JDK 8?
Have you verified that the full build is successful on JDK 11?
If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
If applicable, have you updated the LICENSE file, including the main LICENSE file under nifi-assembly?
If applicable, have you updated the NOTICE file, including the main NOTICE file found under nifi-assembly?
If adding new Properties, have you added .displayName in addition to .name (programmatic access) for each of the new properties?

For documentation related changes:

Have you ensured that format looks appropriate for the output in which it is rendered?

Note:

Please ensure that once the PR is submitted, you check GitHub Actions CI for build issues and submit an update to your PR as soon as possible.

turcsanyip

@simonbence I ran through the code and it looks good overall.
Added some comments so far. Will continue with testing/checking it in more detail.

nifi-nar-bundles/nifi-framework-bundle/nifi-framework-nar/src/main/resources/META-INF/NOTICE

.../src/main/java/org/apache/nifi/controller/status/history/questdb/QuestDbWritingTemplate.java

...src/main/java/org/apache/nifi/controller/status/history/ComponentStatusRepositoryFacade.java

turcsanyip · 2021-02-16T21:11:55Z

...c/main/java/org/apache/nifi/controller/status/history/VolatileComponentStatusRepository.java

-    private final int numDataPoints;
-    private volatile long lastCaptureTime = 0L;
+    private final NodeStatusRepository nodeStatusRepository;
+    private final ComponentStatusRepository componentStatusRepository;


As VolatileComponentStatusRepository provides backward compatibility for the in memory storage and in the background it just delegates requests to the new InMemory*StatusRepositories, using those classes instead of the interface types would fit to the role of this class better.

You are right with that, however the instance creation is managed by the InMemoryStatusRepositoryBuilder, which is an implementation for StatusRepositoryBuilder. This comes with that it will not expose the implementation class it returns with and I wish not to expose that. (Also: adding an other way to create the instances would bring in code duplication and unnecessary complexity in my opinion)

turcsanyip · 2021-02-16T21:28:31Z

.../nifi-framework-bundle/nifi-framework/nifi-resources/src/main/resources/conf/nifi.properties

+# nifi.status.repository.builder.inmemory=org.apache.nifi.controller.status.history.InMemoryStatusRepositoryBuilder
+# nifi.status.repository.builder.persistent=org.apache.nifi.controller.status.history.EmbeddedQuestDbStatusRepositoryBuilder
+# nifi.status.repository.roles.component=persistent
+# nifi.status.repository.roles.node=inmemory


I believe the new InMemory implementation should be the default instead of adding these lines commented out.
As far as I understand, the backward compatibility point here is to support old nifi.properties files where only nifi.components.status.repository.implementation property exists. And it is provided / works even if the default is not the old property.
New installations could (should) go with the new configuration / implementation classes.

...ore/src/main/java/org/apache/nifi/controller/status/history/EmbeddedQuestDbStatusWriter.java

...ain/java/org/apache/nifi/controller/status/history/questdb/QuestDbEntityWritingTemplate.java

...c/main/java/org/apache/nifi/controller/status/history/storage/BufferedNodeStatusStorage.java

markap14

Thanks @simonbence! This is a VERY powerful addition! Code looks good to me. Did some minimal testing, but I will want to do some more testing before fully giving it a +1. The only real concern that I had with the PR is the update to the nifi.properties. There's a LOT going on there. I think there are 2 reasons for that. Firstly, there are some properties that I think really can be removed - they add flexibility but IMO aren't really necessary and add to the complexity of configuring. Secondly, the properties tend to have a lot of comments associated with them. We need to instead keep comments fairly minimal and update the Admin Guide to fully document each of these properties. What effect will they have, etc.

Otherwise, all looks good, but I'll continue doing some testing!

markap14 · 2021-02-18T15:37:13Z

.../src/main/java/org/apache/nifi/controller/status/history/EmbeddedQuestDbRolloverHandler.java

+        ) {
+            while (cursor.hasNext()) {
+                final Record record = cursor.getRecord();
+                result.add(new StringBuilder(record.getStr(0)).toString());


Why are we creating a StringBuilder here?

Record#getStr returns with a CharSequence of some kind. I found StringBuilder the most elaborate way to ensure the correct content will be extracted as with most other ways only a #toString would be called.

Is there a reason not to just use record.getStr(0).toString() - to call the toString method of CharSequence directly? With that, if the object that is returned happens to be a String object (which implements CharSequence) then the toString() method simply returns this.

The record return with an undetermined implementation of CharSequence (actually it's CharSequenceView, which is an internal implementation of the QuestDB) where it is not guaranteed that the toString will be implemented, or implemented properly. I was striving to keep in the safe side

markap14 · 2021-02-18T15:43:55Z

...re/src/main/java/org/apache/nifi/controller/status/history/ProcessGroupStatusDescriptor.java

@@ -113,16 +113,16 @@ public String getField() {


    private static long calculateTaskMillis(final ProcessGroupStatus status) {
-        long nanos = 0L;


Why are these nanos being changed to millis? This leads to a lot of rounding errors, resulting in the data being both less precise and less accurate. By holding onto nanos and converting once at the end, it's also more efficient.

I think I mentioned this at the PR description, but the point would be to avoid loss of information: with the original code, at every level of the recursion we did a nano > millis conversion, but the caller (one level up in the recursion) would consider the result as nano. Thus, the deeper we are in the group structure, the more times we make a conversion, which looks to be incorrect.

If you still think that this comes with rounding errors, what I would suggest is to introduce a calculateTaskNanos, which would handle the recursion and work without converting, ant the calculateTaskMillis would call this and converting the end result only once.

Something like this would solve both issues:

private static long calculateTaskMillis(final ProcessGroupStatus status) { return TimeUnit.MILLISECONDS.convert(calculateTaskNanos(status), TimeUnit.NANOSECONDS); } private static long calculateTaskNanos(final ProcessGroupStatus status) { long nanos = 0L; for (final ProcessorStatus procStatus : status.getProcessorStatus()) { nanos += procStatus.getProcessingNanos(); } for (final ProcessGroupStatus childStatus : status.getProcessGroupStatus()) { nanos += calculateTaskNanos(childStatus); } return nanos; }

Ah, I see. Yes, I think this is a good approach, to calculate recursively using nanos and then converting to millis only after the recursive call.

.../nifi-framework-bundle/nifi-framework/nifi-resources/src/main/resources/conf/nifi.properties

markap14 · 2021-02-18T16:22:39Z

.../nifi-framework-bundle/nifi-framework/nifi-resources/src/main/resources/conf/nifi.properties

 nifi.components.status.repository.implementation=${nifi.components.status.repository.implementation}
+
+# Builder based specification. Gives the possibility to store Node and Component Status History information in different storage solutions.


All properties that get added to this file need to be fully documented in the administration-guide.adoc in nifi-docs

The guide update is underway, I will add it soonish

markap14 · 2021-02-18T16:39:35Z

.../nifi-framework-bundle/nifi-framework/nifi-resources/src/main/resources/conf/nifi.properties

+# nifi.status.repository.questdb.component.id.distinctvalues=${nifi.status.repository.questdb.component.id.distinctvalues}
+# If true, it turns on Java heap based caching for quicker lookup. This increases selection speed but consumes heap memory.
+# nifi.status.repository.questdb.component.id.cached=${nifi.status.repository.questdb.component.id.cached}
+# Turns on indexing of the component id field. For further details please see https://questdb.io/docs/concept/indexes/


While referencing the QuestDB docs may provide some additional insights, we should not expect users to understand how QuestDB works. That is simply an implementation detail. We need to ensure that we fully document exactly how this property will affect the user, given the context of NiFi. We should do this in the administration guide, though, rather than add too much to the nifi.properties.

markap14 · 2021-02-18T16:42:32Z

.../nifi-framework-bundle/nifi-framework/nifi-resources/src/main/resources/conf/nifi.properties

+# nifi.status.repository.builder.inmemory=org.apache.nifi.controller.status.history.InMemoryStatusRepositoryBuilder
+# nifi.status.repository.builder.persistent=org.apache.nifi.controller.status.history.EmbeddedQuestDbStatusRepositoryBuilder
+# nifi.status.repository.roles.component=persistent
+# nifi.status.repository.roles.node=inmemory


I'm not sure that I see the benefit to adding these properties at all. If the user wants to persist the data, it should be persisted. If they want to keep it in memory, it should be kept in memory. These properties become confusing and add dubious value. We should lean more toward simple configuration vs. more raw power when we're able to.

Recommend removing all 4 of these properties. Instead, just allow the QuestDB Status Repository to be configured via the nifi.components.status.repository.implementation property, in which case all stats are persistent. If Volatile repo is used, store everything in memory.

markap14 · 2021-02-18T16:45:48Z

.../nifi-framework-bundle/nifi-framework/nifi-resources/src/main/resources/conf/nifi.properties

+# nifi.status.repository.questdb.persist.frequency=${nifi.status.repository.questdb.persist.frequency}
+# nifi.status.repository.questdb.persist.roll.frequency=${nifi.status.repository.questdb.persist.roll.frequency}
+# nifi.status.repository.questdb.persist.batch.size=${nifi.status.repository.questdb.persist.batch.size}


These properties also seem too complex to me. Admins shouldn't need to guess what an appropriate "batch size" is for storing metrics. We should try to keep this as simple as possible and just configure how frequently we capture a snapshot. Can always add in additional properties later, if necessary, for tuning. Just don't want to overwhelm users with 15 additional properties when all the user really cares about is "I want this persisted for longer and across restarts."

markap14 · 2021-02-18T16:57:39Z

...i-framework/nifi-framework-core/src/main/java/org/apache/nifi/controller/FlowController.java

+        // Creating status repository based on implementation class takes precedence over creation based on builder
+        final String implementationClassName = nifiProperties.getProperty(NiFiProperties.COMPONENT_STATUS_REPOSITORY_IMPLEMENTATION);
+
+        if (implementationClassName != null) {


Any time that we fetch a property value from nifi properties, we need to treat null the same as empty strings or strings with only white space. If the property name exists but with no value, you'll get back an empty string here instead of null.

…ded QuestDB

simonbence · 2021-02-23T15:40:07Z

I do abandon this review. Based on @markap14 's great comments, I simplified the configuration which resulted a more clean codebase as well. This comes with some changes I reverted and in order to keep the review readable I decided to create a new. All comments in this review are answered or aimed here.

simonbence · 2021-02-23T16:03:15Z

Please find the predecessor PR here: PR 4839

turcsanyip reviewed Feb 16, 2021

View reviewed changes

simonbence force-pushed the NIFI-8113 branch 2 times, most recently from 0af5bd7 to a31493b Compare February 17, 2021 09:50

turcsanyip reviewed Feb 17, 2021

View reviewed changes

markap14 requested changes Feb 18, 2021

View reviewed changes

NIFI-8113 Adding persistent status history repository backed by embed…

326bfd5

…ded QuestDB

simonbence force-pushed the NIFI-8113 branch from 413383b to 326bfd5 Compare February 23, 2021 14:58

simonbence closed this Feb 23, 2021

simonbence mentioned this pull request Feb 23, 2021

NIFI-8113 Adding persistent status history repository backed by embedded QuestDB #4839

Closed

13 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NIFI-8113 Adding persistent status history repository backed by embed… #4821

NIFI-8113 Adding persistent status history repository backed by embed… #4821

simonbence commented Feb 11, 2021 •

edited

turcsanyip left a comment

turcsanyip Feb 16, 2021

simonbence Feb 18, 2021

turcsanyip Feb 16, 2021

markap14 left a comment

markap14 Feb 18, 2021

simonbence Feb 19, 2021

markap14 Feb 19, 2021

simonbence Feb 23, 2021

markap14 Feb 18, 2021

simonbence Feb 19, 2021 •

edited

simonbence Feb 19, 2021

markap14 Feb 19, 2021

markap14 Feb 18, 2021

simonbence Feb 19, 2021

markap14 Feb 18, 2021

markap14 Feb 18, 2021

markap14 Feb 18, 2021

markap14 Feb 18, 2021

simonbence commented Feb 23, 2021

simonbence commented Feb 23, 2021

		@@ -113,16 +113,16 @@ public String getField() {


		private static long calculateTaskMillis(final ProcessGroupStatus status) {
		long nanos = 0L;

		nifi.components.status.repository.implementation=${nifi.components.status.repository.implementation}

		# Builder based specification. Gives the possibility to store Node and Component Status History information in different storage solutions.

NIFI-8113 Adding persistent status history repository backed by embed… #4821

NIFI-8113 Adding persistent status history repository backed by embed… #4821

Conversation

simonbence commented Feb 11, 2021 • edited

For all changes:

For code changes:

For documentation related changes:

Note:

turcsanyip left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

markap14 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

simonbence Feb 19, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

simonbence commented Feb 23, 2021

simonbence commented Feb 23, 2021

simonbence commented Feb 11, 2021 •

edited

simonbence Feb 19, 2021 •

edited