New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NIFI-8113 Adding persistent status history repository backed by embed… #4821
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@simonbence I ran through the code and it looks good overall.
Added some comments so far. Will continue with testing/checking it in more detail.
nifi-nar-bundles/nifi-framework-bundle/nifi-framework-nar/src/main/resources/META-INF/NOTICE
Outdated
Show resolved
Hide resolved
.../src/main/java/org/apache/nifi/controller/status/history/questdb/QuestDbWritingTemplate.java
Outdated
Show resolved
Hide resolved
...src/main/java/org/apache/nifi/controller/status/history/ComponentStatusRepositoryFacade.java
Outdated
Show resolved
Hide resolved
private final int numDataPoints; | ||
private volatile long lastCaptureTime = 0L; | ||
private final NodeStatusRepository nodeStatusRepository; | ||
private final ComponentStatusRepository componentStatusRepository; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As VolatileComponentStatusRepository
provides backward compatibility for the in memory storage and in the background it just delegates requests to the new InMemory*StatusRepositories
, using those classes instead of the interface types would fit to the role of this class better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right with that, however the instance creation is managed by the InMemoryStatusRepositoryBuilder
, which is an implementation for StatusRepositoryBuilder
. This comes with that it will not expose the implementation class it returns with and I wish not to expose that. (Also: adding an other way to create the instances would bring in code duplication and unnecessary complexity in my opinion)
# nifi.status.repository.builder.inmemory=org.apache.nifi.controller.status.history.InMemoryStatusRepositoryBuilder | ||
# nifi.status.repository.builder.persistent=org.apache.nifi.controller.status.history.EmbeddedQuestDbStatusRepositoryBuilder | ||
# nifi.status.repository.roles.component=persistent | ||
# nifi.status.repository.roles.node=inmemory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the new InMemory implementation should be the default instead of adding these lines commented out.
As far as I understand, the backward compatibility point here is to support old nifi.properties
files where only nifi.components.status.repository.implementation
property exists. And it is provided / works even if the default is not the old property.
New installations could (should) go with the new configuration / implementation classes.
0af5bd7
to
a31493b
Compare
...ore/src/main/java/org/apache/nifi/controller/status/history/EmbeddedQuestDbStatusWriter.java
Outdated
Show resolved
Hide resolved
...ain/java/org/apache/nifi/controller/status/history/questdb/QuestDbEntityWritingTemplate.java
Show resolved
Hide resolved
...c/main/java/org/apache/nifi/controller/status/history/storage/BufferedNodeStatusStorage.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @simonbence! This is a VERY powerful addition! Code looks good to me. Did some minimal testing, but I will want to do some more testing before fully giving it a +1. The only real concern that I had with the PR is the update to the nifi.properties. There's a LOT going on there. I think there are 2 reasons for that. Firstly, there are some properties that I think really can be removed - they add flexibility but IMO aren't really necessary and add to the complexity of configuring. Secondly, the properties tend to have a lot of comments associated with them. We need to instead keep comments fairly minimal and update the Admin Guide to fully document each of these properties. What effect will they have, etc.
Otherwise, all looks good, but I'll continue doing some testing!
) { | ||
while (cursor.hasNext()) { | ||
final Record record = cursor.getRecord(); | ||
result.add(new StringBuilder(record.getStr(0)).toString()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we creating a StringBuilder here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Record#getStr
returns with a CharSequence of some kind. I found StringBuilder the most elaborate way to ensure the correct content will be extracted as with most other ways only a #toString
would be called.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason not to just use record.getStr(0).toString()
- to call the toString
method of CharSequence
directly? With that, if the object that is returned happens to be a String
object (which implements CharSequence
) then the toString()
method simply returns this
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The record return with an undetermined implementation of CharSequence
(actually it's CharSequenceView
, which is an internal implementation of the QuestDB) where it is not guaranteed that the toString
will be implemented, or implemented properly. I was striving to keep in the safe side
@@ -113,16 +113,16 @@ public String getField() { | |||
|
|||
|
|||
private static long calculateTaskMillis(final ProcessGroupStatus status) { | |||
long nanos = 0L; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are these nanos being changed to millis? This leads to a lot of rounding errors, resulting in the data being both less precise and less accurate. By holding onto nanos and converting once at the end, it's also more efficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I mentioned this at the PR description, but the point would be to avoid loss of information: with the original code, at every level of the recursion we did a nano > millis conversion, but the caller (one level up in the recursion) would consider the result as nano. Thus, the deeper we are in the group structure, the more times we make a conversion, which looks to be incorrect.
If you still think that this comes with rounding errors, what I would suggest is to introduce a calculateTaskNanos
, which would handle the recursion and work without converting, ant the calculateTaskMillis
would call this and converting the end result only once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something like this would solve both issues:
private static long calculateTaskMillis(final ProcessGroupStatus status) {
return TimeUnit.MILLISECONDS.convert(calculateTaskNanos(status), TimeUnit.NANOSECONDS);
}
private static long calculateTaskNanos(final ProcessGroupStatus status) {
long nanos = 0L;
for (final ProcessorStatus procStatus : status.getProcessorStatus()) {
nanos += procStatus.getProcessingNanos();
}
for (final ProcessGroupStatus childStatus : status.getProcessGroupStatus()) {
nanos += calculateTaskNanos(childStatus);
}
return nanos;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see. Yes, I think this is a good approach, to calculate recursively using nanos and then converting to millis only after the recursive call.
.../nifi-framework-bundle/nifi-framework/nifi-resources/src/main/resources/conf/nifi.properties
Outdated
Show resolved
Hide resolved
nifi.components.status.repository.implementation=${nifi.components.status.repository.implementation} | ||
|
||
# Builder based specification. Gives the possibility to store Node and Component Status History information in different storage solutions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All properties that get added to this file need to be fully documented in the administration-guide.adoc
in nifi-docs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The guide update is underway, I will add it soonish
# nifi.status.repository.questdb.component.id.distinctvalues=${nifi.status.repository.questdb.component.id.distinctvalues} | ||
# If true, it turns on Java heap based caching for quicker lookup. This increases selection speed but consumes heap memory. | ||
# nifi.status.repository.questdb.component.id.cached=${nifi.status.repository.questdb.component.id.cached} | ||
# Turns on indexing of the component id field. For further details please see https://questdb.io/docs/concept/indexes/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While referencing the QuestDB docs may provide some additional insights, we should not expect users to understand how QuestDB works. That is simply an implementation detail. We need to ensure that we fully document exactly how this property will affect the user, given the context of NiFi. We should do this in the administration guide, though, rather than add too much to the nifi.properties.
# nifi.status.repository.builder.inmemory=org.apache.nifi.controller.status.history.InMemoryStatusRepositoryBuilder | ||
# nifi.status.repository.builder.persistent=org.apache.nifi.controller.status.history.EmbeddedQuestDbStatusRepositoryBuilder | ||
# nifi.status.repository.roles.component=persistent | ||
# nifi.status.repository.roles.node=inmemory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that I see the benefit to adding these properties at all. If the user wants to persist the data, it should be persisted. If they want to keep it in memory, it should be kept in memory. These properties become confusing and add dubious value. We should lean more toward simple configuration vs. more raw power when we're able to.
Recommend removing all 4 of these properties. Instead, just allow the QuestDB Status Repository to be configured via the nifi.components.status.repository.implementation
property, in which case all stats are persistent. If Volatile repo is used, store everything in memory.
# nifi.status.repository.questdb.persist.frequency=${nifi.status.repository.questdb.persist.frequency} | ||
# nifi.status.repository.questdb.persist.roll.frequency=${nifi.status.repository.questdb.persist.roll.frequency} | ||
# nifi.status.repository.questdb.persist.batch.size=${nifi.status.repository.questdb.persist.batch.size} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These properties also seem too complex to me. Admins shouldn't need to guess what an appropriate "batch size" is for storing metrics. We should try to keep this as simple as possible and just configure how frequently we capture a snapshot. Can always add in additional properties later, if necessary, for tuning. Just don't want to overwhelm users with 15 additional properties when all the user really cares about is "I want this persisted for longer and across restarts."
// Creating status repository based on implementation class takes precedence over creation based on builder | ||
final String implementationClassName = nifiProperties.getProperty(NiFiProperties.COMPONENT_STATUS_REPOSITORY_IMPLEMENTATION); | ||
|
||
if (implementationClassName != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any time that we fetch a property value from nifi properties, we need to treat null
the same as empty strings or strings with only white space. If the property name exists but with no value, you'll get back an empty string here instead of null.
413383b
to
326bfd5
Compare
I do abandon this review. Based on @markap14 's great comments, I simplified the configuration which resulted a more clean codebase as well. This comes with some changes I reverted and in order to keep the review readable I decided to create a new. All comments in this review are answered or aimed here. |
Please find the predecessor PR here: PR 4839 |
This would be a proposal for having persistent status history for both component and node level metrics. During the implementation I tried to balance between introducing some flexibility in usage and supporting the existing behaviour. As an end result, I did split the component status repository (now it is StatusRepository) into one which is responsible for component metrics and an other responsible for node level metrics. They might be configured independently on some level (like it is possible to store data into a different storage or for a different amount of time window), but in order to support the previous configuration, there is a facade for them which provides the same composite service as it did before (component and node).
As for code organisational part I worked with three concepts: repository is the top level entity, provides the service for the clients (FlowController, etc.). In general, this is the same as before, only split into two parts: node and component. The storage classes are part of the repositories and are responsible for manage the details of a given type, like processor status, node status, etc. Finally the WriterTemplates and ReaderTemplates are merely helpers exist to deal with QuestDB API calls.
Some remarks on design decisions:
Thank you for investing your time to my PR!
In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:
For all changes:
Is there a JIRA ticket associated with this PR? Is it referenced
in the commit message?
Does your PR title start with NIFI-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.
Has your PR been rebased against the latest commit within the target branch (typically
main
)?Is your initial contribution a single, squashed commit? Additional commits in response to PR reviewer feedback should be made on this branch and pushed to allow change tracking. Do not
squash
or use--force
when pushing to allow for clean monitoring of changes.For code changes:
mvn -Pcontrib-check clean install
at the rootnifi
folder?LICENSE
file, including the mainLICENSE
file undernifi-assembly
?NOTICE
file, including the mainNOTICE
file found undernifi-assembly
?.displayName
in addition to .name (programmatic access) for each of the new properties?For documentation related changes:
Note:
Please ensure that once the PR is submitted, you check GitHub Actions CI for build issues and submit an update to your PR as soon as possible.