Skip to content

NIFI-14109 Refactored remaining processors and control services to be uniform when creating properties and relationships.#9600

Merged
exceptionfactory merged 4 commits intoapache:mainfrom
dan-s1:NIFI-14109
Jan 23, 2025
Merged

Conversation

@dan-s1
Copy link
Copy Markdown
Contributor

@dan-s1 dan-s1 commented Dec 27, 2024

Summary

NIFI-14109
This PR aims to have a lists of PropertyDescriptor objects and sets of Relationship objects all be defined with one PropertyDescriptor and Relationship per line for increased readability. The list variables are named PROPERTY_DESCRIPTORS and the set variables are named RELATIONSHIPS. Also for common PropertyDescriptor objects defined in a parent a method, a method getCommonPropertyDescriptors is defined to allow children to use them. This was done for the following parent and children classes (parents on left and children which use the method are on the right).

  1. AbstractAMQPProcessor - ConsumeAMQP, PublishAMQP
  2. AbstractAwsMachineLearningJobStarter - StartAwsTextractJob
  3. AbstractAwsMachineLearningJobStatusProcessor - GetAwsTextractJobStatus
  4. AbstractAzureCosmosDBProcessor - PutAzureCosmosDBRecord
  5. AbstractEmailProcessor - ConsumeIMAP, ConsumePOP3
  6. AbstractGridFSProcessor (applies this also to relationships) - FetchGridFS, PutGridFS
  7. AbstractHadoopProcessor - AbstractFetchHDFSRecord, AbstractPutHDFSRecord (PutParquet but doesn't use common method), CreateHadoopSequenceFile, DeleteHDFS, FetchHDFS, GetHDFS, GetHDFSEvents, GetHDFSFileInfo, ListHDFS, MoveHDFS, PutHDFS
  8. AbstractJoltTransform - JoltTransformJSON, JoltTransformRecord
  9. AbstractMongoProcessor - DeleteMongo, GetMongo, PutMongo, PutMongoBulkOperations, PutMongoRecord, RunMongoAggregation
  10. SplunkAPICall - PutSplunkHTTP, QuerySplunkIndexingStatus

Tracking

Please complete the following tracking steps prior to pull request creation.

Issue Tracking

Pull Request Tracking

  • Pull Request title starts with Apache NiFi Jira issue number, such as NIFI-00000
  • Pull Request commit message starts with Apache NiFi Jira issue number, as such NIFI-00000

Pull Request Formatting

  • Pull Request based on current revision of the main branch
  • Pull Request refers to a feature branch with one commit containing changes

Verification

Please indicate the verification steps performed prior to pull request creation.

Build

  • Build completed using mvn clean install -P contrib-check
    • JDK 21

Licensing

  • New dependencies are compatible with the Apache License 2.0 according to the License Policy
  • New dependencies are documented in applicable LICENSE and NOTICE files

Documentation

  • Documentation formatting appears as expected in rendered files

relationships.addAll(getAdditionalRelationships());
this.relationships = Collections.unmodifiableSet(relationships);
this.descriptors = Stream.concat(
PROPERTIES.stream(),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there no simpler/cleaner option than Stream.concat ... list.stream().... .toList()? Overall this is cleanear in some ways but does seem like there is a cleaner/simpler way of saying 'List from these lists'

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used this as I saw that being used in other parts of the code e.g. ElasticSearchLookupService.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would surprise me if this is the best way to express these declarations.

Copy link
Copy Markdown
Contributor Author

@dan-s1 dan-s1 Dec 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought it was the cleanest way to make an unmodifiable list. A quick Google search indicates that is the Java 9 way to combine lists and make them unmodifiable.

AMQP_VERSION,
SSL_CONTEXT_SERVICE,
USE_CERT_AUTHENTICATION
);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here the parent class 'getCommonPropertyDescriptors' method is removed and instead a subclass is supposed to know to use the static list of descriptors as they build their own lists. I think this weakens what the author intended in building that class.

Also there is at least one example much later on in this PR where you kept the parent method. I think keeping the parent method better conveys intent to any subclassers.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will revert this back.

Copy link
Copy Markdown
Contributor Author

@dan-s1 dan-s1 Jan 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ended up adding 'getCommonPropertyDescriptors' for all the parent children relationships I saw. Below are the ones I made changes for (Parents on the left and children who use the method on the right).

  1. AbstractAMQPProcessor - ConsumeAMQP, PublishAMQP
  2. AbstractAwsMachineLearningJobStarter - StartAwsTextractJob
  3. AbstractAwsMachineLearningJobStatusProcessor - GetAwsTextractJobStatus
  4. AbstractAzureCosmosDBProcessor - PutAzureCosmosDBRecord
  5. AbstractEmailProcessor - ConsumeIMAP, ConsumePOP3
  6. AbstractGridFSProcessor (applies this also to relationships) - FetchGridFS, PutGridFS
  7. AbstractHadoopProcessor - AbstractFetchHDFSRecord, AbstractPutHDFSRecord (PutParquet but doesn't use common method), CreateHadoopSequenceFile, DeleteHDFS, FetchHDFS, GetHDFS, GetHDFSEvents, GetHDFSFileInfo, ListHDFS, MoveHDFS, PutHDFS
  8. AbstractJoltTransform - JoltTransformJSON, JoltTransformRecord
  9. AbstractMongoProcessor - DeleteMongo, GetMongo, PutMongo, PutMongoBulkOperations, PutMongoRecord, RunMongoAggregation
  10. SplunkAPICall - PutSplunkHTTP, QuerySplunkIndexingStatus

@dan-s1 dan-s1 force-pushed the NIFI-14109 branch 2 times, most recently from c05a2c0 to d5e7a3b Compare January 9, 2025 14:02
Copy link
Copy Markdown
Contributor

@exceptionfactory exceptionfactory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on improving the consistency of approach @dan-s1.

I'm in favor of going with PROPERTY_DESCRIPTORS and RELATIONSHIPS as currently implemented. If you can rebase the pull request, this looks close to completion.

… uniform when creating properties and relationships.
…ROPERTY_DESCRIPTORS. For parent processor classes which defined common properties, made getCommonPropertyDescriptors() method for children to use to load them. Ensured lists of PropertyDescriptor objects has each PropertyDescriptor on its own line and sets of RelationShip objects has each Relationship on its own line to improve readability.
Copy link
Copy Markdown
Contributor

@exceptionfactory exceptionfactory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm in the favor of the current set of changes, as it includes a notable clean up of property descriptor and relationship collection construction.

Do you have any additional feedback @joewitt?

@joewitt
Copy link
Copy Markdown
Contributor

joewitt commented Jan 23, 2025

It is a massive change set so definitely don't feel super comfortable :)

I also find the 'concat/streamof' to be not very nice but that might just be me.

If we are good then yolo! i'm fine

@exceptionfactory
Copy link
Copy Markdown
Contributor

It is a massive change set so definitely don't feel super comfortable :)

I also find the 'concat/streamof' to be not very nice but that might just be me.

If we are good then yolo! i'm fine

Yea, I agree the Stream.concat() is less optimal than the pure List.of(), but it reflects some of the property descriptor inheritance currently in play. That in itself is not always optimal, but I think that is outside of this PR, and the Stream.concat() is marginally better than repetitive add() calls on an intermediate list.

Thanks for the feedback!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants