NIFI-13934 Remove Hive 3 Catalog from Iceberg Services#9452
NIFI-13934 Remove Hive 3 Catalog from Iceberg Services#9452exceptionfactory wants to merge 1 commit intoapache:mainfrom
Conversation
- Removed Iceberg Hive Metastore Catalog implementation due Hive 3 end of life declaration - Removed associated dependency check suppressions no longer needed
|
That PR is a bit stunning in just how much nonsense gets deleted that was purely there to work around long standing reported vulns and pom wrangling and so on. Thanks for doing this. Will give it a look and verify results on the march to clean builds and vulnerability reports |
|
Sorry, but at first look I'm -1 to this PR: First of all I would like to note that I completely agree with removing Hive 3 dependencies and removing these vulnerabilities, thank you David for bringing this topic up and initiating the discussion/changes! My concerns lie with timing and complexity introduced by these changes. By timining I mean: |
|
+1 on @arpadboda's opinion, Hive 3 support should be removed but doing this right before the release and removing the hive support from Iceberg could lead to confusion. |
|
Thanks for the feedback @joewitt, @arpadboda, and @mark-bathori. Regarding removing a component versus removing a NAR, it is important to note that the Regarding the timing, I realize that Hive 4 is fairly new and that the decision to declare Hive 3 EOL is also fairly recent. As we have an opportunity to provide a cleaner baseline for NiFi 2.0, that is the reason for raising this right now. We could revisit this after the NiFi 2.0 release, and consider this an acceptable change given that Hive 3 is no longer receiving updates. Anyone who needs Hive 3 Catalog support could continue to use the 2.0.0-M4 release version, with the understanding that it is not supported from either NiFi or Hive. I think there is actually more confusion from releasing the current version, because it implies support from the NiFi project, which we are not able to provide, given that Hive 3 is EOL. @mark-bathori Do you have a rough idea of what would be involved in implementing support for Hive 4? @arpadboda For clarification, if the current Hive 3 version of |
|
@arpadboda I should also note that the NiFi release is the source code, for which the project is responsible. Even though Iceberg support for If I understand correctly, it sounds like you would be in favor of removal if this were in its own NAR? That sounds like a way forward, but it would be helpful if you can clarify. |
|
@exceptionfactory Didn't dig too deep yet but the Originally the Service implementations were separate in the service NAR but they had to be moved due the Kerberos support (NIFI-11334). The problem is that for the processor instance isolation the |
|
Thanks for the additional background on the current structure @mark-bathori, I appreciate the challenges that the Kerberos support presents in terms of class loading. In light of those issues, it sounds like more significant restructuring is necessary to provide the kind of decoupling required. With general momentum around Polaris and REST Catalog capabilities, I expect more community interest in that direction, but of course there will still be different implementations in various places. Given the complexities that Kerberos support introduces, it seems like it may be necessary to decouple that, perhaps through a different |
|
@exceptionfactory The decoupling you mention would be beneficial I agree, but that's far from the scope of this PR and I don't feel like anyone in the community has the capacity to do that in a very short term. So in nutshell, here are the options I see: A is ideal but I see that very complex to happen in the near future, B seems feasible to me, in case of C I feel like it would introduce more pain than gain. |
|
@arpadboda and @mark-bathori In light of the current situation, I propose removing the current Iceberg bundle entirely for the initial release. We are not in a position to maintain EOL components, so in light of the current complexities, it seems much better to clear to current baseline and start with a new approach. Those who need the existing Iceberg support can continue to use the 2.0.0-M4 version. That will provide the opportunity to address the issues raised in a holistic manner. I can put together a separate PR for that removal. What do you think of that option? |
|
@exceptionfactory Thanks, sadly, but I have to say yes to that, please feel free to proceed! |
|
Thanks for the reply @arpadboda, and thanks again for pointing out the background for the current implementation @mark-bathori. I am closing this PR and I have opened #9460 for NIFI-13938 to remove the existing Iceberg components for now. |
Summary
NIFI-13934 Removes the Hive 3 Catalog implementation from Iceberg services due to the Apache Hive project declaring Hive 3 end of life on 8 October 2024. Removing Hive 3 dependencies removes several flagged vulnerabilities associated with
libthrift0.9.3 and also avoids potential issues due to lack of future updates for Hive 3.Tracking
Please complete the following tracking steps prior to pull request creation.
Issue Tracking
Pull Request Tracking
NIFI-00000NIFI-00000Pull Request Formatting
mainbranchVerification
Please indicate the verification steps performed prior to pull request creation.
Build
mvn clean install -P contrib-checkLicensing
LICENSEandNOTICEfilesDocumentation