-
Notifications
You must be signed in to change notification settings - Fork 769
SOLR-14660 - move HDFS to a module #324
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This commit - moves the code implementing HDFS support to a contrib module which has its own build and tests. - moves block caching implementation to the HDFS contrib module - removes direct references from Solr Core to HDFS - moves test classes which are required both by Core and the HDFS contrib to the test framework - updates the documentation link to refer the HdfsDirectoryFactory from the new location - adds an AwaitsFix annotation to the HDFSRecoveryZKTest which was failing before these changes, needs to be fixed separately - removes the deprecated flags from HDFS classes - removes the deprecated notice from the reference guide After ensuring that the contrib module is on the classpath, HDFS support can be used as before, no changes need to be done to the configuration files. The contrib module is planned to be implemented as a Solr package in a next phase.
- refactored HdfsTestutil and created HadoopTestUtil
…epositoryFactory.java
…s (1.: HdfsChaosMonkeyNothingIsSafeTest) - HdfsChaosMonkeyNothingIsSafeTest was extending ChaosMonkeyNothingIsSafeTest.java, however since there is no depenency possible between test projects, it had to be split up - Created AbstractChaosMonkeyNotheingIsSafeTestBase as a common ancestor. Class is placed to the test framework because it has junit depenencies and needs to be shared - Moved the login in the test() method to doTest() which is now invoked from the child classes - Moved FullThrottleStoppableIndexingThread to the test framework as it needs to be shared - HdfsChaosMonkeyNothingIsSafeTest.java is moved to the hdfs plugin.
…s (2.: HdfsSyncSliceTest) - HdfsSyncSliceTest was extending SyncSliceTest
…s (3.: TestHdfsBackupRestoreCore) - TestHdfsBackupRestoreCore uses BackupRestoreUtils and BackupStatusChecker, these were moved to test framework (these also use log4j classes) - Moved the fetchRestoreStatus method out of TestRestoreCore to a util class as it is shared by hdfs tests and core tests
…s (4.: tests based on BasicDistributedZkTest) - HdfsNNFailoverTest extends BasicDistributedZkTest. Separated it to AbstractBasicDistributedZkTestBase and BasicDistributedZkTest. - Also changed the following classes to extend AbstractBasicDistributedZkTestBase: - HdfsBasicDistributedZkTest - HdfsWriteToMultipleCollectionsTest - StressHdfsTest
…s (5.: HdfsChaosMonkeySafeLeaderTest) - HdfsChaosMonkeySafeLeaderTest extends ChaosMonkeySafeLeaderTest - separated by introducing AbstractChaosMonkeySafeLeaderTestBase
…s (6.: HdfsBasicDistributedZk2Test) - HdfsBasicDistributedZk2Test extends BasicDistributedZk2Test - separated by introducing AbstractBasicDistributedZk2TestBase
…s (7.: HdfsRecoveryZkTest) - HdfsRecoveryZkTest extends RecoveryZkTest - introduced AbstractRecoveryZkTestBase
…s (8.: MoveReplicaTest) - MoveReplicaHDFSTest extends MoveReplicaTest - introduced AbstractMoveReplicaTestBase
…s (9.: RestartWhileUpdatingTest) - HDFSRestartWhileUpdatingTest extends RestartWhileUpdatingTest - introduced AbstractRestartWhileUpdatingTestBase
…s (10.: HdfsTlogReplayBufferedWhileIndexingTest) - HdfsTlogReplayBufferedWhileIndexingTest extends TlogReplayBufferedWhileIndexingTest - introduced AbstractTlogReplayBufferedWhileIndexingTestBase
…s (11.: HdfsUnloadDistributedZkTest) - HdfsUnloadDistributedZkTest extends UnloadDistributedZkTest - introduced AbstractUnloadDistributedZkTestBase
…s (12.: HdfsCollectionsAPIDistributedZkTest) - HdfsCollectionsAPIDistributedZkTest extends CollectionsAPIDistributedZkTest - introduced AbstractCollectionsAPIDistributedZkTestBase
…can be accessed by hdfs module tests
…fo and adjusted gradle validations to skip Hadoop code validations
Unfortunately it seems that some modules which are used by the gradle check are not compatible with test fixtures. This commit moves them to the test framework.
- fixed a documentation link to point to hdfs contrib directory - added a short README.md on how to build and use - updated the solr on hdfs ref guide to remove deprecated flag
I went through these two and I'm not sure there is a good way to address them right now. I don't think it should hold up the merge of this PR.
|
I ran Hadoop-auth, -annotations, and -common are all still in WEB-INF/lib, do we need all of those? I guess auth depends on common and we haven't moved that out yet? |
I've been looking through and I think that we can exclude Hadoop-annotations and leave it in only as a test dependency? I'm not sure. |
…tion instead of relative paths
@madrob I'm with you I expected more jars to be there in FWIW removing
|
Thanks @madrob and @dsmiley. I think your comments have been addressed by @warperwolf and I. Any other questions/comments/concerns/thoughts? |
This commit - moves the code implementing HDFS support to a new HDFS module which has its own build and tests. - moves block caching implementation to the HDFS module - removes direct references from Solr Core to HDFS module - moves test classes which are required both by Core and the HDFS module to the test framework - updates the documentation link to refer the HdfsDirectoryFactory from the new location - removes the deprecated flags from HDFS classes - removes the deprecated notice from the reference guide After ensuring that the module is on the classpath, HDFS support can be used as before, no changes need to be done to the configuration files. Some specific changes - Remove DirectoryFactory.LOCK_TYPE_HDFS - updateHandler will use the update log instance returned by the directory factory if no class is specified in solrconfig.xml. - test resource files are now copied from core upon build to avoid having to maintain 2 set of the same files in git. - HdfsCollectionsAPIDistributedZkTest extends CollectionsAPIDistributedZkTest - introduced AbstractCollectionsAPIDistributedZkTestBase - HdfsUnloadDistributedZkTest extends UnloadDistributedZkTest - introduced AbstractUnloadDistributedZkTestBase - HdfsTlogReplayBufferedWhileIndexingTest extends TlogReplayBufferedWhileIndexingTest - introduced AbstractTlogReplayBufferedWhileIndexingTestBase - HDFSRestartWhileUpdatingTest extends RestartWhileUpdatingTest - introduced AbstractRestartWhileUpdatingTestBase - MoveReplicaHDFSTest extends MoveReplicaTest - introduced AbstractMoveReplicaTestBase - HdfsRecoveryZkTest extends RecoveryZkTest - introduced AbstractRecoveryZkTestBase - HdfsBasicDistributedZk2Test extends BasicDistributedZk2Test - separated by introducing AbstractBasicDistributedZk2TestBase - HdfsChaosMonkeySafeLeaderTest extends ChaosMonkeySafeLeaderTest - separated by introducing AbstractChaosMonkeySafeLeaderTestBase - HdfsNNFailoverTest extends BasicDistributedZkTest. Separated it to AbstractBasicDistributedZkTestBase and BasicDistributedZkTest. - Also changed the following classes to extend AbstractBasicDistributedZkTestBase: - HdfsBasicDistributedZkTest - HdfsWriteToMultipleCollectionsTest - StressHdfsTest - TestHdfsBackupRestoreCore uses BackupRestoreUtils and BackupStatusChecker, these were moved to test framework (these also use log4j classes) - Moved the fetchRestoreStatus method out of TestRestoreCore to a util class as it is shared by hdfs tests and core tests - HdfsSyncSliceTest was extending SyncSliceTest - HdfsChaosMonkeyNothingIsSafeTest was extending ChaosMonkeyNothingIsSafeTest.java, however since there is no depenency possible between test projects, it had to be split up - Created AbstractChaosMonkeyNotheingIsSafeTestBase as a common ancestor. Class is placed to the test framework because it has junit depenencies and needs to be shared - Moved the login in the test() method to doTest() which is now invoked from the child classes - Moved FullThrottleStoppableIndexingThread to the test framework as it needs to be shared Closes PR #324 Co-authored-by: Istvan Farkas <ent128k@gmail.com> Co-authored-by: Kevin Risden <krisden@apache.org>
This commit - moves the code implementing HDFS support to a new HDFS module which has its own build and tests. - moves block caching implementation to the HDFS module - removes direct references from Solr Core to HDFS module - moves test classes which are required both by Core and the HDFS module to the test framework - updates the documentation link to refer the HdfsDirectoryFactory from the new location - removes the deprecated flags from HDFS classes - removes the deprecated notice from the reference guide After ensuring that the module is on the classpath, HDFS support can be used as before, no changes need to be done to the configuration files. Some specific changes - Remove DirectoryFactory.LOCK_TYPE_HDFS - updateHandler will use the update log instance returned by the directory factory if no class is specified in solrconfig.xml. - test resource files are now copied from core upon build to avoid having to maintain 2 set of the same files in git. - HdfsCollectionsAPIDistributedZkTest extends CollectionsAPIDistributedZkTest - introduced AbstractCollectionsAPIDistributedZkTestBase - HdfsUnloadDistributedZkTest extends UnloadDistributedZkTest - introduced AbstractUnloadDistributedZkTestBase - HdfsTlogReplayBufferedWhileIndexingTest extends TlogReplayBufferedWhileIndexingTest - introduced AbstractTlogReplayBufferedWhileIndexingTestBase - HDFSRestartWhileUpdatingTest extends RestartWhileUpdatingTest - introduced AbstractRestartWhileUpdatingTestBase - MoveReplicaHDFSTest extends MoveReplicaTest - introduced AbstractMoveReplicaTestBase - HdfsRecoveryZkTest extends RecoveryZkTest - introduced AbstractRecoveryZkTestBase - HdfsBasicDistributedZk2Test extends BasicDistributedZk2Test - separated by introducing AbstractBasicDistributedZk2TestBase - HdfsChaosMonkeySafeLeaderTest extends ChaosMonkeySafeLeaderTest - separated by introducing AbstractChaosMonkeySafeLeaderTestBase - HdfsNNFailoverTest extends BasicDistributedZkTest. Separated it to AbstractBasicDistributedZkTestBase and BasicDistributedZkTest. - Also changed the following classes to extend AbstractBasicDistributedZkTestBase: - HdfsBasicDistributedZkTest - HdfsWriteToMultipleCollectionsTest - StressHdfsTest - TestHdfsBackupRestoreCore uses BackupRestoreUtils and BackupStatusChecker, these were moved to test framework (these also use log4j classes) - Moved the fetchRestoreStatus method out of TestRestoreCore to a util class as it is shared by hdfs tests and core tests - HdfsSyncSliceTest was extending SyncSliceTest - HdfsChaosMonkeyNothingIsSafeTest was extending ChaosMonkeyNothingIsSafeTest.java, however since there is no depenency possible between test projects, it had to be split up - Created AbstractChaosMonkeyNotheingIsSafeTestBase as a common ancestor. Class is placed to the test framework because it has junit depenencies and needs to be shared - Moved the login in the test() method to doTest() which is now invoked from the child classes - Moved FullThrottleStoppableIndexingThread to the test framework as it needs to be shared Closes PR #324 Co-authored-by: Istvan Farkas <ent128k@gmail.com> Co-authored-by: Kevin Risden <krisden@apache.org>
This commit - moves the code implementing HDFS support to a new HDFS module which has its own build and tests. - moves block caching implementation to the HDFS module - removes direct references from Solr Core to HDFS module - moves test classes which are required both by Core and the HDFS module to the test framework - updates the documentation link to refer the HdfsDirectoryFactory from the new location - removes the deprecated flags from HDFS classes - removes the deprecated notice from the reference guide After ensuring that the module is on the classpath, HDFS support can be used as before, no changes need to be done to the configuration files. Some specific changes - Remove DirectoryFactory.LOCK_TYPE_HDFS - updateHandler will use the update log instance returned by the directory factory if no class is specified in solrconfig.xml. - test resource files are now copied from core upon build to avoid having to maintain 2 set of the same files in git. - HdfsCollectionsAPIDistributedZkTest extends CollectionsAPIDistributedZkTest - introduced AbstractCollectionsAPIDistributedZkTestBase - HdfsUnloadDistributedZkTest extends UnloadDistributedZkTest - introduced AbstractUnloadDistributedZkTestBase - HdfsTlogReplayBufferedWhileIndexingTest extends TlogReplayBufferedWhileIndexingTest - introduced AbstractTlogReplayBufferedWhileIndexingTestBase - HDFSRestartWhileUpdatingTest extends RestartWhileUpdatingTest - introduced AbstractRestartWhileUpdatingTestBase - MoveReplicaHDFSTest extends MoveReplicaTest - introduced AbstractMoveReplicaTestBase - HdfsRecoveryZkTest extends RecoveryZkTest - introduced AbstractRecoveryZkTestBase - HdfsBasicDistributedZk2Test extends BasicDistributedZk2Test - separated by introducing AbstractBasicDistributedZk2TestBase - HdfsChaosMonkeySafeLeaderTest extends ChaosMonkeySafeLeaderTest - separated by introducing AbstractChaosMonkeySafeLeaderTestBase - HdfsNNFailoverTest extends BasicDistributedZkTest. Separated it to AbstractBasicDistributedZkTestBase and BasicDistributedZkTest. - Also changed the following classes to extend AbstractBasicDistributedZkTestBase: - HdfsBasicDistributedZkTest - HdfsWriteToMultipleCollectionsTest - StressHdfsTest - TestHdfsBackupRestoreCore uses BackupRestoreUtils and BackupStatusChecker, these were moved to test framework (these also use log4j classes) - Moved the fetchRestoreStatus method out of TestRestoreCore to a util class as it is shared by hdfs tests and core tests - HdfsSyncSliceTest was extending SyncSliceTest - HdfsChaosMonkeyNothingIsSafeTest was extending ChaosMonkeyNothingIsSafeTest.java, however since there is no depenency possible between test projects, it had to be split up - Created AbstractChaosMonkeyNotheingIsSafeTestBase as a common ancestor. Class is placed to the test framework because it has junit depenencies and needs to be shared - Moved the login in the test() method to doTest() which is now invoked from the child classes - Moved FullThrottleStoppableIndexingThread to the test framework as it needs to be shared Closes PR #324 Co-authored-by: Istvan Farkas <ent128k@gmail.com> Co-authored-by: Kevin Risden <krisden@apache.org>
Merged in cb2e58f |
Big shout out to @warperwolf who did 99% of the heavy lifting here. |
https://issues.apache.org/jira/browse/SOLR-14660
Description
This PR moves HDFS from Core to a Contrib module.
Note: I kept separate commits to make it easier to review and to change / revert if necessary. They can be squashed once everything is polished and we have a green light.
Solution
Tests
Gradle does not allow dependencies between test projects of 2 different modules, so in such cases I introduced an abstract base class (here AbstractChaosMonkeyNothingIsSafeTest) which is extended by both. A good place to store these classes would be the test fixtures feature of gradle which is especially designed for this purpose, however some of the gradle plugins Solr uses fail because they are not compatible with fixtures (see https://issues.apache.org/jira/browse/SOLR-14660?focusedCommentId=17409690&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17409690 ), so I moved these classes to the Solr Test Framework project.
Checklist
Please review the following and check all that apply:
main
branch../gradlew check
.