-
Notifications
You must be signed in to change notification settings - Fork 502
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDDS-8690. Ozone Support deletion related parameter dynamic configuration #4798
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Conflicts: # hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/HddsDatanodeService.java # hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/server/StorageContainerManager.java # hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/reconfig/TestOmReconfigure.java # hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/reconfig/TestScmReconfigure.java # hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java
@adoroszlai Should we maintain ozone/hadoop-hdds/docs/content/feature/Reconfigurability.md Lines 41 to 48 in 9c6cd4b
|
Yes, let's remove the specific properties. Documenting the way to explore the list of properties is enough. An outdated doc may be worse than no doc. |
…rieve the reconfigurable properties list
# Conflicts: # hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/scm/ScmConfig.java # hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/HddsDatanodeService.java # hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/server/StorageContainerManager.java # hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/reconfig/TestScmReconfiguration.java # hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/service/KeyDeletingService.java
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @xichen01 for continuing work on this. Mostly looks good.
private String reconfHddsScmBlockDeletionPerIntervalMax(String newVal) { | ||
getConfiguration().set(HDDS_SCM_BLOCK_DELETION_PER_INTERVAL_MAX, newVal); | ||
|
||
getScmBlockManager().getSCMBlockDeletingService() | ||
.setBlockDeleteTXNum(Integer.parseInt(newVal)); | ||
return newVal; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We simplify reconfiguration of this property by using reconfigurable = true
and making sure the same instance of scmConfig
is used in SCMBlockDeletingService
as the one being reconfigured in SCM.
Full change (also fixes TestScmReconfiguration
failure):
adoroszlai@1ce698f
@Test | ||
public void blockDeleteThreadMax() throws ReconfigurationException { | ||
// Start the service and get the original pool size | ||
ThreadPoolExecutor executor = ((DeleteBlocksCommandHandler) | ||
getFirstDatanode().getDatanodeStateMachine().getCommandDispatcher() | ||
.getDeleteBlocksCommandHandler()).getExecutor(); | ||
int originPoolSize = executor.getMaximumPoolSize(); | ||
|
||
// Attempt to increase the pool size by 1 and verify if it's successful | ||
getFirstDatanode().getReconfigurationHandler().reconfigurePropertyImpl( | ||
HDDS_DATANODE_BLOCK_DELETE_THREAD_MAX, | ||
String.valueOf(originPoolSize + 1)); | ||
assertEquals(originPoolSize + 1, executor.getMaximumPoolSize()); | ||
assertEquals(originPoolSize + 1, executor.getCorePoolSize()); | ||
|
||
// Attempt to decrease the pool size by 1 and verify if it's successful | ||
getFirstDatanode().getReconfigurationHandler().reconfigurePropertyImpl( | ||
HDDS_DATANODE_BLOCK_DELETE_THREAD_MAX, | ||
String.valueOf(originPoolSize - 1)); | ||
assertEquals(originPoolSize - 1, executor.getMaximumPoolSize()); | ||
assertEquals(originPoolSize - 1, executor.getCorePoolSize()); | ||
} | ||
|
||
@Test | ||
public void blockDeletingServiceWorkers() throws ReconfigurationException { | ||
ScheduledThreadPoolExecutor executor = (ScheduledThreadPoolExecutor) | ||
getFirstDatanode().getDatanodeStateMachine().getContainer() | ||
.getBlockDeletingService().getExecutorService(); | ||
int originPoolSize = executor.getCorePoolSize(); | ||
|
||
// Attempt to increase the pool size by 1 and verify if it's successful | ||
getFirstDatanode().getReconfigurationHandler().reconfigurePropertyImpl( | ||
OZONE_BLOCK_DELETING_SERVICE_WORKERS, | ||
String.valueOf(originPoolSize + 1)); | ||
assertEquals(originPoolSize + 1, executor.getCorePoolSize()); | ||
|
||
// Attempt to decrease the pool size by 1 and verify if it's successful | ||
getFirstDatanode().getReconfigurationHandler().reconfigurePropertyImpl( | ||
OZONE_BLOCK_DELETING_SERVICE_WORKERS, | ||
String.valueOf(originPoolSize - 1)); | ||
assertEquals(originPoolSize - 1, executor.getCorePoolSize()); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can avoid duplication by converting to @ParameterizedTest
: adoroszlai@5aeb768
Also fixes findbugs error:
M D RV: Return value of org.apache.hadoop.ozone.container.keyvalue.statemachine.background.BlockDeletingService.getBlockLimitPerInterval() ignored, but method has no side effect At TestDatanodeReconfiguration.java:[line 58]
https://github.com/xichen01/ozone/actions/runs/5577408467/jobs/10190274215#step:6:2367
@@ -26,6 +26,7 @@ | |||
import static org.apache.commons.lang3.RandomStringUtils.randomAlphabetic; | |||
import static org.apache.hadoop.ozone.OzoneConfigKeys.OZONE_ADMINISTRATORS; | |||
import static org.apache.hadoop.ozone.OzoneConfigKeys.OZONE_READONLY_ADMINISTRATORS; | |||
import static org.apache.hadoop.ozone.om.OMConfigKeys.OZONE_KEY_DELETING_LIMIT_PER_TASK; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reconfigurableProperties()
test case needs to be updated: adoroszlai@9d9226e
Fixes https://github.com/xichen01/ozone/actions/runs/5577408467/jobs/10190272872#step:5:4011
public DatanodeConfiguration getDfsConf() { | ||
return dfsConf; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I think dnConf
/ getDnConf
would be better.
# Conflicts: # hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/reconfig/TestScmReconfiguration.java
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @xichen01 for updating the patch.
@Override | ||
public synchronized <T> T getSingletonObject(Class<T> configurationClass) { | ||
if (singletons.containsKey(configurationClass)) { | ||
return (T) singletons.get(configurationClass); | ||
} | ||
|
||
T singletonObject = getObject(configurationClass); | ||
singletons.put(configurationClass, singletonObject); | ||
return singletonObject; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be implemented more simply:
@Override | |
public synchronized <T> T getSingletonObject(Class<T> configurationClass) { | |
if (singletons.containsKey(configurationClass)) { | |
return (T) singletons.get(configurationClass); | |
} | |
T singletonObject = getObject(configurationClass); | |
singletons.put(configurationClass, singletonObject); | |
return singletonObject; | |
} | |
public <T> T getSingletonObject(Class<T> configurationClass) { | |
return (T) singletons.computeIfAbsent(configurationClass, | |
c -> getObject(configurationClass)); | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't need to be synchronized
, since it uses a ConcurrentHashMap
's atomic operation.
/** | ||
* Returns a singleton instance of the given configuration class. | ||
* If an instance of the class has already been created, | ||
* it will be returned; otherwise, a new instance will be created, | ||
* stored in a map for future retrieval. | ||
* | ||
* @param configurationClass The class for which a singleton | ||
* instance is required | ||
* @return a singleton instance of the given class | ||
*/ | ||
default <T> T getSingletonObject(Class<T> configurationClass) { | ||
return getObject(configurationClass); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this should be added to the ConfigurationSource
interface, since it does not offer the "singleton" functionality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getSingletonObject
has been removed from ConfigurationSource.java
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @xichen01 for making the change. On second thought, instead of adding a new method getSingletonObject
, we should override getObject
from ConfigurationSource
with the singleton-based implementation. This way we can keep passing ConfigurationSource
.
In the long run, we should move towards passing the type-safe config objects (DatanodeConfiguration
, ScmConfig
, etc.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we change the logic of ConfigurationSource#getObject
directly, will this cause some of the tests to fail, since at the moment we can assume that each object from ConfigurationSource#getObject
is independent and unaffected.
If the ConfigurationSource#getObject
is "Singleton" that maybe cause tests to affect with each other.
Should we keep this implemented(getSingletonObject
) temporarily, and implement a "Singleton" ConfigurationSource#getObject
in another MR?
@xichen01 , could you rebase this patch once you got time? |
@@ -335,4 +334,5 @@ private static void addDeprecatedKeys() { | |||
OZONE_CONTAINER_COPY_WORKDIR) | |||
}); | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you revert the changes in this file since it's actually doesn't change any content?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -137,7 +146,7 @@ public class DatanodeConfiguration { | |||
* missed. With max threads 5, optimistically DN can handle 1500 individual | |||
* container delete tx in 60s with RocksDB cache miss. | |||
*/ | |||
@Config(key = "block.delete.threads.max", | |||
@Config(key = BLOCK_DELETE_THREAD_MAX, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xichen01 , I find this property is registered to ReconfigurationHandler, but it doesn't have the "reconfigurable = true" field, while the property "block.deleting.limit.per.interval" below has the "reconfigurable = true" but not registered to ReconfigurationHandler.
Could you explain it a bit more that what's the required steps to make a property reconfigurable, and whether "reconfigurable = true" is an mandatory attribute for a reconfigurable property?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we change a configuration's value, if only need to change the value itself, and no additional operations are performed. we can just add a reconfigurable = true
in @Config
annotation. the reconfigurable configuration can be auto registered.
reconfigurationHandler =
new ReconfigurationHandler("DN", conf, this::checkAdminPrivilege)
.register(HDDS_DATANODE_BLOCK_DELETE_THREAD_MAX,
this::reconfigBlockDeleteThreadMax)
.register(OZONE_BLOCK_DELETING_SERVICE_WORKERS,
this::reconfigDeletingServiceWorkers)
.register(conf.getObject(DatanodeConfiguration.class)); // register the reconfigurable configuration
then this configuration's value can be modify in runtime by ozone admin reconfig
command.
But If need to do something extra when modifying the configuration value, such as modifying the thread pool (block.delete.threads.max
), we need to register the operation manually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So either register the property explicitly with a consumer handler, or set the "reconfigurable = true" attribute to the property. The later way the property will be auto registered with a default consumer handler, right? @xichen01
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, if you set the "reconfigurable = true" attribute, the configuration will be auto handled when "reconfigure".
but one point needs to notice:
The object registered to ReconfigurationHandler
must be the same conf as the one that wants to get the value after reconfiguring.
such as:
DatanodeConfiguration dnConf1 = conf.getObject(DatanodeConfiguration.class)
DatanodeConfiguration dnConf2 = conf.getObject(DatanodeConfiguration.class)
reconfigurationHandler =
new ReconfigurationHandler("DN", conf, this::checkAdminPrivilege)
.register(dnConf1);
After reconfiguring, only dnConf1
can get the changed value, because only dnConf1
was registered to the ReconfigurationHandler
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xichen01 , thanks for the explain.
@@ -260,7 +261,7 @@ public DatanodeStateMachine(DatanodeDetails datanodeDetails, | |||
|
|||
@VisibleForTesting | |||
public DatanodeStateMachine(DatanodeDetails datanodeDetails, | |||
ConfigurationSource conf) throws IOException { | |||
OzoneConfiguration conf) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we change the conf from ConfigurationSource to OzoneConfiguration? it's also changed in below OzoneContainer.java.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xichen01 , is this one missed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
## OM Reconfigurability | ||
## Retrieve the reconfigurable properties list | ||
To retrieve all the reconfigurable properties list for a specific component in Ozone, | ||
you can use the command: `ozone admin reconfig properties --address=<ip:port>`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ozone admin reconfig properties --address=<ip:port>
doesn't work. The accepted format is what you used in the below examples.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Already corrected
@@ -4704,6 +4707,14 @@ private String reconfOzoneReadOnlyAdmins(String newVal) { | |||
return String.valueOf(newVal); | |||
} | |||
|
|||
private String reconfOzoneKeyDeletingLimitPerTask(String newVal) { | |||
getConfiguration().set(OZONE_KEY_DELETING_LIMIT_PER_TASK, newVal); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add the check that newValue should be a positive value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think may we can provide a conf.getUnsignedInt()
for those configurations which must be larger than 0. we can do some check in conf.getUnsignedInt()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean conf.setUnsignedInt? It's a good idea. But the configuration class is in hadoop common module.
# Conflicts: # hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/statemachine/background/BlockDeletingService.java
# Conflicts: # hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/impl/BlockDeletingService.java
# Conflicts: # hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/statemachine/commandhandler/DeleteBlocksCommandHandler.java
@xichen01 , the last patch looks good, could you rebase the patch to solve the conflicts? |
private String reconfigBlockDeleteThreadMax(String value) { | ||
getConf().set(HDDS_DATANODE_BLOCK_DELETE_THREAD_MAX, value); | ||
conf.getObject(DatanodeConfiguration.class) | ||
.setBlockDeleteThreads(Integer.parseInt(value)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this line is covered by L684 already. It can be removed.
# Conflicts: # hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/service/KeyDeletingService.java
@xichen01 , can you check the failed test "TestOzoneManagerHAKeyDeletion.testKeyDeletion"? It might be related with the code change. |
The master branch have this Error too https://github.com/apache/ozone/actions/runs/6439086990/job/17486681800 |
Yes, it's reproduced with master branch. Created HDDS-9414 to track the failure. |
@adoroszlai has create a ticket |
What changes were proposed in this pull request?
Support deletion related parameter dynamic configuration
OM:
SCM:
Datanode:
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-8690
How was this patch tested?
unit test.