Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AMBARI-25675] On giving Restart command, on Server side the desired state gets changed to STARTED but on Agent Side desired state remains INSTALLED. #3314

Open
wants to merge 1 commit into
base: branch-2.7
Choose a base branch
from

Conversation

shubhamod
Copy link

What changes were proposed in this pull request?

Currently, if we the component is in INSTALLED state and we give Restart command, on Server side the desired state gets changed to STARTED but on Agent Side desired state remains INSTALLED.

The RESTART command received and created on Ambari Agent is:
{'requiredConfigTimestamp': 1621237797633, u'commandParams':

{u'hooks_folder': u'stack-hooks', u'custom_command': u'RESTART', u'script': u'scripts/oozie_server.py', u'version': u'4.1.4.0', u'command_timeout': u'1800', u'HAS_RESOURCE_FILTERS': u'true', u'script_type': u'PYTHON'}
, u'roleCommand': u'CUSTOM_COMMAND', u'repositoryFile': {u'resolved': True, u'repoVersion': u'4.1.4.0', u'repositories': [

{u'mirrorsList': None, u'tags': [], u'ambariManaged': True, u'baseUrl': u'https://hdi31distrorelease.blob.core.windows.net/repos/HDInsight/ubuntu16/4.x/4.1.4.0/MDP.list ', u'repoName': u'HDInsight', u'components': None, u'distribution': None, u'repoId': u'HDInsight-4.1-repo-1', u'applicableServices': []}
], u'feature':

{u'preInstalled': True, u'scoped': False}
, u'stackName': u'HDInsight', u'repoVersionId': 1, u'repoFileName': u'ambari-hdinsight-1'}, u'clusterId': u'2', u'commandType': u'EXECUTION_COMMAND', u'clusterName': u'gshubhamhadoop40test', u'serviceName': u'OOZIE', u'role': u'OOZIE_SERVER', u'requestId': 75454, u'taskId': 13304, u'roleParams':

{u'component_category': u'MASTER'}
, u'componentVersionMap': {u'HDFS':

{u'DATANODE': u'4.1.4.0', u'ZKFC': u'4.1.4.0', u'JOURNALNODE': u'4.1.4.0', u'HDFS_CLIENT': u'4.1.4.0', u'NAMENODE': u'4.1.4.0'}
, u'ZOOKEEPER':

{u'ZOOKEEPER_SERVER': u'4.1.4.0', u'ZOOKEEPER_CLIENT': u'4.1.4.0'}
, u'SQOOP':

{u'SQOOP': u'4.1.4.0'}
, u'HIVE':

{u'HIVE_METASTORE': u'4.1.4.0', u'HIVE_SERVER': u'4.1.4.0', u'HIVE_CLIENT': u'4.1.4.0'}
, u'PIG':

{u'PIG': u'4.1.4.0'}
, u'TEZ':

{u'TEZ_CLIENT': u'4.1.4.0'}
, u'MAPREDUCE2':

{u'MAPREDUCE2_CLIENT': u'4.1.4.0', u'HISTORYSERVER': u'4.1.4.0'}
, u'YARN':

{u'NODEMANAGER': u'4.1.4.0', u'APP_TIMELINE_SERVER': u'4.1.4.0', u'RESOURCEMANAGER': u'4.1.4.0', u'YARN_CLIENT': u'4.1.4.0'}
, u'OOZIE': {u'OOZIE_CLIENT': u'4.1.4.0', u'OOZIE_SERVER': u'4.1.4.0'}}, u'commandId': u'75454-0'}

And currently while updating the desired state we are checking - command['custom_command'] == CustomCommand.restart

But this is incorrect since the key is inside commandParams, so correct check should be: command['commandParams']['custom_command'] == CustomCommand.restart

Problems due to this:
Currently if we stop a component and give RESTART , then on agent side DESIRED state is still INSTALLED but current state gets to STARTED state. Now if we manually kill the process or the process gets shutdown, it will not recover since CURRENT state == DESIRED state == INSTALLED.

With this bug fix, we should be able to recover those tasks as well.

(Please fill in changes proposed in this fix)

How was this patch tested?

(Please explain how this patch was tested. Ex: unit tests, manual tests)
(If this patch involves UI changes, please attach a screen-shot; otherwise, remove this)

Please review Ambari Contributing Guide before opening a pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant