Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[UI] Fix the UI bugs when there is environment under the Terminating status #2954

Merged
merged 4 commits into from
Dec 10, 2020

Conversation

ChengYanJin
Copy link
Contributor

@ChengYanJin ChengYanJin commented Dec 1, 2020

Component: UI, salt, solution

Context:
The Environment list does not display correctly when there is a namespace/environment on terminating status.

Summary:
FIx the bug of #2949

Acceptance criteria:
image
image

  • The environment list should be correctly displayed even there is env is under Terminating status, and we should not be able to delete this env again (The cursor is set to not-allowed)
  • Add the missing translations and fix the typo of translation.

Closes: #2949

@ChengYanJin ChengYanJin requested a review from a team as a code owner December 1, 2020 11:37
@bert-e
Copy link
Contributor

bert-e commented Dec 1, 2020

Hello chengyanjin,

My role is to assist you with the merge of this
pull request. Please type @bert-e help to get information
on this process, or consult the user documentation.

Status report is not available.

@ChengYanJin ChengYanJin force-pushed the improvement/add-error-handling-for-salt-api branch from cfacd0f to 15dc174 Compare December 1, 2020 11:54
Copy link
Contributor

@gdemonet gdemonet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some things to change, but most importantly, I don't think we should implement any "locking" from the UI. If Salt cannot handle running two jobs in parallel, it should be responsible for saying so (or use some mechanism to queue one of them for later execution). UI cannot know everything (what if an incompatible job was triggered by CLI?), unfortunately.

Comment on lines 185 to 189
// Check if the job jid from localStorage is exist. If not, we should remove it from localStorage
const jobNotFound = result?.return?.find(
(job) => Object.values(job)[0]['Error'] === ERROR_NO_JOB_JID_FOUND,
);
yield call(removeJobFromLocalStorage, Object.keys(jobNotFound)[0]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I think that's not going to work well.

Try this:

  • create a dummy Node
  • click Deploy
  • refresh the page, and look for the printJob call in your browser's network capture tool

Personally, until the job is finished, I get:

{
  "return": [
    {
      "20201201120203793335": {
        "Result": {},
        "StartTime": "2020, Dec 01 12:02:03.793335",
        "Error": "Cannot contact returner or no job with this jid"
      }
    }
  ]
}

Which means the job would be cleared out, with this code.
However, if I query for a non-existing JID:

{
  "return": [
    {
      "12345": {
        "Result": {},
        "StartTime": "",
        "Error": "Cannot contact returner or no job with this jid"
      }
    }
  ]
}

Maybe we should also check the StartTime...?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, good catch! It will delete good jobs.

I don't know how we can check the StartTime. Because let's say I executed Salt job from an instance, and switch to another instance. I will have a StartTime too.

Maybe we should not remove any JID from localStorage?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because let's say I executed Salt job from an instance, and switch to another instance. I will have a StartTime too.

Are you sure? If the job ID doesn't exist, then I don't see how Salt can give you a StartTime...

Copy link
Contributor

@slaperche-scality slaperche-scality Dec 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gdemonet That's weird that you get an error key with Cannot contact returner or no job with this jid, seems like something is wrong (the call failed or the ID really doesn't exists).
The presence of the Error key is also what we use in the operator:

	// Check for "Error" field.
	if reason, found := info["Error"]; found {
		// Unknown Job ID: maybe the Salt server restarted or something like that.
		logger.Info("Salt job not found")
		return nil, fmt.Errorf("cannot get status for job %s: %v", jobID, reason)
	}

If your job is really in progress, then you'll get some info about it's status (and an key Result with an empty dict since it's not finished yet):

	// Extract results.
	result, ok := info["Result"].(map[string]interface{})
	if !ok {
		return nil, fmt.Errorf("missing 'Result' key in %v", info)
	}
	// No result yet, the job is still running.
	if len(result) == 0 {
		logger.Info("Salt job is still running")
		return nil, nil
	}

At least for the states used by the storage-operator, I don't know if this holds for every Salt call.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, indeed there is something wrong somewhere (my bet is on Salt).

Tested the following:

  • for a runner calling our deploy_node orchestrate, as the UI does
# salt-run --async state.orch metalk8s.orchestrate.deploy_node saltenv=metalk8s-2.6.0-dev pillar='{orchestrate: {node_name: worker-1}}'
[WARNING ] Running in asynchronous mode. Results of this execution may be collected by attaching to the master event bus or by examing the master job cache, if configured. This execution is running under tag salt/run/20201201163643992851
# salt-run jobs.print_job 20201201163643992851
20201201163643992851:
    ----------
    Error:
        Cannot contact returner or no job with this jid
    Result:
        ----------
    StartTime:
        2020, Dec 01 16:36:43.992851
[INFO    ] Runner completed: 20201201163704623108
  • for a simple salt command, as the storage-operator does:
# salt --async 'worker-1' state.sls metalk8s.volumes saltenv=metalk8s-2.6.0-dev
Executed command with job ID: 20201201164240351912
# salt-run jobs.print_job 20201201164240351912
20201201164240351912:
    ----------
    Arguments:
        - metalk8s.volumes
        |_
          ----------
          __kwarg__:
              True
          saltenv:
              metalk8s-2.6.0-dev
    Function:
        state.sls
    Minions:
        - worker-1
    Result:
        ----------
    StartTime:
        2020, Dec 01 16:42:40.351912
    Target:
        worker-1
    Target-type:
        glob
    User:
        root
[INFO    ] Runner completed: 20201201164247350096

Of course, if I wait long enough, I get a full Result for both jobs, meaning there is no actual error in the first case outside the missing reporter.

Copy link
Contributor Author

@ChengYanJin ChengYanJin Dec 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because let's say I executed Salt job from an instance, and switch to another instance. I will have a StartTime too.

Are you sure? If the job ID doesn't exist, then I don't see how Salt can give you a StartTime...

@gdemonet I think if you try with the real job which generates the real timestamp (not a random number), you can get StartTime.

So basically we should not remove any job with the error Cannot contact returner or no job with this jid right? ( It's not actually an error...) because it might be during the transition state.

Copy link
Contributor Author

@ChengYanJin ChengYanJin Dec 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. As discussed during the meeting, I will drop this commit.

Since the issue: Salt shares the Job JID because we store in localStorage will not happen in the production.

Comment on lines 88 to 95
status.completed = true;
const returner = Object.values(job['Result'])[0].return;
status = { ...status, ...parseJobError(returner) };
if (job['Error'] === ERROR_NO_JOB_JID_FOUND) {
status.completed = true;
} else {
status.completed = true;
const returner = Object.values(job['Result'])[0].return;
status = { ...status, ...parseJobError(returner) };
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be simpler to just check if there is something in job['Result'].

@@ -193,6 +203,9 @@ const SolutionsList = (props) => {
dispatch(deleteEnvironmentAction(environment.name));
}}
inverted={true}
disabled={
environment.namespaces[0].status.phase === STATUS_TERMINATING
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if we have more than one namespace? Could be:

environment.namespaces.some(ns => ns.status.phase === STATUS_TERMINATING)

@@ -141,7 +141,8 @@
"env_preparation": "Préparation de l’environnement",
"env_preparation_success": "L'environnement {envName} est en cours de préparation.",
"env_preparation_failed": "La préparation de l'environnement {envName} a échoué. {step} - {reason}",
"preparing_environemnt": "Préparer l'environnement",
"preparing_environment": "Préparer l'environnement",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have both "prepare" and "preparing", the right translation for "preparing" would be Environnement en cours de préparation.
What bugs me is that env_preparation_success already means "in progress", for some reason. Should we make sure all those translations are used appropriately?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the french translation of "preparing"!

Well, actually these two keywords hold the same meaning indeed, but they are used by different components.

"env_preparation_success": "Environment {envName} is being prepared." used by the notification
"preparing_environment": "Preparing the environment" used by the loader in the env list when adding a solution which should be a shorter version.

so for me, it may not clear which one to use but make sure to have both.

envName,
solName,
solVersion,
// Need to check if Salt is under `deploy-node` before adding solution to environment
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that's the issue with #2941 - "solution import" means running ./solutions.sh import --archive /path/to/solution.iso from the Bootstrap node.
Adding a Solution to an environment only means performing K8s API calls (either from the UI directly, or from Salt master), but it should not trigger any Salt master restart.

So I don't think this check is needed at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely right, I was confused with solution import and add a solution to an environment.
I will remove this checking...

{intl.translate('preparing_environemnt', {
{intl.translate('preparing_environment', {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be in the same commit as the one changing the key in src/translations/.

ui/src/ducks/app/solutions.test.js Outdated Show resolved Hide resolved
@ChengYanJin ChengYanJin force-pushed the improvement/add-error-handling-for-salt-api branch 4 times, most recently from cecc7c4 to 5bc2ebe Compare December 2, 2020 17:36
@ChengYanJin ChengYanJin changed the title [UI] Fix the UI bugs related to Salt [UI] Fix the UI bugs when there is environment under the Terminating status Dec 2, 2020
@ChengYanJin ChengYanJin force-pushed the improvement/add-error-handling-for-salt-api branch from 5bc2ebe to 6839eb8 Compare December 3, 2020 11:00
The environment list is broken due to the environment which are  in
 terminating status.

And since the configMap has been already
deleted. The call readNamespacedConfigMap gets error. But we didn't
catch the error.

Refs: #2949
Add the missing translation for prepare environment
Fix the typo

Refs: #2941
@gdemonet
Copy link
Contributor

gdemonet commented Dec 4, 2020

Will need to wait for #2959

@ChengYanJin
Copy link
Contributor Author

/approve

@bert-e
Copy link
Contributor

bert-e commented Dec 9, 2020

Integration data created

I have created the integration data for the additional destination branches.

The following branches will NOT be impacted:

  • development/1.0
  • development/1.1
  • development/1.2
  • development/1.3
  • development/2.0
  • development/2.1
  • development/2.2
  • development/2.3
  • development/2.4
  • development/2.5

You can set option create_pull_requests if you need me to create
integration pull requests in addition to integration branches, with:

@bert-e create_pull_requests

The following options are set: approve

@bert-e
Copy link
Contributor

bert-e commented Dec 9, 2020

In the queue

The changeset has received all authorizations and has been added to the
relevant queue(s). The queue(s) will be merged in the target development
branch(es) as soon as builds have passed.

The changeset will be merged in:

  • ✔️ development/2.6

  • ✔️ development/2.7

The following branches will NOT be impacted:

  • development/1.0
  • development/1.1
  • development/1.2
  • development/1.3
  • development/2.0
  • development/2.1
  • development/2.2
  • development/2.3
  • development/2.4
  • development/2.5

There is no action required on your side. You will be notified here once
the changeset has been merged. In the unlikely event that the changeset
fails permanently on the queue, a member of the admin team will
contact you to help resolve the matter.

IMPORTANT

Please do not attempt to modify this pull request.

  • Any commit you add on the source branch will trigger a new cycle after the
    current queue is merged.
  • Any commit you add on one of the integration branches will be lost.

If you need this pull request to be removed from the queue, please contact a
member of the admin team now.

The following options are set: approve

@bert-e
Copy link
Contributor

bert-e commented Dec 10, 2020

I have successfully merged the changeset of this pull request
into targetted development branches:

  • ✔️ development/2.6

  • ✔️ development/2.7

The following branches have NOT changed:

  • development/1.0
  • development/1.1
  • development/1.2
  • development/1.3
  • development/2.0
  • development/2.1
  • development/2.2
  • development/2.3
  • development/2.4
  • development/2.5

Please check the status of the associated issue None.

Goodbye chengyanjin.

@bert-e bert-e merged commit 751283c into development/2.6 Dec 10, 2020
@bert-e bert-e deleted the improvement/add-error-handling-for-salt-api branch December 10, 2020 17:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants