[ws-daemon] do not fail workspace if git status failed during dispose #11331

sagor999 · 2022-07-12T23:52:33Z

Description

(This PR is stacked on top of #11330, please review commit: 8d2d1c6)

Some workspaces fail with cannot get git status, which is not a critical error.
Observed in some cases it failed due to potentially corrupt\misconfigured .git folder (potentially due to user's actions).

This PR would make it so that we don't fail workspace when this happens and instead continue disposing it.
For example, we already do that if workspace folder does not contain .git folder in it:

gitpod/components/ws-daemon/pkg/internal/session/workspace.go

Lines 309 to 312 in 905be0a

    
           if !git.IsWorkingCopy(loc) { 
        
           	log.WithField("loc", loc).WithField("checkout location", s.CheckoutLocation).WithFields(s.OWI()).Debug("did not find a Git working copy - not updating Git status") 
        
           	return nil, nil 
        
           }

Main reasoning is that even if we cannot get git status, all that means is that we not going to show the user in dashboard uncommitted changes, which should not be a reason for failing whole workspace.

Related Issue(s)

Fixes #

How to test

Open workspace and corrupt your .git folder
Observe that workspace still finishes correctly, albeit without showing any uncommitted changes in dashboard.

Release Notes

none

Documentation

Werft options:

/werft with-preview

sagor999 · 2022-07-12T23:53:01Z

/hold
would like to get input from @csweichel on this one, so holding to make sure he has a chance to review

aledbf · 2022-07-13T00:00:29Z

@sagor999 I think we need to update the configuration and configure /mnt/workingarea as safe using
git config --global --add safe.directory * instead of individual directories https://github.blog/2022-04-18-highlights-from-git-2-36/
(file /home/gitpod/.gitconfig in ws-daemon)

sagor999 · 2022-07-13T00:18:30Z

@aledbf I was referring to this particular case:
https://cloudlogging.app.goo.gl/8GkTztnkJ5MdpwAz8
(check ws-daemon error log there)

I think what you are referring to is a frequent similar error that seems to be coming from prebuilds (unsafe directory error).
I will try to take a look at it as well, but it is out of scope for this particular PR (even though it is kind of related)

jenting

I like this approach 👍
I encountered a scenario with the error git status failed, but the volume snapshot is ready to use.

In the scenario, we enable the PVC feature flag and manually make the node into a NotReady state (by disabling the k3s-agent on the host).
Then, the workspace pod becomes a Terminating state—finally, the VolumeSnapshot object is created by ws-manager and taken the snapshot of the PVC.
After relaunching the workspace pod, the workspace pod launches at another Ready node, and the user's content is recovered from the VolumeSnapshot.

Without this PR change, the dashboard shows the error cannot get git status.

The pro is we keep the user's content by PVC w/o data loss, and the con is we can't see the git difference in the dashboard.

csweichel

LGTM

(minor nit, hence the hold)

/hold

components/ws-daemon/pkg/content/service.go

csweichel · 2022-07-13T17:42:07Z

/hold

sagor999 · 2022-07-13T18:20:29Z

/unhold

aledbf · 2022-07-13T20:18:43Z

/hold

(due to the build error)

sagor999 · 2022-07-13T20:43:07Z

/werft run with-clean-slate-deployment=true

👍 started the job as gitpod-build-pavel-git-status-fix.3
(with .werft/ from main)

sagor999 · 2022-07-13T21:25:25Z

/werft run with-clean-slate-deployment=true

👍 started the job as gitpod-build-pavel-git-status-fix.4
(with .werft/ from main)

sagor999 · 2022-07-13T21:42:24Z

something funky with unit tests in this build. each time it is a different test that fails, but opened it and did a leeway build and everything built fine...
going to try to rebase.

Co-authored-by: Christian Weichel <chris@gitpod.io>

mads-hartmann · 2022-07-14T07:01:06Z

/werft run

👍 started the job as gitpod-build-pavel-git-status-fix.6
(with .werft/ from main)

mads-hartmann · 2022-07-14T07:23:16Z

The test succeeded this time. The set of of components that Leeway decided it needed to build is very different between gitpod-build-pavel-git-status-fix.5 and gitpod-build-pavel-git-status-fix.6 - I'm not quite sure why.

But given the failing tests were sporadic and unrelated to this PR I'll remove the hold so it can be merged and you can at least wake up to a merged PR ☺️ I'll following up to understand why Leeway would decide to build different packages when executed from the same branch at 10 hours apart.

/unhold

Update: Here is the internal Slack - I think that we're using a shared Leeway cache for all non-main branches which might hide test flakiness as you only need one successful build across all non-main branches. So you experienced the flakiness because you happened to be the first to build these components off main (if my theory is correct)

sagor999 requested review from csweichel and a team July 12, 2022 23:52

roboquat added release-note-none size/S labels Jul 12, 2022

github-actions bot added team: workspace Issue belongs to the Workspace team and removed size/S labels Jul 12, 2022

roboquat added the do-not-merge/hold label Jul 12, 2022

sagor999 force-pushed the pavel/git-status-fix branch from 8d2d1c6 to 5688da8 Compare July 13, 2022 00:25

roboquat added the size/XS label Jul 13, 2022

jenting reviewed Jul 13, 2022

View reviewed changes

csweichel approved these changes Jul 13, 2022

View reviewed changes

components/ws-daemon/pkg/content/service.go Outdated Show resolved Hide resolved

roboquat removed the do-not-merge/hold label Jul 13, 2022

roboquat added the do-not-merge/hold label Jul 13, 2022

sagor999 and others added 2 commits July 13, 2022 21:48

[ws-daemon] do not fail workspace if git status failed during dispose

575ac38

Update components/ws-daemon/pkg/content/service.go

74c2f36

Co-authored-by: Christian Weichel <chris@gitpod.io>

sagor999 force-pushed the pavel/git-status-fix branch from 9b29fc9 to 74c2f36 Compare July 13, 2022 21:48

roboquat removed the do-not-merge/hold label Jul 14, 2022

roboquat merged commit af62571 into main Jul 14, 2022

roboquat deleted the pavel/git-status-fix branch July 14, 2022 07:24

roboquat added deployed: workspace Workspace team change is running in production deployed Change is completely running in production labels Jul 20, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ws-daemon] do not fail workspace if git status failed during dispose #11331

[ws-daemon] do not fail workspace if git status failed during dispose #11331

sagor999 commented Jul 12, 2022

sagor999 commented Jul 12, 2022

aledbf commented Jul 13, 2022

sagor999 commented Jul 13, 2022

jenting left a comment

csweichel left a comment

csweichel commented Jul 13, 2022

sagor999 commented Jul 13, 2022

aledbf commented Jul 13, 2022

sagor999 commented Jul 13, 2022 •

edited by werft-gitpod-dev-com bot

sagor999 commented Jul 13, 2022 •

edited by werft-gitpod-dev-com bot

sagor999 commented Jul 13, 2022

mads-hartmann commented Jul 14, 2022 •

edited by werft-gitpod-dev-com bot

mads-hartmann commented Jul 14, 2022 •

edited

	if !git.IsWorkingCopy(loc) {
	log.WithField("loc", loc).WithField("checkout location", s.CheckoutLocation).WithFields(s.OWI()).Debug("did not find a Git working copy - not updating Git status")
	return nil, nil
	}

[ws-daemon] do not fail workspace if git status failed during dispose #11331

[ws-daemon] do not fail workspace if git status failed during dispose #11331

Conversation

sagor999 commented Jul 12, 2022

Description

Related Issue(s)

How to test

Release Notes

Documentation

Werft options:

sagor999 commented Jul 12, 2022

aledbf commented Jul 13, 2022

sagor999 commented Jul 13, 2022

jenting left a comment

Choose a reason for hiding this comment

csweichel left a comment

Choose a reason for hiding this comment

csweichel commented Jul 13, 2022

sagor999 commented Jul 13, 2022

aledbf commented Jul 13, 2022

sagor999 commented Jul 13, 2022 • edited by werft-gitpod-dev-com bot

sagor999 commented Jul 13, 2022 • edited by werft-gitpod-dev-com bot

sagor999 commented Jul 13, 2022

mads-hartmann commented Jul 14, 2022 • edited by werft-gitpod-dev-com bot

mads-hartmann commented Jul 14, 2022 • edited

sagor999 commented Jul 13, 2022 •

edited by werft-gitpod-dev-com bot

sagor999 commented Jul 13, 2022 •

edited by werft-gitpod-dev-com bot

mads-hartmann commented Jul 14, 2022 •

edited by werft-gitpod-dev-com bot

mads-hartmann commented Jul 14, 2022 •

edited