Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

status-indicator: Display external service validation and syncing errors #4804

Closed
wants to merge 12 commits into from

Conversation

mrnugget
Copy link
Contributor

@mrnugget mrnugget commented Jul 4, 2019

This extends the global status indicator with a new status message: SYNCERROR. The message is produced whenever the initial construction of an external service "source" fails (due to invalid tokens, invalid config, etc.) or when syncing the repositories fails.

This is one of the "extension" ideas outlined here: https://github.com/sourcegraph/sourcegraph/issues/4120
And it's also better UX now that we've moved the syncing into the background, where errors could possibly disappear: https://github.com/sourcegraph/sourcegraph/issues/4357#issuecomment-499060771

Screenshot of current state:

Screen Shot 2019-07-05 at 16 53 09

It works as intended, but I'd love to get more feedback on the following:

  • general code design regarding the problem of how to keep multiple errors around in the syncer and associate them with external services/sources (see the currentMultiSourceError/SourceError solution). I feel like this could maybe be cleaner.
  • UI: the error messages can be become quite unwieldy in the status dropdown items. Suggestions highly welcome.

Test plan: manual testing, go test, cd web && yarn test --runInBand StatusMessagesNavItem.test.tsx

cmd/repo-updater/repos/sources.go Show resolved Hide resolved
len(es), strings.Join(points, "\n\t"))
}

type MultiSourceError struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What makes this type necessary? Can't we work with multierror.Error of SourceErrors?

Copy link
Contributor Author

@mrnugget mrnugget Jul 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what I did first, but it lead to the code looking like this:

if multiErr, ok := err.(*multierror.Error); ok {
	for _, sub := range multiErr.Errors {
		if sourceErr, ok := sub.(*SourceError); ok {
			m := newSyncErrMessage(sourceErr)
			resp.Messages = append(resp.Messages, m)
		} else {
			// drop error? add generic message?
		}
	}
} else {
	// drop error? add generic message?
}

That seems a bit unwieldy and MultiSourceError was me trying my hand at putting some more information back into the type system. But I'm not set on it.

I think the general problem that needs to be solved is the association of errors to external services and making those accessible to consumers. multierror.Error and MultiSourceError are equally capable of doing this, but I feel like there might be an even better approach.

I'm also thinking that maybe we need a map that gets updated with each sync run, kinda like map[*ExternalService]error. That would then also allow us to show the status per external service when everything went fine. But in order to implement that, we still need to find out what happened with each external service inside the syncer, so at least *SourceError seems necessary?

And another problem is that, of course, the Err in *SourceError is also a multierror.Error and doesn't lead to the most UI-friendly messages (2 errors occured:\n\t* blub\n\t* blah...). I thought about using a custom format func for this, but that would only lead to a custom format in repo-updater that then needs to be untangled in the TypeScript world?

cmd/repo-updater/repos/syncer.go Show resolved Hide resolved
cmd/repo-updater/repos/syncer.go Outdated Show resolved Hide resolved
cmd/repo-updater/repoupdater/server.go Show resolved Hide resolved
@mrnugget
Copy link
Contributor Author

mrnugget commented Jul 5, 2019

Update after talking this through with @keegancsmith and @tsenart:

  1. From a technical standpoint this works (of course the proper code review might reveal things) and we said that we can merge this and then refine it (especially since I don't have a really nice solution to the code design problems described above yet)
  2. Do you think this already solves a customer issue in its current form, @christinaforney? Or do you have any concerns regarding UI/UX?

I'll be out of the office for the next three days. From my side, it'd totally fine to merge this (again, as long as build passes and your code reviews didn't yield any criticial bugs I've overlooked) and refine it later. But if there are any customer-side concerns, let's refine it in this PR before merging.

@mrnugget mrnugget marked this pull request as ready for review July 5, 2019 14:35
@codecov
Copy link

codecov bot commented Jul 5, 2019

Codecov Report

Merging #4804 into master will increase coverage by 0.04%.
The diff coverage is 67.5%.

@@            Coverage Diff             @@
##           master    #4804      +/-   ##
==========================================
+ Coverage   48.07%   48.11%   +0.04%     
==========================================
  Files         726      726              
  Lines       44263    44361      +98     
  Branches     1761     1761              
==========================================
+ Hits        21278    21343      +65     
- Misses      21004    21032      +28     
- Partials     1981     1986       +5
Impacted Files Coverage Δ
cmd/repo-updater/repos/syncer.go 79.32% <100%> (+1.49%) ⬆️
cmd/repo-updater/repoupdater/server.go 59.72% <100%> (+1.5%) ⬆️
cmd/frontend/graphqlbackend/status_messages.go 43.24% <20.83%> (-43.43%) ⬇️
cmd/repo-updater/repos/sources.go 76.78% <59.45%> (-8.59%) ⬇️
cmd/repo-updater/repos/testing.go 83.08% <75%> (-0.38%) ⬇️
web/src/nav/StatusMessagesNavItem.tsx 86.44% <85.71%> (-0.52%) ⬇️
...d/frontend/internal/authz/bitbucketserver/store.go 78.69% <0%> (-1.19%) ⬇️
cmd/frontend/graphqlbackend/graphqlbackend.go 42.65% <0%> (+4.19%) ⬆️

# The name of this metadata pair.
name: String!
# The data of this metadata pair.
value: String!
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So is this type just a way to get an untyped key/value bag? Why can't this metadata be typed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the metadata differs depending on which type the StatusMessage has. AFAIK GraphQL doesn't support union types, which is what I'd want here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GraphQL does support union types

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, interesting! 🤔

@felixfbecker
Copy link
Contributor

Regarding the errors:

  • It would be nice if errors used something of our error alert styles
  • These errors are very raw. Is it impossible to distill something actionable for the user? E.g. if the error is a 401, the token must be wrong?

@mrnugget
Copy link
Contributor Author

mrnugget commented Jul 5, 2019

* These errors are very _raw_. Is it impossible to distill something _actionable_ for the user? E.g. if the error is a 401, the token must be wrong?

Agree! And that's actually a lot of work, because we'd basically need to map arbitrary errors into actionable items for each code host. And at which layer do we do that? In each code host client? GraphQL layer? Web? I don't know how feasible that is (which is not to say that it isn't).

@beyang beyang removed their request for review July 8, 2019 17:25
@christinaforney
Copy link
Contributor

Do you think this already solves a customer issue in its current form? Or do you have any concerns regarding UI/UX?

Since this is something that is only shown to admin users, I think it is worth experimenting with and getting feedback on. I agree with @felixfbecker that an improved experience would be something more actionable, but I understand that would require specific mapping.

What if we just mapped the most common errors (e.g. 401 (bad credentials), 404 (probably bad repo path?), 504 (host unavailable)) so that at least the common cases are a bit more usable, then the rest of them can stay in their raw state, essentially solving for the 80% scenario. Thoughts?

@mrnugget
Copy link
Contributor Author

mrnugget commented Jul 9, 2019

What if we just mapped the most common errors (e.g. 401 (bad credentials), 404 (probably bad repo path?), 504 (host unavailable)) so that at least the common cases are a bit more usable, then the rest of them can stay in their raw state, essentially solving for the 80% scenario. Thoughts?

The more I thought I about it, the more I think this is a requirement before we can ship.


I'm closing this PR (since it was a prototype thing anyway) and will re-open later once I have a second iteration done that includes the suggestions made by @tsenart @felixfbecker and @christinaforney.

@mrnugget mrnugget closed this Jul 9, 2019
@nicksnyder
Copy link
Contributor

I did not look at the code but I do think it would be ok to be incremental about improving our error handling here.

My understanding from a conversation with @mrnugget is that it is currently possible for the user to not see a warning when saving an external service (if the sync takes too long), or worse, see a warning caused by an external service other than the one that is currently being saved. This seems like a valuable problem to fix.

The more I thought I about it, the more I think this is a requirement before we can ship.

All else equal, of course pretty errors are better, but showing not pretty errors is a big improvement over not showing errors at all, so I wouldn't block a change like this on the errors not looking pretty.

Supporting more types of errors is also something that can happen as necessary in the future, so I think it is fine to just implement one error to begin with.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants