Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Internal redispatch #39

Merged
merged 16 commits into from
Aug 31, 2021
Merged

Internal redispatch #39

merged 16 commits into from
Aug 31, 2021

Conversation

jakedt
Copy link
Member

@jakedt jakedt commented Aug 27, 2021

No description provided.

@github-actions github-actions bot added area/dependencies Affects dependencies area/tooling Affects the dev or user toolchain (e.g. tests, ci, build tools) labels Aug 27, 2021
@@ -71,9 +71,11 @@ jobs:
- name: "Generate & Diff Servok Protos"
run: |
./buf.gen.yaml
git diff
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why these additions? Just to make the errors nicer?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it was impossible for me to see what the difference was.

@@ -77,8 +82,11 @@ func newRootCmd() *cobra.Command {
// Flags for parsing and validating schemas.
rootCmd.Flags().Bool("schema-prefixes-required", false, "require prefixes on all object definitions in schemas")

// Flags for internal dispatch API
rootCmd.Flags().String("internal-grpc-addr", ":50052", "address to listen for internal requests")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we use 50052 for corrino currently, can we default to this something like 50053?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

everything uses 50051 now (inherited from cobrautil), but we might want to make this port totally different so that there's no mistaking it

}

totalCounter := prometheus.NewCounter(prometheus.CounterOpts{
Namespace: "spicedb",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move the "spicedb" into a constant


// DispatchExpand implements dispatch.Expand interface
func (cd *cachingDispatcher) DispatchExpand(ctx context.Context, req *v1.DispatchExpandRequest) (*v1.DispatchExpandResponse, error) {
return cd.d.DispatchExpand(ctx, req)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add comments here and in Lookup that we're not doing caching yet

return nil, fmt.Errorf(errCachingInitialization, err)
}

totalCounter := prometheus.NewCounter(prometheus.CounterOpts{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably should name this differently, since I'd expect it to be for all api calls, not just checks

cd.c.Set(requestKey, toCache, checkResultEntryCost)
}

return computed, err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe be explicit here in returning nil for the result?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not always nil here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a non-nil result when there is an error?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the error condition doesn't short circuit

Copy link
Member

@josephschorr josephschorr Aug 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh... that's highly unexpected to me. Maybe just return the error if it occurs and move the cache update outside of the branch?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah i think adding the short circuit would make this less prone to getting broken in a future refactor

})

require.NoError(err)
require.Equal(expected.isMember, checkResult.Membership == v1.DispatchCheckResponse_MEMBER)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps add a dispatch again and verify again, to ensure caching is working as expected?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which caching? This only has namespace manager caching, not dispatch caching.

@jzelinskie jzelinskie added the priority/2 medium This needs to be done label Aug 27, 2021
Copy link
Member

@jzelinskie jzelinskie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry a bunch of my comments were on early commits from before refactors and then I just decided to jump ahead to the final state.

sync.Mutex{},
consistent.NewHashring(xxhash.Sum64, hashringReplicationFactor),
cancel,
proto.MarshalOptions{Deterministic: true},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a time we don't want this to be deterministic? Why bother parameterizing it?

}

for ctx.Err() == nil {
endpointResponse, err := stream.Recv()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't the first response always an empty one? (i'm remembering this from an earlier iteration of this code)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed that. It was a disaster when servok restarted.

}

// Stop will cancel the client watch and clean up the pool
func (sc *SmartClient) Stop() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be more idiomatic to support context cancellation for this rather than an explicit Stop function.
I can see arguments both ways, but I think we should strive to use one cohesive strategy across the codebase.


var errNoBackends = errors.New("no backends available for request")

// SmartClient is a client which utilizes a dynamic source of backends and a consistent
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it's a nit, but I really dislike "smart client" -- it doesn't really mean anything and gives absolutely no information about the behavior of the client other than it has "some kind of logic".

LiveHashringDispatchClient or UpdatingHashringDispatchClient maybe would be better?

"github.com/authzed/spicedb/pkg/x509util"
"go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc"
"google.golang.org/grpc"
"google.golang.org/grpc/credentials"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

import mixing

internal/graph/lookup.go Outdated Show resolved Hide resolved
cd.c.Set(requestKey, toCache, checkResultEntryCost)
}

return computed, err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah i think adding the short circuit would make this less prone to getting broken in a future refactor

var tracer = otel.Tracer("spicedb/internal/dispatch/local")

// NewLocalOnlyDispatcher creates a dispatcher that consults with the graph to formulate a response.
func NewLocalOnlyDispatcher(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason to call this LocalOnly rather than just Local?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, because a "local" dispatcher can be configured to redispatch to the network.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's not at all obvious from the names

v1 "github.com/authzed/spicedb/internal/proto/dispatch/v1"
)

type clusterClient interface {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dispatchClient


// NewClusterDispatcher creates a dispatcher implementation that uses the provided client
// to dispatch requests to peer nodes in the cluster.
func NewClusterDispatcher(client clusterClient) dispatch.Dispatcher {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get it -- but this API is kinda weird since you're creating a dispatcher from a dispatcher.
Maybe we can make this more specific and name this one the DepthCheckingDispatcher?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoa that's a gross misrepresentation of what this does!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure I'm just crazy overloading the term "dispatcher"

cd.c.Set(requestKey, toCache, checkResultEntryCost)
}

// Return both the computed and err in ALL cases, computed contains resolved metadata even
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

, -> :

Copy link
Member

@jzelinskie jzelinskie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me after a rebase

Signed-off-by: Jake Moshenko <jacob.moshenko@gmail.com>
Signed-off-by: Jake Moshenko <jacob.moshenko@gmail.com>
Signed-off-by: Jake Moshenko <jacob.moshenko@gmail.com>
Signed-off-by: Jake Moshenko <jacob.moshenko@gmail.com>
Signed-off-by: Jake Moshenko <jacob.moshenko@gmail.com>
Signed-off-by: Jake Moshenko <jacob.moshenko@gmail.com>
Signed-off-by: Jake Moshenko <jacob.moshenko@gmail.com>
Signed-off-by: Jake Moshenko <jacob.moshenko@gmail.com>
Signed-off-by: Jake Moshenko <jacob.moshenko@gmail.com>
Signed-off-by: Jake Moshenko <jacob.moshenko@gmail.com>
Signed-off-by: Jake Moshenko <jacob.moshenko@gmail.com>
Signed-off-by: Jake Moshenko <jacob.moshenko@gmail.com>
Signed-off-by: Jake Moshenko <jacob.moshenko@gmail.com>
Signed-off-by: Jake Moshenko <jacob.moshenko@gmail.com>
Signed-off-by: Jake Moshenko <jacob.moshenko@gmail.com>
Signed-off-by: Jake Moshenko <jacob.moshenko@gmail.com>
@jakedt jakedt merged commit b7e2031 into main Aug 31, 2021
@jakedt jakedt deleted the internal-redispatch branch August 31, 2021 20:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/dependencies Affects dependencies area/tooling Affects the dev or user toolchain (e.g. tests, ci, build tools) priority/2 medium This needs to be done
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants