Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add etcd backed storage #1108

Merged
merged 6 commits into from
Nov 6, 2017
Merged

Add etcd backed storage #1108

merged 6 commits into from
Nov 6, 2017

Conversation

dqminh
Copy link
Contributor

@dqminh dqminh commented Oct 27, 2017

This adds a storage implementation backed by etcd. This should be useful mostly in cases where we dont want to depend on a separate SQL cluster ( and etcd is easier to operate ), or if we dont want to incur some overhead of using kubernetes CRD.

There are no documentation yet ! Once this looks good i can try to add them.

The previous test doesnt actually testing ListConnectors code. For
example the following pseudocode will pass the test:

```
ListConnectors() { return nil, nil }
```

Instead change to actually fetch and compare list of connectors,
ordering by name
Copy link
Contributor

@ericchiang ericchiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ran this locally and it passed all the conformance tests. Got a few lint errors but other than that it seems to run fine.

)

type EtcdSSL struct {
ServerName string `json:"server_name" yaml:"server_name"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we use camel case for struct tags. json:"serverName"

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we just use transport.TLSInfo https://github.com/coreos/etcd/blob/master/pkg/transport/listener.go#L59?

tlsInfo := transport.TLSInfo{
    CertFile:      "/tmp/test-certs/test-name-1.pem",
    KeyFile:       "/tmp/test-certs/test-name-1-key.pem",
    TrustedCAFile: "/tmp/test-certs/trusted-ca.pem",
}
tlsConfig, err := tlsInfo.ClientConfig()
...
clientv3.New(clientv3.Config{
    ...
    TLS:         tlsConfig,
})

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, nvm. It's for json, yaml tagging.

CertFile string `json:"cert_file" yaml:"cert_file"`
}

type Etcd struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Allow a prefix

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm what do you mean ? Like setting a etcd namespace ? I think its usually done now with the help of the grpc-proxy, isnt it ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like setting a etcd namespace ?

Yeah an equivalent of etcd grpc-proxy start --namespace=<prefix>

I think its usually done now with the help of the grpc-proxy, isnt it ?

I think it depends on your setup. But yeah we can document the grpc-proxy solution then add native namespace functionality later.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still need struct tags

Endpoints []string `yaml:"endpoints"`


type Etcd struct {
Endpoints []string `json:"endpoints" yaml:"endpoints"`
DialTimeout int `json:"dial_timeout" yaml:"dial_timeout"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We generally don't make everything configurable just because the client library does. Can we set DialTimeout and AutoSyncInterval to reasonable defaults and if there are compelling reasons to expose them later we can?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm that works for me to. I can set them to the same default as etcdctl.

Username string `json:"username" yaml:"username"`
Password string `json:"password" yaml:"password"`
SSL EtcdSSL `json:"ssl" yaml:"ssl"`
RejectOldCluster bool `json:"reject-old-cluster" yaml:"reject-old-cluster"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this do?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// RejectOldCluster when set will refuse to create a client against an outdated cluster.
RejectOldCluster bool `json:"reject-old-cluster"`

as before, will also remove this knob for now.

return c.txnUpdate(ctx, key, func(currentValue []byte) ([]byte, error) {
var current RefreshToken
if len(currentValue) > 0 {
if err := json.Unmarshal([]byte(currentValue), &current); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any thoughts on using a faster serialization format like protobuf? We use JSON for CRDs because CRDs only support JSON

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont have any preference. Go's json stdlib is indeed slow-ish compared to other serialization format, but usually its not significant in practice i think.

@@ -0,0 +1,109 @@
#!/bin/bash

if [ "$EUID" -ne 0 ]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Admittedly we've used these scripts less and less. I think instructions for downloading the binary releases would suffice https://github.com/coreos/etcd/releases

return nil
}

func (c *conn) txnUpdate(ctx context.Context, key string, update func(current []byte) ([]byte, error)) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @xiang90 mind taking a look at this?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks right to me.

we could do auto retry inside the txnUpdate by calling update func multiple times when there are conflicts (maybe it is already wrapped?).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont think auto-retry should be applied here. At least thats how the current test is structured ( also how sql transaction works ).
Although IIUC there are differences on the order of transaction between sql and non-sql storages now ( k8s or etcd ). In sql the first started transaction win vs the first finished transaction win in others.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dqminh understood. thanks.

}
newStorage := func() storage.Storage {
s := &Etcd{
Endpoints: []string{endpoints},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weird that we call the var ENDPOINTS but only allow one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yah, in tests we only setup 1 endpoint so it still works 😛 I was lazy to do the transformation from env vars to list of endpoints.

@ericchiang
Copy link
Contributor

Since adding a new storage is a big overhead to all future developers, if this is committed we need to be running these tests in travis.

Also just to address your commit message

  • kubernetes is not available yet, or if kubernetes depends on dex
    to perform authentication and the operator would like to remove any
    circular dependency if possible.

We intentionally build the OpenID Connect authenticator in Kubernetes to operate fine if dex isn't running. While the dependency looks circular, our experience of running dex on Tectonic would imply that that's a perfectly fine setup.

@xiang90
Copy link

xiang90 commented Oct 27, 2017

@dqminh

Can we handle timeout better? Or there is a future plan? Right now, we simply pass in context.TODO().

@ericchiang
Copy link
Contributor

@xiang90 I think exposing a context is something we have to do on the dex end, unfortunately.

return err
}
if !updateResp.Succeeded {
return fmt.Errorf("failed to update key=%q: updated not by us", key)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

failed to update key = %q: concurrent conflicting update happened.

@xiang90
Copy link

xiang90 commented Oct 28, 2017

I only took a look at the txnUpdate func. Let me know if there is anything else I could help on.

@dqminh
Copy link
Contributor Author

dqminh commented Oct 30, 2017

Can we handle timeout better? Or there is a future plan? Right now, we simply pass in context.TODO().

yah i think the storage interface needs to be changed so we can pass context downwards. Right now not sure what else can we do here:

  1. using context.TODO()
  2. enforce a hard-code timeout for all operations.

@xiang90
Copy link

xiang90 commented Oct 30, 2017

enforce a hard-code timeout for all operations.

this is what i usually do if the api layer does not expose context. it is safer. but i am not a dex expert, so i cannot speak for this exact case.

@ericchiang
Copy link
Contributor

A hard-coded timeout sounds fine.

This patch adds etcd storage implementation. This should be useful in
environments where
- we dont want to depends on a separate, hard to maintain SQL cluster
- we dont want to incur the overhead of talking to kubernetes apiservers
- kubernetes is not available yet, or if kubernetes depends on dex
to perform authentication and the operator would like to remove any
circular dependency if possible.
This change vendors github.com/coreos/etcd related packages to support
etcd storage implementation.
@dqminh
Copy link
Contributor Author

dqminh commented Oct 31, 2017

added default timeout = 5s, and kv namespace support via Etcd.Namespace field.

Copy link
Contributor

@ericchiang ericchiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One nit. Code lgtm

Needs docs and tests running in travis. I'd be happy with a follow up.

CertFile string `json:"cert_file" yaml:"cert_file"`
}

type Etcd struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still need struct tags

Endpoints []string `yaml:"endpoints"`

@dqminh
Copy link
Contributor Author

dqminh commented Nov 6, 2017

@ericchiang Thanks ! I will look into docs and travis change.

This patch uses docker to run an etcd container in travis CI so we can
run storage/etcd conformance tests.
This adds references to etcd storage, including:
- only supports etcd v3
- list of options and their meanings when connecting to etcd cluster
This explicitly adds struct tags for etcd storage instead of implicitly
depends on yaml/json config serialization.
Copy link
Contributor

@ericchiang ericchiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. Thanks for getting the conformance tests running against CI!

@ericchiang ericchiang merged commit ccf85a7 into dexidp:master Nov 6, 2017
@dqminh dqminh deleted the etcd-storage branch November 8, 2017 16:17
mmrath pushed a commit to mmrath/dex that referenced this pull request Sep 2, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants