Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(storage): implement separate storage layer #262

Merged
merged 7 commits into from
Mar 29, 2016
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
42 changes: 36 additions & 6 deletions boot.go
Original file line number Diff line number Diff line change
Expand Up @@ -13,9 +13,13 @@ import (
"github.com/deis/builder/pkg/gitreceive"
"github.com/deis/builder/pkg/healthsrv"
"github.com/deis/builder/pkg/sshd"
"github.com/deis/builder/pkg/storage"
"github.com/deis/builder/pkg/sys"
pkglog "github.com/deis/pkg/log"
storagedriver "github.com/docker/distribution/registry/storage/driver"
_ "github.com/docker/distribution/registry/storage/driver/azure"
"github.com/docker/distribution/registry/storage/driver/factory"
_ "github.com/docker/distribution/registry/storage/driver/gcs"
_ "github.com/docker/distribution/registry/storage/driver/s3-aws"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get what you're going for here... most of what you need is already implemented in these packages... but it feels very "weird" to rely on these packages that have a great deal to do with registry when the builder code really has nothing to do with registry. Can the useful bits of Docker's code be split into a separate project that can be a standalone go-based object storage library?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to @krancour's statement. We should at least use our fork, but I'd prefer a new repository that is 100% dedicated to this storage abstraction. Note that @mboersma has submitted distribution/distribution#1537, which is highly related.

Finally, I'd prefer that, eventually, we refactor whatever storage library that we work with to:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am using our fork only. I had aliased docker distribution to grab our repo in the glide.yaml file the reason being i don't want to change all the imports in our repo.
The plan is to initially use it this way and depending on the PR in the docker distribution we want to change ourself making into a separate repo.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kmala responses below:

I am using our fork only. I had aliased docker distribution to grab our repo in the glide.yaml file the reason being i don't want to change all the imports in our repo.

Ah, got it. Sorry I didn't realize what's going on there.

The plan is to initially use it this way and depending on the PR in the docker distribution we want to change ourself making into a separate repo.

That plan is fine with me for now. FWIW, my guess is that distribution/distribution#1537 is going to be a "won't fix".

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That plan is fine with me for now. FWIW, my guess is that distribution/distribution#1537 is going to be a "won't fix".

Don't count us out. We'd like to avoid extra maintenance overhead of a separate project. However, if people find the model to be effective and some extra work is done to increase fitness for external usage, we will likely reconsider. There really is nothing registry specific about these packages.

Not use anonymous imports unless for good, documented reason

These are to support back registration. If you find the documentation to be lacking, we are willing to accept a PR. Up until this point, the usage of this package has been mostly internal. If we split out this project, the back registration would likely remain in the registry project, as it is very specific to the configuration environment. There is no reason to continue to maintain it in this manner, other than convenience, so we're open to suggestions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stevvooe I certainly haven't counted you out (I was simply speculating), and definitely understand the maintenance overhead of the separate project.

Reading up, I realize that I wasn't clear that we can make progress with, and will submit PRs upstream to either deis/distribution or the new repo if you do decide to split it out. This storage abstraction is useful to us either way, and a new repo would just be a convenience (precisely because the storage layer is not registry specific).

These are to support back registration. If you find the documentation to be lacking, we are willing to accept a PR.

Great - I'll need to read through the registry codebase more thoroughly to know exactly what you mean by back registration. Mind pointing me to a good starting place?

There is no reason to continue to maintain it in this manner, other than convenience, so we're open to suggestions.

Just so I'm clear, do you mean there would be no reason to continue to maintain it in this manner if you do split out the project, or is there something I'm missing?

Anyway, I'm happy to help out where possible (beyond the docs mentioned above).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A solid place to start with back registration is here. The factory is registered such that the driver can be instantiated by name and opaque configuration values. The anonymous imports (ie _) force the registration via the init method and make it available. It allows one to have different mains to compile in different sets of driver plugins. We used to use this, along with build tags, to control which drivers were compiled into the binary. This is a pattern that is leveraged in several places, such as crypto hash functions and database/sql.

What I meant by my word soup was that we can separate the back registration from the driver implementations. Registry specific configuration is present in a lot of the driver packages that we might want to have a closer look at. The exact split is unclear, but we would go with what would make it broadly useful.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A solid place to start with back registration is here. The factory is registered such that the driver can be instantiated by name and opaque configuration values. The anonymous imports (ie _) force the registration via the init method and make it available

Thanks for that this documentation PR should be a decent (if small) start to documenting this registration mechanism.

What I meant by my word soup was that we can separate the back registration from the driver implementations.

That's great!

Registry specific configuration is present in a lot of the driver packages that we might want to have a closer look at. The exact split is unclear, but we would go with what would make it broadly useful.

I'll have a look to get myself more familiar on the registry bits. My general thought was to move just StorageDriver and implementations out, leaving the factory in. FWIW

kcl "k8s.io/kubernetes/pkg/client/unversioned"
)

Expand Down Expand Up @@ -49,16 +53,27 @@ func main() {
pkglog.Err("getting config for %s [%s]", serverConfAppName, err)
os.Exit(1)
}
env := sys.RealEnv()
fs := sys.RealFS()
env := sys.RealEnv()
pushLock := sshd.NewInMemoryRepositoryLock()
circ := sshd.NewCircuit()

s3Client, err := storage.GetClient(cnf.HealthSrvTestStorageRegion, fs, env)
storageParams, err := conf.GetStorageParams(env)
if err != nil {
log.Printf("Error getting s3 client (%s)", err)
log.Printf("Error getting storage parameters (%s)", err)
os.Exit(1)
}
var storageDriver storagedriver.StorageDriver
if cnf.StorageType == "minio" {
storageDriver, err = factory.Create("s3", storageParams)
} else {
storageDriver, err = factory.Create(cnf.StorageType, storageParams)
}
if err != nil {
log.Printf("Error creating storage driver (%s)", err)
os.Exit(1)
}

kubeClient, err := kcl.NewInCluster()
if err != nil {
log.Printf("Error getting kubernetes client [%s]", err)
Expand All @@ -67,7 +82,7 @@ func main() {
log.Printf("Starting health check server on port %d", cnf.HealthSrvPort)
healthSrvCh := make(chan error)
go func() {
if err := healthsrv.Start(cnf.HealthSrvPort, kubeClient.Namespaces(), s3Client, circ); err != nil {
if err := healthsrv.Start(cnf.HealthSrvPort, kubeClient.Namespaces(), storageDriver, circ); err != nil {
healthSrvCh <- err
}
}()
Expand Down Expand Up @@ -111,8 +126,23 @@ func main() {
cnf.CheckDurations()
fs := sys.RealFS()
env := sys.RealEnv()
storageParams, err := conf.GetStorageParams(env)
if err != nil {
log.Printf("Error getting storage parameters (%s)", err)
os.Exit(1)
}
var storageDriver storagedriver.StorageDriver
if cnf.StorageType == "minio" {
storageDriver, err = factory.Create("s3", storageParams)
} else {
storageDriver, err = factory.Create(cnf.StorageType, storageParams)
}
if err != nil {
log.Printf("Error creating storage driver (%s)", err)
os.Exit(1)
}

if err := gitreceive.Run(cnf, fs, env); err != nil {
if err := gitreceive.Run(cnf, fs, env, storageDriver); err != nil {
log.Printf("Error running git receive hook [%s]", err)
os.Exit(1)
}
Expand Down
116 changes: 105 additions & 11 deletions glide.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 3 additions & 1 deletion glide.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -32,4 +32,6 @@ import:
version: ~1.1
- package: github.com/arschles/assert
version: 6882f85ccdc7c1822b146d1a6b0c2c48f91b5140
- package: github.com/minio/minio-go
- package: github.com/docker/distribution
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since distribution/distribution#1512 has been merged, can we now use the docker repo?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docker distribution doesn't create a bucket if it isn't present and they have a regex which prevents us from using the ':' in the path https://github.com/docker/distribution/blob/master/registry/storage/driver/storagedriver.go#L109.
All our folders are :

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, fine to leave as-is for now then.

Also, do you know the reasoning behind that regex? I'm particularly wondering if there's an object storage system that disallows : in object keys (I don't know of one)? Not blocking for this PR.

repo: https://github.com/deis/distribution
vcs: git
42 changes: 41 additions & 1 deletion pkg/conf/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,20 @@ import (
"fmt"
"io/ioutil"

"github.com/deis/builder/pkg/sys"
"github.com/kelseyhightower/envconfig"
)

const (
builderKeyLocation = "/var/run/secrets/api/auth/builder-key"
builderKeyLocation = "/var/run/secrets/api/auth/builder-key"
storageCredLocation = "/var/run/secrets/deis/objectstore/creds/"
minioHostEnvVar = "DEIS_MINIO_SERVICE_HOST"
minioPortEnvVar = "DEIS_MINIO_SERVICE_PORT"
gcsKey = "key.json"
)

type Parameters map[string]interface{}

// EnvConfig is a convenience function to process the envconfig (
// https://github.com/kelseyhightower/envconfig) based configuration environment variables into
// conf. Additional notes:
Expand All @@ -35,3 +42,36 @@ func GetBuilderKey() (string, error) {
builderKey := string(builderKeyBytes)
return builderKey, nil
}

func GetStorageParams(env sys.Env) (Parameters, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like this func to go into a top level storage package. I just merged #248, which creates the github.com/deis/builder/pkg/storage package, so that's a perfect place to put it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kept in conf because i am getting the storage params from the environment configuration.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I see what you mean. I think the argument could be made either way, so I won't contest your decision.

params := make(map[string]interface{})
files, err := ioutil.ReadDir(storageCredLocation)
if err != nil {
return nil, err
}

for _, file := range files {
data, err := ioutil.ReadFile(storageCredLocation + file.Name())
if err != nil {
return nil, err
}
//GCS expect the to have the location of the service account credential json file
if file.Name() == gcsKey {
params["keyfile"] = storageCredLocation + file.Name()
} else {
params[file.Name()] = string(data)
}
}
params["bucket"] = params["builder-bucket"]
params["container"] = params["builder-container"]
if env.Get("BUILDER_STORAGE") == "minio" {
mHost := env.Get(minioHostEnvVar)
mPort := env.Get(minioPortEnvVar)
params["regionendpoint"] = fmt.Sprintf("http://%s:%s", mHost, mPort)
params["secure"] = false
params["region"] = "us-east-1"
params["bucket"] = "git"
}

return params, nil
}