-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: RFC: pkg/assets: Merkle DAG asset library #556
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: wking The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
went through this as first pass:
will keep looking. i'm a little unsure if we need this complexity... |
I agree that there will be lots of files. To mitigate that, we could use But to balance the "this is a lot of files":
This allows me to colocate the rebuilder/defaulter definitions with their registrations. If you'd rather have me centralize the registrations in one big, non-
It's all bytes on disk for both this and our current approach. If you want helpers to unpack a particular asset into a
That's a bonus, right? Why would you want custom read logic? ;)
We ignore future changes to its ancestors, we continue to propagate down to its decendants. If we wanted, we could record its parent references when we freeze it and log warnings if those parents changed (e.g. "you've added a custom |
:/
so this requires users keep the whole dir around for that, i don't see how that is differnet that state file.
I personally have been trying my best to keep us from adding new interfaces too, but targets are a better UX
The current impl is able to work without any init
Current impl is slowly moving away from bytes... https://github.com/openshift/installer/blob/master/pkg/asset/machines/worker.go#L69 is accessing the installconfig type directly, not marshal/unmarshalling, while this forces us to go back
reading expired certs from disk etc validations can be useful to user.
The current impl can handle that case easily too, disregard parents when asset from disk is read and modified... Also how do we model a case where user wants to bring new files to an asset, eg. manfiests from current impl. |
I've pushed dd9b8cc -> 20912ea, rebasing onto master and adding enough assets to generate a reasonable graph.
Yeah, #547 will give us similar behaviour here. The difference is that with #547 you still need to find the right target to access the resource(s) you want to manipulate, while with this approach you (re)generate all the assets at once.
Because...? It it just that there's less to think about at each stage? One extension I was thinking about was:
to allow you to create subsets of the graph. For example, you could: openshift-install --dir example create assets bootstrap.ign to just create assets through to the
Because the current implementation requires consumers to explicitly import their parents (e.g. here). With this PR, you just feed your parents' names into
It's still unmarshalling when we pull it up off the disk. The savings are just efficiency for assets that are consumed in multiple places. I don't think we have so many of those that unmarshalling efficiency is going to be a big deal, but I can run some benchmarks if you want once I get the asset tree filled in a bit more.
Validation sounds useful. Adding a new
Right, currently it's disregarding the from-disk asset when a parent changes, but we could change that. Looking through master's The current asset approach requires a lot of Go overhead for each asset, while the approach I have here requires very little. For example, compare this 29-line file for the kube CA with the current 44 liner. And I have the bootstrap units a separate assets injected with one line each, while the current master pushes them into the bootstrap asset, presumthoably because the overhead of supporting
Yeah, there's not an elegant way to do that. We could hack something in for manifests in particular (do we need this for other asset types beyond manifests?). As this branch stands, the easiest approach is probably stuffing your manifest into |
I've pushed 20912ea -> 0222815 expanding the graph a bit more and changing One possibility to balance my interest in (re)rendering as a single step with @abhinavdahiya's concerns about overwhelming users would be to write these files into subdirectories (e.g. |
I've pushed 0222815 -> 49efd2d (new graph) implementing asset subdirectories. The generated asset state (still very incomplete) now looks like: $ openshift-install --dir=wking create assets --prune
$ tree wking
wking
├── auth
│ ├── kubeconfig-admin
│ └── kubeconfig-kubelet
├── base-domain
├── bin
│ ├── bootkube.sh
│ └── report-progress.sh
├── cluster
├── cluster-name
├── ignition
│ └── bootstrap.ign
├── ignition-configs
├── manifests
│ ├── cluster-apiserver-certs.yaml
│ └── kube-apiserver-secret.yaml
├── overrides
│ ├── kube-apiserver-config-overrides.yaml
│ └── kube-controller-manager-config-overrides.yaml
├── ssh.pub
├── tls
│ ├── admin-client.crt
│ ├── admin-client.key
│ ├── aggregator-ca.crt
│ ├── aggregator-ca.key
│ ├── cluster-apiserver-ca.crt
│ ├── cluster-apiserver-ca.key
│ ├── kube-ca.crt
│ ├── kube-ca.key
│ ├── kubelet-client.crt
│ ├── kubelet-client.key
│ ├── root-ca.crt
│ └── root-ca.key
└── unit
├── bootkube.service
├── kubelet.service
├── progress.service
└── tectonic.service
7 directories, 30 files Folks interested in install-config level settings can just look at: $ ls -F --group-directories-first wking | cat
auth/
bin/
ignition/
manifests/
overrides/
tls/
unit/
base-domain
cluster
cluster-name
ignition-configs
ssh.pub |
return nil, errors.Wrapf(err, "hash %q parent %q", asset.Name, name) | ||
} | ||
|
||
asset.Parents = append(asset.Parents, Reference{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is open to duplications?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is open to duplications?
It is. I'm comfortable avoiding that by not requesting the same parent twice from a single rebuilder. But if you'd prefer a guard to catch folks who do so by mistake, or if you'd like me to allow repeat lookups without duplicate additions, I can put either of those in.
58b31c2 I don't think this is the best way to allow any kind of review. i can't compare or verify if your copied asset impl is correct. please make changes inplace. also see the dev docs on asset generation were dropped, i think they were useful as they allow any body new to look and add new asset. Also there is no mention of |
I could squash things down, but to get an in-place pivot, I'd need to squash all of:
together (and possibly also the doc updates currently in 813ea6e). I can do that, you can see pretty much the same diff via GitHub's file view, and I think that in general I'm mutating files too much for Git to be able to see things like the Or are you saying I should rename my files to preserve the old filenames where possible? I could do that, but I'd like to keep the generic
The user-level portion of this doc moved into
Because users don't need to care about them, and devs can read the (currently sparse) godocs. I'll take a stab at extending the godocs, and we'll see how that feels to you. |
040ec7c
to
5994f8b
Compare
I've rebased onto master with 49efd2d -> 5994f8b (new graph). The graph nodes are now colored by directory to make it easier to get a quick overview. I've also added more docs to help orient devs: $ godoc ./pkg/installerassets
...
VARIABLES
var Defaults = make(map[string]Defaulter)
Defaults registers installer asset default functions by name. Use this
to set up assets that do not have parents. For example, constants:
Defaults["your/asset"] = ConstantDefault([]byte("your value"))
or values populated from outside the asset graph:
Defaults["your/asset"] = func() (data []byte, err error) {
value = os.Getenv("YOUR_ENVIRONMENT_VARIABLE")
return []byte(value), nil
}
var Rebuilders = make(map[string]assets.Rebuild)
Rebuilders registers installer asset rebuilders by name. Use this to set
up assets that have parents. For example:
func yourRebuilder(getByName assets.GetByString) (asset *assets.Asset, err error) {
asset = &assets.Asset{
Name: "tls/root-ca.crt",
RebuildHelper: rootCARebuilder,
}
parents, err := asset.GetParents(getByName, "tls/root-ca.key")
if err != nil {
return nil, err
}
// Assemble your data based on the parent content using your custom logic.
for name, parent := range parents {
asset.Data = append(asset.Data, parent.Data)
}
return asset, nil
}
and then somewhere (e.g. an init() function), add it to the registry:
Rebuilders["your/asset"] = yourRebuilder
... |
5c45d37
to
c9f2dcb
Compare
To separate them more cleanly from the assets that will end up in our asset graph.
Our current pkg/asset approach requires multiple stages to run, and selecting those stages requires some dev planning and user education. This new package allows us to dump the whole asset graph at once, and then edit/rebuild multiple times while maintaining a consistent asset state. The new library also hashes the parent data, which would allow expensive rebuilds to check their intended parents against the parents from the previous run and only rebuild when a parent had changed. When rebuilding the asset from scratch is cheap, a parent check is probably not worth the trouble.
So we can remove the rest of pkg/asset.
... to create installer assets. This asset graph is currently pretty sparse while we work out the proof-of-concept. The 'graph' command is a bit different now that parents are dynamic. Instead of printing the full possible graph, it prints the actual graph for a particular asset state. If you render an asset graph with 'openshift-install --dir=whatever create assets', edit a file, and run 'openshift-install --dir=whatever graph', you'll get a graph for your particular asset store. I'm also auto-coloring nodes based on their directory name, using hue-saturation-value colors to get a rainbow of pale backgrounds. I'm reaching in and adjusting the saturation for tls/kubelet-client.crt because it is special with its thirty-minute validity. While the other assets, especially the other TLS assets, should be safe to use for multiple cluster invocations, users will almost certainly want to remove and regenerate the kubelet client cert for each new cluster (or drop in their own client cert with a longer validity).
Generated with: $ dep ensure using: $ dep version dep: version : v0.5.0 build date : git hash : 22125cf go version : go1.10.3 go compiler : gc platform : linux/amd64 features : ImportDuringSolve=false This interface library allows us to use *testing.T for logging during unit tests (which has nice display properties, being hidden by the test suite except on failures or -v). And it also lets us swap that over to logrus during command runs. This commit also reduces our Ignition dependencies because I've dropped a BoolToPtr call from pkg/asset/ignition/bootstrap/bootstrap.go that had been consuming github.com/coreos/ignition/config/util.
The node names are wide, and the graph itself is wide and shallow. Ordering the graph left to right gets the wide node names in one direction, and the wide graph in the other direction, bringing the total rendering closer to a square.
So you can hover over a long edge to see the parent and child names Docs in [1]. [1]: http://www.graphviz.org/doc/info/attrs.html#d:tooltip
Generated with: $ export OPENSHIFT_INSTALL_PLATFORM=aws $ openshift-install --dir=does-not-exist graph | dot -Tsvg >docs/user/assets.svg using: $ dot -V dot - graphviz version 2.30.1 (20170916.1124)
@wking: PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@wking: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
I still like this approach, but I'm giving up on rebasing it ;). /close |
@wking: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Our current
pkg/asset
approach requires multiple stages to run, and selecting those stages requires some dev planning and user education. This new package allows us to dump the whole asset graph at once, and then edit/rebuild multiple times while maintaining a consistent asset state. I think this will be easier for folks to wrap their heads around, but I'm not always the best judge of approachability ;). Check out the updated user docs (included in the PR) for an overview of the new approach.If folks think this approach is worthwhile, I can keep working through the process of porting our old assets over to the new framework. Currently there are only a handful as a proof-of-concept.