Skip to content
This repository has been archived by the owner on Feb 6, 2024. It is now read-only.

Consider using Jsonnet (and maybe Kubecfg) as a foundation for the rules. #25

Open
hsyed opened this issue Sep 19, 2017 · 21 comments
Open
Labels

Comments

@hsyed
Copy link

hsyed commented Sep 19, 2017

As primitives the k8s_defaults and k8_object are quite fine grained. I envisage the current k8s_defaults, k8s_object to evolve to provide some templating parameterised by make variables, json files, statically provided dicts -- this will invariably be limiting.

  1. I don't think they will scale to express multiple build / CI environments without a lot of skylark macro code.
  2. If a production environment faithfull can be sufficiently modelled using skylark macros all of that work will be specific to the build / CI environment. To produce manifests for production a separate manifest management tool would stiil to be maintained.

Here are some suggestions:

  • Build upon Jsonnet. It is super lightweight as a templating language and easy to include as an external dependency (no external dependencies with an existing bazel build).
  • Expand the scope of k8s_defaults more than just defaults for a single type of resource, or as a holder for some configuration parameters. A k8s_defaults that holds jsonnet file(s) (libsonnet) can model a simple dict of params -- at it does now -- all the way to modelling a particular deployment environment.
  • Expand the scope of k8s_object to represent an arbitrary set of k8s resources instead of just a single resource. A single jsonnet file renders to multiple Json or Yaml files. A real world micro service is going to have a service, a controller, ingress, secrets, configmap. These should likely be deployed in one step.
  • Consider the use of kubecfg -- the kubecfg tool takes care of some of the problems that would be need to be solved to use jsonnet in the manner I am describing above. Kubecfg has image resolution, simple multi yaml file rendering and CI steps that would likely mean a dependency on kubectl is not necessary.

I working on wiring up kubecfg into our codebase. I have run rules for my k8s_object equivalent that map to the kubecfgs (apply, delete, validate, show) commands. I can provide more detail if interested.

@mattmoor
Copy link
Contributor

@dlorenc FYI

FWIW, I intentionally decoupled the functionality that this currently performs from any sort of templating, you can see some of my rationale here. I definitely appreciate it's usefulness, which is why my original prototype had a form of templating :)

That said, I certainly anticipate folks feeding the output of bazelbuild/rules_jsonnet into the template attribute, there is already an issue tracking it here. I would expect these rules to work with the output of a variety of Kubernetes templating tools, e.g. kexpand, ksonnet.

The resolver already supports multi-document yaml files, see here, I believe the only place this should be problematic today is :foo.describe, which IIRC is the only place we consume the kind attribute.

At one point I was validating kind, and debated allowing a multiple for this class of thing, but ultimately dumped the validation because it doesn't work well with Custom Resource Definitions (formerly Third Party Resources).


Beyond composing templating with these rules, I also wanted to enable folks to build stuff that potentially wraps these rules to expose higher-level functionality.

  • This parallels the way DSLs like jsonnet and friends wrap the raw API objects, and
  • This parallels the way py_image and friends wraps the raw docker_build.

Bottom line, I want to enable all of the things you are talking about and more, and I hope this is a reasonable foundation for doing that, but it'll take some iteration.

To produce manifests for production a separate manifest management tool would stiil to be maintained.

You mean to instantiate things or to store a resolved template?

... for my k8s_object equivalent ...

Is it in a public repo you can link me to?

@mattmoor
Copy link
Contributor

@hsyed Does what I say make sense?

I would love to add some samples showing this stuff working with jsonnet, but having never touched it and having no good jsonnet + K8s samples (issue) in rules_jsonnet I wasn't fruitful in my (brief) exploration last night.

If you have some good examples we could turn into samples for K8s, I'd certainly appreciate the pointer.

@mmikulicic Also has bazelbuild/rules_jsonnet#28 that seem like they could be fed into these rules, and it uses kubecfg under the hood.

@hsyed
Copy link
Author

hsyed commented Sep 22, 2017

@mattmoor We could setups a slack call next week and I can show you what I have done so far.

What you say makes sense. I can see the power of lower pevel primitives that keep doors open. Granularly modelled rules are very good for setting up smoke and unit tests. I hadn't seen that you had a jsonnet issue allready open.

What is great about jsonnet is that it can just be treated like simple json, the templating features need not be used- at the same time it has the modelling power that rivals ci systems like saltstack. It is certainly more flexible at creating reusable manifeat than helm as it stands currently. So when I say lets use Jsonnet from the ground up I mean it from the low level perspective it doesn't close any doors.

this is a good tutorial on showing how jsonnet mixins work.

The kubecfg guys are working on conventions ontop of jsonnet for modelling arbitrary deployment environments, their issue list is a good place to pick up on what they are working towards. The environments are just conventions of laying out jsonnet files and attaching envornment configuration such as environment x uses kubecfg context x.

@hsyed
Copy link
Author

hsyed commented Sep 22, 2017

A bit of a brain dump I wrote last night on the scope I want bazel to cover for k8s deployments.

Multiple Environments

When I created this issue I was hoping for the rules_k8s model of the deployments to be rich enough so that multiple CI environments can be generated (minikube -> staging -> dev -> qa) where each gets progressively more production faithfull and that the same declarations could also be used to generate production manifests.

Model Complex Deployments

In terms of scope, it would be great if we could model a production faithful HA Postgres or Kafka in bazel in. These deployments tend to be very complex to model. If the rules aren't flexible enough to enable a lot of reuse helm or another tool would still have to be used to setup parts of an environment.

Mass manifest installation mechanism

Another aspect is the ability to create many different k8s states/fixtures for k8s aware microservices. Our codebase is evolving toward an architecture that will use kubernetes as a hub for deploying schemas and plugins (js) in a multi tenant enterprise system. In this case it isnt just installing manifests into kubernetes to get services up but also to set up the business logic in the applications.

@hsyed
Copy link
Author

hsyed commented Sep 22, 2017

Due to time constraints I am considering writing helm rules for bazel for the time being as we have already have a lot of charts.

Helm as a CI tool

We currently have a few helm charts which are being used in a CI capacity. Helm works quite well as a CI tool if you it in a specific way. What is good about helm is that it models the upgrade path quite well and will restart components if the charts are modelled correctly.

helm vs kubecfg

Kubecfg (at the moment) only installs manifests and has no logic for rolling upgrades.

helm vs kubectl

kubectl requires a rolling upgrade command to be issued and I don't think it discriminates -- I suspect the rolling upgrades are applied to every upgradeable component in a set of manifests -- so if we had a blob of manifests for an entire environment everything would be restarted.

low level modelling in rules_k8s

On a side note, If kubectl was used in rules_k8s the granular modelling approach would help as each component would be hermetically tied to the manifests and docker image targets it depends on. An "environment" would be a collection of k8s_object's. So a k8s_environment run target has all the knowledge it needs.

rules_k8s as a CI workflow tool.

Consider "Model Complex Deployments" in my previous entry... helm charts already exist for complex deployments. In rules_docker I ask for docker_build and docker_push to be reusable in other rules. Perhaps rules_k8s should be written as a k8s specific CI workflow engine, or be pluggable and helm rules could be one set of rules it supports.

@mattmoor
Copy link
Contributor

mattmoor commented Oct 3, 2017

@hsyed sorry it took so long to get to this. I don't suppose you'll be coming to the Bazel conference in Sunnyvale in early November? I'd love to buy you coffee/beer and chat f2f :)

An "environment" would be a collection of k8s_object's. So a k8s_environment run target has all the knowledge it needs.

One of the things I've been playing with recently is getting this up and running for Prow from the kubernetes/test-infra repo, which prompted me to add support for k8s_objects. You can see it in action here. I think this captures two missing pieces:

  1. The ability to act on N objects at once (e.g. to stand up a whole thing), and
  2. The ability to act on a few of those objects as once (e.g. to iterate on a couple things)

I also resolved the jsonnet issue adding a simple rules_jsonnet example.

When I created this issue I was hoping for the rules_k8s model of the deployments to be rich enough so that multiple CI environments can be generated (minikube -> staging -> dev -> qa) where each gets progressively more production faithfull and that the same declarations could also be used to generate production manifests.

So my hope is that rules_k8s with rules_jsonnet is sufficient here today for generating the configuration for multiple environments. For environment delineated by cluster or namespace, I hope the stamping support in those attributes is adequate. If you mutate labels to express environment, then one potential issue I see is that we might want to support stamp variables in rules_jsonnet.

You might be able to get away with single rule definitions for multiple environment using Bazel's select support, but I'm definitely not an expert on that.

@calder
Copy link
Contributor

calder commented May 8, 2018

Did you two get to talk back in November? It would great to be able to arbitrarily specialize resources at deploy time without needing the entire build system (i.e. --workspace_status_command).

@anguslees
Copy link

Minor clarification to above:

Kubecfg (at the moment) only installs manifests and has no logic for rolling upgrades.

kubecfg has full support for rolling upgrades (the same as what you see from other k8s tools), since this is done by k8s itself server-side. Eg: you can just update from one Deployment version to the next with kubecfg, and k8s will manage a smooth and safe rolling transition between the two. The explicit kubectl rolling-update subcommand is a relic from ancient Kubernetes (pre-Deployment) and should not be used.

A notable difference in this space is that helm adds the possibility for an additional "update" job that gets run between chart versions - which can be used for things like database schema upgrades. This pattern is generally frowned upon (and thus not supported by k8s out of the box) since it is a risky atomic imperative step and can complicate downgrades, but it is something people are used to using in pre-k8s architectures.

@borg286
Copy link

borg286 commented Jan 7, 2019

I've been working on a repo that integrates jsonnet and rules_k8s
https://github.com/borg286/better_minig
The architecture is that I assume you have a kubernetes system and point a .bzl file to it.
You can then do
bazel run //go/examples/routeguide/client:myns-deep.apply
This is a k8s_objects target that pulls in the json for a local routeguide client, and pulls in the routeguide server. Both the client and the server k8s_object are deployment types that uses the image keyword which depends on the docker image target. A side note is that these images are of grpc-enabled go binaries.
To produce the json I used rules_jsonnet and https://github.com/bitnami-labs/kube-libsonnet to simplify creating k8s json.
I use a bzl file to make an ENVS list consisting of "prod", "staging", "dev", and "myns" environment names. I use python composition to generate bazel targets for each environment. a jsonnet library is used to dereference each environment name with the namespace value, notably the "myns" environment pulls in the USER environment variable so the developer can work in his own namespace.

In https://github.com/borg286/better_minig/tree/master/java/com/examples/grpc_redis I have an example where the server depends on a redis setup as its backend for the location database. When the :myns-deep.create target is ran it also runs the server and redis. the :myns-shallow.create target only includes the k8s objects that this service needs (ConfigMap, Prom rules, Service...) not external dependencies like a prometheus server. The challenge I have now is that I'd like to have the build dependency reflected in the order that the services are turned on. I believe that helm offers this kind of dependency. I don't know if kubecfg does. If we had some way to express our dependencies in bazel targets that would enable me to do deep dependency turnups consistently as though a human were rolling out services in a sane manner.
I am currently considering forking rules_k8s and swapping out the binary used to be helm instead of kubectl. The more grainular direction I see in the comments above would be to take the dependencies described in bazel targets and provide them to another system that would manage the turnup.

@mattmoor
Copy link
Contributor

mattmoor commented Jan 7, 2019

FWIW, I briefly looked at helm early on and (at the time) it was its inability to operate without Tiller that shut down my investigation (template instantiation couldn't be hermetic).

Tillerless Helm is now a (very popular) thing, so it's worth revisiting. Perhaps rather than forking the repo you could upstream the support via helm rules here?

@borg286
Copy link

borg286 commented Jan 7, 2019

I think that sounds great. This is a side question but will inform me on this dev work: Is it considered wrong to reference the helm binary by url path and have bazel extract it compared with having bazel compile the whole thing. Seems helm is mostly just a templating tool. In mkmik's repo he seems to have built kubecfg from scratch. Seems overboard to me.

Regarding Tillerless Helm, I thought that the tiller would be responsible for watching a deployment go out. If we pursue a tillerless solution then the only benefit we get is letting users specify overrides in yaml files and merging that with their charts to produce composite k8s yaml files with no dependency assurance.
A poor-man's hack would be to use the order of the dependencies in the k8s_objects target to order the kubectl commands instead of throwing them all into 1 folder and telling kubectl to just run the whole pile.

@chrislovecnm
Copy link
Contributor

Someone at bazel conf had helm integration. But yah rules_helm seems reasonable. That or make this were you can make kubectl deploy a toolchain ... not sure if I am describing it correctly.

I know some folks want helm and some will not.

@borg286
Copy link

borg286 commented Jan 7, 2019

I saw that this repo was trying to make kubectl an optional tool to have on the host. Without it bazel would build it from scratch. Are you proposing a flag to the dependency function in one's WORKSPACE file where the user could opt for helm binary and the k8s_object rule would produce .create and .apply targets that have helm install and helm upgrade under the hood?

@mattmoor
Copy link
Contributor

mattmoor commented Jan 7, 2019

I meant something like a helm subdirectory that potentially swaps out aspects of the underlying implementation, but may have a common core for resolving the yamls and such.

One other thing about Helm that turned me off was that the Go templating (when used arbitrarily) made it so that you couldn't read/modify/write it in a structured way to do things like resolution. Again, if that can now be done hermetically, a major obstacle is gone.

@borg286
Copy link

borg286 commented Jan 7, 2019

A significant benefit of incorporating helm would be access to its plethora of charts. Meaning that one could theoretically define a helm_object target, point it at some values.yaml file, then depend on that in a subsequent k8s_objects target, then this repo would do some magic under the hood and pipe up the composite yaml files into nested chart directories that helm understands. This is probably getting off on a tangent, but I wanted to point out a possible benefit of this inheritance.

@borg286
Copy link

borg286 commented Jan 7, 2019

Mattmoor, can you help me understand the problems with the hermetic aspect of the go templating?

@borg286
Copy link

borg286 commented Jan 7, 2019

To address the title of the bug, I feel that jsonnet is a superior language for inheriting and modifying kubernetes objects that are fed into rules_k8s. Unlike gcl and piccolo it is fairly well structured, has pretty good support for creating libraries and piping values in from BUILD files and targets. Most of the examples I could find elsewhere did imports in a relative way (../other_folder/some.jsonnet) while I found
local mylib = import "external/kube_jsonnet/kube.libsonnet";
good for defining jsonnet libraries and using them elsewhere.
I personally like having the rules_k8s expect either json or yaml as the input and leaving it to the user to figure out how to generate it. This aligns with kubectl. rules_jsonnet already has some decent flags for tuning how values get into your jsonnet space and having multiple output files. I did encounter some pain w.r.t. needing to define separate k8s_object rules for each output file. It makes running bazel query ... spit out a horde of "targets." Wanting to add .diff to the existing rule will make that blow up even further.

@mattmoor
Copy link
Contributor

mattmoor commented Jan 7, 2019

(this is from almost two years ago, so foggy, but I'll try to summarize what I recall)

Go templating wasn't the problem, it was that you needed Tiller to instantiate it (at the time), so build actions couldn't hemertically produce the yamls (to feed to another rule as an input).

@anguslees
Copy link

anguslees commented Jan 8, 2019

Responding to two out of the many points above:

The challenge I have now is that I'd like to have the build dependency reflected in the order that the services are turned on. I believe that helm offers this kind of dependency. I don't know if kubecfg does.

kubecfg update flattens and sorts the objects before sending them to the k8s server. In brief, it does:

  • CRD/TPR definitions first
  • non-namespaced (global) objects
  • everything not covered elsewhere
  • all resources that include a PodSpec (by walking the server schema)

This isn't perfect, but it is good enough for many cases (and does not suffer from some of the issues helm encounters with its hard-coded list of kinds). For example, it will create the certmanager Certificate CRD declaration before using it, and the mysql Namespace, Service and ConfigMaps before creating the wordpress Deployment.

If you want something else, then you need to force the order externally by invoking kubecfg multiple times. If you think you have a generally-applicable example where the above heuristic fails, please file a kubecfg issue so we can at least know about it.

Fwiw, kubecfg show does not do the same sorting on output. (Odd, because I thought I had implemented that at some point...) I would be happy to add that to kubecfg if it is useful for what you're trying to build here.

Go templating wasn't the problem, it was that you needed Tiller to instantiate it (at the time)

helm template now exists and works entirely client-side (don't even need tiller installed in target cluster). It just does the expected template expansion from an existing chart and values.yaml and spits out regular k8s resources YAML. It's pretty trivial(*) to slurp that YAML into kubecfg if you want to combine the two worlds.

As pointed out in other comments above, you lose the installation tracking and ordering without tiller.

(*) kubecfg.parseYaml(importstr "foo.yaml") -> gives you a jsonnet array of parsed YAML docs.

@borg286
Copy link

borg286 commented Jan 8, 2019

I've read over all the comments and I'll try to summarize everything

Request 1: Make k8s_object capable of pulling in jsonnet so it can do the templating/rendering.
Response: Use @rules_jsonnet and pull the output from a jsonnet_to_json to do the rendering yourself.
Justification: Having the API be json reinforces granular abstraction of individual k8s_object targets

Request 2: Make a k8s_stack/k8s_environment like rule that would intelligently handle individual k8s_object targets. Currently :bla.describe can't handle a composite yaml file.
Response: None so far.

Request 3: Add a deps field to either k8s_objects or k8s_stack where the user requested command is first executed on dependencies before being executed on this target. ie. :my-stack with a dependency on //prod/redis:prod would end up with my-stack.upgrade first calling //prod/redis.prod.upgrade and then seeing its own objects are k8s_object targets and calling .apply on each of them in the order listed in the rule.
Response: None so far
Benefits: This would allow for one to explicitly define the order of their resources, and what whole stacks would need to be turned up and successfully exit before continuing on to this stack.

I don't like the way kubectl pushes, what about X

Proposal 1: Make helm an optional toolchain target that you opt into somehow.
Complications: Tillerless helm (v3) isn't out yet, forcing us to run helm and tiller locally. Do you know where the experimental v3 binary is located now?
Benefits: Helm can manage upgrading a stack, touching only the parts it needs to touch. It allows for patching jobs to be ran between upgrades of a resource. Removal of tiller removes the need for a "sudo" account in k8s, as well as handling the final "rendering" of it.
Questions: Do we build helm from source or reference a static binary? Tiller accepts traffic on a port. Will this be a problem for our bazel run sandbox? Do we map the .create, .apply, .delete, .describe targets to relevant commands for both kubectl and helm? Composing helm supervised objects into a single k8s_objects is complicated.

Proposal 2: Make a helm_rules repo that has a leakier abstraction layer (values.yaml, subcharts...)
Benefits: We can punt this off to some helm folks to maintain.

Proposal 3: Swap out kubectl with kubecfg
Benefits: We can now consume jsonnet directly in addition to json and yaml. kubecfg gives a best effort on "figuring out" dependencies and pushing them in order.
Questions: Importing libraries, stamping, and external variables is not as rich as rules_jsonnet which makes a conflicting jsonnet compilation call. Does kubecfg handle a pile of rendered json and do proper ordering?

My feelings so far.
Helm's go-based templating language and overlap with the pushing tool makes me feel that a dedicated repo is the right direction for it.

rules_jsonnet has advanced bazel rules for handling jsonnet while kubectl doesn't. Simply wrapping it doesn't feel much different than simply piping the jsonnet_to_json into a set of k8s_object targets. However it does remove the need to define a k8s_object for every new resource. The drawback of doing that is that you can't act on individual resources (ie. updating a PromRules object w/o the overhead of parsing/checking the entire stack). Allowing one to route some resources into explicit build targets that get their own .create, .apply... as well as allowing composite sets of files to be updated feels like the right API.

Why can't we do a .apply to a composite file?

We should define an API for general k8s_stack. Its dependencies would have their appropriate tool take over and take the relevant action appropiate for its tool: create, upgrade, delete, describe, diff.
The native k8s_objects would be a k8s_stack, and blindly apply kubecfg on all the files inside. When a k8s_stack tells a dependency to upgrade a helm_stack would map that to upgrade, while a k8s_objects would try to call apply on the pile of files. The cluster/context would need to be pass down. (b.t.w. I've confirmed that a deployment with no changes isn't restarted. I've wanted the .diff command to figure out why ones with image overrides do).
This would allow you to make a handful of helm_stack targets dependencies of a main server stack, and that is then a dependency of some other server stack and so forth to some frontend stack. Running .create on the frontend would start with .create of the lowest stack and after those commands finish it then calls .create on the next level up. Either stamping or jsonnet would allow an entire stack to be spun up for a developer in their own namespace or cluster.

In the end the action items are to

  1. Invite some helm leads to imitate this repo
  2. Swap out kubectl for kubecfg
  3. Make a k8s_stack rule that has dependencies, and calls on one of 5 commands to each dependency sequentially.

@borg286
Copy link

borg286 commented Jan 10, 2019

After some further thought, I shouldn't rely on the ordering of the dependency list for dictating the ordering of doing pushes. Instead I should either explicitly have the user provide some ordering or do it in some other way. In the end I was shooting for something like a poor-man's workflow engine. The best thing to do now is simply to have a deps list and each dependency is asked to create/update.. and then the objects in that stack are then asked to create/update...

I've been thinking about pulling in ksonnet and doing the jsonnet_to_json inside of rules_k8s in some new rule and I'm actually liking it more now.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

6 participants