Skip to content
This repository has been archived by the owner on Dec 21, 2019. It is now read-only.

Support external resources via HTTP as includes #53

Closed
jordanjennings opened this issue Jun 1, 2017 · 11 comments
Closed

Support external resources via HTTP as includes #53

jordanjennings opened this issue Jun 1, 2017 · 11 comments
Milestone

Comments

@jordanjennings
Copy link

jordanjennings commented Jun 1, 2017

It would be really nice to have the option to include external resources via HTTP so that I could have templates stored somewhere different than the kontemplate file.

Use case 1

I create a microservice deployment template that I want multiple projects to be able to use. Each project then has one or more kontemplate files that refer to the common deployment template.

Each project only has kontemplate files like:

  • dev.yml:
---
context: myservice-dev
include:
- name: https://raw.githubusercontent.com/standard/template/master/service.yaml
  values:
    serviceName: myservice
    servicePort: 8080
    ingressDomain: myservice.dev.mydomain.com
  • prod.yml:
---
context: myservice
include:
- name: https://raw.githubusercontent.com/standard/template/master/service.yaml
  values:
    serviceName: myservice
    servicePort: 8080
    ingressDomain: myservice.mydomain.com

Use case 2

Using kontemplate for cluster standup I have a collection of addons that I want configured per cluster. Some addon definitions will be locally managed, some will be shared (and may not even need customization for installation).

dev-cluster.yml:

---
context: k8s.dev.mydomain.com
global:
  clusterDomain: k8s.dev.mydomain.com
include:
- name: addons/ingress
  values:
    name: nginx-ingress
    replicas: 2
    ingressClass: nginx
    ingressDomain: 'k8s.dev.mydomain.com'
- name: addons/ingress
  values:
    name: nginx-ingress-internal
    replicas: 2
    ingressClass: nginx-internal
    ingressDomain: 'internal.k8s.dev.mydomain.com'
- name: addons/logging
  values:
    kibanaIngressClass: nginx-internal
    kibanaDomain: 'kibana.internal.k8s.dev.mydomain.com'
- name: https://raw.githubusercontent.com/kubernetes/kops/master/addons/kubernetes-dashboard/v1.6.0.yaml

prod-cluster.yml:

---
context: k8s.mydomain.com
global:
  clusterDomain: k8s.mydomain.com
include:
- name: addons/ingress
  values:
    name: nginx-ingress
    replicas: 5
    ingressClass: nginx
    ingressDomain: 'k8s.mydomain.com'
- name: addons/ingress
  values:
    name: nginx-ingress-internal
    replicas: 3
    ingressClass: nginx-internal
    ingressDomain: 'internal.k8s.mydomain.com'
- name: addons/logging
  values:
    kibanaIngressClass: nginx-internal
    kibanaDomain: 'kibana.internal.k8s.mydomain.com'
- name: https://raw.githubusercontent.com/kubernetes/kops/master/addons/kubernetes-dashboard/v1.6.0.yaml

Yes for the second use case I could just directly run kubectl apply -f with the URL, but keeping it all in one place would be very convenient and make me happy :)

@tazjin
Copy link
Owner

tazjin commented Jun 1, 2017

Thanks for filing this! We chatted briefly about this on Slack and I like the idea, but here are my current "worries":

  • Remote resources can disappear or be altered. Maybe the user intends for that to happen, maybe not. What could kontemplate do in this case? Provide some sort of hash field (with a potential ignore value?). Thoughts by the hashing king, @heavenlyhash?

  • kontemplate attempts to deal with a concept of resource sets, which in practical terms is a folder on disk with resource files (JSON or YAML) and an optional default.json file. Should including resources from somewhere else support these too, and if yes, how?

  • How should this be presented to the user? Just overloading the name field feels wrong.

I don't want to add features that increase complexity and user friction (afterall this isn't supposed to be Helm ...), so it's important to me to solve these questions before proceeding with any implementation (writing this is trivial, the design is hard!).

@jordanjennings
Copy link
Author

Great concerns, here are my thoughts at the moment:

  • If a resource is unavailable then I would expect kontemplate to stop immediately and print a message saying the resource couldn't be found, ideally it would stop before doing any work if it was an apply/create/delete. You could add a "required: true | false" flag to be really flexible, but I would assume if someone is referring to a resource they expect it to be there, and I personally would not need such a flag.

  • For simplicity, a remote resource would not be a collection of things, it would have to be a single resource. I don't care much about the default.json support here, but maybe kontemplate could just try to pull down a default.json from the same folder as the single remote resource. So if the remote resource was at http://domain.com/resources/template.yml then kontemplate would look for http://domain.com/resources/default.json?

  • I actually think the name field feels a little "wrong" already, because the name isn't really a name it's actually the location of the resources relative to the kontemplate file. Since it's a location already, it doesn't feel much more "wrong" to put a remote resource location in there as well. However, maybe a new key would make sense to call out single resource vs resource sets. In my opinion re-using name would be the least friction option from an end-user perspective, but I understand the concern.

@tazjin
Copy link
Owner

tazjin commented Jun 1, 2017

Here are some preliminary UI considerations, consider this me thinking loud:

# Remote resources UI mockup
---
context: some-cluster
include:
  # My API service is a local resource set:
  - name: some-api-service
    values:
      version: '1.2-7ab326c'
  # but its database is a remote resource set:
  - name: postgres
    # The location field will always opaquely exist for normal resource sets,
    # too. This means that it is by default derived from the 'name' field but
    # can be overridden and set to a different local or remote path.
    # Remote paths support HTTP(S) for now.
    location: https://raw.githubusercontent.com/tazjin/stuff/postgres.yaml

    # I'd prefer if remote resource sets did not change unexpectedly, so the
    # optional 'hash' field can be used to provide the sha256 hash of the file
    # specified above.
    # If this field is not set a warning will be emitted during the run that
    # informs the user about the hash of the file that was retrieved and
    # suggests pinning the hash.
    # If the hashes don't match, kontemplate errors on runs.
    hash: '969a969d4e57f352a4a1b4e8e5788033fbecf84c7d2c9c5652b0462e369d5372'

    # Values for the remote resource set are set as normal
    values:
      version: 9.6
      storageSize: 100Gi

    # That's basic functionality but the big remaining question is how to deal
    # with defaults. @jordanjennings suggested simply attempting to fetch the
    # default.[json|yaml] from the same location as the resource itself, but
    # there is no guarantee that the resource location is a URL containing a
    # filename.
    # One other idea is to make 'location' an array (potentially with promotion
    # of single values for syntactic convenience):
    location:
      - https://raw.githubusercontent.com/tazjin/stuff/postgres.yaml
      - https://raw.githubusercontent.com/tazjin/stuff/default.json
    # But what happens to hashes then? This can too easily end up with a complex
    # type in this array.

    # Another feature that falls out of this potentially is using absolute local
    # paths:
    location: /etc/kontemplate/resource-sets/postgres

I want to avoid making "package formats" with metadata-manifests if at all possible, but I also don't want remote resources to have diminished capabilities.

Probably going to need 2-3 days to form a decision on this ...

@tazjin
Copy link
Owner

tazjin commented Jun 1, 2017

Some of these things can be solved with convention, by saying remote resource sets must have a main.yaml and may have a default.[json|yaml] and that the location specifies a path at which those must be available.

User friction can be reduced by having helpful error messages:

1: Remote resource set is incomplete

If the main.yaml file is not found at $location (+ /) + main.yaml, then the
interaction is as follows:

$ grep -A1 postgres my-cluster.yaml
  - name: postgres
    location: https://www.google.com

$ kontemplate apply my-cluster.yaml
Loading resources for some-api-service
Loading resources for postgres
ERROR: The resource set 'postgres' has been included from a remote location at
'https://www.google.com', however the expected file 'https://www.google.com/main.yaml`
could not be found.
Remote resource sets are expected to contain a `main.yaml` file and optionally a
`default.json`.

Something something YAML & JSON support. *furious handwaving*

2: Remote resource set has no hash

Hashes are optional, but preferred. If they are not present a warning will be
shown to users (maybe with colour!).

Hashes are calculated like such:

// Furious pseudocoding in typed language!
Bytes mainYaml = getFileOptional(location, "main.yaml").getOrElseError()
Option<Bytes> defaultJson = getFileOptional(location, "default.json")
String hash = sha256sum(mainYaml + defaultJson.getOrElse(Bytes.EMPTY))

And checked after fetching the remote files.

If no hash is present this happens (text could be better...):

$ grep -A1 postgres my-cluster.yaml
  - name: postgres
    location: https://raw.githubusercontent.com/tazjin/resource-sets/postgres

$ kontemplate apply my-cluster.yaml
Loading resources for some-api-service
Loading resources for postgres
WARNING: Resource set 'postgres' is remote, but no hash is currently set. The
owner of the remote location could potentially change the resource files without
your knowledge.
To pin the current version of the resource set add 'hash: 969a969d4e57f352a4a1b4e8e5788033fbecf84c7d2c9c5652b0462e369d5372' below the 'location' configuration for 'postgres'.
You may suppress this warning by adding 'hash: ignore' instead.

Step 3: Hash mismatch

If the user is a good user and has put nice hashes in there, but the remote
changes them we have a situation that causes an error:

$ grep -A1 postgres my-cluster.yaml
  - name: postgres
    location: https://raw.githubusercontent.com/tazjin/resource-sets/postgres
    hash: 969a969d4e57f352a4a1b4e8e5788033fbecf84c7d2c9c5652b0462e369d5372

$ kontemplate apply my-cluster.yaml
Loading resources for some-api-service
Loading resources for postgres
ERROR: Expected hash for remote resource set 'postgres' did not match.
Got unexpected hash '33495e96e55c2fe59e1ba6d75a3cfa5ca36719ac1e38155d245908e0b720ae75'.
Please check whether the updated resource is the one you want and update
'my-cluster.yaml' accordingly.

"Emergent" features

  • Assuming a lot of people link to Github: Pressing yy while looking at files on Github creates a full-hash URL to the file, this can be used to provide git-commit pinning.

  • HTTP basic auth probably works with the native Go client's URL parser? (As soon as this feature goes in people will ask for authentication support)

@jordanjennings
Copy link
Author

jordanjennings commented Jun 1, 2017

I was typing a long reply and you basically mentioned everything I was typing already in your latest update :)

Yes, HTTP basic auth would be something people would definitely request.

And yes, most git hosts (GitHub, Bitbucket, and Gitlab at least) support linking to a specific commit which would be guaranteed to be stable, since the commit ID is already a hash of the contents.

I would very much prefer NOT to see an enforced naming convention for remote resources. If the convention just uses a single file anyway, I think it would be much nicer to just allow the user to specify the location of both the resource and optionally the defaults.

context: some-cluster
include:
  # My API service is a local resource set:
  - name: some-api-service
    values:
      version: '1.2-7ab326c'
  # but its database is a remote resource set:
  - name: postgres
    location: https://raw.githubusercontent.com/tazjin/stuff/postgres.yaml
    defaultValues: https://raw.githubusercontent.com/tazjin/default/defaults.yaml

Perhaps by convention if the URL does not end with .yaml, .yml, or .json then you could check for a main.yaml/json and default.yaml/json relative to the location?

@tazjin
Copy link
Owner

tazjin commented Jun 1, 2017

I was typing a long reply and you basically mentioned everything I was typing already in your latest update :)

Great minds etc.! ;-)

I would very much prefer NOT to see an enforced naming convention for remote resources

Do you think it's feasible to enforce that the remote locations have to end in sensible filenames? In that case going for the location :: List<URL> solution would treat the files as if they were in a single folder and apply the normal logic (with respect to defaults and such).


I spoke briefly to @heavenlyhash and he brought up that kontemplate "suddenly" doing network I/O should require explicit opt-in, a la having a --enable-external flag. I'm not sure I agree with that because I think that specifying remote locations is already an opt-in, but for completeness sake:

00:18:41 <M-hash> and 2) ok, if you do do this... i'd say ideally it should be an opt-in flag, or at least, it has to have an opt-out flag or env var.  Running a templating tool on local text files should not make arbitrary noise on the internet, unless I know what I'm doing and I asked for that behavior.
00:18:42 <M-hash> (IMO, you could argue that putting a url in the files at all counts as opting in, but I'd say there's a practical hair to spit here: someone writing the script that runs this templater in CI should know whether or not its going to go nuts, and the someone(s) committing to this repo are different people and should maybe not be able to make the CI server dial various internet addresses...)
00:19:04 <tazjin> well yes you'd be asking for that behaviour by saying `location: http://something`
00:19:59 <M-hash> IMO an appropriate metaphor here is guns having both triggers and a safety latch
00:20:36 <M-hash> making a simple text templating tool into a request-shit-from-the-interweb tool should require flipping a safety latch to "off"
00:20:47 <M-hash> in addition to actually pulling the trigger

@jordanjennings
Copy link
Author

The concern about networking calls is certainly valid, but I would agree that specifying remote locations is essentially an opt-in.

And yes, enforcing that files end with sensible extensions like .yaml, .yml, or .json seems very reasonable to me.

@tazjin
Copy link
Owner

tazjin commented Jun 2, 2017

I'd rather have a more aggressive --simple (or some such) flag that also disables template functions that do IO (e.g. passLookup or #45, #46)

@tazjin
Copy link
Owner

tazjin commented Jun 11, 2017

I don't think this will make it into 1.1.0 which I'm planning to get out soon for the fix of #51

@tazjin tazjin modified the milestone: Version 1.2.0 Jun 12, 2017
@rlees85
Copy link

rlees85 commented Jan 8, 2018

This would be an awesome feature, making codebases much more modular (similar to Terraform & git source modules).

@tazjin
Copy link
Owner

tazjin commented Dec 20, 2019

I'm closing this issue as I now think that using tools like Nix is a much better way of achieving this sort of modularity.

Please note that the source of kontemplate is moving to git.tazj.in and issue tracking / patches will be available on depot@tazj.in.

@tazjin tazjin closed this as completed Dec 20, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants