RFC: Resources v2 #1

vito · 2018-04-10T20:46:55Z

Status: ~80% figured out. The remaining 20% is concerning performance/architectural implications that we should be sure to understand before going forward.

This is a very large proposal, heavily influenced by concourse/concourse#1707 and with a lot of prior planning in concourse/concourse#534. The discussion in 534 was growing stale, which was the catalyst for this new RFC format. Please leave feedback on individual lines instead of commenting directly so that the number of open discussion threads dwindles down as the proposal stabilizes and improves.

TODOs:

RFC: Resources v2 #1 (comment)

01-resources-v2/proposal.md

cwlbraa

I'm looking forward to building against this. I haven't totally wrapped my head around spaces and I really liked the original proposal as it was, so I hope we can get this back to that same level of confidence and clarity.

One thing that i'd like to put out there explicitly is that even the tiniest interface ergonomics improvements can make a huge difference to first time resource implementers. Many small things like not being able to log to stdout, having different command line parameters for each binary, in==get, out==put, understanding multiline yaml scalar => json conversions, multiple sources of configuration, etc, etc can make the current interface feel like death by a thousand cuts when you're first wrapping your head around it. The original proposal addressed those sorts of issues more directly, and though this one still does I hope we don't lose sight of the small things.

01-resources-v2/proposal.md

+
+It may be the case that most resources cannot easily support `destroy`. One
+example is the `git` resource. It doesn't really make sense to `destroy` a
+commit. Even if it did (`push -f`?), it's a kind of weird workflow to support


01-resources-v2/proposal.md

01-resources-v2/branch-gen.yml

01-resources-v2/commit-status.yml

+  plan:
+  - get: atc-pr
+    trigger: true
+    spaces: all


01-resources-v2/proposal.md

* removed the `destroy` action. instead, `put` returns a set of versions that were deleted, alongside the set of versions that were created. * removed the `spaces` action. instead, `check` now runs in batch across all spaces. * `check` now returns *all versions*, both on the initial call and if the given versions are no longer present (push -f). this will be a problem for resources with lots of history - maybe either streaming or pagination would help in the future? * space identifiers are back to normal strings, rather than JSON objects. i'm not 100% convinced of this but it definitely makes it easier to use them as keys in maps, and should be cleaner in the UI and in YAML.

space is a string, not a JSON object

itsdalmo

Looks really good in my opinion! My Ruby is not so strong, but most of what I saw was as expected from the proposal - except for the check request which I also left a comment about. I also left a "renaming" comment in there for your consideration. Might be too wild an idea, but I left it there anyhow ;)

Other than that I just have two general things:

Webhook artifact is on my wishlist. Not sure how hard it would be to forward the webhook payload to a resource, but I really think it is something that could help us build next level resources.
I'm not sure why partial implementations (understood as not implementing all interfaces) for resources is considered a bad thing? Added an issue around this question here.

01-resources-v2/git-example/artifact

01-resources-v2/proposal.md

vito · 2018-06-12T20:47:39Z

I'm going to cut the scope of this RFC down a bit by removing all the stuff concerned with notifications. I think there's a lot more to explore there and it warrants its own RFC. I'd rather wrap up the new artifacts interface sooner so that we can get support for spaces out the door.

I also just realized we haven't at all discussed the idea of schema validation. I'm going to cut that out, too - we should be able to add that in later, again as a separate RFC.

also update examples to use simple string space identifiers, and clarify some of the change summaries

didn't get time to map this out - let's do it as a separate RFC

also switch git example to rugged, for much more efficient checking across branches (no need to checkout tree, no need to shell out to git)

01-resources-v2/git-example/artifact

01-resources-v2/proposal.md

vito · 2019-03-08T19:47:43Z

001-resources-v2/proposal.md

+
+If the requested version is unavailable, the command should exit nonzero.
+
+No response is expected.


After discussing with @clarafu we think it'd actually be better if gets still returned a response, and have that response include metadata.

This makes get, check, and put all have a response, and lets the resource author decide where metadata should be discovered. Now the difference between v1 and v2 would be that with v2 check can also return metadata - an additive change, rather than requiring all metadata come from check.

This way if there are situations where it's too expensive to collect all the metadata at once, it can be deferred until later get and put calls. Any metadata returned by get and put would just update the version's metadata.

This also makes the implementation in the ATC a bit cleaner because we don't have to worry about a scenario where e.g. a get is running with a version that has never had a check run and so there's no metadata to show to the user in the UI. This normally won't happen but is theoretically possible via the builds API, where you can submit a build that fetches an arbitrary version. This also makes v1 and v2 more similar to each other, minimizing conditional code paths in the implementation.

marco-m · 2019-03-08T20:38:25Z

001-resources-v2/proposal.md

+  "resources" concept, but with a more specific name.
+
+* There are no more hardcoded paths (`/opt/resource/X`) - instead there's the
+  single `info` entrypoint, which is run in the container's working directory.


info semantically implies read-only. If it is going to replace also the out/put, maybe a more adequate name could be command or similar.

Confusing wording - it's read-only, I meant entrypoint from a discovery standpoint (invoke info to learn the later commands to run), not an execution standpoint. Will amend.

marco-m · 2019-03-08T20:42:01Z

001-resources-v2/proposal.md

+  rather than taking the path as an argument. This was something people would
+  trip up on when implementing a resource.
+
+* Change `check` and `put` to write its JSON response to a specified file,


its -> their
response -> responses

marco-m · 2019-03-08T20:42:55Z

001-resources-v2/proposal.md

+  trip up on when implementing a resource.
+
+* Change `check` and `put` to write its JSON response to a specified file,
+  rather than `stdout`, so that we don't have to be attached to process its


so that we don't have to be attached

this is unclear. Who is attached to what ?

jchesterpivotal · 2019-03-08T21:17:24Z

A question I've thought of whether resources could add a provision capability as well. There are some resources where the remote service is ephemeral or at least unimportant to humans. Examples would be registering a webhook with Github, creating randomly-named blobstore directories, or creating and binding databases. I think OSBAPI might be able to serve in this role.

Arguing against: any provision could instead be modeled as a put to a provisioning resource. Existing examples including the bosh-deployment resource, which essentially provisions systems intended for further usage.

The inspiration for this thought bubble is watching over the shoulder of Knative Eventing. In Concourse terms, I think Knative Eventing somewhat mixes together the concerns of provisioning an outside service with configuration of source/params.

vito · 2019-03-28T20:00:16Z

I've been doing a lot of thinking about this lately, and now I'm worried that baking a high-level concept like 'spaces' into the resource interface is the wrong direction. I think I've got a better idea that simplifies things down to a single resource interface, a lot like what we have today - and a plan to use this single interface to implement spaces, triggers (e.g. the time resource), notifications (e.g. Slack, GitHub status checks), and versioned artifacts (today's primary use of the interface).

There's a lot to go over here - I'm going to try and provide as much context as possible behind my thought process. I expect at least some of this to be confusing or unclear. Sorry in advance for the word vomit. 😅 Feel free to ask questions to clarify - I plan to open a new RFC as it's a significant enough departure from this proposal, so we might as well just open the discussion up here until then.

Triggers and the thundering herd problem

With concourse/concourse#2386 we formalized how resource configs and their versions are stored abstractly, but this introduced a problem: now the time resource would cause a thundering herd, as everyone who said interval: 10m would get the same history, so all their jobs would fire at once across the cluster. We fixed this by the introducing 'resource scopes' which allow resources to have their own version history exclusive to their pipeline resource definition.

Resource scopes proved challenging to implement and introduce a level of overhead that is only exacerbated by the addition of resource spaces to the model, as they both try to achieve similar things. This led us to try to rethink the time resource as an example of inbound notifications, or "triggers". The hope was by switching time to a different concept and interface we would get rid of the problem of 'shared version history' as we would just stop treating it like an artifact. Context: concourse/concourse#3585 and concourse/concourse#3595

But then I thought maybe we're over-complicating things by introducing a whole new interface. If you look at the time resource's implementation, what it's doing pretty much makes sense in isolation. It's just that we don't want to centralize the history - in fact we don't even really want to store the history. Maybe we just need a different way to use resources in a pipeline in this way?

Coincidentally, @chenbh opened concourse/concourse#3572 around the same time we were thinking about this. What's interesting is here they still want to use a resource, but only as a trigger, and they don't want to fetch the bits. Sounds familiar! 🤔

That led me to think maybe the interface is fine, and we just need different ways to use resources in a pipeline. Maybe everything is just "resources" in the sense that they're all the inputs to your pipeline and still represent all external state, but maybe we just need to allow your pipeline to have different relationships with this state? Maybe everything just isn't versioned artifacts?

It's worth noting briefly that the thundering herd problem still exists within a pipeline today, not just with globally shared history. Having all the jobs that point to a time trigger all fire at once doesn't seem great.

So what if you could associate a resource to a job, and have its checks always be relative to the last-used trigger "version" that the job used? That way each job naturally has its own interval, and we're still just using the time resource as it is. The thundering herd problem is then solved at all levels, without the need for a new interface.

Spatial resources

If we go with v2 RFC as-is, all resources implicitly have to care about spaces. What does this 'trigger-only' workflow even mean for spaces? ...Should spaces just be optional?

What if spaces wasn't part of the interface at all? Just like we were able to implement triggers in terms of the resource interface, can we implement spaces in terms of it, too?

The meaning of check, get, and put map pretty well to the spaces themselves:

check: returns a 'version' for each space. We would just never give it a cursor version, and do a set diff against the result. Users can configure whatever filters they want in config to control which spaces are returned.
put: supports creation/deletion of spaces (i.e. git branches).
get: returns whatever data is useful for a given space.

What's nice is this makes the detecting of spaces a lot clearer, and supports explicit creation/deletion of the spaces themselves. The current proposal puts all of this into check which checks across all spaces and has to emit a reset event when one disappears. With this interface, the space is just no longer returned.

So, how would we use this to support spaces over an artifact resource like git?

We take the "versions" returned by the space check and toss them in the config of the artifact resource, like so:

space /check:
    {"uri":"..."}
 -> {"branch":"foo"}
    {"branch":"bar"}

artifact /check:
    {"uri":"..."} + {"branch":"foo"} = {"uri":"...","branch":"foo"}
 -> {"ref":"abcdef"}
    {"ref":"deadbeef"}

In this case, the git resource already understands branch: as part of its source: config, so it doesn't even have to change! I think this is common to lots of existing resources, actually - the things they would have been changed to "space" over are actually things they already support being pointed to in their source: config.

The interesting thing here is it uses the 'version' as a way to compose resources together. It actually feels pretty nice! Versions are already a public contract and are even shown to users directly in the UI.

What's really interesting about this is that by decoupling the space resource from the artifact resource, you could space over arbitrary things. At the end of the day you're just dynamically providing fields to source:. If you wanted to have a space for testing different kinds of arbitrary configuration, not just what the resource author predicted, you can do it by writing your own space resource.

This also removes the whole idea of a "default space". Now that resources don't have to implement spaces, it's not necessary to even think about as a resource type author. Either your source: is statically configured or dynamically modified via a space resource - as an author you don't know or care.

Another thing that's great about this is that, internally, everything is just back to resource configs and their versions. No more scopes, spaces, etc. - just resource configs and their associated versions. In this relationship, one resource config just results in another resource config being created.

Notifications

An early iteration of this RFC actually introduced notifications as a new part of the v2 interface, and they could be tied to artifact resources. This was so that things like GitHub status checks could be sent back with an associated version, so the notification resource knows what commit to apply the status check to.

So, what if we don't add a new interface for notifications, and we just use the 'version' as a communication mechanism between resources, just like we did for spaces? What if you could point a notification resource at an artifact resource, and have Concourse automatically run the notification resource when the artifact resource is used in a build?

This would work by taking the version of the artifact resource and tossing it in the config when invoking the put action of the notification resource, like so:

{
  "config": {
    # arbitrary user-specified config
    "repo": "concourse/concourse",

    # taken from the monitored resource
    "ref": "abcdef"
  },

  # provided by concourse
  "build": {"id": 123, "status": "failed", "job_name": "some-job", "url": "https://..."}
}

Overall proposal

I think we should go back to resources being a super simple and general check/get/put interface, and cut spaces out of the proposed interface. I think this RFC has run its course and I can open a new RFC with the simpler proposal. This new RFC would be a more incremental change on the existing interface, and it would still include things like version deletion and collecting all versions back in time. With spaces out of the picture we can finally make these changes without putting all our eggs in one basket.

The new proposal would introduce the general interface, and then describe how it would be interpreted to support 'versioned artifacts', as the interface is used for today, and also touch on the ideas around composability I covered above, ultimately so that the common interface isn't too tied to versioned artifacts. (For example, we may want to use a name other than 'version'.)

The challenging part of this approach is how we bubble these concepts up to users. The pipeline has to feel intuitive and not too repetitive. It shouldn't take a degree in YAML architecture to configure GitHub status checks. But the more I think about this, the more apparent the benefits become in terms of the mental model and internal structure. I think having a consistent model is paramount for Concourse's success (it's one of the founding principles).

Here are some key take-aways summarizing the general direction I think we should go:

Resources are general interfaces for interacting with external state.
Concourse pipelines define the relationship your pipeline has with this state.
- versioning artifacts
- emitting notifications
- triggering jobs
- fanning out across branches dynamically
- configuring build matrixes
Resources receive a single config and can emit config fragments (née versions).
Resources can compose with each other by passing config fragments to each other.
- Config fragments can feed into /check.
  - Space {uri} config -> {branch}...
  - Artifact {uri} config + space {branch} fragment -> {ref}...
  - Artifact {uri,branch} config + artifact {ref} fragment (of last seen version) -> {ref}...
  - Or: artifact {uri} config + space {branch} fragment + artifact {ref} fragment (of last seen version) -> {ref}...
- Config fragments can feed into /get.
  - Artifact {uri,branch} config + artifact {ref} fragment = git clone {uri}; git checkout {ref}
- Config fragments can feed into /put.
  - Notification {repo} config + artifact {ref} fragment = update GitHub status
Add pipeline functionality for trigger-only resources as another RFC.
- These are resources whose checks run relative to their associated job's last use of the notification.
Add pipeline functionality for notifications as another RFC.
- These are resources that are sidecar-ed onto another resource and follow its path through the pipeline.
Add pipeline functionality for spaces as another RFC.
- These are resources that are referenced in tandem with an artifact resource to run against each of the config fragments returned by the space. Something like {get: foo, spaces: bar}.
- Spaces are a pipeline-only add-on feature that users opt in to by composing resources explicitly.
- Now we don't have to worry about backwards-compatibility, because now the jump from v1 to v2 is easy, and spaces don't fundamentally change the resource model.

Critically, those last 3 RFCs can be done with both the v1 and v2 interfaces. It's not so coupled anymore. If you subtract spaces from resources v2, a lot of it is boring protocol changes, plus things like "put can create/delete many versions". That all is pretty independent of the other RFCs. It's a bit clearer with v2, of course, since v2 has just "config", but for v1 we can say that 'versions' (config fragments) are shoved into 'source' (config).

vito · 2019-05-10T17:38:40Z

Proposing we close this in favor of #24, which is still a work in progress but at least the resource interface part is pretty close. 😄 I'll open separate RFCs for spaces/triggering/notifications after the resource interface portion of #24 is done. (Right now they're all in there together.)

jchesterpivotal · 2019-05-10T18:47:06Z

Will the lengthy comment above somehow come across to the new proposal? It carries a lot of important context and food for thought.

vito · 2019-05-10T19:16:30Z

@jchesterpivotal The new proposals will encompass everything covered there, yes, though at the moment in a bit of a dry sense - the RFCs just outline the 'what' whereas my comment earlier was kind of a brain dump/train of thought that led to it. Is there any particular part you feel I should make sure to incorporate? Maybe they could just reference my comment for context-building? 🤔

jchesterpivotal · 2019-05-10T19:31:34Z

Perhaps as a discussion document?

I'm concerned that lots of folks will overlook a detailed discussion on a closed PR.

vito · 2019-05-13T19:00:53Z

@jchesterpivotal I've added a 'Previous Discussions' section to the new RFC: 0a805ec

vito · 2019-05-29T14:49:47Z

Closing in favor of #24. I was gonna wait on the post announcing it per the RFC resolution process but then realized folks can just close their own proposals without having to go through the resolution process. 😅 I'm writing a post for #24 anyway though and will mention the closing of this one.

vito added 3 commits April 3, 2018 17:31

add wip resources v2 proposal

320c4b3

more wip; currently stuck pondering notification api

b4b63ec

clean up proposal, add summary of changes

c0b3683

This was referenced Apr 10, 2018

New resource interface (+ add versioning) concourse/concourse#534

Closed

Do a full clone when tag filter is set and branch is not concourse/git-resource#176

Closed

Update to allow dynamic branch choices during 'out' concourse/git-resource#172

Closed

cwlbraa reviewed Apr 17, 2018

View reviewed changes

01-resources-v2/proposal.md Outdated Show resolved Hide resolved

cwlbraa reviewed Apr 17, 2018

View reviewed changes

ringods mentioned this pull request May 16, 2018

Trigger job with custom parameters concourse/concourse#783

Closed

dprotaso reviewed May 17, 2018

View reviewed changes

01-resources-v2/proposal.md Outdated Show resolved Hide resolved

itsdalmo reviewed May 24, 2018

View reviewed changes

01-resources-v2/branch-gen.yml Outdated Show resolved Hide resolved

itsdalmo reviewed May 24, 2018

View reviewed changes

01-resources-v2/branch-gen.yml Outdated Show resolved Hide resolved

itsdalmo reviewed May 24, 2018

View reviewed changes

01-resources-v2/commit-status.yml Outdated

plan:

- get: atc-pr

trigger: true

spaces: all

This comment was marked as spam.

Sign in to view

This comment was marked as spam.

Sign in to view

itsdalmo reviewed May 24, 2018

View reviewed changes

01-resources-v2/proposal.md Outdated Show resolved Hide resolved

vito mentioned this pull request May 29, 2018

Add mention to version being cache in the resource concourse/semver-resource#59

Closed

fix typos

8e853a5

itsdalmo reviewed May 29, 2018

View reviewed changes

01-resources-v2/proposal.md Outdated Show resolved Hide resolved

vito mentioned this pull request Jun 4, 2018

Large amount of active connections on Wings GCP sql DB that degrade overall performance concourse/concourse#2254

Closed

vito added 2 commits June 4, 2018 19:32

fix inconsistency with 'get' action in example

5551f03

space is a string, not a JSON object

itsdalmo reviewed Jun 4, 2018

View reviewed changes

01-resources-v2/git-example/artifact Outdated Show resolved Hide resolved

01-resources-v2/git-example/artifact Outdated Show resolved Hide resolved

01-resources-v2/proposal.md Outdated Show resolved Hide resolved

vito mentioned this pull request Jun 5, 2018

Create migration for new BTREE index on check_order column of version… vmware-archive/atc#232

Closed

vito added 4 commits June 12, 2018 16:49

remove notification stuff from proposal

251c645

also update examples to use simple string space identifiers, and clarify some of the change summaries

remove mention of schema validation

a5895f9

didn't get time to map this out - let's do it as a separate RFC

add has_latest, flesh out json api via Go structs

60b015c

also switch git example to rugged, for much more efficient checking across branches (no need to checkout tree, no need to shell out to git)

add semver example resource type + pipelines

3bc0009

vito commented Jun 22, 2018

View reviewed changes

01-resources-v2/git-example/artifact Outdated Show resolved Hide resolved

vito mentioned this pull request Jun 22, 2018

Add Metadata to Check concourse/git-resource#193

Open

vito commented Jun 28, 2018

View reviewed changes

01-resources-v2/proposal.md Outdated Show resolved Hide resolved

vito mentioned this pull request Feb 15, 2019

Add an option to skip implied get of put concourse/concourse#3299

Closed

vito mentioned this pull request Feb 23, 2019

Trigger on new docker tag concourse/docker-image-resource#217

Closed

clarafu mentioned this pull request Mar 1, 2019

Spaces and Resources V2 concourse/concourse#3413

Closed

vito commented Mar 8, 2019

View reviewed changes

marco-m reviewed Mar 8, 2019

View reviewed changes

clarafu mentioned this pull request Mar 25, 2019

Spaces and Resources V2 concourse/concourse#3584

Closed

This was referenced Mar 25, 2019

can set an icon on resources, and display it on pipeline and resource views concourse/concourse#3581

Merged

Add an option to skip fetching the resource in get steps concourse/concourse#3572

Closed

ddadlani mentioned this pull request Apr 25, 2019

Random "unexpected end of JSON input" on build tasks concourse/concourse#3791

Closed

itsdalmo mentioned this pull request May 6, 2019

Out-of-order commits on different PRs result in skipped builds telia-oss/github-pr-resource#26

Open

vito mentioned this pull request May 9, 2019

Generic skip get for the put step concourse/concourse#974

Closed

vito added the resolution/close label May 10, 2019

vito closed this May 29, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: Resources v2 #1

RFC: Resources v2 #1

vito commented Apr 10, 2018 •

edited

Loading

cwlbraa left a comment •

edited

Loading

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

itsdalmo left a comment •

edited

Loading

vito commented Jun 12, 2018

vito Mar 8, 2019

marco-m Mar 8, 2019

vito Mar 9, 2019

marco-m Mar 8, 2019

marco-m Mar 8, 2019

jchesterpivotal commented Mar 8, 2019

vito commented Mar 28, 2019 •

edited

Loading

vito commented May 10, 2019

jchesterpivotal commented May 10, 2019

vito commented May 10, 2019

jchesterpivotal commented May 10, 2019

vito commented May 13, 2019

vito commented May 29, 2019


		If the requested version is unavailable, the command should exit nonzero.

		No response is expected.

RFC: Resources v2 #1

RFC: Resources v2 #1

Conversation

vito commented Apr 10, 2018 • edited Loading

cwlbraa left a comment • edited Loading

Choose a reason for hiding this comment

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

itsdalmo left a comment • edited Loading

Choose a reason for hiding this comment

vito commented Jun 12, 2018

vito Mar 8, 2019

Choose a reason for hiding this comment

marco-m Mar 8, 2019

Choose a reason for hiding this comment

vito Mar 9, 2019

Choose a reason for hiding this comment

marco-m Mar 8, 2019

Choose a reason for hiding this comment

marco-m Mar 8, 2019

Choose a reason for hiding this comment

jchesterpivotal commented Mar 8, 2019

vito commented Mar 28, 2019 • edited Loading

Triggers and the thundering herd problem

Spatial resources

Notifications

Overall proposal

vito commented May 10, 2019

jchesterpivotal commented May 10, 2019

vito commented May 10, 2019

jchesterpivotal commented May 10, 2019

vito commented May 13, 2019

vito commented May 29, 2019

vito commented Apr 10, 2018 •

edited

Loading

cwlbraa left a comment •

edited

Loading

itsdalmo left a comment •

edited

Loading

vito commented Mar 28, 2019 •

edited

Loading