Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resource Groups #210

Closed
cppforlife opened this issue Nov 23, 2015 · 15 comments
Closed

Resource Groups #210

cppforlife opened this issue Nov 23, 2015 · 15 comments

Comments

@cppforlife
Copy link

Problem

BOSH core team runs a pipeline to produce a bosh release and bosh cli gem combination. It also tests them against a specific version of bats git repo.

BOSH cpi team runs a pipeline that expects to consume bosh release, cli gem and bats repo combination so that it can test it all against a specific version of cpi release. Ultimately this pipeline publishes a new cpi release referencing other asset versions.

We would like to capture group of resource versions instead of just individual resource versions, so that cpi team's pipeline can run against a specific set of resources that passed core team's pipeline.

Other examples:

  • concourse release + garden release combination
  • cf-release + diego release + cats src combination
  • pieces of cf-release

Proposal

I think concourse should introduce a new primitive for specifying resource groups. Resource groups would be versioned similarly to any other resource. Builds would be able to produce a resource group version and consume specific resource group version.

In UI this could look like a box around multiple resources to indicate that they come as a whole.

  +--------------+
  | +----------+ |   +-------+
  | | bats-src |-|---|       |
  | +----------+ |   |       |
  | +----------+ |   |       |
  | | cli-ver  |-|---| tests |
  | +----------+ |   |       |
  | +----------+ |   |       |
  | | bosh-rel |-|---|       |
  | +----------+ |   +-------+
  +--------------+

Resources mentioned in a resource group are not any different from any other resources. They will be configured with all the other resources and have exactly same semantics.

Backing store for resource groups could work similarly to any resource's backing store. For example concourse release can include s3_resource_group that stores and retrieves files from a bucket just like s3_resource.

Contents of each resource group will only include versions for other resources so that storage/retrieval is delegated to each resource. For example contents of v1 of core-assets resource group will be:

candidate-core-assets-bucket/core-asset-1.json

(json in yaml)

bats-src:
  ref: 329ryfvbqeg83490tk549h39q40
cli-ver:
  version: 1.34.0
bosh-rel:
  version: 223

Possible configuration in the pipeline:

resources:
- name: bosh-init
  type: s3
  source:
    regexp: bosh-init-([0-9.]+)-linux-amd64
    bucket: {{init_bucket}}
- name: bats
  type: git
  source:
    uri: https://github.com/cloudfoundry/bosh-acceptance-tests.git
    branch: master
- name: bosh-release
  type: bosh-io-release
  source:
    repository: cloudfoundry/bosh

resource_groups:
- name: core-assets
  type: s3
  resources: [bosh-init, bats, bosh-release]
  source:
    regexp: core-assets-([0-9.]+)
    bucket: {{combos_buckets}}

I'll skip details of how concourse will be able to fetch specific versions of group's resources and its api format.

@cppforlife
Copy link
Author

@vito @xoebus let's talk more about this some time today/tomorrow.

/cc @Amit-PivotalLabs @calebamiles @mariash

@concourse-bot
Copy link
Collaborator

Hi there!

We use Pivotal Tracker to provide visibility into what our team is working on. A story for this issue has been automatically created.

The current status is as follows:

This comment, as well as the labels on the issue, will be automatically updated as the status in Tracker changes.

@xoebus
Copy link
Contributor

xoebus commented Nov 24, 2015

@dliebreich What do you think of this instead of Airfreight?

@dliebreich
Copy link

I'm not sure how the example yml and json describes the solution. Otherwise, conceptually, it seems like it would meet our needs.

One problem I can foresee is that you need to have the component resource configs match in the producer of the group and in the consumer of the group. If different teams maintained the producer and the consumer, that coordination might be somewhat expensive.

@amitkgupta
Copy link
Contributor

For the purposes of integrating releases for Cloud Foundry (and this generalizes to any BOSH-deployed service composed of many releases), we require:

  • some builds can produce an artifact recording the tuple of input releases that passed the build, and are therefore "blessed"
  • some builds that can consume said artifact, and generate a BOSH deployment manifest consisting of all or some of the blessed releases
  • end users (human operators) can also consume said artifact, and produce a manifest

We accomplish this like so:

https://github.com/cloudfoundry/cf-deployment/blob/master/blessed_versions.json

We maintain our own code for writing and reading this artifact because it needs to be used (at least read) outside the context of Concourse. We also specifically do not want to fetch the resources referenced in that blessed_versions.json file, because we use the data to construct a manifest where we can then delegate to the Director to fetch only the resources that it needs, and do it all Director-side to avoid unnecessary download-then-uploads on the client side.

@dliebreich
Copy link

+1 on not fetching all the resources in a group. We would group the ops manager images and stemcells for each of the 4 IaaSs with the .pivotal file, but we would only want the one stemcell and one ops manager for the IaaS we are testing in that pipeline.

@vito
Copy link
Member

vito commented Nov 26, 2015

Here's a more fleshed out example:

resources:
- name: aws-stemcell
  type: bosh-io-stemcell
  source: {...}

- name: vsphere-stemcell
  type: bosh-io-stemcell
  source: {...}

- name: cf-release
  type: bosh-io-release
  source: {...}

- name: diego-release
  type: bosh-io-release
  source: {...}

resource_groups:
- name: diego-cf-compat
  type: s3
  source: {...}
  resources: [cf-release, diego-release]

- name: cf-stemcell-compat
  type: s3
  source: {...}
  resources: [cf-release, vsphere-stemcell, aws-stemcell]

jobs:
- name: something-with-cf-and-diego-on-aws
  plan:
  - aggregate:
    - get: cf-release
      groups: [diego-cf-compat, cf-stemcell-compat]
    - get: diego-release
      groups: [diego-cf-compat]
    - get: aws-stemcell
      groups: [cf-stemcell-compat]

So, you could pick and choose what parts of the tuple you actually want, and you're still dealing with regular old Concourse resources, rather than passing around giant "union tarballs" or something. If you wanted to just produce URLs for the director to upload that would be done on a resource-by-resource basis, i.e. a param telling the resource to just give you the URL and not download it (which a few already support).

Notes:

  • resource_groups configure where to find the "tuples" and what resources they map to. (May need a map here instead of a list, not sure what the actual contents of the resource would look like.)
  • You'll notice strong parallels between groups and passed on the get steps.
  • The resources themselves are configured and used as normal, making this a drop-in feature so you can reuse your tasks and much of your pipeline config.

@Amit-PivotalLabs
Copy link
Contributor

For our use case, not sure how much this buys us. We need to extract the
URLs or whatever metadata is contained in the S3 object backing the
resource_group, and use them to construct a manifest, outside the context
of concourse. And we would want to exercise this logic in our builds as
well, so we wouldn't want one way of extracting the data in Concourse and a
different way of extracting the data for end users.

On Thu, Nov 26, 2015 at 9:41 AM, Alex Suraci notifications@github.com
wrote:

Here's a more fleshed out example:

resources:- name: aws-stemcell
type: bosh-io-stemcell
source: {...}

  • name: vsphere-stemcell
    type: bosh-io-stemcell
    source: {...}
  • name: cf-release
    type: bosh-io-release
    source: {...}
  • name: diego-release
    type: bosh-io-release
    source: {...}
    resource_groups:- name: diego-cf-compat
    type: s3
    source: {...}
    resources: [cf-release, diego-release]
  • name: cf-stemcell-compat
    type: s3
    source: {...}
    resources: [cf-release, vsphere-stemcell, aws-stemcell]
    jobs:- name: something-with-cf-and-diego-on-aws
    plan:
    • aggregate:
      • get: cf-release
        groups: [diego-cf-compat, cf-stemcell-compat]
      • get: diego-release
        groups: [diego-cf-compat]
      • get: aws-stemcell
        groups: [cf-stemcell-compat]

So, you could pick and choose what parts of the tuple you actually want,
and you're still dealing with regular old Concourse resources, rather than
passing around giant "union tarballs" or something. If you wanted to just
produce URLs for the director to upload that would be done on a
resource-by-resource basis, i.e. a param telling the resource to just give
you the URL and not download it (which a few already support).

Notes:

  • resource_groups configure where to find the "tuples" and what
    resources they map to. (May need a map here instead of a list, not sure
    what the actual contents of the resource would look like.)
  • You'll notice strong parallels between groups and passed on the get
    steps.
  • The resources themselves are configured and used as normal, making
    this a drop-in feature so you can reuse your tasks and much of your
    pipeline config.


Reply to this email directly or view it on GitHub
#210 (comment)
.

@xoebus xoebus changed the title resource groups [proposal] Resource Groups Dec 2, 2015
@xoebus
Copy link
Contributor

xoebus commented Dec 2, 2015

Let's just store the information for this in Git. We're familiar with it now and it's much easier for a human to look at diffs and the GitHub web interface than S3. I'm liking the Alex's syntax proposal above.

@Amit-PivotalLabs I don't understand. Can you give an example of your workflow?

@xoebus
Copy link
Contributor

xoebus commented Dec 2, 2015

@vito How would you put: to a resource group with your syntax?

@Amit-PivotalLabs
Copy link
Contributor

@xoebus

  • Build consumes cf-release, etcd-release and aws-stemcell resource
  • Tests pass, then a "blessed-versions" file is checked into a git repo recording this group of resources passed together. The data included for each resource includes a version, URL, and SHA.
  • Tool consumes "blessed-versions" data to produce a BOSH manifest stub saying which versions of releases, and which stemcell, to use. It heavily leverages the new feature in BOSH where you can directly specify URLs for releases and stemcells in the manifest, and delegate to the CLI and director to make sure the right releases and stemcells end up on the director (no explicit bosh upload release/stemcell).

The key point here is we don't need to get the resources, we sorta just need the data in the file backing the git repo backing the resource_group that you're proposing. The feature you're describing does the opposite, makes it possible to get a resource from a group (something we don't need), but not get the group metadata itself (all we need).

@vito
Copy link
Member

vito commented Dec 9, 2015

@Amit-PivotalLabs

For the artifacts your team publishes at the end of the day, this won't be a silver bullet. For determining that set of outputs in the first place though, it could make your lives a bit easier. For example, if each team consistently reported the tuple of versions that worked together, mega's CI could just operate on the intersections for whatever deployment configurations you're concerned with. Disclaimer: no idea how you're determining these tuples today. If it's based on diego-cf-compatibility I'd expect that to be replaced with this resource groups feature instead. Not sure if other compatibility matrices currently exist.

That being said, it'd be pretty easy to write the function from the individual resources to your own format. Which seems like a useful thing generically. The point behind not having the tuple's format be the focal point is to have the pipeline usage just mirror the semantics of passed constraints. If we were passing around resource groups as their own objects you may end up with some tasks that deal with groups vs. some that deal with individual inputs; I'd rather have everything be the latter.

If all you need is metadata, it's up to the resources to support that, e.g. the S3 resource will give you /url and /version. I'd rather not optimize for avoiding downloading since caching should help out, and some resources support that via params anyway (or could have issues opened to support it, as in S3's case).

@jchesterpivotal
Copy link
Contributor

We discussed this problem on an unsuccessful hackday effort, calling it "correlated resources".

If you think of a Concourse pipeline as a function, right now it doesn't allow you to control all the parameters at once. You can vary each individually, once, but backing up to try permutations is difficult (does every or version pinning do this? I'm not sure)

We're interested in this problem because we want to develop an "approvals" resource -- a resource representing human approval of versions of other resources. So if I have a git SHA 1abc234, I might also want an approval for that SHA. The approval is distinct from the SHA. In fact we'd like it to be distinct from what the other resource type is -- approvals should be assignable to any other resource version. SHAs, Tracker stories, S3 documents, BOSH images. Anything.

To achieve this, there needs to be a way to tell Concourse to treat correlated resources as a single unit.

This also feeds into a larger discussion, which is inverting the UI emphasis from job to resource. The UI shows the state of the world as the integral of the stream of all resources coming into the system. However, what is interesting is the differential -- what the pipeline looks for this particular set of resources. This data is available but requires sleuthing to reconstruct.

@vito
Copy link
Member

vito commented May 11, 2016

/cc @kimeberz for the differential UI - I've thought the same, and want to redo the resources page to be a lot more useful for data ingestion (including potentially having a page for viewing a particular version)

cdutra pushed a commit that referenced this issue Aug 4, 2016
[finishes #127219723]

Submodule src/github.com/concourse/atc 1ee1c48..a3023dd:
  > ginkgo blur reoreded everything
  > return all public pipelines on GetAllPipelines endpoint
  > only check for basic auth on get token endpoint
  > do not default team name to main if not provided
  > do not default team name to main in api
Submodule src/github.com/concourse/fly cb98e4c..3269297:
  > make --team-name required
  > retrieves token if there is no auth method set
Submodule src/github.com/concourse/testflight fdd0c3f..abd306e:
  > do not default team name to 'main'
Submodule src/github.com/onsi/ginkgo e43390e..74c678d:
  > Make JUnit reporter include failure location in message. (#262)
  > remove 1.4 from travis.yml
  > Add gcflags option (#276)
  > Revert "Use the go1.5 build tag to handle vendor exceptions" (#274)
  > Merge pull request #272 from fsouza/fix-vendor
  > Add flaky test mitigation (#261)
  > Allow units and precision in benchmark (#266)
  > Add Solaris support (#264)
  > Merge pull request #259 from kwadrat/master
  > Merge branch 'apvail-spell-fix'
  > Fix go16 vendor
  > Merge pull request #250 from james-lawrence/master
  > Merge pull request #228 from jayunit100/RegexFileNameFiltering
  > Fix test flakiness
  > Merge pull request #235 from mboersma/fix-travis
  > fix compilation on older versions of go
  > fix issue where packages that reference vendored dependencies weren't compiling
  > Merge pull request #216 from sha1sum/master
  > Merge pull request #209 from luxas/build_on_arm64
  > Merge pull request #212 from cfmobile/master
  > Merge pull request #210 from cfmobile/master
Submodule src/github.com/onsi/gomega 2152b45..9ed8da1:
  > Merge pull request #166 from trayo/patch-2
  > Merge pull request #164 from wendorf/assert_typo
  > Merge remote-tracking branch 'origin/pr/163'
  > Merge pull request #160 from tinygrasshopper/fix_failing_close_ghttp
  > Merge pull request #150 from tinygrasshopper/build-fix
  > Merge pull request #159 from WesleyJeanette/patch-1
  > Merge pull request #157 from kwadrat/master
  > Merge pull request #141 from mariantalla/gomega-yaml-matcher
  > Reset tmpDir in gexec.CleanupBuildArtifacts
  > Update test description for match json tests.
  > Make the error message for expected JSON values having the wrong type accurate
  > Merge pull request #133 from tjarratt/be-identical-to-matcher
  > Merge pull request #132 from tjarratt/improve-match-json-error-message
  > Merge pull request #128 from tinygrasshopper/have-cap
  > drop 1.4 from travis
  > ghttp tests should now pass in 1.6
  > CloseClientConnections test uses http.Post instead of http.Get to avoid retries
  > add tip to .travis.yml
  > Merge pull request #125 from cfmobile/master
  > Merge pull request #122 from cfmobile/master
  > Merge pull request #119 from jim-slattery-rs/gitignore_idea
  > Merge pull request #118 from jim-slattery-rs/fix_up_succeed

Signed-off-by: Yucheng Tu <ytu@pivotal.io>
@vito
Copy link
Member

vito commented Oct 6, 2016

think we're headed towards #684 as a general solution that would allow for this kind of workflow. closing this to consolidate.

@vito vito closed this as completed Oct 6, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants