New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New resource interface (+ add versioning) #534

Closed
vito opened this Issue Jul 13, 2016 · 54 comments

Comments

@vito
Copy link
Member

vito commented Jul 13, 2016

Proposed changes

  1. Rename in and out scripts to get and put. These are confusingly named something different because a long time ago jobs just had _in_puts and _out_puts.
  2. Run ./check, ./get, and ./put, and ./info (see next point) rather than /opt/resource/X. Having them in /opt is a bit Linux-specific and assumes containers have their own chroot, which is not the case on Darwin or Windows.
  3. Add a info script which prints the resource's API version, e.g. {"version":"2.0"}. This will start at 2.0. If /info does not exist we'll execute today's resource interface behavior.
  4. Change get to not emit the version or metadata. This was from when in didn't know what it was fetching until it was done. It should now be an error if it can't fetch the requested version. We'll also move all metadata collection to check (see point 8).
  5. Add a delete action, which looks like put (can be given files and params to determine what to delete) but returns the set of versions that it deleted. This is to support bulk deletes, e.g. to garbage collect intermediate artifacts after a final build is shipped.
  6. Remove the source and destination arguments passed to get and put - just make it the working directory.
  7. Give check an official scratch space, which is the current working directory. No real semantics change here, just consistency with get and put, and potential for future behavior like reusing the scratch space but not reusing check containers for extended periods. Note: we've introduced /scratch recently so this change may just mean making that the work-dir.
  8. Move all metadata emitting to check, so that it's always present. The original thought was that metadata collection may be expensive, but so far we haven't seen that to be the case.
  9. Change put to emit an array of versions, rather than just one, and without metadata. Technically the git resource may push many commits, so this is necessary to track them all as outputs of a build. This could also support batch creation. Notably, the versions emitted by put are ordered chronologically, so that the dependent get knows to fetch the latest version. We would save them internally, along with an association with the build, and rely on check to determine the final ordering and metadata, being careful to not leave gaps in the version history (i.e. other commits pushed slightly before the put registered the build's).
  10. Change put to write to a specified file, rather than stdout, so that we don't have to be attached to process its response. This is one of the few ways a build can error after the ATC reattaches (unexpected end of JSON). With it written to a file, we can just try to read the file when we re-attach after seeing that the process exited. This also frees up stdout/stderr for normal logging, which has been an occasional pitfall during resource development/debugging.
  11. Remove the distinction between source and params; resources will receive a single config. The distinction will remain in the pipeline. This makes it easier to implement a resource without planning ahead for interesting dynamic vs. static usage patterns, and will get more powerful with #684.

@vito vito changed the title Resource scripts should be called `get` and `put`, not `in` and `out` New resource interface (+ add versioning) Jul 14, 2016

@vito

This comment has been minimized.

Copy link
Member Author

vito commented Jul 14, 2016

Tagging a bunch of resource authors who would probably be interested in where the new resource interface goes. Pardon the spam if you're not interested.

@jtarchie @jchesterpivotal @robdimsdale @xoebus @cppforlife @drnic @thewoolleyman @xtreme-gavin-enns @zachgersh @ljfranklin @drnic

@cppforlife

This comment has been minimized.

Copy link

cppforlife commented Jul 14, 2016

info instead of version; version as a key in resp?

Sent from my iPhone

On Jul 14, 2016, at 10:55 AM, Alex Suraci notifications@github.com wrote:

Tagging a bunch of resource authors who would probably be interested in where the new resource interface goes. Pardon the spam if you're not interested.

@jtarchie @jchesterpivotal @robdimsdale @xoebus @cppforlife @drnic @thewoolleyman @xtreme-gavin-enns @zachgersh @ljfranklin @drnic


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

@vito

This comment has been minimized.

Copy link
Member Author

vito commented Jul 14, 2016

Yeah that's probably better long-term.

./info:

{ "api_version": "1.0" }

Not sure if we'd bother with actual versioning or just have it contain a number.

@jchesterpivotal

This comment has been minimized.

Copy link
Contributor

jchesterpivotal commented Jul 14, 2016

Does api_version refer to the API of the resource or of Concourse?

@vito

This comment has been minimized.

Copy link
Member Author

vito commented Jul 14, 2016

@jchesterpivotal the version of the resource interface that the resource conforms to. it'd be its own thing and probably not change much (this is our first big resource interface change since its inception)

@thewoolleyman

This comment has been minimized.

Copy link

thewoolleyman commented Jul 15, 2016

One thought on returning arrays of versions from different containers - I assume the concern about running on different containers is really about the array of versions being different or non-deterministic across multiple invocations to generate the array, whether or not it was run on the same container or not?

If so, the git-branches resource is a great example of something that is hard or maybe even impossible to make deterministic. It is just fetching a list of git refs at a given point in time, but git doesn't version refs. So, you just get whatever refs exist at the time you generated the version array, and a one may have been deleted then re-added since the last time it was generated, resulting in an version which is identical to a previous one, even though the ref may contain completely different code. That's why my implementation of git-branches resource has to include an independent increment value as part of the version, to ensure a new version is emitted even in these cases.

This may be what you are intending to solve with the git-branch-heads resource?

But nevertheless, it's still a problem that could exist with other types of resources people may come up with.

Also, when a resource version does not have an inherent monotonically increasing version component, it causes problems with sorting in the resources UI. For example, the current s3 resource with regex version has very bad sorting behavior in the UI. First of all, it forces the use of a semantic version, which we don't care about because we are tagging our "versions" with specifically-generated checksum SHAs (separate issue, but relevant. There's a good reason for doing this and not relying on concourse-generated versions, because the s3 resources are build artifacts which are looked up by the locally-generated checksum SHA outside of our Concourse system). So, we had to force it into a 0.0.0.abcdef format, where the 0.0.0. part never changes. Now that we have many hundreds of these, the latest resource emitted can be pages down in the resource view, e.g. for 0.0.0.aa... vs 0.0.0.ff....

So, given these concerns, would it be of value to require that every resource includes an monotonically increasing increment component to its version, to solve these types of issues? Just the existence of it would solve the first scenario of "identical looking but different" versions, e.g. deleted then re-added git branches. And If it were required to be an integer (which could either just be an increment or a timestamp), then the Concourse resource UI could default to using it as the sort field, avoiding the types of sorting issues we are seeing with only-alphanumeric-prerelease-component semantic versions with the S3 resource.

Thoughts?

Thanks,
-- Chad

@vito

This comment has been minimized.

Copy link
Member Author

vito commented Jul 15, 2016

@thewoolleyman The only concern is that some resources (e.g. docker-image) need to spin up daemons and may not be coded today to support multiple scripts running in the same container. For example they both may try to spin up the daemon, and one will fail. We don't document the expected isolation of each script relative to each other.

The non-determinism is another valid concern: even in the git case, we'd need to at least know which final SHA it pushed (n), so that we know if we run check and find n-3, n-2, n-1, n, and n+1 (because someone else pushed immediately after), only n-3..n should be considered outputs of the build. This pushes me towards put returning the array of versions, as that's something it can know e.g. from the git push output that it just ran.

The git-branch-heads resource is not trying to solve anything about the git-branches resource, it's just a different use case, affected by the same non-linearity issue. That doesn't sound like something we can necessarily solve with the resource interface, though. Arguably if a ref is pushed and then un-pushed, it shouldn't be a new version anyway (in the case the same version was collected before the push that was later reverted).

The S3 example sounds like a misuse of the S3 resource. You should probably just use versioned_file or something instead, or at least put something increasing in there (unix timestamp?). There's literally no way to sort those in a reasonable way without coupling it to git. They're just strings.

I don't think any of that justifies requiring a monotonically increasing value. Idempotent updates are common and should not yield new versions. Resources are representing the content with the version, not just change over time; if something has gone forward and then back to the previous version, and everything is the same as it was, then it should be the same version.

@thewoolleyman

This comment has been minimized.

Copy link

thewoolleyman commented Jul 15, 2016

@vito I think that all makes sense, except for the claim of our misuse of the S3 resource. I probably didn't explain the use case well enough.

We cannot use versioned_file because the SHA component of the version is a _calculated checksum_ which uniquely identifies the gzipped artifact on S3. This checksum must be able to be calculated and used to download the artifact OUTSIDE of the Concourse system.

An example is our Rails fixture files generated via the fixture_builder gem. These take several minutes to build, but only need to ever be built once for every git SHA, because they are completely derived based off of the current schema.rb and test object mothers. So, the checksum (and name of the S3 published file) is based off these files from which the fixtures are derived

So we logically want to push this delay onto CI, and avoid incurring it locally every time a developer pulls a new SHA which needs a new version of fixtures.

Therefore, we have concourse build and publish them in a zip file named after the checksum, via the S3 resource (because that's what it's for, publishing files to S3). Then, on developer boxes, we only need to calculate the checksum based on the currently-checked-out files (which takes milliseconds), then download from s3 the file which matches that checksum (which takes only a second or two).

And we can't put a timestamp or other increasing component into the version, because there's no way for the non-concourse clients to know what that was in order to download the file matching their locally-generated checksum.

So, this isn't a misuse of the S3 artifact, because we are using it for the intended purpose: uploading concourse-generated files to S3.

The problem is (specific to the s3 resource case):

  1. It is unnecessarily forcing us to use a semver format for the filename, when we really just want to use a calculated SHA. This in itself isn't a big problem, we worked around by using the 0.0.0.<calculated SHA pretending to be a prerelease version> approach. But...
  2. This makes the sorting in the resource view unusable, because even though 0.0.0.aa... might be the most-recently-generated version, it may be buried several pages deep, while all the older 0.0.0.ff... versions are at the top of the list.
  3. (update) and the lack of a URL to directly navigate to a specific resource version (and avoid going through pagination) only compounds the pain.

So, that brings us back to a the more general problem related to this issue, which I think does justify requiring a monotonically increasing value to force sorting in the case of resources which do not otherwise have an inherent sortable version component, e.g. our use of the S3 resource.

@vito

This comment has been minimized.

Copy link
Member Author

vito commented Jul 15, 2016

@thewoolleyman If you're using the S3 resource without versioning (semver or versioned_file) you're misusing the S3 resource and should probably use or write a different resource. It does not make sense to add a monotonically increasing number to the resource, as it already has an established versioning scheme when used correctly.

@thewoolleyman

This comment has been minimized.

Copy link

thewoolleyman commented Jul 16, 2016

@vito OK, if you say wanting the S3 resource to support naming its published file based on an externally-derivable unique checksum is a misuse of it, and we shouldn't expect it to be chronologically sorted, then we clearly have to roll our own if we want sorting. Good to know.

But, how do the standard resources versions know how to properly sort themselves? E.g. the git resource, and an S3 resource used with versioned_file, which are both just random SHAs, just like our usage of the S3 resource with 0.0.0.SHA semvers. I assume this is somehow determined by some value in the internal Concourse database?

If so, is there some way we could ensure that this internal version is always used to provide chronological sorting, or a way to allow it to be used if the default sorting isn't useful (as is the case with our S3 usage)?

And it also seems like exposing this would be the natural unique ID for a URL pointing directly to resource versions (which would be useful in debugging issues in a pipeline, e.g. clicking a resource version to see what upstream job builds generated it, without having to manually search for it in the resource view, possibly through many paginations if there is no way to chronologically sort them).

-- Chad

@vito

This comment has been minimized.

Copy link
Member Author

vito commented Jul 16, 2016

@thewoolleyman The git resource knows their order because in git, SHAs have an order to them (i.e. git log). This is how every resource works; the s3 resource knows their ordering because either it's via versioned_file (inherently ordered) or semver (order determined by filenames).

@thewoolleyman

This comment has been minimized.

Copy link

thewoolleyman commented Jul 16, 2016

@vito But the resources web page (/pipelines/PIPELINE_NAME/resources/RESOURCE_NAME) knows this "order" based off a value stored in the concourse database, but it isn't always "inherent" by my definition of "inherent".

I believe the ordering is based on "check_order" in the DB, as shown here:

https://github.com/concourse/atc/blob/master/db/pipeline_db_test.go#L590
https://github.com/concourse/atc/blob/master/db/pipeline_db.go#L480
https://github.com/concourse/atc/blob/master/db/pipeline_db.go#L921

Looking at the last link, incrementCheckOrderWhenNewerVersion, I'm guessing the meaning of "Newer" or "CheckOrder" is implemented differently for different resources, or different usages of different resources? It can't just always be the chronological order in which they were emitted, otherwise our S3 semver resources wouldn't sort the way they do (which is by semantic version order).

In the case of an S3 resource with Semver ordering, there IS obviously an "inherent" ordering (Semver), which is observed by the Resources UI.

But in the case of a Git resource SHA or versioned_file JSON version object, there IS NOT an inherent ordering, at least not to anything displayed in the Resources UI (it's just a random SHA). Yet, the latest one is always at the top in chronological order. Presumably this is because the original API request ordered it based off the check_order in the database, which was chronological.

So, relevant to this issue, there's other resource "versions" which also do not have an "inherent" ordering in their actual JSON version object, e.g. the git-branches resource, the proposed get-branch-heads resource.

But specific to the S3 resource, looking at it that way, our requirement is not a "misuse" of the S3 resource. It just a feature request for a combination of different existing behaviors. Specifically:

  1. We want a random SHA as the name of the S3 file (obviously possible since that's exactly what the name of a versioned_file S3 resource file is),
  2. ...but where we control the filename (obviously possible because that's what we're doing with the regexp S3 resource option, except we are unnecessarily required to make it conform to a semver format, even though we don't care about semantic versioning of the artifact)
  3. ...and we want it to be sorted in chronological order even though it has a random SHA as the name of the S3 file (obviously possible since that's exactly what the versioned_file S3 resource option does.)

So, you can state that you don't want the official S3 resource to support that particular combination of obviously-possible existing behaviors, and I'm glad to fork and implement it if our team cares enough. But it's definitely not a "misuse", it's a legitimate use case for what we want to do.

Anyway, I'm glad to take this to a separate issue on the s3 resource, if you don't think this discussion is relevant to this issue about changes to the resource/version interface.

@vito

This comment has been minimized.

Copy link
Member Author

vito commented Jul 16, 2016

@thewoolleyman The order in the database comes from check. It's not arbitrary. Git resource shas and versioned_files absolutely do have an inherent ordering.

Your current usage is a misuse of the S3 resource as-is. What you're requesting is a very different resource. What you define as 'chronological order' is impossible to determine after-the-fact; looking at the bucket and ordering the filenames is impossible for the S3 resource, as it depends on knowledge from the git repo, which the S3 resource knows nothing about. Being able to discover older versions of something is supported by resources that properly implement the full interface, which would be impossible in this case.

Resources that do not have an inherent ordering should decide, at their own discretion, whether they need an increment value. It is certainly not a property of all resources.

At this point the topic is getting derailed and it feels like this should be taken elsewhere. There are many more parties that should be able to contribute to this discussion.

@thewoolleyman

This comment has been minimized.

Copy link

thewoolleyman commented Jul 18, 2016

@vito The specifics to the S3 resource may be unrelated to this issue, as I admitted.

However, the concerns I have raised about how to properly order resource versions in the UI, and address them directly via URI, even if they do not conform to the current implementations of resources or versioning in concourse, is in my opinion, very relevant to an issue which is intended to discusses proposed changes to how resources and versioning are handled.

And our use case (leveraging Concourse to publish a resource externally while using externally determined versions) is completely valid and necessary, as I described in detail, and it's reasonable to expect a decent UX, and as I said, our problems are relevant to other use cases people may come up with (proven by our discovery of the need to manually add an "increment" value to the git-branches, which also does not have an inherent sorting).

Therefore, we may be "misusing" the existing S3 resource, that's a valid statement.

But dismissing and ignoring our concerns over an unusable UX which could potentially be addressed by changes in how versioning is handled is not valid. Isn't an issue discussing breaking changes to how versioning is handled is exactly the place to discuss that? These concerns don't have to be addressed in this issue's proposed changes if you deem them out of scope, but they definitely merit consideration for a future potential feature request, rather than being dismissed outright as a "misuse".

Anyway, having said that, I'll drop the point.

@vito

This comment has been minimized.

Copy link
Member Author

vito commented Jul 18, 2016

@thewoolleyman I've not dismissed or ignored your concerns. From the first comment I've been addressing them, while trying to push unrelated things that cloud the discussion aside. You've been leaving very long comments that make a lot of assumptions which makes the discussion more difficult than it would be if we both got to where we want in smaller steps. It makes it frustrating for both of us.

One of those assumptions was the behavior of the S3 resource, and then later the git resource and others. As I've said previously, the ordering of versions is determined entirely by the resource, and if a resource is correctly implemented, it returns versions in the correct order, and the UI and everything else respects that. If there is no inherent order, then the resource may want to rely on an increment value, but even that is unrelated to ordering - a resource would only do that to force the presence of a new version in order to trigger new builds (as is the case for the git-branches resource). In some cases you won't actually want that, e.g. if I do a rollback deploy with BOSH to something I've deployed previously, I'd expect to just get the previous version again. So, it doesn't feel like something common to all resources, and resources that need it already have a way to do it.

@vito

This comment has been minimized.

Copy link
Member Author

vito commented Jul 27, 2016

Another thing to think about is whether the distinction between source and params is worth keeping. There are many times where a resource has its config hardcoded in source, but then people want to do things dynamically, e.g.:

I've pushed back on some of these because it feels wrong to have every resource support both static and dynamic configuration, and have every resource implement it themselves, differently each time.

I think that the current source and params terminology is meaningful for the pipeline narrative. Each resource should have a source so the pipeline knows where it's coming from. And a put or a get should configure params if they want to configure how the operation is performed. One subtle distinction is that put params can be dynamic (i.e. read from a file) but get params cannot.

My end goal is that each resource simply receives a bag of static JSON config, with no indirection to deal with (no credential fetching, no resolving values to file contents, etc), and does its thing.

So, here's one idea:

  • For the resource interface, merge source and params into just config.

  • In the pipeline, support explicitly loading config from files in previous steps, e.g.:

    - get: some-file
    - put: my-image
      params: {tag: {load: some-file/version}}

    We could even support loading JSON values like so:

    - get: branches
    - task: generate-pipelines
      file: # ...
    - put: my-pipelines
      params: {pipelines: {load: generated-pipelines/pipelines.json, format: json}}
  • Interestingly, this allows get to also be dynamically configured, as the file loading occurs in Concourse, bringing the cache identifier to the forefront.

@robdimsdale @cppforlife @xoebus thoughts?

@srbry

This comment has been minimized.

Copy link
Contributor

srbry commented Jul 27, 2016

I like the sound of merging source and params, one query I have as a user of custom resources.

How would backwards compatibility be manged?
It wouldn't be the cleanest user experience to have to declare one resource_type version for new concourse releases and different ones for the older concourse releases? We don't have the luxury of being packaged with the concourse releases.

@vito

This comment has been minimized.

Copy link
Member Author

vito commented Jul 27, 2016

@srbry Sounds like you're asking the general question of how do resources migrate to this new interface we're planning?

I think it's just a matter of using semver for your custom resource types via tags. If you're supporting both for an interim period, then that can just be a minor bump. Once you cut over to the new resource interface (which I would recommend we do aggressively), just do a major bump. Folks using custom resource types should probably be pinning the version down in their pipelines already. (Though it would be nice to have versioning semantics, i.e. tag: 1.x. - which we could maybe add to the docker image resource.)

@vito

This comment has been minimized.

Copy link
Member Author

vito commented Jul 27, 2016

@srbry We could also potentially support having a generic shim that translates the old interface usage to the new interface.

@srbry

This comment has been minimized.

Copy link
Contributor

srbry commented Jul 27, 2016

@vito thanks, sounds simple enough

@eedwardsav

This comment has been minimized.

Copy link

eedwardsav commented Oct 2, 2017

Is this still going to be targeted in 3.x?

@evanfarrar

This comment has been minimized.

Copy link

evanfarrar commented Nov 30, 2017

👍 for delete (can't add reactions on mobile GitHub)

@madmod

This comment has been minimized.

Copy link

madmod commented Feb 3, 2018

Is this still on the roadmap?

@vito

This comment has been minimized.

Copy link
Member Author

vito commented Feb 3, 2018

@madmod There's a chance we tackle it soon (maybe in part) as we're working on spaces which also implies a change to the resource interface.

@headcr4sh

This comment has been minimized.

Copy link

headcr4sh commented Mar 2, 2018

I was thinking about a new resource interface for quite some time now and I wanted to throw in a bunch ideas that came into my mind:

Docker tags

Since all resource_types are (right now?) basically docker images, it might make sense to use Docker labels to attach some metadata to docker images that can be used as Concourse resources. This would open up a few possibilities, e.g. having a "Concourse Resource Registry" that can be used to automatically list resources

Possible tags:

org.concourse-ci.resource.version=2
org.concourse-ci.resource.actions=check,in,out (or check, get, put delete, whatever... in the future...)
org.concourse-ci.resource.description="A dummy resource that does nothing"
org.concourse-ci.resource.documentation=[link to documentation]
org.concourse-ci.resource.author="John Doe <j.doe@example.com>

JSON schema

An embedded schema that is part of any resource that can be used to validate the source- and params- configuration would be really neat. I think that JSON schema might be used to do that job quite nicely (http://json-schema.org/)

Advantages:

  • JSON schema can be used to document the resource configuration parameters
  • JSON schema can be used by Editor/IDE plugins to perform validation (mandatory vs. optional fields, regex and field type constraints, etc., etc...) and show useful hints while editing (the current VSCode Concourse plugin only shows hints and tips for well-known resources)
  • JSON schema is a pretty solid tech and implementations for all major programming languages (including Go) seem to exist.
  • JSON schema can be used to extract the documentation for a resource

Since Docker label values can contain any arbitrary text of any length, it might be a possibility to embed the schema(s) into the resource docker images by associating them with well known tag values (but I think that might be a bit complicated)

Just my 2 cents.

@jchesterpivotal

This comment has been minimized.

Copy link
Contributor

jchesterpivotal commented Mar 22, 2018

I'm very strongly in favour of schema support as a way to improve validations and tooling.

vito added a commit that referenced this issue Mar 28, 2018

bump dns
Submodule src/github.com/miekg/dns 0c23f842..22cb769f:
  > use a local variable to calculate rtt (#656)
  > [msg] Add UnpackRRWithHeader (#643)
  > fix: panicing on options parsing. (#642)
  > Test that Shutdown does not surface closed errors (#624)
  > Release 1.0.4
  > Fix for CVE-2017-15133 TCP DOS (#631)
  > Add dnscrypt-proxy and rpdns to the list of users (#628)
  > Fix TCP Shutdown 'use of closed network connection' (#623)
  > Release: plain push is also needed
  > Release 1.0.3
  > Ignore malformed UDP datagrams without headers (#622)
  > Fixes #613 & #619 (#621)
  > Revert "Fixes #613 (#617)" (#620)
  > ClassANY: don't convert CLASS255 to ANY (#618)
  > Fixes #613 (#617)
  > test: remove net tests (#616)
  > Release 1.0.1
  > Don't use untrusted lengths from Header to pre-allocate (#610)
  > Unpack: return header (#608)
  > Fix issue #605 (#606)
  > relative include: now tested! (#602)
  > Allow $INCLUDE to reference relative file (#598)
  > Add size-hex: modifier for len() (#599)
  > Some linter fixes from Go report card. (#601)
  > Add codecov yaml to not fail the build (#600)
  > Lint: use ignore-this on generated files (#596)
  > Add semver (#595)
  > Use and vendor golang.org/x/net/ipv4 and golang.org/x/net/ipv6 (#594)
  > EDNS0 client subnet: drop draft option (#589)
  > Add support for TKEY RRs (#567)
  > Modified clientconfig to match ndots0 (#590)
  > 458+dep (#591)
  > Include missing types when for DNSSEC sig verify (#587)
  > Implement CSYNC (#585)
  > Remove idn/ (#584)
  > Spelling fixes (#583)
  > Add fuzzing framework (#580)
  > Fuzzing the text parser: a few fixes (#579)
  > Test: rework concurrentExchange (#576)
  > TSIG name must be presented in canonical form (#574)
  > cleanup: remove debug.Printf from scanner (#573)
  > txt parser: fix goroutine leak (#570)
  > Server: drop inflight waitgroup (#561)
  > Tests: add ListenAndServe tests (#562)
  > Revert "server: drop graceful handling (#546)" (#560)
  > server: drop graceful handling (#546)
  > util.TrimDomainName() fails when origin doesn't end in dot (#559)
  > Tests updates (#556)
  > readme: small bunch updates (#554)
  > golint fixes (#553)
  > Add goreportcard badge (#552)
  > codecov: add shield to README (#551)
  > codecov: add test coverage (#550)
  > Cleanup: gofmt -w -s *.go (#548)
  > Test: remove all Logf/Log (#547)
  > Make compress generate output stable and edns.go formatting (#542)
  > Document SetTsig() needs to be final RRset change (#544)
  > Change quilt.io link in the README to kelda.io (#539)
  > Add Apex in Users (#538)
  > Optimize CompareDomainName (#535)
  > Allow parsing resolv.conf from io.Reader (#532)
  > Fix EDNS0_SUBNET compatibility with dig (#530)
  > Fix EDNS Extended RCODE (#517)
  > Fix IXFR may end prematurely (#512) (#507)
  > Simplify compressed length code (#527)
  > Go version bump in CI (#534)
  > Fix DialTimeout always using udp (#526)
  > Redesigned Client API around net.Dialer (#511)
  > Correctly set the Source IP to the received Destination IP (#524)
  > Correctly parse omitted TTLs and relative domains (#513)
  > Implement EDNS(0) Padding option code (#520)
  > scan: Fix $INCLUDE arguments to parseZone (#508)
  > Fix TSIG bug releated to ID substitution (#504)
  > variable shadowing of token (#503)
  > document RCodes from the IANA registry (#499)
  > Fix ignored err variables. (#498)
  > Add ExchangeContext methods. (#497)
  > xfr: return sane error when !RcodeSuccess (#496)
  > Added dnsperf (#494)
  > Added new user (#495)
@ships

This comment has been minimized.

Copy link
Contributor

ships commented Mar 29, 2018

@vito @robdimsdale you could use the YAML native << append syntax to get multiple top-level objects loaded, right? (but you would be subject to the native merging rules, which yous probably know more intimately than I):

- put: my-pipelines
  params:
    <<: ((load: somefile))
    <<: ((load: someotherfile))

and if needed can define your own merging logic by adding a shim task:

- task: merge-my-params
  file: merge/the/env.sh
  env:
    FILE_1: "((load: somefile))"
    ...

limitation would be now on variable number of files to load.

@ships

This comment has been minimized.

Copy link
Contributor

ships commented Mar 29, 2018

@vito @robdimsdale regarding the use of source vs params, I like the idea of where/how you specify them being decoupled from the API, so that you can put a param in the resource definition and always inherit. however that issue seems separate from the issue of what a resource author can do with an opaque config style API.

to that effect, as far as caching goes two things continue to bother me:

  • specific keys on some resource type configs would still be useful but not correspond to whether bits are cache-worthy. In particular, there are all kinds of reasons why a get with different params could have identical bits, such as auth info, or its contents be inferable from another cache, such as network optimization params.
  • Similarly, put params have often been used to relate the put to previous steps in the build, which we will still need to do but which would change the config, but conceivably not its cache identifier.

as such, the concept of static vs dynamic configuration seems very important to the resource type author, because some of the keys may substantially change what versions appear in the history.

concretely: if you have a resource for a git repo, it would be intuitive under this new design to push paths and ignore_paths into the get params rather than defining a new resource for every directory that only relates to a few of your builds. But now I have a problem: the resource is producing hella versions, but only some of them relate to the different builds; I therefore have to provide some suspect filtering power for the ignore_paths to be able to be respected as far as triggering is concerned. How do we save the resource author from having to distinguish between static & dynamic in this case? @headcr4sh 's schema plan seems like a viable option but i haven't got a solution in mind that I like.

@vito

This comment has been minimized.

Copy link
Member Author

vito commented Apr 2, 2018

@jraqula The caching question is extremely tricky. Changing credentials may not affect the contents of the cache, but having the credentials be verified should be a precursor to anyone's ability to use the cache. If they aren't part of the key, and we don't perform any additional verification (there's no resource interface for this), we've got a very serious information leak on our hands (credentials effectively become optional after the first person fetches it)!

Re: paths/ignore_paths, that whole approach is pretty flawed at the moment. It would probably be better to have some way to determine trigger: true dynamically based on some sort of metadata attached to each version. That's quite a complicated topic on its own. In any case, the current properties only have meaning to check, since they influence what versions it collects in the first place. Specifying them on get steps would be too late. Remember: the only thing we check is what they've configured under resources:, so the only config available will be whatever's under source.

@ships

This comment has been minimized.

Copy link
Contributor

ships commented Apr 4, 2018

@vito secondary concern then while I stew over that: params: {force: true} on a put: my-git-resource. That seems like an important thing to be able to do idempotently, but i feel like genuinely would constitute a delete as well -- there may now be spurious SHAs in the version history.

Easy to imagine this sort of idempotent "change" that is neither append-only nor redact-only applying to other resources generally.

As such what would we feel about instead of adding delete and instead (or having delete be a synonym in some way) allow put to specify a delta which includes removed versions in the history?

very basic strategy would be use a context-sensitive diffpatch like diff but without requiring line numbers or their equivalent (as currently put does not have to be aware of the entire history of versions to report the change), which would be possible as long as version strings are required to be unique?

@jchesterpivotal

This comment has been minimized.

Copy link
Contributor

jchesterpivotal commented Apr 4, 2018

A put with a delta might be better called patch, or am I missing something obvious?

@vito

This comment has been minimized.

Copy link
Member Author

vito commented Apr 4, 2018

@jraqula I can see the appeal of having put and delete be one interface since they both have just about the same signature:

put    : Config -> Bits -> Set Version
delete : Config -> Bits -> Set Version

We could change it to explicitly return a response object saying "here are the versions I created", "here are the versions I deleted".

Devil's advocate, though - version removal could be discovered by check, maybe even more easily. If check is given a version it doesn't recognize (in the event of a push -f), we could have it return the new latest version and the version preceding it. Concourse would then consider any versions in-between as deleted. We already converge on the return value of check for the version ordering, so it kind of follows that we would have to deal with this type of response value anyway. I suspect this would be easier than teaching the git resource's put action to also know how to detect the versions that it destroyed.

Side note: I'm going to close this issue and make a PR to https://github.com/concourse/rfcs (new repo) instead. This will allow more structure for the discussion. It'll be much easier to follow than a long, gigantic linear discussion with no history as to when the initial issue's proposal has been changed. I'm stealing this pattern from Rust and others.

@vito vito added this to In Flight in Core Apr 9, 2018

@vito vito referenced this issue Apr 10, 2018

Open

RFC: Resources v2 #1

@vito

This comment has been minimized.

Copy link
Member Author

vito commented Apr 10, 2018

Closing and moving the discussion to concourse/rfcs#1. Thanks everyone who has helped out so far!

@vito vito closed this Apr 10, 2018

Core automation moved this from In Flight to Done Apr 10, 2018

@vito vito added the wontfix label Apr 12, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment