Provide docs/standards about client/worker/proxy interactions in redeployable taskcluster#128
Conversation
…ons in redeployable taskcluster
|
Is this going to include schemas for reference files and so on (like https://bugzilla.mozilla.org/show_bug.cgi?id=1476602)? If so, I'll stop working on that until this RFC is complete. |
rfcs/0128-redeployable-clients.md
Outdated
|
|
||
| ``` | ||
| { | ||
| "version" : 1, |
There was a problem hiding this comment.
We have found it easier to just include a $schema property to indicate the version of a document.
rfcs/0128-redeployable-clients.md
Outdated
| "<serviceName>": { | ||
| "api": true|false, | ||
| "exchanges": true|false, | ||
| }, |
There was a problem hiding this comment.
This seems to omit the API version.
Why switch from relative URLs to true/false? Both are probably fine, but the URLs are a little more discoverable for people trying to navigate these files by hand.
| example: | ||
|
|
||
| ``` | ||
| queue := queue.NewFromEnvVars() |
There was a problem hiding this comment.
Let's just leave env var stuff to the person using the library? I don't think we need to build in any alternative ways of configuring other than passing in creds probably.
There was a problem hiding this comment.
I guess for node.js and python, that wouldn't be the end of the world. The problem comes when you have static language clients where the client gets compiled into some other binary that get used in other places (for example, the taskcluster-client is used internally by generic-worker, but also by taskcluster-proxy, for example). In the case of generic worker, for example, it will fetch the root URL from the generic-worker.config file, rather than the TASKCLUSTER_ROOT_URL environment variable. If it were for some reason set in the environment, and then it took it from there, it might not necessarily be the correct TASKCLUSTER_ROOT_URL to use.
Also consider the situation where e.g. somebody writes a tool that syncs role definitions between two taskcluster deployments. If internally the client uses the environment variable TASKCLUSTER_ROOT_URL to discover which cluster it needs to talk to, only one can be specified in that process. It kind of assumes it is only talking to one deployment.
When writing executable code, such as a command line tool or web service, where you know what your tool is and what it is being used for, and have a feeling about the context it might be called from, it seems reasonable to take config from environment variables (e.g. for writing a web service). You also know only one value is required for each parameter (unlike the sync example above where internally in the code multiple clients can be created). But when it isn't a service or command line application you are writing, but just a library which is internally used by other command line tools and web services, (for example), it feels cleaner and more appropriate to take configuration as parameters passed to methods and functions. By providing a utility method to fetch config from env vars, you expose the possibility to the calling code to configure based on env vars, but you don't force it. Then it is up to the author of the command line tool / service / etc that calls your library to decide explicitly whether they want that or not. Then another advantage of being explicit is it is a little less magical and easier to reason about when you are degugging that code and trying to work out where it got the configuration from. I think. :-)
There was a problem hiding this comment.
I think we all agree here -- libraries shouldn't be reading env variables without explicit instruction from the caller.
We just need to roll out that change gently since it will break every user of the client in possibly-subtle ways.
There was a problem hiding this comment.
lol, I guess I misread @imbstack's comment, and thought he meant we should support only accepting env vars, but now I see he meant the opposite, so I didn't need to write all that! 😆
There was a problem hiding this comment.
I thought we all agreed that all env vars will no longer be automatically read by the clients, but the text of the RFC still refers only to TASKCLUSTER_ROOT_URL. I think clients should be consistent in which env vars they read, and they currently read TASKCLUSTER_CLIENT_ID etc.
This will be a big breaking change for our users (and I've held @tomprince back from using newer clients in anticipation), so I think we should be explicit about it and make sure they get a chance to comment here. Alternately, we could remove this bit from this RFC and move it to a new one.
djmitche
left a comment
There was a problem hiding this comment.
Regarding the renaming (api -> http, events/exchanges -> amqp), I don't feel like the new names make more sense than the old, but the old aren't perfect. It'd be nice to be consistent, whatever we choose, but the fewer things we rename, the easier this work will be. Renaming always seems easy, then you end up breaking things in production because you missed somewhere!
rfcs/0128-redeployable-clients.md
Outdated
|
|
||
| # 3. Details | ||
|
|
||
| ## 3.1 The development lifecycle of APIs |
There was a problem hiding this comment.
In an earlier draft this was labeled as "current" -- I kind of liked that
There was a problem hiding this comment.
Whoops that got lost in a rewrite, I'm putting it back in. Nice spot!
rfcs/0128-redeployable-clients.md
Outdated
| currently existing reference schemas are these: | ||
|
|
||
| * [HTTP reference schema](https://schemas.taskcluster.net/base/v1/api-reference.json) | ||
| * [AMQP 0.9.1 reference schema](https://schemas.taskcluster.net/base/v1/exchanges-reference.json) |
There was a problem hiding this comment.
We have always called these "api" and "events", so let's continue with those names. Actually, we've sometimes called the latter "exchanges", so fixing that to "events" would be good.
rfcs/0128-redeployable-clients.md
Outdated
| versions, and released | ||
| 6. Released software is tested and deployed | ||
|
|
||
| ## 3.2 Changes to publishing API manifest |
There was a problem hiding this comment.
Maybe call this section 4, proposed changes?
rfcs/0128-redeployable-clients.md
Outdated
| references/schemas that the client reads dynamically should also be a | ||
| dependency of the project, which may be either frozen references/schemas, | ||
| fetched from a language package, or fetched dynamically during build/CI from a | ||
| taskcluster deployment. |
There was a problem hiding this comment.
I'm not sure what this means -- how can the references be a dependency?
rfcs/0128-redeployable-clients.md
Outdated
|
|
||
| ```go | ||
| func HTTPReference(rootURL string, version string) string | ||
| func AMQPReference(rootURL string, version string) string |
There was a problem hiding this comment.
These are equivalent to the existing ApiReference and ExchangesReference methods, but omit the service. I'm not sure how that would work? Also, given that the manifest format allows arbitrary URLs, I think these methods are unnecessary?
rfcs/0128-redeployable-clients.md
Outdated
|
|
||
| These will return absolute urls to the `*-reference.json` documents. | ||
|
|
||
| ## Changes to building docs site |
That is a fair point. And I agree the new names aren't considerably better, so I will go back to the old names. The main naming issue I was trying to solve was that we have no collective term for API meaning HTTP API or AMQP API. We use API for the collective term, but then also sometimes just meaning the HTTP API (e.g. in the name Maybe I can just update the docs to make it clear in each place I reference these terms, what I mean... |
jhford
left a comment
There was a problem hiding this comment.
Looking good, @petemoore !
rfcs/0128-redeployable-clients.md
Outdated
| ## 4.1 Changes to publishing API manifest | ||
|
|
||
| 1. If a service of a taskcluster deployment provides an API interface, the | ||
| cluster may host the API reference document under |
There was a problem hiding this comment.
is this intentionally may and not must?
There was a problem hiding this comment.
It was just to cover the case that you have an API but you don't want to publish it and people/clients to start using it...
| ``` | ||
|
|
||
| How this list is generated by taskcluster is compiled and served by | ||
| taskcluster-references is not the concern of this RFC, but it is recommended |
There was a problem hiding this comment.
A well formed manifest to drive client generation seems like a foundational element of the design. I don't want to speak for @petemoore , but I suspect what he's trying to say is that creating the manifest is out of scope for this document, but that the format which this document expects is in scope.
rfcs/0128-redeployable-clients.md
Outdated
| generate the URL paths, rather than requiring the consumer of this library to | ||
| use taskcluster-lib-urls. This keeps the involvement of taskcluster-lib-urls as | ||
| high up in the stack as possible, which makes the lower parts of the stack more | ||
| generic/flexible with fewer concerns. |
There was a problem hiding this comment.
I really like this design consideration!
| manifest simply says "these are the APIs I declare, here is where you can fetch | ||
| their references, and they are self-describing, so go ask them". It doesn't | ||
| burn in any concerns about URL path building, or the types of reference we | ||
| support. |
There was a problem hiding this comment.
I really like this approach, I think it's the right way to handle the manifest.
rfcs/0128-redeployable-clients.md
Outdated
| ## Changes to publication of API references and schemas | ||
|
|
||
| The implementation must serve the described resources under the given URLs set | ||
| out in this document. The author is not concerned with how a service declares |
There was a problem hiding this comment.
I think it's useful information to clearly mark what is and isn't in scope.
| `TASKCLUSTER_PROXY_URL` and `TASKCLUSTER_ROOT_URL`. This gives them the | ||
| freedom to refer to either the proxy or the target service, as required. | ||
| Since they must explicitly configure the root url when using a taskcluster | ||
| client, both endpoints are at their disposal, based on what they wish to do. |
|
Note, I'll squash commits when I land, of course, and give a reasonable commit message! This is ready for review, I am not intending to land any more changes. |
djmitche
left a comment
There was a problem hiding this comment.
I like this. I'm a little worried that some JSON-schema tooling will struggle with the metadata field, but as long as that's not the case it sounds like a really elegant solution, not least because it means we can make "trivial" changes to the reference schema (such as modifying a description) without bumping the version number.
We had earlier planned to have the version number in the reference schemas. That would allow us to have services that are publishing references adhering to different schema versions to co-exist in the same deployment (and would couple nicely with the metadata property, so nobody needs to parse "-v1" out of a filename).
Would the metadata-based versioning apply to the manifest schema (api-manifest.json) as well?
Could we rename reference.json to indicate that it is a metaschema, not a reference or schema? Perhaps metadata-metaschema.json or something like that?
| example: | ||
|
|
||
| ``` | ||
| queue := queue.NewFromEnvVars() |
There was a problem hiding this comment.
I thought we all agreed that all env vars will no longer be automatically read by the clients, but the text of the RFC still refers only to TASKCLUSTER_ROOT_URL. I think clients should be consistent in which env vars they read, and they currently read TASKCLUSTER_CLIENT_ID etc.
This will be a big breaking change for our users (and I've held @tomprince back from using newer clients in anticipation), so I think we should be explicit about it and make sure they get a chance to comment here. Alternately, we could remove this bit from this RFC and move it to a new one.
| * teams that deploy their own taskcluster environments, are able to include | ||
| additional references for programming interfaces not covered by the core | ||
| platform (for example, maybe it is desired to provide APIs for talking with | ||
| databases, other messaging buses, monitoring tools, ...) |
There was a problem hiding this comment.
Indeed, I see that you've been careful to indicate what this RFC requires, and what it merely discusses. I'm strongly opposed to a few of the things it discusses, but I'm happy to agree to the RFC only on the basis of what it requires.
imbstack
left a comment
There was a problem hiding this comment.
Seems generally reasonable to me!
|
|
||
| * building taskcluster clients, that provide language-level programming | ||
| interfaces to taskcluster APIs | ||
| * refreshing `taskcluster-raw-docs` AWS S3 bucket, which is used by taskcluster |
There was a problem hiding this comment.
I don't believe we actually use these for docs. Every service just publishes to that bucket directly.
There was a problem hiding this comment.
Ah OK.
Do you know where the docs site fetches the references and schemas from?
|
So, the meaning of the env vars section remains ambiguous, and there appears to have been no final-comment period here in which to catch that issue or check that release engineering and other client consumers are OK with the change. I don't want to delay this by standing on ceremony, and I think we have consensus on most of this, but I'm worried that we've missed discussion of a consumer-facing breaking change. Maybe we should consider pulling out the env-var bit into a separate RFC? |
Apologies, I forgot about applying the tags, but I think we have a general consensus based on the review approvals. It is always plausible for there to be slight changes during the implementation phase, but I think as long as we are vocal about changes coming, and communicate these things well, we should be ok. When users upgrade a major version, they are generally aware there can be breaking changes - the mistake we have made in the past is not being very vocal about it and explicit in release notes. As long as the release notes are very explicit about the breaking changes, and we communicate them well, users can decide if they wish to upgrade, and how long they need for adjusting to the changes. I think this is where we have fallen down in the past. |
Changes are required for clients of taskcluster services when taskcluster becomes redeployable.
This RFC serves to define the standards by which taskcluster deployments will publish a manifest of their service definitions, how client generators will query and interpret this manifest and associated reference and schema documents in order to generate clients, what the architectural changes to generated clients will be, how users of those clients (workers, worker authentication proxies, command line tools, software libraries) will be adapted in order to use the new clients, and how generated clients and client generators will be built, released and deployed.
This entails changes to the way workers will share deployment / authentication proxy configuration information with tasks, versioning of the API / Event reference schemas, versioning of the services definitions based on those API and Event reference schemas, and the possibility for clients to connect to different taskcluster deployments.