Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xrfc: xdstp:// xRFC (TP1) #6

Merged
merged 23 commits into from
Aug 5, 2021
Merged

xrfc: xdstp:// xRFC (TP1) #6

merged 23 commits into from
Aug 5, 2021

Conversation

htuch
Copy link
Contributor

@htuch htuch commented Apr 4, 2021

This ports
https://docs.google.com/document/d/1zZav-IYxMO0mP2A7y5XHBa9v0eXyirw1CxrLLSUqWR8/edit#
to xRFC format and updates to match changes that have occurred during
Envoy implementation.

This is xRFC number TP1.

Signed-off-by: Harvey Tuch htuch@google.com

This ports
https://docs.google.com/document/d/1zZav-IYxMO0mP2A7y5XHBa9v0eXyirw1CxrLLSUqWR8/edit#
to xRFC format and updates to match changes that have occurred during
Envoy implementation.

Signed-off-by: Harvey Tuch <htuch@google.com>
@htuch
Copy link
Contributor Author

htuch commented Apr 4, 2021

CC @cncf/xds-wg

proposals/TP1-xds-transport-next.md Outdated Show resolved Hide resolved
proposals/TP1-xds-transport-next.md Outdated Show resolved Hide resolved
proposals/TP1-xds-transport-next.md Show resolved Hide resolved
proposals/TP1-xds-transport-next.md Show resolved Hide resolved
proposals/TP1-xds-transport-next.md Outdated Show resolved Hide resolved
resource_names_subscribe:
- xdstp://some-cloud-authority/envoy.config.endpoint.v3.ClusterLoadAssignment/foo
```

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you expand on this. What does this mean?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which bit is unclear? Happy to expand on anything here, just need to figure out what is missing :)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is the fallback to 'some-onprem-authority' configured?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The presence of the #alt directive above describes the fallback relationship. It's also mentioned that there is a mapping from some-onprem-authority to ConfigSource in the bootstrap.

proposals/TP1-xds-transport-next.md Show resolved Hide resolved
Two `xdstp://` URNs are considered equivalent if they match component-wise
modulo context parameter ordering.

## Context parameters

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How are we going to handle duplicate context params from different process? Say one from URL has same key with the one from dynamic context param added by client.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should not be any conflicts of this nature, because the names are designed to prevent this kind of conflict. Node-derived context params start with xds.node.; client feature parameters start with xds.client_feature.; per-resource-type parameters start with xds.resource.; and parameters from the URL should not start with xds. at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@markdroth did we end up with any rules around how URL parameters (i.e. free form) are supposed to work? I.e. "they can never starts with xds." or "they can optionally start with xds. if deliberately overriding other context parameters"?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it will be clear to have some rules for context parameters from URL.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. @markdroth can you comment on #6 (comment)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think where we wound up is that the ad-hoc context params can't start with xds., so there will basically never be any conflicts.

Copy link
Contributor

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for porting this over. A few comments from a fresh read through.

proposals/TP1-xds-transport-next.md Outdated Show resolved Hide resolved
proposals/TP1-xds-transport-next.md Outdated Show resolved Hide resolved
proposals/TP1-xds-transport-next.md Outdated Show resolved Hide resolved
proposals/TP1-xds-transport-next.md Outdated Show resolved Hide resolved
proposals/TP1-xds-transport-next.md Outdated Show resolved Hide resolved
Copy link
Contributor

@markdroth markdroth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this!

Please let me know if you have any questions about any of this.

proposals/TP1-xds-transport-next.md Show resolved Hide resolved
proposals/TP1-xds-transport-next.md Outdated Show resolved Hide resolved
proposals/TP1-xds-transport-next.md Outdated Show resolved Hide resolved
Two `xdstp://` URNs are considered equivalent if they match component-wise
modulo context parameter ordering.

## Context parameters
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should not be any conflicts of this nature, because the names are designed to prevent this kind of conflict. Node-derived context params start with xds.node.; client feature parameters start with xds.client_feature.; per-resource-type parameters start with xds.resource.; and parameters from the URL should not start with xds. at all.

proposals/TP1-xds-transport-next.md Outdated Show resolved Hide resolved
proposals/TP1-xds-transport-next.md Show resolved Hide resolved
proposals/TP1-xds-transport-next.md Outdated Show resolved Hide resolved

### Discovery request and responses

State-of-the-world (SotW) `Discovery{Request,Response}` are not supported with
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we're using strings instead of structured protos for URIs on the wire, I don't think there's actually any reason to not support xdstp names in SotW. I think the only thing that doesn't work in SotW is glob collections, because there's no way for the server to delete a resource from the collection.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, but I rather think about this additively; land the delta spec and add SotW as needed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm okay with that, but I do think that we'll need another xRFC for the new naming scheme in SotW in the near future, because gRPC is planning to implement that. And that xRFC will be very short; it will basically say "everything from the original xRFC except for glob collections works for StoW too".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd like to renew my request to not explicitly exclude SotW here. gRPC is definitely going to be supporting xdstp names in SotW, and it seems silly to have to publish a follow-up xRFC just to say "everything in xRFC TP1 works for SotW too".

proposals/TP1-xds-transport-next.md Outdated Show resolved Hide resolved
proposals/TP1-xds-transport-next.md Outdated Show resolved Hide resolved
htuch added 10 commits April 12, 2021 00:06
Signed-off-by: Harvey Tuch <htuch@google.com>
Signed-off-by: Harvey Tuch <htuch@google.com>
Signed-off-by: Harvey Tuch <htuch@google.com>
Signed-off-by: Harvey Tuch <htuch@google.com>
Signed-off-by: Harvey Tuch <htuch@google.com>
Signed-off-by: Harvey Tuch <htuch@google.com>
Signed-off-by: Harvey Tuch <htuch@google.com>
Signed-off-by: Harvey Tuch <htuch@google.com>
Signed-off-by: Harvey Tuch <htuch@google.com>
Signed-off-by: Harvey Tuch <htuch@google.com>
proposals/TP1-xds-transport-next.md Outdated Show resolved Hide resolved
resource_names_subscribe:
- xdstp://some-cloud-authority/envoy.config.endpoint.v3.ClusterLoadAssignment/foo
```

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is the fallback to 'some-onprem-authority' configured?

Signed-off-by: Harvey Tuch <htuch@google.com>
Signed-off-by: Harvey Tuch <htuch@google.com>
Copy link

@adisuissa adisuissa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks very good!
Left a couple of questions and minor nits.

proposals/TP1-xds-transport-next.md Show resolved Hide resolved
proposals/TP1-xds-transport-next.md Outdated Show resolved Hide resolved
proposals/TP1-xds-transport-next.md Outdated Show resolved Hide resolved
```

Under normal conditions, the requested resource is returned as above. If the
config source for `some-cloud-authority` is unreachable, then the client issues

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the on-prem server be reached if the fail-over is temporary?
Should the client try accessing the original server periodically, and if it's alive change the configuration to the one it receives from the original server?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good question; I'm tempted to punt as we're not implementing this right now and it gets complicated. What about leaving it as a client specific how failover and restoration of original resource fetch is decided?

@markdroth also interested in your take on this one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a note stating this is client specific.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the client should always be trying to contact the original server, with appropriate backoff delay. Whenever the original server recovers enough to return a result, then the client should switch back to data from that server.

That having been said, I think I would be in favor of removing this from this xRFC, because we don't plan to implement this part in either gRPC or in Envoy in the near future. I think we can put together a separate xRFC for this once we actually have a need for it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Unresolving this conversation so that my last comment gets addressed.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a deferral sentence.

proposals/TP1-xds-transport-next.md Show resolved Hide resolved
Signed-off-by: Harvey Tuch <htuch@google.com>
Signed-off-by: Harvey Tuch <htuch@google.com>
Signed-off-by: Harvey Tuch <htuch@google.com>
@htuch
Copy link
Contributor Author

htuch commented May 16, 2021

I've updated based on feedback. @fuqianggao @norafang52 @markdroth @adisuissa PTAL.

Copy link
Contributor

@markdroth markdroth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry it took me so long to re-review this! Please let me know if you have any questions about any of this.

proposals/TP1-xds-transport-next.md Outdated Show resolved Hide resolved
proposals/TP1-xds-transport-next.md Outdated Show resolved Hide resolved
proposals/TP1-xds-transport-next.md Show resolved Hide resolved
proposals/TP1-xds-transport-next.md Outdated Show resolved Hide resolved
proposals/TP1-xds-transport-next.md Outdated Show resolved Hide resolved
proposals/TP1-xds-transport-next.md Show resolved Hide resolved
proposals/TP1-xds-transport-next.md Show resolved Hide resolved

### Discovery request and responses

State-of-the-world (SotW) `Discovery{Request,Response}` are not supported with
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm okay with that, but I do think that we'll need another xRFC for the new naming scheme in SotW in the near future, because gRPC is planning to implement that. And that xRFC will be very short; it will basically say "everything from the original xRFC except for glob collections works for StoW too".

```

Under normal conditions, the requested resource is returned as above. If the
config source for `some-cloud-authority` is unreachable, then the client issues
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the client should always be trying to contact the original server, with appropriate backoff delay. Whenever the original server recovers enough to return a result, then the client should switch back to data from that server.

That having been said, I think I would be in favor of removing this from this xRFC, because we don't plan to implement this part in either gRPC or in Envoy in the near future. I think we can put together a separate xRFC for this once we actually have a need for it.

@htuch
Copy link
Contributor Author

htuch commented Jun 21, 2021

Sorry, this has idled a bit, I will update today.

@htuch htuch changed the title xrfc: xdstp:// xRFC. xrfc: xdstp:// xRFC (TP1) Jun 21, 2021
Signed-off-by: Harvey Tuch <htuch@google.com>
of `major.minor.patch` values, e.g. `"1.2.3"`.

* Context parameters from the URL, in the above example
`xds.bar=baz`. These must not overlap with any other context parameter name,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@markdroth should we put this in some namespace, e.g. xds.params. to future proof other parts of the xds. hierarchy in case we want them?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should say that the context params from the URI must not start with xds.. The params in the URI are for control planes to set however they want; the params in the xds. namespace should not be used for that, since they are reserved for future use by the API.

We are currently documenting xds.node. and xds.resource., which means that we are effectively reserving the entire remainder of the xds. namespace for future use.

An example computed URN following the above example is
`xdstp://some-authority/some.type/foo?xds.node.metadata.foo=bar&xds.shard_id=1234&xds.bar=baz&xds.client_feature.lb.least_loaded=false&xds.resource.vip=96.54.3.1`.

This proposal reserves the `@` prefix for context parameter values for future
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@markdroth should we still keep this one?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect that we won't wind up using this anytime soon, but it's probably better to be safe than sorry.

Can we do something a bit more general, like maybe reserving all non-alphanumeric characters, just for the first character of the value?

Signed-off-by: Harvey Tuch <htuch@google.com>
Signed-off-by: Harvey Tuch <htuch@google.com>
@htuch
Copy link
Contributor Author

htuch commented Jul 25, 2021

@markdroth @fuqianggao @mattklein123 @norafang52 any further feedback or can we merge? I'd like to get the baseline RFC landed for us to iterate later.

@fuqianggao
Copy link

/lgtm

@mattklein123
Copy link
Contributor

LGTM to ship and iterate.

Copy link
Contributor

@markdroth markdroth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really good overall! I have just a few comments.

Please let me know if you have any questions. Thanks!

of `major.minor.patch` values, e.g. `"1.2.3"`.

* Context parameters from the URL, in the above example
`xds.bar=baz`. These must not overlap with any other context parameter name,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should say that the context params from the URI must not start with xds.. The params in the URI are for control planes to set however they want; the params in the xds. namespace should not be used for that, since they are reserved for future use by the API.

We are currently documenting xds.node. and xds.resource., which means that we are effectively reserving the entire remainder of the xds. namespace for future use.

proposals/TP1-xds-transport-next.md Outdated Show resolved Hide resolved
proposals/TP1-xds-transport-next.md Outdated Show resolved Hide resolved
An example computed URN following the above example is
`xdstp://some-authority/some.type/foo?xds.node.metadata.foo=bar&xds.shard_id=1234&xds.bar=baz&xds.client_feature.lb.least_loaded=false&xds.resource.vip=96.54.3.1`.

This proposal reserves the `@` prefix for context parameter values for future
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect that we won't wind up using this anytime soon, but it's probably better to be safe than sorry.

Can we do something a bit more general, like maybe reserving all non-alphanumeric characters, just for the first character of the value?

proposals/TP1-xds-transport-next.md Show resolved Hide resolved
proposals/TP1-xds-transport-next.md Outdated Show resolved Hide resolved
```

Under normal conditions, the requested resource is returned as above. If the
config source for `some-cloud-authority` is unreachable, then the client issues
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Unresolving this conversation so that my last comment gets addressed.)

proposals/TP1-xds-transport-next.md Outdated Show resolved Hide resolved
proposals/TP1-xds-transport-next.md Outdated Show resolved Hide resolved
proposals/TP1-xds-transport-next.md Outdated Show resolved Hide resolved
markdroth
markdroth previously approved these changes Aug 3, 2021
Copy link
Contributor

@markdroth markdroth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one small typo to fix, otherwise this looks fantastic!

proposals/TP1-xds-transport-next.md Outdated Show resolved Hide resolved
Signed-off-by: Harvey Tuch <htuch@google.com>
Signed-off-by: Harvey Tuch <htuch@google.com>
Copy link
Contributor

@markdroth markdroth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! I think it's ready to merge.

@htuch htuch merged commit aa0b789 into cncf:main Aug 5, 2021
@htuch htuch deleted the xdstp-xrfc branch August 5, 2021 03:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants