-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial MDS implementation #504
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few comments- it's ok to just add TODOs and comments in the code, no need to resolve them in this PR, better to merge and iterate.
fatal(err) | ||
|
||
si, err := istiometadata.FetchFromClientConnection(conn) | ||
fatal(err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a future PR we should add some code to test that MDS doesn't return anything after close is called.
I would change this to a plain TCP echo ( also in future PR) and add an example for HTTP and gRPC but more proper, i.e. using the library and usable in real life apps.
But this is good enough for now.
} | ||
|
||
// metadataContextKey is the key the metadata handler will store connection metadata in | ||
var metadataContextKey = &contextKey{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As API, not a big fan of using context as a hash map and forcing users to use this pattern.
Better to have a clean interface - and let users wrap or save the info however they want. Most likely this will
be integrated into another library ( otel, authz )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I generally agree but I think this is the only way to do it as an HTTP handler? unless do the magic to lookup the TCP connection for the request, and then add caching yourself which would just be duplicating this I think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure - see my other comment bellow. In a handler user has access to peer and local IP - and can make the MDS call directly when needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see what you mean - it's about avoiding to make this call on each request.
How about we leave it out for now - users can probably use the same pattern if they want, better to focus this library on TCP ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that I don't love implicit context storage, but this is what go-grpc uses all over the place, so in this instance probably better to follow the same pattern
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a difference between a library and a framework. Yes, go-grpc and other frameworks follow this pattern - and it is not a bad pattern for a framework, where user can get peer identity in a consistent way from context.
My point is that for a library that interacts with ztunnel - it should not define those abstractions and implied requirements ( chained for ALL requests, etc). We will have other ways to get peer info - JWT, XFCC, etc - so this library would be used by frameworks together with other auth mechanisms, and in the end expose the context
data specific to the framework in use.
// ExtractFromRequest attempts to extract ConnectionMetadata from a request. | ||
// This requires Handler to be used. | ||
// If a connection is re-used between requests, only a single call will be made. | ||
func ExtractFromRequest(r *http.Request) *ConnectionMetadata { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what I don't like about this pattern - user has to call ExtractFromRequest ( so has a dependency to this library anyways ), but the Handler is called in ALL requests. If user only needs peer info for some requests - it's more complicated to setup ( wrap only some request with the handler chain ), or not possible if the lookup is based on some other logic ( like checking XFF or other standard headers).
Instead this method could just to the lookup on demand, and no wrapper needed.
src/proxy/metadata.rs
Outdated
.parse()?, | ||
); | ||
if remote.ip() != dst.ip() && remote.ip() != src.ip() { | ||
anyhow::bail!("metadata server request must come from the src or dst address") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure this is guaranteed - for example if the connection is IPv6, pretty sure it won't be the case ( dual stack and we use a MDS v4 address ).
Can we keep a pointer to the dest and source pod - and check its IPs ?
Or just add a TODO and bug - we can do this in a later phase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ran into this myself with localhost
testing -- since we don't use 127.0.0.1 to connect to MDS. I put a TODO. I think we can handle it once the workload API supports dual stack (currently doesn't)
orig_dst_addr: SocketAddr, | ||
block_passthrough: bool, | ||
) -> Result<(), Error> { | ||
let remote_ip = remote_addr.ip(); | ||
if orig_dst_addr.ip() == METADATA_SERVER_IP { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's ok - but may also be good to have MDS listen on a separate port ( 150xx) and let CNI handle the redirection ( or in whitebox mode use the Service ). Cleaner.
But it's good to keep this code too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have MDS listen on a separate port ( 150xx) and let CNI handle the redirection
+1 on this, I don't love the reliance on a hardcoded IP here, and would prefer a metadata port
(like prometheus, xds, etc etc) so we don't need to rely on known IP keying in zt proper.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the record - changed my mind after further thinking. See my other comments on why it's best for ztunnel to handle all redirection based on destination IP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is also something we can potentially revisit later if ztunnel/CNI integ gets tighter/more involved, but I'm ok punting on it now and going with this impl.
Most eBPF CNIs will need to do something like this (track connections in a shared map), and it will work almost exactly the same way (map of connections, just in kernelspace maintained as a eBPF hash map). I understand that ztunnel will need to work with things that are not just eBPF, but I wonder if this wouldn't be better implemented (eventually) as a BPF conntrack map that ztunnel can manage/query from userspace. |
Sorry I'm missing how this would benefit users since we said we want to keep ztunnel small and lean. MDS - is this google cloud metadata service or more generic? |
It has nothing to do with Google - this allows users applications to find the peer identity. Today they can do this with XFCC header, but that is HTTP based which we cannot do in ztunnel. This provides a generalized way. Note this is needed for sandwiching as well |
Thanks, btw, what does MDS stand for? I thought we aren't using sandwich approach at all? |
On Thu, Apr 27, 2023 at 7:29 AM Eitan Yarmush ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In go-metadata/istio.go
<#504 (comment)>:
> + "net/url"
+ "time"
+)
+
+// ConnectionMetadata returns the metadata about a connection
+type ConnectionMetadata struct {
+ Identity string `json:"identity"`
+}
+
+// metadataClient provides a default client for all metadata requests
+var metadataClient = http.Client{
+ Timeout: time.Second,
+}
+
+// metadataContextKey is the key the metadata handler will store connection metadata in
+var metadataContextKey = &contextKey{}
I agree that I don't love implicit context storage, but this is what
go-grpc uses all over the place, so in this instance probably better to
follow the same pattern
The pattern may be fine in some cases. That's not my concern - this library
should not be a framework, just expose the raw command.
It will likely be combined with checks for XFCC, ProxyProtocol, etc - there
are plenty of auth schemes an app may support, and
plenty of frameworks that set the identity in some way.
|
MDS is 'metadata server' AFAIK, all cloud vendors have one. Not sure who
was first.
This has little to do with sandwich - any user application may need to know
the identity of the peer.
For example we may have authz to allow Alice to talk to Bob - but Bob still
wants to know who's talking to.
…On Thu, Apr 27, 2023 at 6:02 PM Lin Sun ***@***.***> wrote:
It has nothing to do with Google - this allows users applications to find
the peer identity. Today they can do this with XFCC header, but that is
HTTP based which we cannot do in ztunnel. This provides a generalized way.
Note this is needed for sandwiching as well
Thanks, btw, what does MDS stand for?
I thought we aren't using sandwich approach at all?
—
Reply to this email directly, view it on GitHub
<#504 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAUR2XX36RPONSNSU7QFZDXDMJJFANCNFSM6AAAAAAXLK5ZTQ>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
orig_dst_addr: SocketAddr, | ||
block_passthrough: bool, | ||
) -> Result<(), Error> { | ||
let remote_ip = remote_addr.ip(); | ||
if orig_dst_addr.ip() == METADATA_SERVER_IP { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would prefer this is not intercepted to ztunnel
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure how we can opt out? since ztunnel will capture all traffic
Interesting - will ztunnel break the platform MDS ? I think CNI needs to
probably exclude the range ( most platforms seem to be using
same /24).
But ztunnel MDS itself needs to be intercepted, how else can the pod
communicate with it ?
…On Thu, Apr 27, 2023 at 7:45 PM John Howard ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In src/proxy/outbound.rs
<#504 (comment)>:
> orig_dst_addr: SocketAddr,
block_passthrough: bool,
) -> Result<(), Error> {
+ let remote_ip = remote_addr.ip();
+ if orig_dst_addr.ip() == METADATA_SERVER_IP {
I am not sure how we can opt out? since ztunnel will capture all traffic
—
Reply to this email directly, view it on GitHub
<#504 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAUR2VDSEHDRTXDW5TOCPTXDMVMHANCNFSM6AAAAAAXLK5ZTQ>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Hmm we did fix platform MDS somehow and have a test for it. Let me see how
we did it
On Thu, Apr 27, 2023, 8:30 PM Costin Manolache ***@***.***>
wrote:
… Interesting - will ztunnel break the platform MDS ? I think CNI needs to
probably exclude the range ( most platforms seem to be using
same /24).
But ztunnel MDS itself needs to be intercepted, how else can the pod
communicate with it ?
On Thu, Apr 27, 2023 at 7:45 PM John Howard ***@***.***>
wrote:
> ***@***.**** commented on this pull request.
> ------------------------------
>
> In src/proxy/outbound.rs
> <#504 (comment)>:
>
> > orig_dst_addr: SocketAddr,
> block_passthrough: bool,
> ) -> Result<(), Error> {
> + let remote_ip = remote_addr.ip();
> + if orig_dst_addr.ip() == METADATA_SERVER_IP {
>
> I am not sure how we can opt out? since ztunnel will capture all traffic
>
> —
> Reply to this email directly, view it on GitHub
> <#504 (comment)>, or
> unsubscribe
> <
https://github.com/notifications/unsubscribe-auth/AAAUR2VDSEHDRTXDW5TOCPTXDMVMHANCNFSM6AAAAAAXLK5ZTQ
>
> .
> You are receiving this because you commented.Message ID:
> ***@***.***>
>
—
Reply to this email directly, view it on GitHub
<#504 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAEYGXJOBNZ25SLBRUKS4MTXDM2UPANCNFSM6AAAAAAXLK5ZTQ>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
type ConnectionMetadata struct { | ||
// Identity provides the identity of the peer. | ||
// Example: spiffe://cluster.local/ns/a/sa/b. | ||
Identity string `json:"identity"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we open a tracking issue to capture the other information about the identity that we would like to expose here
Over time I would expect to see us add either a field or a new API that provided details from the public key of the peer
I think the interesting question is what are we trying to prove to the caller about the veracity of the identity we just provided. Options include:
-
How can the caller know that the MDS is reliably acting on its behalf. Clearly theres a certain amount of infrastructural trust seeing as this is coming from the 'network' as perceived by the app. One option here if for the MDS to offer a "who am I" API. This is a bit like SDS but now its an implicit part of the infrastructure. Response to the call should either be the current public cert or some shorthand of it. Variant, can prove that the infra has the private key which is shielded from the app by signing some random bytes passed to the API which can then be verified with the public key. Likely want to deliver a shorthand in addition to x509 to shield users from parsing its gory details.
-
How can the caller know that the infrastructure actually performed a secure handshake with the identified peer? This is harder and of debateable value. If we provide proof that infrastructure already has the public and private keys as described in 1 above then the infra is already capable of impersonating the application for whatever purposes it sees fit. We could do something like a 'signed ping'
Client generates random bytes and encrypts with public key of peer (opaque to local ztunnel) Client app calls 'secure ping' passing the encrypted bytes local ztunnel simply forwards encrypted bytes to the identified peer peer ztunnel decrypts random byes with private key and returns their signature local ztunnel forwars message signature to client Client compares message signature to locally computed hash
This proves that there is something between the client and the peer that has access to the private key associated with the identity. The client doesn't know who or what that is necessarily
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Over time I would expect to see us add either a field or a new API that provided details from the public key of the peer
+1, ideally one that follows XFCC semantics, and chains certs of all workloads in the pathway, including the ztunnel instance.
One option here if for the MDS to offer a "who am I" API. This is a bit like SDS but now its an implicit part of the infrastructure. Response to the call should either be the current public cert or some shorthand of it.
This is more or less what SPIRE already does, it's worth pointing out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
XFCC is a pretty horrible UX - you need a parser library, parse certs, etc. It may be useful for some super advanced users - but if they are willing to take all those dependencies they may be better off just using proxyless and having end to end mTLS.
All platforms I know ( not only google) have a basic, simple MDS server that returns metadata and tokens in
a very simple way.
I'm not saying it is an invalid use case to want fancy attestation - and it is clear that cloud vendors don't
provide this right now, so it's not a bad idea if someone implemented it - but we should not mix it with
the very simple use case for this PR.
Thanks, thinking loud here - could this be something opted in so it doesn't impact users don't use MDS. |
This doesn't impact anyone if they don't use it, t hey have to explicitly call the metadata server for it to do anything |
orig_dst_addr: SocketAddr, | ||
block_passthrough: bool, | ||
) -> Result<(), Error> { | ||
let remote_ip = remote_addr.ip(); | ||
if orig_dst_addr.ip() == METADATA_SERVER_IP { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have MDS listen on a separate port ( 150xx) and let CNI handle the redirection
+1 on this, I don't love the reliance on a hardcoded IP here, and would prefer a metadata port
(like prometheus, xds, etc etc) so we don't need to rely on known IP keying in zt proper.
|
||
/// METADATA_SERVER_IP provides the well-known metadata server IP. | ||
/// This is captured by the redirection. | ||
pub const METADATA_SERVER_IP: IpAddr = IpAddr::V4(Ipv4Addr::new(169, 254, 169, 111)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just hardcode a specific metadata server IP in OSS ztunnel like this?
If it's "well-known", to whom is it well-known?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean we can - the IP is irrelevant other than the client and server need to be in agreement about it and it should not conflict with other IPs users want to use for other purposes.
and would prefer a metadata port (like prometheus, xds, etc etc) so we don't need to rely on known IP keying in zt proper.
What is better about a special port instead of a special IP? Or did you want special IP + special port (which seems good?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I'm not clear on why ztunnel needs a special IP at all?
If we can redirect any pod outbound destined for <whatever IP you want>
to <whatever ztunnel you want>
and <whatever ztunnel port we want>
outside of ztunnel itself - why does ztunnel need to match a well-known IP?
That could be a pretty simple CNI rule.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it any simpler to have it in the CNI layer instead of in ztunnel though?
In general there is a tension of what goes at which layer. For example, we could skip. ztunnel entirely in cases where we know it won't do anything but pass through.
So far we have leaned towards keeping the CNI simpler so there is a better chance we can replace it with better approaches (and rust is easier than iptables/eBPF)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it any simpler to have it in the CNI layer instead of in ztunnel though?
Long term I think so, since the CNI (ours or someone else's) will be handling/redirecting the traffic to listeners within the ztunnel netns based on intended dest anyway, and we already have the concept of directing packets to different well-known ztunnel listener ports and handling them differently as a result (for purposes of early exit/handling bypass in eBPF/iptables redirect logic, for instance)
So far we have leaned towards keeping the CNI simpler so there is a better chance we can replace it with better approaches (and rust is easier than iptables/eBPF)
Absolutely a fan of keeping CNI simpler, but sending captured traffic based on intended dest to a given ztunnel listener is what CNI is for - that's what it does best. It's how we do DNS redir for instance. CNI redir already does special-casing around certain dest ports (or whether src is host ip, etc etc) for purposes of data path optimization - it's already handling this stuff, and adding this would probably be a single iptables rule.
Feels logically consistent to maintain that pattern here, and it has the added benefit of making it very easy for the IP to be changed or redefined however the network owner wants it, without ztunnel caring (or us needing to add a flag/envvar to ztunnel to let people define what they want).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I strongly disagree - routing traffic based on dst is what ztunnel does.
CNI is responsible to redirect pod egress to ztunnel - it does not know where to send ( unless ztunnel and WDS/PTR-DS is integrated into the CNI ). Knowing where to redirect an IP and how is based on Istiod configs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And if we want at some point to have the 'redirect all egress from pod to daemon set' - shared by all
mesh implementations and all CNI providers - we need to keep the requirements on the CNI layer as
small as possible.
We may add extra requirements - like exclude ports for proxyless - but getting into 'route IP address x in a different way' is too much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I strongly disagree - routing traffic based on dst is what ztunnel does.
CNI is responsible to redirect pod egress to ztunnel - it does not know where to send ( unless ztunnel and WDS/PTR-DS is integrated into the CNI ). Knowing where to redirect an IP and how is based on Istiod configs.
CNI knows what ztunnel listener to redirect traffic to, and ztunnel knows what dest to send traffic to - correct - using CNI to redirect specific MDS traffic to a specific ztunnel listener would be the former, which is definitely what CNI is responsible for doing today.
And if we want at some point to have the 'redirect all egress from pod to daemon set' - shared by all
mesh implementations and all CNI providers - we need to keep the requirements on the CNI layer as
small as possible.
Valid. I'm not sure that's a realistic or practical goal, given that no two CNIs are really the same and we want to keep ztunnel from being a kitchen-sink but also tightly integrate at the L4 layer, but we can live in hope.
We may add extra requirements - like exclude ports for proxyless - but getting into 'route IP address x in a different way' is too much.
It doesn't strike me as significantly different than route IP port X in a different way
which CNI already does and will always do for e.g. DNS, but I'm good w/ this approach for now.
The reason for the hardcoded IP is pretty simple: in an app, how do you
make the http request ? You need a DNS name or IP.
Many MDS servers are also DNS servers - there is a plan to allow running a
DNS proxy in ztunnel. Also what hostname would you
use - and how would CNI know that some VIP is actually the MDS address ?
All MDS servers I know rely on a hardcoded 'well known' address in that
range. We can also use a special 150xx port, but if we
want ztunnel to be a resolver it'll need to listen on 53 ( yes, we can do
some complicated redirection to use a different port for DNS - but
why not use the simplest solution ). If we are using this pattern - I would
like to see a good reason why we should follow
a different pattern, and what would that be - can't think of a simpler or
cleaner solution.
Also this whole thing doesn't need a fancy iptables or eBPF or k8s - CNI
can simply add a route and add a second
address with the well-known IP on the ztunnel pod.
It will also work on VMs and non-K8S environments - even in cases where no
iptables are possible ( ztunnel can work without
any CNI or iptables, by setting SOCKS_PROXY env variable with most
applications ).
…On Fri, Apr 28, 2023 at 3:42 PM Ben Leggett ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In src/proxy/metadata.rs
<#504 (comment)>:
> + fn drop(&mut self) {
+ trace!(tuple=?self.tup, "forgetting connection");
+ let mut cm = self.ct.conns.lock().unwrap();
+ cm.remove(&self.tup);
+ }
+}
+
+#[derive(Clone, Debug, Serialize)]
+pub struct ConnectionMetadata {
+ #[serde(default)]
+ identity: Option<Identity>,
+}
+
+/// METADATA_SERVER_IP provides the well-known metadata server IP.
+/// This is captured by the redirection.
+pub const METADATA_SERVER_IP: IpAddr = IpAddr::V4(Ipv4Addr::new(169, 254, 169, 111));
Is it any simpler to have it in the CNI layer instead of in ztunnel though?
Long term I think so, since the CNI (ours or someone else's) will be
handling/redirecting the traffic based on intended dest anyway, and we
already have the concept of directing packets to different well-known
ztunnel listener ports and handling them differently as a result (for
purposes of early exit/handling bypass in eBPF/iptables redirect logic, for
instance)
So far we have leaned towards keeping the CNI simpler so there is a better
chance we can replace it with better approaches (and rust is easier than
iptables/eBPF)
Absolutely a fan of keeping CNI simpler, but sending captured traffic
based on intended dest to a given ztunnel listener is what CNI is for -
that's what it does best. It's how we do DNS redir for instance. CNI redir
already does special-casing around certain dest ports (or whether src is
host ip, etc etc) for purposes of data path optimization - it's already
handling this stuff.
Feels logically consistent to maintain that pattern here, and it has the
added benefit of making it very easy for the IP to be changed or redefined
without ztunnel caring (or us needing to add a flag to ztunnel to let
people define what they want).
—
Reply to this email directly, view it on GitHub
<#504 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAUR2XZUNHVJ75L7CON3MLXDRBXDANCNFSM6AAAAAAXLK5ZTQ>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
On Fri, Apr 28, 2023 at 12:49 PM Louis Ryan ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In go-metadata/istio.go
<#504 (comment)>:
> + "context"
+ "encoding/json"
+ "fmt"
+ "io"
+ "net"
+ "net/http"
+ "net/url"
+ "os"
+ "time"
+)
+
+// ConnectionMetadata returns the metadata about a connection
+type ConnectionMetadata struct {
+ // Identity provides the identity of the peer.
+ // Example: spiffe://cluster.local/ns/a/sa/b.
+ Identity string `json:"identity"`
Can we open a tracking issue to capture the other information about the
identity that we would like to expose here
Over time I would expect to see us add either a field or a new API that
provided details from the public key of the peer
And probably some of the telemetry or labels and other info we got from the
PTR-DS. I agree in time we'll likely add more,
but the first step is to get identity in and validate.
I would also expose a lookup using only the peer IP - the 4-tuple is only
really needed if ztunnel can't impersonate the original
source IP.
I think the interesting question is what are we trying to prove to the
caller about the veracity of the identity we just provided. Options include:
1.
*How can the caller know that the MDS is reliably acting on its behalf*.
Clearly theres a certain amount of infrastructural trust seeing as this is
coming from the 'network' as perceived by the app. One option here if for
the MDS to offer a "who am I" API. This is a bit like SDS but now its an
implicit part of the infrastructure. Response to the call should either be
the current public cert or some shorthand of it. Variant, can prove that
the infra has the private key which is shielded from the app by signing
some random bytes passed to the API which can then be verified with the
public key. Likely want to deliver a shorthand in addition to x509 to
shield users from parsing its gory details.
I think this may be overkill and defeat the purpose. If the app is willing
to do all this - and link libraries and deal with certs - they are probably
better off
just using a proxyless library. With sidecars they don't get any guarantee
either, nor with most CNI.
It's also not the caller that needs to be assured - what is it going to do
? The network admin, who installs the CNI and ztunnel needs to be sure
all packets go to ztunnel, but that can be exposed using telemetry.
1.
*How can the caller know that the infrastructure actually performed a
secure handshake with the identified peer?* This is harder and of
debateable value. If we provide proof that infrastructure already has the
public and private keys as described in 1 above then the infra is already
capable of impersonating the application for whatever purposes it sees fit.
We could do something like a 'signed ping'
Client generates random bytes and encrypts with public key of peer (opaque to local ztunnel)
Client app calls 'secure ping' passing the encrypted bytes
local ztunnel simply forwards encrypted bytes to the identified peer
peer ztunnel decrypts random byes with private key and returns their signature
local ztunnel forwars message signature to client
Client compares message signature to locally computed hash
This proves that there is something between the client and the peer that
has access to the private key associated with the identity. The client
doesn't know who or what that is necessarily
That seems for super-advanced users - it may be something for a specialized
ztunnel to support for specialized apps with libraries that can take
advantage of it,
but maybe not something to have in OSS and the simple/clean implementation
we have ? It is a cool idea.
… Message ID: ***@***.***>
|
I'm suggesting if you have a need to map an arbitrary environmental IP of your choice to always get redirected to a listener in ztunnel, it's pretty simple to do that with a single iptables rule outside of ztunnel, by adding another rule to the ones we already have that are functionally similar, without ztunnel needing to care what that IP is, or needing to hardcode the IP in ztunnel. We could pick a fixed IP in OSS, anyone who wanted to do something different could change it by changing the rules outside ztunnel, ztunnel wouldn't care either way.
ztunnel already listens on 15053 for this purpose, I believe. We have iptables rules that rewrite UDP on port 53 to 15053 within ztunnel's netns on pod redirect. ztunnel doesn't care that this translation happened, or (for now) what the original dest IP was, it just treats traffic on port 15053 as DNS proxy traffic. I'm suggesting we do the same thing for MDS here.
I'm proposing we use the pattern already in use for all ambient ztunnel redirection - ztunnel exposes/hardcodes well-known ports, and CNI redirects traffic, based on (variously) dest IP, dest port, and dest packet type to those ports - for this as well. A fixed IP within ztunnel isn't a requirement for this, and it feels weird to add it here when nothing else currently in ztunnel works like this, and there's nothing really preventing it from being implemented exactly like all the existing pod->ztunnel redirections that rely on CNI to map traffic to fixed ztunnel listening ports. |
On Fri, Apr 28, 2023 at 6:18 PM Ben Leggett ***@***.***> wrote:
The reason for the hardcoded IP is pretty simple: in an app, how do you
make the http request ? You need a DNS name or IP. Many MDS servers are
also DNS servers - there is a plan to allow running a DNS proxy in ztunnel.
Also what hostname would you use - and how would CNI know that some VIP is
actually the MDS address ?
I'm suggesting if you have a need to map an arbitrary environmental IP of
your choice to always get redirected to a listener in ztunnel, it's pretty
simple to do that with a single iptables rule without ztunnel needing to
care what that IP is, or needing to hardcode it in ztunnel. We could pick a
fixed IP, anyone who wanted to do something different could change it,
ztunnel wouldn't care either way.
I see, you are right, that would work too.
However since ztunnel is intercepting all egress anyways - and istio-cni is
extremely complex and hopefully will become part of the native CNI,
I would rather keep it in ztunnel and simplify the requirement on the CNI
to 'just redirect all egress from the app to ztunnel'.
All MDS servers I know rely on a hardcoded 'well known' address in that
range.
MDS clients do, I'm not sure the MDS server being added to ztunnel does.
We can also use a special 150xx port, but if we want ztunnel to be a
resolver it'll need to listen on 53 ( yes, we can do some complicated
redirection to use a different port for DNS - but why not use the simplest
solution ).
ztunnel already listens on 15053 for this purpose, I believe. We have
iptables rules that rewrite UDP on port 53 to 15053 within ztunnel's netns.
Unless someone added DNS - I think agent does that. There is an option to
also listen on 53 and not require iptable rules. Just because
it is possible to use iptables - it doesn't mean we should ( and the rules
for 53 require raw and some very ugly stuff - UDP is far more
complicated )
If we are using this pattern - I would like to see a good reason why we
should follow a different pattern, and what would that be - can't think of
a simpler or cleaner solution. Also this whole thing doesn't need a fancy
iptables or eBPF or k8s - CNI can simply add a route and add a second
address with the well-known IP on the ztunnel pod. It will also work on VMs
and non-K8S environments - even in cases where no iptables are possible (
ztunnel can work without any CNI or iptables, by setting SOCKS_PROXY env
variable with most applications ).
I'm proposing we use the pattern already in use - ztunnel exposes
well-known ports, and CNI redirects traffic, based on (variously) dest IP,
dest port, and dest packet type to those ports - for this as well. A fixed
IP within ztunnel isn't a requirement for this, and it feels weird to add
it here when nothing else currently in ztunnel works like that.
Another pattern already in use is CNI redirects all traffic - and ztunnel
routes it wherever it needs to go. I would rather reduce complexity in CNI.
I also mentioned we have the SOCKS mode that is independent of iptables (
works without any ) - with the current method it would work
just fine. And if we ever add a HTTP/1.1 CONNECT proxy ( for the few apps
that support HTTPS_PROXY and not SOCKS ) - it would work too.
Message ID: ***@***.***>
… |
u := metadataServerURL() | ||
params := url.Values{} | ||
params.Add("src", src) | ||
params.Add("dst", dst) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Getting connection metadata is very useful - as Louis mentioned we may extend this to return all the details
about the connection, including labels and more handshake info.
And since we return it as json - we may as well consider an MDS endpoint that is also signing it.
But I think the use case for passing both src and dst is primarily for dealing with CNI implementations
where the source IP can't be preserved. Calico for example seems to support that, and hopefully we can
do it for more CNIs.
In the case of using ztunnel in a 'best practice' CNI - we may want to support metadata lookup using only the source IP ( which would return data from the PTR-DS/WDS ).
If the ztunnel doesn't preserve the source IP - I think the lookup result should include the original source IP
in addition to the peer identity. And maybe all the metadata lookup info too.
( all this in additional PRs, of course - not intending to hold this PR until everything is added )
* Store each connection in a shared map * Capture special IP address, redirect to MDS handler * MDS handler takes 4 tuple as input. TODO: should probably validate that the caller matches at least one of the IPs? * Response is JSON containing identity Also has an example Golang library implementation to add HTTP middleware that extracts the identity.
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Looking back at this:
e.g. - if I was using SPIRE with Istio as the workload CA as documented by us, and as a workload/pod sidecar I asked SPIRE for my identity and also asked ztunnel for my (connection) identity - I would/should get the same x509 cert in both cases. |
Lets drop this for now to keep things minimal. |
Also has an example Golang library implementation to add HTTP middleware that extracts the identity.
Istio tests: istio/istio#44536 (blocked by this PR, of course)