New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GCE metadata name doesn't resolve in container #8512
Comments
Qualifying the name also works:
|
/cc @thockin |
FWIW http://metadata/ is hardcoded into the Google API client for Java (haven't checked other languages), so any app using that and relying on service accounts is impacted. |
I can get to metadata from my current cluster (which is about 5 days old)
On Tue, May 19, 2015 at 5:12 PM, Evan Brown notifications@github.com
|
@thockin this is a GKE cluster I created today. Just created another ~2 minutes ago, same result: Creating the GKE Cluster:CLUSTER_NAME=mdtester
NUM_NODES=1
MACHINE_TYPE=g1-small
API_VERSION=0.17.0
gcloud alpha container clusters create ${CLUSTER_NAME} \
--num-nodes ${NUM_NODES} \
--machine-type ${MACHINE_TYPE} \
--cluster-api-version ${API_VERSION} \
--scopes "https://www.googleapis.com/auth/devstorage.full_control" \
"https://www.googleapis.com/auth/projecthosting" Tools on my workstation:$ gcloud components update preview
All components are up to date.
$ kubectl version
Client Version: version.Info{Major:"0", Minor:"16", GitVersion:"v0.16.1", GitCommit:"b933dda5369043161fa7c7330dfcfbc4624d40e6", GitTreeState:"clean"}
Server Version: version.Info{Major:"0", Minor:"17", GitVersion:"v0.17.0", GitCommit:"82f8bdac06ddfacf493a9ed0fedc85f5ea62ebd5", GitTreeState:"clean"} On node:evanbrown@k8s-mdtester-node-1:~$ sudo bash
root@k8s-mdtester-node-1:/home/evanbrown# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
d1de71de8b9a gcr.io/cloud-solutions-images/nginx-ssl-proxy:latest "/bin/sh -c ./start. 50 seconds ago Up 47 seconds k8s_nginx-ssl-proxy.22441003_nginx-ssl-proxy-7u7uh_default_9c82b564-fe8f-11e4-845f-42010af01d9a_b174af7a
f6a34bf6e2ee gcr.io/google_containers/pause:0.8.0 "/pause" About a minute ago Up About a minute k8s_POD.e4cc795_jenkins-agent-czbes_default_9daa9258-fe8f-11e4-845f-42010af01d9a_a96641ed
8324df574b26 gcr.io/google_containers/pause:0.8.0 "/pause" About a minute ago Up About a minute k8s_POD.b72e170c_jenkins-leader-gnebq_default_9d16d691-fe8f-11e4-845f-42010af01d9a_123c0387
10d8e2e6e6a8 gcr.io/google_containers/pause:0.8.0 "/pause" About a minute ago Up About a minute k8s_POD.605b1ae5_nginx-ssl-proxy-7u7uh_default_9c82b564-fe8f-11e4-845f-42010af01d9a_3d84c97c
b5e81c0e3a3a gcr.io/google_containers/skydns:2015-03-11-001 "/skydns -machines=h About a minute ago Up About a minute k8s_skydns.cd47cc92_kube-dns-lmn6f_default_776bdb99-fe8f-11e4-82e2-42010af01d9a_52f85030
c7124b4dac50 gcr.io/google_containers/kube2sky:1.4 "/kube2sky -domain=k About a minute ago Up About a minute k8s_kube2sky.87b54c53_kube-dns-lmn6f_default_776bdb99-fe8f-11e4-82e2-42010af01d9a_71c23a7f
bbf319e174d0 gcr.io/google_containers/etcd:2.0.9 "/usr/local/bin/etcd About a minute ago Up About a minute k8s_etcd.13594ee7_kube-dns-lmn6f_default_776bdb99-fe8f-11e4-82e2-42010af01d9a_ba9cfac8
22ce1c0b0c8c gcr.io/google_containers/pause:0.8.0 "/pause" About a minute ago Up About a minute k8s_POD.8fdb0e41_kube-dns-lmn6f_default_776bdb99-fe8f-11e4-82e2-42010af01d9a_11a56c63
26949dfb38f7 gcr.io/google_containers/pause:0.8.0 "/pause" 2 minutes ago Up 2 minutes k8s_POD.e4cc795_fluentd-cloud-logging-k8s-mdtester-node-1_default_e7a73ce4931dc175ddc463501188a765_00ec948d
root@k8s-mdtester-node-1:/home/evanbrown# docker exec -ti d1de71de8b9a sh
# ping metadata
ping: unknown host
# ping metadata.google.internal
PING metadata.google.internal (169.254.169.254): 48 data bytes
56 bytes from 169.254.169.254: icmp_seq=0 ttl=254 time=0.335 ms
56 bytes from 169.254.169.254: icmp_seq=1 ttl=254 time=1.369 ms |
I tried this on a GKE cluster that I created today.
Have you tried this from within an image other than |
cat /etc/resolv.conf from within the container? On Tue, May 19, 2015 at 6:36 PM, Evan Brown notifications@github.com
|
Ahh, ubuntu and busybox work. Two images I've been using on GKE 0.15.0 are breaking. Frak. I'll spin up a 0.15.0 cluster locally and confirm they work there (they were working on 0.15.0 on GKE early last week.) /etc/resolv.conf is:
|
Is one of those your actual DNS service? The search path look splausible. On Tue, May 19, 2015 at 9:08 PM, Evan Brown notifications@github.com
|
I haven't touched "HostConfig": {
...
"Dns": [
"10.195.240.10",
"169.254.169.254",
"10.240.0.1"
],
"DnsSearch": [
"default.kubernetes.local",
"default.svc.kubernetes.local",
"svc.kubernetes.local",
"kubernetes.local",
"c.your-new-project.internal.",
"297725298471.google.internal.",
"google.internal."
], This is what I used to create the controller (the image is public): {
"kind": "ReplicationController",
"apiVersion": "v1beta3",
"metadata": {
"name": "nginx-ssl-proxy",
"labels": {
"name": "nginx", "role": "ssl-proxy"
}
},
"spec": {
"replicas": 1,
"selector": {
"name": "nginx", "role": "ssl-proxy"
},
"template": {
"metadata": {
"name": "nginx-ssl-proxy",
"labels": {
"name": "nginx", "role": "ssl-proxy"
}
},
"spec": {
"containers": [
{
"name": "nginx-ssl-proxy",
"image": "gcr.io/cloud-solutions-images/nginx-ssl-proxy:latest",
"env": [
{
"name": "SERVICE_HOST_ENV_NAME",
"value": "JENKINS_SERVICE_HOST"
},
{
"name": "SERVICE_PORT_ENV_NAME",
"value": "JENKINS_SERVICE_PORT_UI"
},
{
"name": "ENABLE_SSL",
"value": "false"
},
{
"name": "ENABLE_BASIC_AUTH",
"value": "true"
}
],
"ports": [
{
"name": "nginx-ssl-proxy-http",
"containerPort": 80
},
{
"name": "nginx-ssl-proxy-https",
"containerPort": 443
}
],
"volumeMounts": [{
"name": "secrets",
"mountPath": "/etc/secrets",
"readOnly": true
}]
}
],
"volumes": [{
"name": "secrets",
"secret": {
"secretName": "ssl-proxy-secret"
}
}]
}
}
}
} |
I'm not sure what to say. Something about that image is not using On Tue, May 19, 2015 at 9:26 PM, Evan Brown notifications@github.com
|
Cool, I'll keep diggin'. Thanks for the help thus far. |
On GKE cluster API version Easy to repro:
|
@evandbrown This was indeed introduced in v.0.17.0 resolv.conf only allows a fixed number of search paths. It is generally set to 6. In v0.17.0, we introduced two new search domains, "default.svc.kubernetes.local" and "svc.kubernetes.local". This caused .google.internal. search path to be the 7th search domain, essentially causing it to be silently ignored. Observations
In terms of fixes, we would probably have to find a way to reduce the number of search domains.. |
This seems bad. Can we get this fixed asap? |
I'll discuss options with @vishh today. This will make the DNS conversion On Wed, May 20, 2015 at 8:24 AM, Abhi Shah notifications@github.com wrote:
|
and can we add a simple e2e so we don't break this again in the future? (and @roberthbailey this is also prob. worth cutting an 0.17.1 for) |
Yeah, we can add 'metadata' to the list or URLs we resolve in e2e - On Wed, May 20, 2015 at 8:50 AM, Brendan Burns notifications@github.com
|
Agreed that this is worth 0.17.1. Bummer. |
Q OK'ed a rollback. I'm on it. On Wed, May 20, 2015 at 9:00 AM, Robert Bailey notifications@github.com
|
#8568 - can I get a quick review? On Wed, May 20, 2015 at 9:03 AM, Tim Hockin thockin@google.com wrote:
|
And I beg for a couple hours grace to see if we can fix it before cutting On Wed, May 20, 2015 at 9:07 AM, Tim Hockin notifications@github.com
|
cc/ @dchen1107 |
I am working on extending e2e to cover this failure. |
we can get the e2e fix in without the rest, good idea On Wed, May 20, 2015 at 9:52 AM, Vish Kannan notifications@github.com
|
closing this issue. |
@evandbrown Which Google API client? We don't want containers talking to the host, in general. |
@bgrant0607 The Java API client in my case, but it generally affected any client that tried to resolve the metadata name. All fixed and working great in 0.17.1. What about container->host communication, though? |
Filed #8990. Container->host communication should require special permission. Is this the client? |
It has to be possible to call Google APIs from outside of GCE. Presumably, you just need to provide the credentials.
|
It's definitely possible, but creating a service account (or AWS IAM access/secret key) and distributing/revoking/rotating that credential is a huge pain. The metadata service is the de facto standard for distributing short-lived creds to apps running on EC2 (IAM roles) or GCE (scoped compute service accounts), and SDKs from both support this very well. The GCE and EC2 metadata services would need to add done thing similar to namespaces to help here. |
It's a real balancing act. On the one hand we don't want people to couple And yet, we can't really shut down access to things outside your k8s We probably need a way to offer GCE's metadata at reduced scope to On Thu, May 28, 2015 at 11:18 PM, Evan Brown notifications@github.com
|
In v0.15.0 on GKE, containers had access to the GCE metadata service at http://metadata. On a v0.17.0 cluster, metadata does not resolve:
The link-local address works fine in the container:
A number of tools rely on the metadata name resolving to access service accounts, etc.
The name resolves fine on the GCE host.
The text was updated successfully, but these errors were encountered: