Pod stuck in "ContainerCreating" status in K8 v.14.3 #96855

heartTorres · 2020-11-25T09:04:51Z

Kubernetes v.14.3
Amazon Web Services

What happened:
I have storageclass defined. Pod stuck in "ContainerCreating" as the volume cannot be attached and error "instance not found" error in the logs.

What you expected to happen:
Volume should be attached successfully and status of pod should be "Running"

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

Kubernetes version (use kubectl version): kubernetes v1.14.3
Cloud provider or hardware configuration: AWS
OS (e.g: cat /etc/os-release):
Kernel (e.g. uname -a):
Install tools:
Network plugin and version (if this is a network-related bug):
Others:

The text was updated successfully, but these errors were encountered:

k8s-ci-robot · 2020-11-25T09:05:00Z

@heartTorres: This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

pacoxu · 2020-11-25T10:24:08Z

/sig cloud-provider

similar with #77920 and @slconley meet it recently (#77920 (comment))

Some comments in the code:

kubernetes/staging/src/k8s.io/legacy-cloud-providers/aws/aws.go

Lines 2316 to 2327 in 48a0ef6

    
           // We want to fetch the hostname via the EC2 metadata service 
        
           // (`GetMetadata("local-hostname")`): But see #11543 - we need to use 
        
           // the EC2 API to get the privateDnsName in case of a private DNS zone 
        
           // e.g. mydomain.io, because the metadata service returns the wrong 
        
           // hostname.  Once we're doing that, we might as well get all our 
        
           // information from the instance returned by the EC2 API - it is a 
        
           // single API call to get all the information, and it means we don't 
        
           // have two code paths. 
        
           instance, err := c.getInstanceByID(instanceID) 
        
           if err != nil { 
        
           	return nil, fmt.Errorf("error finding instance %s: %q", instanceID, err) 
        
           }

	// We want to fetch the hostname via the EC2 metadata service
	// (`GetMetadata("local-hostname")`): But see #11543 - we need to use
	// the EC2 API to get the privateDnsName in case of a private DNS zone
	// e.g. mydomain.io, because the metadata service returns the wrong
	// hostname.  Once we're doing that, we might as well get all our
	// information from the instance returned by the EC2 API - it is a
	// single API call to get all the information, and it means we don't
	// have two code paths.

There's some analysis in https://dzone.com/articles/fixing-kubernetes-failedattachvolume-and-failed-mo#:~:text=The%20Warning%20FailedAttachVolume%20error%20occurs,cannot%20be%20attached%20to%20another.&text=You%20can%20see%20in%20the,attached%20to%20an%20existing%20node. According to is, aws will do something like forcing detach it and schedule it to another ec2 instance later? Are you using esb volumes ?

k8s-ci-robot · 2020-11-25T10:24:11Z

@pacoxu: The label(s) sig/aws cannot be applied, because the repository doesn't have them

In response to this:

/sig aws
/sig cloud-provider

similar with #77920 and @slconley meet it recently (#77920 (comment))

Some comments in the code:

kubernetes/staging/src/k8s.io/legacy-cloud-providers/aws/aws.go

Lines 2316 to 2327 in 48a0ef6

// We want to fetch the hostname via the EC2 metadata service

// (`GetMetadata("local-hostname")`): But see #11543 - we need to use

// the EC2 API to get the privateDnsName in case of a private DNS zone

// e.g. mydomain.io, because the metadata service returns the wrong

// hostname. Once we're doing that, we might as well get all our

// information from the instance returned by the EC2 API - it is a

// single API call to get all the information, and it means we don't

// have two code paths.

instance, err := c.getInstanceByID(instanceID)

if err != nil {

return nil, fmt.Errorf("error finding instance %s: %q", instanceID, err)

}
  // We want to fetch the hostname via the EC2 metadata service
  // (`GetMetadata("local-hostname")`): But see #11543 - we need to use
  // the EC2 API to get the privateDnsName in case of a private DNS zone
  // e.g. mydomain.io, because the metadata service returns the wrong
  // hostname.  Once we're doing that, we might as well get all our
  // information from the instance returned by the EC2 API - it is a
  // single API call to get all the information, and it means we don't
  // have two code paths.
There's some analysis in https://dzone.com/articles/fixing-kubernetes-failedattachvolume-and-failed-mo#:~:text=The%20Warning%20FailedAttachVolume%20error%20occurs,cannot%20be%20attached%20to%20another.&text=You%20can%20see%20in%20the,attached%20to%20an%20existing%20node. According to is, aws will do something like forcing detach it and schedule it to another ec2 instance later? Are you using esb volumes ?

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

heartTorres · 2020-11-25T15:24:10Z

/sig cloud-provider

similar with #77920 and @slconley meet it recently (#77920 (comment))

Some comments in the code:

kubernetes/staging/src/k8s.io/legacy-cloud-providers/aws/aws.go

Lines 2316 to 2327 in 48a0ef6

// We want to fetch the hostname via the EC2 metadata service

// (`GetMetadata("local-hostname")`): But see #11543 - we need to use

// the EC2 API to get the privateDnsName in case of a private DNS zone

// e.g. mydomain.io, because the metadata service returns the wrong

// hostname. Once we're doing that, we might as well get all our

// information from the instance returned by the EC2 API - it is a

// single API call to get all the information, and it means we don't

// have two code paths.

instance, err := c.getInstanceByID(instanceID)

if err != nil {

return nil, fmt.Errorf("error finding instance %s: %q", instanceID, err)

}
	// We want to fetch the hostname via the EC2 metadata service
	// (`GetMetadata("local-hostname")`): But see #11543 - we need to use
	// the EC2 API to get the privateDnsName in case of a private DNS zone
	// e.g. mydomain.io, because the metadata service returns the wrong
	// hostname.  Once we're doing that, we might as well get all our
	// information from the instance returned by the EC2 API - it is a
	// single API call to get all the information, and it means we don't
	// have two code paths.
There's some analysis in https://dzone.com/articles/fixing-kubernetes-failedattachvolume-and-failed-mo#:~:text=The%20Warning%20FailedAttachVolume%20error%20occurs,cannot%20be%20attached%20to%20another.&text=You%20can%20see%20in%20the,attached%20to%20an%20existing%20node. According to is, aws will do something like forcing detach it and schedule it to another ec2 instance later? Are you using esb volumes ?

Hi, yes I am using ebs volumes.

heartTorres added the kind/bug Categorizes issue or PR as related to a bug. label Nov 25, 2020

k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Nov 25, 2020

heartTorres changed the title ~~Pod stuck in "ContainerCreating" status in K8 v.1.4.3~~ Pod stuck in "ContainerCreating" status in K8 v.14.3 Nov 25, 2020

k8s-ci-robot added the sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. label Nov 25, 2020

k8s-ci-robot removed the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Nov 25, 2020

heartTorres closed this as completed Dec 1, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pod stuck in "ContainerCreating" status in K8 v.14.3 #96855

Pod stuck in "ContainerCreating" status in K8 v.14.3 #96855

heartTorres commented Nov 25, 2020 •

edited

k8s-ci-robot commented Nov 25, 2020

pacoxu commented Nov 25, 2020 •

edited

k8s-ci-robot commented Nov 25, 2020

heartTorres commented Nov 25, 2020 •

edited

Pod stuck in "ContainerCreating" status in K8 v.14.3 #96855

Pod stuck in "ContainerCreating" status in K8 v.14.3 #96855

Comments

heartTorres commented Nov 25, 2020 • edited

k8s-ci-robot commented Nov 25, 2020

pacoxu commented Nov 25, 2020 • edited

k8s-ci-robot commented Nov 25, 2020

heartTorres commented Nov 25, 2020 • edited

heartTorres commented Nov 25, 2020 •

edited

pacoxu commented Nov 25, 2020 •

edited

heartTorres commented Nov 25, 2020 •

edited