Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pod stuck in "ContainerCreating" status in K8 v.14.3 #96855

Closed
heartTorres opened this issue Nov 25, 2020 · 4 comments
Closed

Pod stuck in "ContainerCreating" status in K8 v.14.3 #96855

heartTorres opened this issue Nov 25, 2020 · 4 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider.

Comments

@heartTorres
Copy link

heartTorres commented Nov 25, 2020

Kubernetes v.14.3
Amazon Web Services

What happened:
I have storageclass defined. Pod stuck in "ContainerCreating" as the volume cannot be attached and error "instance not found" error in the logs.

Screen Shot 2020-11-25 at 4 39 32 PM

What you expected to happen:
Volume should be attached successfully and status of pod should be "Running"

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

  • Kubernetes version (use kubectl version): kubernetes v1.14.3
  • Cloud provider or hardware configuration: AWS
  • OS (e.g: cat /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Network plugin and version (if this is a network-related bug):
  • Others:
@heartTorres heartTorres added the kind/bug Categorizes issue or PR as related to a bug. label Nov 25, 2020
@k8s-ci-robot k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Nov 25, 2020
@k8s-ci-robot
Copy link
Contributor

@heartTorres: This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@heartTorres heartTorres changed the title Pod stuck in "ContainerCreating" status in K8 v.1.4.3 Pod stuck in "ContainerCreating" status in K8 v.14.3 Nov 25, 2020
@pacoxu
Copy link
Member

pacoxu commented Nov 25, 2020

/sig cloud-provider

similar with #77920 and @slconley meet it recently (#77920 (comment))

Some comments in the code:

// We want to fetch the hostname via the EC2 metadata service
// (`GetMetadata("local-hostname")`): But see #11543 - we need to use
// the EC2 API to get the privateDnsName in case of a private DNS zone
// e.g. mydomain.io, because the metadata service returns the wrong
// hostname. Once we're doing that, we might as well get all our
// information from the instance returned by the EC2 API - it is a
// single API call to get all the information, and it means we don't
// have two code paths.
instance, err := c.getInstanceByID(instanceID)
if err != nil {
return nil, fmt.Errorf("error finding instance %s: %q", instanceID, err)
}

	// We want to fetch the hostname via the EC2 metadata service
	// (`GetMetadata("local-hostname")`): But see #11543 - we need to use
	// the EC2 API to get the privateDnsName in case of a private DNS zone
	// e.g. mydomain.io, because the metadata service returns the wrong
	// hostname.  Once we're doing that, we might as well get all our
	// information from the instance returned by the EC2 API - it is a
	// single API call to get all the information, and it means we don't
	// have two code paths.

There's some analysis in https://dzone.com/articles/fixing-kubernetes-failedattachvolume-and-failed-mo#:~:text=The%20Warning%20FailedAttachVolume%20error%20occurs,cannot%20be%20attached%20to%20another.&text=You%20can%20see%20in%20the,attached%20to%20an%20existing%20node. According to is, aws will do something like forcing detach it and schedule it to another ec2 instance later? Are you using esb volumes ?

@k8s-ci-robot k8s-ci-robot added the sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. label Nov 25, 2020
@k8s-ci-robot
Copy link
Contributor

@pacoxu: The label(s) sig/aws cannot be applied, because the repository doesn't have them

In response to this:

/sig aws
/sig cloud-provider

similar with #77920 and @slconley meet it recently (#77920 (comment))

Some comments in the code:

// We want to fetch the hostname via the EC2 metadata service
// (`GetMetadata("local-hostname")`): But see #11543 - we need to use
// the EC2 API to get the privateDnsName in case of a private DNS zone
// e.g. mydomain.io, because the metadata service returns the wrong
// hostname. Once we're doing that, we might as well get all our
// information from the instance returned by the EC2 API - it is a
// single API call to get all the information, and it means we don't
// have two code paths.
instance, err := c.getInstanceByID(instanceID)
if err != nil {
return nil, fmt.Errorf("error finding instance %s: %q", instanceID, err)
}

  // We want to fetch the hostname via the EC2 metadata service
  // (`GetMetadata("local-hostname")`): But see #11543 - we need to use
  // the EC2 API to get the privateDnsName in case of a private DNS zone
  // e.g. mydomain.io, because the metadata service returns the wrong
  // hostname.  Once we're doing that, we might as well get all our
  // information from the instance returned by the EC2 API - it is a
  // single API call to get all the information, and it means we don't
  // have two code paths.

There's some analysis in https://dzone.com/articles/fixing-kubernetes-failedattachvolume-and-failed-mo#:~:text=The%20Warning%20FailedAttachVolume%20error%20occurs,cannot%20be%20attached%20to%20another.&text=You%20can%20see%20in%20the,attached%20to%20an%20existing%20node. According to is, aws will do something like forcing detach it and schedule it to another ec2 instance later? Are you using esb volumes ?

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot removed the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Nov 25, 2020
@heartTorres
Copy link
Author

heartTorres commented Nov 25, 2020

/sig cloud-provider

similar with #77920 and @slconley meet it recently (#77920 (comment))

Some comments in the code:

// We want to fetch the hostname via the EC2 metadata service
// (`GetMetadata("local-hostname")`): But see #11543 - we need to use
// the EC2 API to get the privateDnsName in case of a private DNS zone
// e.g. mydomain.io, because the metadata service returns the wrong
// hostname. Once we're doing that, we might as well get all our
// information from the instance returned by the EC2 API - it is a
// single API call to get all the information, and it means we don't
// have two code paths.
instance, err := c.getInstanceByID(instanceID)
if err != nil {
return nil, fmt.Errorf("error finding instance %s: %q", instanceID, err)
}

	// We want to fetch the hostname via the EC2 metadata service
	// (`GetMetadata("local-hostname")`): But see #11543 - we need to use
	// the EC2 API to get the privateDnsName in case of a private DNS zone
	// e.g. mydomain.io, because the metadata service returns the wrong
	// hostname.  Once we're doing that, we might as well get all our
	// information from the instance returned by the EC2 API - it is a
	// single API call to get all the information, and it means we don't
	// have two code paths.

There's some analysis in https://dzone.com/articles/fixing-kubernetes-failedattachvolume-and-failed-mo#:~:text=The%20Warning%20FailedAttachVolume%20error%20occurs,cannot%20be%20attached%20to%20another.&text=You%20can%20see%20in%20the,attached%20to%20an%20existing%20node. According to is, aws will do something like forcing detach it and schedule it to another ec2 instance later? Are you using esb volumes ?

Hi, yes I am using ebs volumes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider.
Projects
None yet
Development

No branches or pull requests

3 participants