Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CRI] Forcibly remove container #44326

Merged
merged 1 commit into from
May 16, 2017

Conversation

xlgao-zju
Copy link
Contributor

Forcibly remove the running containers in RemoveContainer. Since we should forcibly remove the running containers in RemovePodSandbox. See here.

cc @feiskyer @Random-Liu

Signed-off-by: Xianglin Gao xlgao@zju.edu.cn

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Apr 11, 2017
@k8s-reviewable
Copy link

This change is Reviewable

@k8s-ci-robot
Copy link
Contributor

Hi @xlgao-zju. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with @k8s-bot ok to test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-github-robot k8s-github-robot added size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. release-note-label-needed labels Apr 11, 2017
@feiskyer
Copy link
Member

/assign

@feiskyer
Copy link
Member

@k8s-bot ok to test

@feiskyer
Copy link
Member

/release-note-none

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. and removed release-note-label-needed labels Apr 11, 2017
@xlgao-zju
Copy link
Contributor Author

Ping @Random-Liu @feiskyer

@feiskyer
Copy link
Member

LGTM

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 13, 2017
@xlgao-zju
Copy link
Contributor Author

@resouer PTAL

@xlgao-zju
Copy link
Contributor Author

Ping @Random-Liu

@feiskyer
Copy link
Member

@Random-Liu Could you help to take a look at this PR? needs your approval.

/assign @Random-Liu

@yujuhong
Copy link
Contributor

yujuhong commented May 3, 2017

@k8s-bot kops aws e2e test this

@yujuhong yujuhong assigned yujuhong and unassigned resouer, Random-Liu and dchen1107 May 3, 2017
@yujuhong
Copy link
Contributor

yujuhong commented May 3, 2017

/lgtm

@k8s-github-robot k8s-github-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 3, 2017
@yujuhong
Copy link
Contributor

yujuhong commented May 3, 2017

@k8s-bot gce etcd3 e2e test this

1 similar comment
@yujuhong
Copy link
Contributor

yujuhong commented May 3, 2017

@k8s-bot gce etcd3 e2e test this

@xlgao-zju
Copy link
Contributor Author

@feiskyer gce etcd3 e2e test failed again. :(

@yujuhong
Copy link
Contributor

yujuhong commented May 4, 2017

At least two failures were caused by node rebooting due to a kernel panic with the same call trace. Let's hold on to this PR before we rule out the possibility that this was triggered by the PR.
/do-not-merge

https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/pr-logs/pull/44326/pull-kubernetes-e2e-gce-etcd3/28421/
https://storage.googleapis.com/kubernetes-jenkins/pr-logs/pull/44326/pull-kubernetes-e2e-gce-etcd3/28421/artifacts/e2e-gce-agent-pr-105-0-minion-group-59hp/serial-1.log

https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/pr-logs/pull/44326/pull-kubernetes-e2e-gce-etcd3/28386/
https://storage.googleapis.com/kubernetes-jenkins/pr-logs/pull/44326/pull-kubernetes-e2e-gce-etcd3/28386/artifacts/e2e-gce-agent-pr-105-0-minion-group-3l63/serial-1.log

[  819.040151] BUG: unable to handle kernel NULL pointer dereference at 0000000000000078
[  819.048644] IP: [<ffffffff810a1100>] check_preempt_wakeup+0xd0/0x1d0
[  819.055414] PGD 1aa36c067 PUD 1aab2d067 PMD 0 
[  819.060379] Oops: 0000 [#1] SMP 
[  819.063965] Modules linked in: nf_conntrack_netlink nfnetlink sg xt_statistic sch_htb ebt_ip ebtable_filter ebtables veth xt_nat xt_recent ipt_REJECT xt_mark xt_comment xt_tcpudp ipt_MASQUERADE iptable_filter iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 xt_addrtype ip_tables xt_conntrack x_tables nf_nat nf_conntrack bridge stp llc aufs(C) nfsd auth_rpcgss oid_registry nfs_acl nfs lockd fscache sunrpc crct10dif_pclmul crc32_pclmul crc32c_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper pvpanic evdev parport_pc parport psmouse ablk_helper i2c_piix4 i2c_core processor cryptd serio_raw pcspkr thermal_sys virtio_net button ext4 crc16 mbcache jbd2 sd_mod crc_t10dif crct10dif_common virtio_scsi scsi_mod virtio_pci virtio virtio_ring
[  819.139608] CPU: 1 PID: 24559 Comm: exe Tainted: G        WC    3.16.0-4-amd64 #1 Debian 3.16.39-1
[  819.148685] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
[  819.158023] task: ffff8801e80260d0 ti: ffff880212d08000 task.ti: ffff880212d08000
[  819.165617] RIP: 0010:[<ffffffff810a1100>]  [<ffffffff810a1100>] check_preempt_wakeup+0xd0/0x1d0
[  819.174647] RSP: 0018:ffff880212d0be60  EFLAGS: 00010006
[  819.180082] RAX: 0000000000000001 RBX: ffff880073379340 RCX: 0000000000000008
[  819.187339] RDX: 0000000000000001 RSI: ffff88008fa01430 RDI: ffff88021fd12fb8
[  819.194696] RBP: 0000000000000000 R08: ffffffff81610640 R09: 0000000000000002
[  819.201959] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801e80260d0
[  819.209230] R13: ffff88021fd12f40 R14: 0000000000000000 R15: 0000000000000000
[  819.216497] FS:  0000000002672880(0063) GS:ffff88021fd00000(0000) knlGS:0000000000000000
[  819.224700] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  819.230562] CR2: 0000000000000078 CR3: 00000001ebd55000 CR4: 00000000001406e0
[  819.237826] Stack:
[  819.239961]  0000000000012f40 ffff88021fd12f40 0000000000012f40 ffff88021fd12f40
[  819.248004]  ffff88008fa01ab4 0000000000000246 ffff88020b977cc0 ffffffff81095bb5
[  819.256049]  ffff88008fa01430 ffffffff8109869a 00007fffffffeffd 0000000000000000
[  819.264087] Call Trace:
[  819.266646]  [<ffffffff81095bb5>] ? check_preempt_curr+0x85/0xa0
[  819.272773]  [<ffffffff8109869a>] ? wake_up_new_task+0xda/0x190
[  819.278832]  [<ffffffff81067a39>] ? do_fork+0x139/0x3d0
[  819.284196]  [<ffffffff8151b139>] ? stub_clone+0x69/0x90
[  819.289643]  [<ffffffff8151adcd>] ? system_call_fast_compare_end+0x10/0x15
[  819.296638] Code: 39 c2 7d 27 0f 1f 80 00 00 00 00 83 e8 01 48 8b 5b 70 39 d0 75 f5 48 8b 7d 78 48 3b 7b 78 74 15 0f 1f 00 48 8b 6d 70 48 8b 5b 70 <48> 8b 7d 78 48 3b 7b 78 75 ee 48 85 ff 74 e9 e8 8c cb ff ff 48 
[  819.324012] RIP  [<ffffffff810a1100>] check_preempt_wakeup+0xd0/0x1d0
[  819.330835]  RSP <ffff880212d0be60>
[  819.334440] CR2: 0000000000000078
[  819.338552] ---[ end trace fe89f0e30531fa36 ]---
[  819.343289] Kernel panic - not syncing: Fatal exception
[  820.402012] Shutting down cpus with NMI
[  820.407136] Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffff9fffffff)
[  820.417427] Rebooting in 10 seconds..
[  830.395738] ACPI MEMORY or I/O RESET_REG.
SeaBIOS (version 1.8.2-20161003_105447-google)
Total RAM Size = 0x00000001e0000000 = 7680 MiB
CPUs found: 2     Max CPUs supported: 2
found virtio-scsi at 0:3
virtio-scsi vendor='Google' product='PersistentDisk' rev='1' type=0 removable=0
virtio-scsi blksize=512 sectors=209715200 = 102400 MiB
virtio-scsi vendor='Google' product='PersistentDisk' rev='1' type=0 removable=0
virtio-scsi blksize=512 sectors=2097152 = 1024 MiB
drive 0x000f3180: PCHS=0/0/0 translation=lba LCHS=1024/255/63 s=209715200
drive 0x000f3140: PCHS=0/0/0 translation=lba LCHS=1024/32/63 s=2097152

This is from the debian-based CVM running docker 1.11.2.

/cc @kubernetes/sig-node-bugs @dchen1107

@k8s-ci-robot k8s-ci-robot added the sig/node Categorizes an issue or PR as relevant to SIG Node. label May 4, 2017
@yujuhong yujuhong added the do-not-merge DEPRECATED. Indicates that a PR should not merge. Label can only be manually applied/removed. label May 4, 2017
@yujuhong
Copy link
Contributor

yujuhong commented May 4, 2017

Not unique to this PR. Opened #45368

@yujuhong yujuhong removed the do-not-merge DEPRECATED. Indicates that a PR should not merge. Label can only be manually applied/removed. label May 4, 2017
@xlgao-zju
Copy link
Contributor Author

@yujuhong will wait until this is fixed.

@yujuhong
Copy link
Contributor

yujuhong commented May 8, 2017

It's a kernel issue, so I don't think we can fix it unless we update the CVM image.

On the other hand, it'd be good to know whether this PR exacerbates the problem. I am going to trigger a few more runs before letting it merge.

@k8s-bot gce etcd3 e2e test this

@yujuhong yujuhong added the do-not-merge DEPRECATED. Indicates that a PR should not merge. Label can only be manually applied/removed. label May 8, 2017
@k8s-github-robot k8s-github-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. labels May 11, 2017
@k8s-github-robot k8s-github-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 11, 2017
Signed-off-by: Xianglin Gao <xlgao@zju.edu.cn>
@xlgao-zju
Copy link
Contributor Author

@feiskyer @yujuhong rebased

@feiskyer
Copy link
Member

LGTM

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 11, 2017
@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: feiskyer, xlgao-zju, yujuhong

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@xlgao-zju
Copy link
Contributor Author

@yujuhong when can we remove the do-not-merge?

@yujuhong yujuhong removed the do-not-merge DEPRECATED. Indicates that a PR should not merge. Label can only be manually applied/removed. label May 16, 2017
@k8s-github-robot
Copy link

Automatic merge from submit-queue (batch tested with PRs 44326, 45768)

@k8s-github-robot k8s-github-robot merged commit f82bdca into kubernetes:master May 16, 2017
@xlgao-zju xlgao-zju deleted the forcibly-remove branch May 17, 2017 04:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note. sig/node Categorizes an issue or PR as relevant to SIG Node. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants