Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to start contiv-vpp due to SIGTERM #1555

Open
deepakgunjal opened this issue May 17, 2019 · 3 comments
Open

Unable to start contiv-vpp due to SIGTERM #1555

deepakgunjal opened this issue May 17, 2019 · 3 comments

Comments

@deepakgunjal
Copy link

I am trying to run the contiv-vpp on a single node kubernetes cluster with version v1.14.1 on an ubuntu 16.04 server. This server has a Mellanox card "37:00.0 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5]".

The file /etc/vpp/contiv-vswitch.conf:
unix {
nodaemon
cli-listen /run/vpp/cli.sock
cli-no-pager
coredump-size unlimited
full-coredump
poll-sleep-usec 100
}
nat {
endpoint-dependent
translation hash buckets 1048576
translation hash memory 268435456
user hash buckets 1024
max translations per user 10000
}
acl-plugin {
use tuple merge 0
}
dpdk {
dev 0000:37:00.0
uio-driver vfio-pci
}
api-trace {
on
nitems 5000
}
socksvr {
default
}
statseg {
default
}

cat /proc/meminfo | grep Huge
AnonHugePages: 0 kB
HugePages_Total: 1024
HugePages_Free: 1016
HugePages_Rsvd: 63
HugePages_Surp: 0
Hugepagesize: 2048 kB

The contiv-vpp pod is not starting up and I am getting the following error:

time="2019-05-17 08:18:19.74759" level=debug msg="/usr/bin/vpp[28474]: load_one_vat_plugin:67: Loaded plugin: vmxnet3_test_plugin.so\n" loc="contiv-init/vpplogger.go(23)" logger=vpp
time="2019-05-17 08:18:19.75611" level=debug msg="/usr/bin/vpp[28474]: load_one_vat_plugin:67: Loaded plugin: acl_test_plugin.so\n" loc="contiv-init/vpplogger.go(23)" logger=vpp
time="2019-05-17 08:18:19.76391" level=debug msg="/usr/bin/vpp[28474]: load_one_vat_plugin:67: Loaded plugin: mactime_test_plugin.so\n" loc="contiv-init/vpplogger.go(23)" logger=vpp
time="2019-05-17 08:18:19.77169" level=debug msg="/usr/bin/vpp[28474]: load_one_vat_plugin:67: Loaded plugin: lacp_test_plugin.so\n" loc="contiv-init/vpplogger.go(23)" logger=vpp
time="2019-05-17 08:18:19.87490" level=debug msg="/usr/bin/vpp[28474]: dpdk: EAL init args: -c 2 -n 4 --in-memory --file-prefix vpp -w 0000:37:00.0 --master-lcore 1 \n" loc="contiv-init/vpplogger.go(23)" logger=vpp
time="2019-05-17 08:18:27.41260" level=error msg="VPP terminated, stopping contiv-agent" loc="contiv-init/main.go(440)" logger=contiv-init
time="2019-05-17 08:18:27.41296" level=info msg="Signal terminated received during agent start, stopping" loc="agent/agent.go(152)" logger=agent
time="2019-05-17 08:18:27.42598" level=debug msg=exiting loc="contiv-init/main.go(453)" logger=contiv-init

Please help to understand what could be the cause of this error.

@rastislavs
Copy link
Collaborator

Hi,
please read hugepages section in https://github.com/contiv/vpp/blob/master/docs/setup/MANUAL_INSTALL.md#hugepages-kubernetes-110-and-above

In k8s 1.14, disabling HugePages fature gate is not supported anymore, so you may need to define memory limit for the vSwitch container:

    resources:
      limits:
        hugepages-2Mi: 1024Mi
        memory: 1024Mi

@deepakgunjal
Copy link
Author

Thanks for suggestion. I did that but also found the contiv-stn running which I stopped. Now the VPP is running.

Though there is one observation that since it is a single node cluster and physical NIC is not connected with any cable, the vppctl "show int" command does not show the main VPP interface as given in the installation guide. The output as of now is,

vpp# show int
Name Idx State MTU (L3/IP4/IP6/MPLS) Counter Count
local0 0 down 0/0/0/0
loop0 1 up 9000/0/0/0
loop1 2 up 9000/0/0/0
loop2 4 up 9000/0/0/0
tap0 3 up 1450/0/0/0 rx packets 157
rx bytes 55900
tx packets 160
tx bytes 16214
drops 9
ip4 148
ip6 8
tap1 5 up 1450/0/0/0 rx packets 138
rx bytes 12365
tx packets 73
tx bytes 27537
drops 62
ip4 130
ip6 8

If I join a new minion in this cluster would any pod on that server be able to connect with the pod on master provided I get physical cable also connected on master node VPP NIC?

@rastislavs
Copy link
Collaborator

Hi,

so is your DPDK setup correct according to this manual?
https://github.com/contiv/vpp/blob/master/docs/setup/VPP_CONFIG.md

Also, make sure that required kernel modules are loaded:
https://github.com/contiv/vpp/blob/master/docs/setup/MANUAL_INSTALL.md#setting-up-dpdk

In case that the node interconnect interface still does not appear on VPP, please provide the logs from the contiv-vswitch pod.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants