Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error msg="VPP terminated, stopping contiv-agent" #1620

Open
vipmahaj opened this issue Jul 25, 2019 · 11 comments
Open

Error msg="VPP terminated, stopping contiv-agent" #1620

vipmahaj opened this issue Jul 25, 2019 · 11 comments

Comments

@vipmahaj
Copy link

vipmahaj commented Jul 25, 2019

While setting up the k8s cluster on Ubuntu 18.04, with single nic and without sbt. The contiv-vswitch pod is showing CrashLoopBackOff status.

sbyk8s:~$ cat /proc/meminfo | grep Huge
AnonHugePages: 2048 kB
ShmemHugePages: 0 kB
HugePages_Total: 1024
HugePages_Free: 1024
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB

vSwitch resource in contiv_vpp.yaml is configured in sync with above settings:
resources:
limits:
hugepages-2Mi: 1024Mi
memory: 1024Mi

Getting following log message:

sbyk8s:~$ kubectl logs contiv-vswitch-2vq97 -p -n kube-system
time="2019-07-25 13:28:31.31893" level=debug msg="Starting contiv-init process" loc="contiv-init/main.go(294)" logger=contiv-init
time="2019-07-25 13:28:31.31964" level=debug msg="Connecting to Etcd.." endpoints="[9.45.69.221:32379]" loc="etcd/bytes_broker_impl.go(60)" logger=contiv-init
time="2019-07-25 13:28:31.32113" level=info msg="Connected to Etcd (took 1.533288ms)" endpoints="[9.45.69.221:32379]" loc="etcd/bytes_broker_impl.go(60)" logger=contiv-init
time="2019-07-25 13:28:31.33132" level=info msg="bolt path: /var/bolt/bolt.db" loc="bolt/bolt.go(66)" logger=bolt
time="2019-07-25 13:28:31.33981" level=info msg="Contiv configuration: {InterfaceConfig:{MTUSize:1450 UseTAPInterfaces:true TAPInterfaceVersion:2 TAPv2RxRingSize:256 TAPv2TxRingSize:256 Vmxnet3RxRingSize:0 Vmxnet3TxRingSize:0 InterfaceRxMode: TCPChecksumOffloadDisabled:true EnableGSO:true} RoutingConfig:{MainVRFID:0 PodVRFID:1 NodeToNodeTransport:vxlan UseSRv6ForServices:false RouteServiceCIDRToVPP:false} IPNeighborScanConfig:{ScanIPNeighbors:true IPNeighborScanInterval:1 IPNeighborStaleThreshold:4} StealFirstNIC:false StealInterface: STNSocketFile:/var/run/contiv/stn.sock STNVersion:2 NatExternalTraffic:true EnablePacketTrace:false CRDNodeConfigurationDisabled:true IPAMConfig:{UseExternalIPAM:false ContivCIDR: ServiceCIDR:10.96.0.0/12 NodeInterconnectDHCP:false PodSubnetCIDR:10.1.0.0/16 PodSubnetOneNodePrefixLen:24 VPPHostSubnetCIDR:172.30.0.0/16 VPPHostSubnetOneNodePrefixLen:24 NodeInterconnectCIDR:192.168.16.0/24 VxlanCIDR:192.168.30.0/24 DefaultGateway: SRv6:{ServicePolicyBSIDSubnetCIDR:8fff::/16 ServicePodLocalSIDSubnetCIDR:9300::/16 ServiceHostLocalSIDSubnetCIDR:9300::/16 ServiceNodeLocalSIDSubnetCIDR:9000::/16 NodeToNodePodLocalSIDSubnetCIDR:9501::/16 NodeToNodeHostLocalSIDSubnetCIDR:9500::/16 NodeToNodePodPolicySIDSubnetCIDR:8501::/16 NodeToNodeHostPolicySIDSubnetCIDR:8500::/16}} NodeConfig:[]}" loc="contivconf/contivconf.go(338)" logger=contivconf
time="2019-07-25 13:28:31.34000" level=info msg="ContivConf state after re-load: useDHCP=false, mainInterface=, mainInterfaceIPs=[], otherInterfaces=[], defaultGw=, dpdkIfaces=[], stnInterface=, stnIPAddresses=[], stnGW=, stnRoutes=[]" loc="contivconf/contivconf.go(798)" logger=contivconf
time="2019-07-25 13:28:31.34006" level=debug msg="STN not requested" loc="contiv-init/main.go(371)" logger=contiv-init
time="2019-07-25 13:28:31.34023" level=debug msg="Starting VPP" loc="contiv-init/main.go(399)" logger=contiv-init
time="2019-07-25 13:28:31.34892" level=debug msg="Starting contiv-agent" loc="contiv-init/main.go(411)" logger=contiv-init
time="2019-07-25 13:28:31.34981" level=debug msg="ERROR: This binary requires CPU with SSE4.2 extensions.\n" loc="contiv-init/vpplogger.go(23)" logger=vpp
time="2019-07-25 13:28:31.45349" level=info msg="Starting agent version: v3.2.1-1-g727f123ef" BuildDate="2019-07-25T08:45+00:00" CommitHash=727f123efee4c65b91f1e48ca772321c9f7c71f7 loc="agent/agent.go(134)" logger=agent
time="2019-07-25 13:28:31.45450" level=debug msg="setting logger level: statscollector -> info" loc="logrus/registry.go(171)" logger=defaultLogger
time="2019-07-25 13:28:31.45469" level=debug msg="-> Init(): status-check" loc="agent/agent.go(220)" logger=agent
time="2019-07-25 13:28:31.45490" level=debug msg="-> Init(): probe" loc="agent/agent.go(220)" logger=agent
time="2019-07-25 13:28:31.45507" level=debug msg="-> Init(): prometheus" loc="agent/agent.go(220)" logger=agent
time="2019-07-25 13:28:31.45523" level=debug msg="-> Init(): kvscheduler" loc="agent/agent.go(220)" logger=agent
time="2019-07-25 13:28:31.45560" level=debug msg="kvscheduler config found: &{RecordTransactionHistory:true TransactionHistoryAgeLimit:60 PermanentlyRecordedInitPeriod:10 EnableTxnSimulation:false PrintTxnSummary:true}" loc="kvscheduler/plugin_scheduler.go(214)" logger=kvscheduler
time="2019-07-25 13:28:31.45577" level=debug msg="KVScheduler configuration: {RecordTransactionHistory:true TransactionHistoryAgeLimit:60 PermanentlyRecordedInitPeriod:10 EnableTxnSimulation:false PrintTxnSummary:true}" loc="kvscheduler/plugin_scheduler.go(166)" logger=kvscheduler
time="2019-07-25 13:28:31.45734" level=debug msg="Registering handler: /scheduler/txn-history" loc="rest/plugin_impl_rest.go(116)" logger=http
time="2019-07-25 13:28:31.45765" level=debug msg="Registering handler: /scheduler/key-timeline" loc="rest/plugin_impl_rest.go(116)" logger=http
time="2019-07-25 13:28:31.45795" level=debug msg="Registering handler: /scheduler/graph-snapshot" loc="rest/plugin_impl_rest.go(116)" logger=http
time="2019-07-25 13:28:31.45823" level=debug msg="Registering handler: /scheduler/flag-stats" loc="rest/plugin_impl_rest.go(116)" logger=http
time="2019-07-25 13:28:31.45850" level=debug msg="Registering handler: /scheduler/downstream-resync" loc="rest/plugin_impl_rest.go(116)" logger=http
time="2019-07-25 13:28:31.45876" level=debug msg="Registering handler: /scheduler/dump" loc="rest/plugin_impl_rest.go(116)" logger=http
time="2019-07-25 13:28:31.45898" level=debug msg="Registering handler: /scheduler/status" loc="rest/plugin_impl_rest.go(116)" logger=http
time="2019-07-25 13:28:31.45925" level=debug msg="Registering handler: /scheduler/graph" loc="rest/plugin_impl_rest.go(116)" logger=http
time="2019-07-25 13:28:31.46333" level=debug msg="Registering handler: /scheduler/stats" loc="rest/plugin_impl_rest.go(116)" logger=http
time="2019-07-25 13:28:31.46355" level=debug msg="-> Init(): resync" loc="agent/agent.go(220)" logger=agent
time="2019-07-25 13:28:31.46372" level=debug msg="-> Init(): govpp" loc="agent/agent.go(220)" logger=agent
time="2019-07-25 13:28:31.46425" level=debug msg="config loaded from file "/etc/vpp-agent/govpp.conf"" loc="govppmux/plugin_impl_govppmux.go(129)" logger=govpp
time="2019-07-25 13:28:31.46450" level=debug msg="config: &{TraceEnabled:false ReconnectResync:false HealthCheckProbeInterval:3s HealthCheckReplyTimeout:500ms HealthCheckThreshold:3 ReplyTimeout:3s ConnectViaShm:false ShmPrefix: BinAPISocketPath: StatsSocketPath: RetryRequestCount:0 RetryRequestTimeout:500ms RetryConnectCount:0 RetryConnectTimeout:1s}" loc="govppmux/plugin_impl_govppmux.go(145)" logger=govpp
time="2019-07-25 13:28:31.46467" level=debug msg="connecting to VPP.." loc="govppmux/plugin_impl_govppmux.go(170)" logger=govpp
time="2019-07-25 13:28:33.35026" level=error msg="VPP terminated, stopping contiv-agent" loc="contiv-init/main.go(445)" logger=contiv-init
time="2019-07-25 13:28:33.35069" level=info msg="Signal terminated received during agent start, stopping" loc="agent/agent.go(153)" logger=agent
time="2019-07-25 13:28:33.36471" level=debug msg=exiting loc="contiv-init/main.go(463)" logger=contiv-init

Can someone help what went wrong?

@rastislavs
Copy link
Collaborator

ERROR: This binary requires CPU with SSE4.2 extensions

What CPU are you running this on?

@gilesheron
Copy link
Collaborator

or if this is a VM you need to make sure you pass the CPU capabilities through to the VM.

so on KVM I run a VM with a command something like:

sudo qemu-system-x86_64 -daemonize -display none
-enable-kvm -machine accel=kvm -smp cores=4 -m 16384 -cpu host
-hda /home/testuser/vm-images/ubuntu1.img
-net nic,model=virtio,vlan=0,macaddr=00:00:de:ad:be:ef -net tap,vlan=0,script=/etc/qemu-ifup

(note the -cpu host - which passes all CPU capabilities through to the VM...)

@gilesheron
Copy link
Collaborator

nice - that ":de:" become 🇩🇪

@gilesheron
Copy link
Collaborator

ugh - can't even turn it off...

@vipmahaj
Copy link
Author

It's a VM provisioned using vCenter(VmWare). Its a dual core processor.
vendor_id : GenuineIntel
cpu family : 6
model : 15
model name : Intel(R) Xeon(R) CPU E7- 4870 @ 2.40GHz
stepping : 1
microcode : 0x37
cpu MHz : 2394.045
cache size : 30720 KB
physical id : 0
siblings : 1
core id : 0
cpu cores : 1
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx lm constant_tsc arch_perfmon nopl tsc_reliable nonstop_tsc cpuid aperfmperf pni ssse3 cx16 hypervisor lahf_lm epb pti dtherm ida arat
bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds
bogomips : 4788.09
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management:

@vipmahaj
Copy link
Author

Also logs states the following error:

time="2019-07-25 13:28:33.35026" level=error msg="VPP terminated, stopping contiv-agent" loc="contiv-init/main.go(445)" logger=contiv-init
time="2019-07-25 13:28:33.35069" level=info msg="Signal terminated received during agent start, stopping" loc="agent/agent.go(153)" logger=agent

@gilesheron
Copy link
Collaborator

yeah - I think the agent will terminate if VPP has died.

I don't see "see4_2" in the list of flags.

also I think you'll need at least 2 CPU cores for your VM (one each for VPP and for everything else)

@gilesheron
Copy link
Collaborator

sse4_2 even

@gilesheron
Copy link
Collaborator

I think you need VMware hardware version 8 or later to enable SSE4.2.

@vipmahaj
Copy link
Author

vipmahaj commented Jul 26, 2019

I am using VmWare hw version 8 with 2 cores.

Screenshot 2019-07-26 at 10 20 41 AM

@gilesheron
Copy link
Collaborator

strange. Not sure what it is then. I guess you could try a newer hardware version?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants