Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError on add_ip in IPDB #599

Closed
dulek opened this issue Apr 15, 2019 · 8 comments
Closed

RuntimeError on add_ip in IPDB #599

dulek opened this issue Apr 15, 2019 · 8 comments

Comments

@dulek
Copy link

dulek commented Apr 15, 2019

2019-04-15 08:44:33.843 134 ERROR kuryr_kubernetes.cni.daemon.service [-] Error when processing addNetwork request. CNI Params: {'CNI_IFNAME': u'eth0', 'CNI_NETNS': u'/proc/21795/ns/net', 'CNI_PATH': u'/opt/cni/bin', 'CNI_ARGS': u'IgnoreUnknown=1;K8S_POD_NAMESPACE=bigdatastack;K8S_POD_NAME=no3-demo-1-deploy;K8S_POD_INFRA_CONTAINER_ID=c9b18980c4759913b2af304ad5a4123d9258f0dc00a56f37e982b7986e13c44e', 'CNI_DAEMON': u'True', 'CNI_COMMAND': u'ADD', 'CNI_CONTAINERID': u'c9b18980c4759913b2af304ad5a4123d9258f0dc00a56f37e982b7986e13c44e'}: RuntimeError
2019-04-15 08:44:33.843 134 ERROR kuryr_kubernetes.cni.daemon.service Traceback (most recent call last):
2019-04-15 08:44:33.843 134 ERROR kuryr_kubernetes.cni.daemon.service   File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/cni/daemon/service.py", line 81, in add
2019-04-15 08:44:33.843 134 ERROR kuryr_kubernetes.cni.daemon.service     vif = self.plugin.add(params)
2019-04-15 08:44:33.843 134 ERROR kuryr_kubernetes.cni.daemon.service   File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/cni/plugins/k8s_cni_registry.py", line 51, in add
2019-04-15 08:44:33.843 134 ERROR kuryr_kubernetes.cni.daemon.service     vifs = self._do_work(params, b_base.connect)
2019-04-15 08:44:33.843 134 ERROR kuryr_kubernetes.cni.daemon.service   File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/cni/plugins/k8s_cni_registry.py", line 135, in _do_work
2019-04-15 08:44:33.843 134 ERROR kuryr_kubernetes.cni.daemon.service     container_id=params.CNI_CONTAINERID)
2019-04-15 08:44:33.843 134 ERROR kuryr_kubernetes.cni.daemon.service   File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/cni/binding/base.py", line 101, in connect
2019-04-15 08:44:33.843 134 ERROR kuryr_kubernetes.cni.daemon.service     _configure_l3(vif, ifname, netns, is_default_gateway)
2019-04-15 08:44:33.843 134 ERROR kuryr_kubernetes.cni.daemon.service   File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/cni/binding/base.py", line 82, in _configure_l3
2019-04-15 08:44:33.843 134 ERROR kuryr_kubernetes.cni.daemon.service     subnet.cidr.prefixlen))
2019-04-15 08:44:33.843 134 ERROR kuryr_kubernetes.cni.daemon.service   File "/usr/lib/python2.7/site-packages/pyroute2/ipdb/transactional.py", line 209, in __exit__
2019-04-15 08:44:33.843 134 ERROR kuryr_kubernetes.cni.daemon.service     self.commit()
2019-04-15 08:44:33.843 134 ERROR kuryr_kubernetes.cni.daemon.service   File "/usr/lib/python2.7/site-packages/pyroute2/ipdb/interfaces.py", line 1071, in commit
2019-04-15 08:44:33.843 134 ERROR kuryr_kubernetes.cni.daemon.service     raise error
2019-04-15 08:44:33.843 134 ERROR kuryr_kubernetes.cni.daemon.service RuntimeError
2019-04-15 08:44:33.843 134 ERROR kuryr_kubernetes.cni.daemon.service 

We started seeing that starting from 0.5.4. Code triggering this is below, please note that subnet.cidr.version != 6, so whole _enable_ipv6 is not executed.

    with get_ipdb(netns) as ipdb:
        with ipdb.interfaces[ifname] as iface:
            for subnet in vif.network.subnets.objects:
                if subnet.cidr.version == 6:
                    _enable_ipv6(netns)
                for fip in subnet.ips.objects:
                    iface.add_ip('%s/%s' % (fip.address,
                                            subnet.cidr.prefixlen))
@svinota
Copy link
Owner

svinota commented Apr 16, 2019

thanks, investigating

@dulek
Copy link
Author

dulek commented Apr 16, 2019

@svinota: I only realized that after sending the report, but it might be useful for you - we see that only on our "nested" mode. This means that the modified interface is of type vlan. We do not experience that with veth pairs.

Running aforementioned code after this method works fine and after this method explodes.

@svinota
Copy link
Owner

svinota commented Apr 16, 2019

Some additional info needed:

  • Do you get this error every time or sometimes?
  • Do you get this error on any address/prefixlen values or in particular cases?
  • Does the code uses netns, right?
  • What is the kernel version?

@svinota
Copy link
Owner

svinota commented Apr 16, 2019

@dulek Aha, that's interesting, thanks!

@luis5tb
Copy link

luis5tb commented Apr 16, 2019

Some additional info needed:

  • Do you get this error every time or sometimes?

Sometimes, but it is happening frequently, I would say almost 50% of the time. Avoided completely when moving to pyroute0.5.3

  • Do you get this error on any address/prefixlen values or in particular cases?

It was setting the same route for all of them, and only happening intermittently

  • Does the code uses netns, right?

Yes! it is inside a netns

  • What is the kernel version?

CentOS Linux release 7.6.1810 (Core)
3.10.0-957.1.3.el7.x86_64

@svinota
Copy link
Owner

svinota commented Apr 18, 2019

Sounds like a race.

What do you think, is it possible to deploy a local test system that I could use to test regressions from the kuryr/kubernetes perspective? What's needed? Or maybe there is a prepared VM image to download?

@dulek
Copy link
Author

dulek commented Apr 19, 2019

@svinota: Uh, it might be a bit hard, it's a nested setup, as described in Kuryr docs. I understand this is really complex to deploy. Maybe @luis5tb will be able to get access to his setup for you, or we'll try to extract the issue into a smaller example after the weekend.

openstack-gerrit pushed a commit to openstack/requirements that referenced this issue Apr 25, 2019
Seems like latest releases - 0.5.4 and 0.5.5 create issues for
Kuryr-Kubernetes in nested mode (note that we don't run that mode in the
gate due to technical reasons). This is being tracked by issue 599 [1]
in pyroute2 bug tracker.

This commit constraints pyroute2 to 0.5.3.

[1] svinota/pyroute2#599

Story: 2005460
Task: 30530
Related-Bug: 1824846
Change-Id: I37e7b9de70a008edfbe5e745718b2dae3968722a
openstack-gerrit pushed a commit to openstack/openstack that referenced this issue Apr 25, 2019
* Update requirements from branch 'master'
  - Merge "Constraint pyroute2 to 0.5.3"
  - Constraint pyroute2 to 0.5.3
    
    Seems like latest releases - 0.5.4 and 0.5.5 create issues for
    Kuryr-Kubernetes in nested mode (note that we don't run that mode in the
    gate due to technical reasons). This is being tracked by issue 599 [1]
    in pyroute2 bug tracker.
    
    This commit constraints pyroute2 to 0.5.3.
    
    [1] svinota/pyroute2#599
    
    Story: 2005460
    Task: 30530
    Related-Bug: 1824846
    Change-Id: I37e7b9de70a008edfbe5e745718b2dae3968722a
@dulek
Copy link
Author

dulek commented Sep 3, 2019

This does seem to be fixed in 0.5.6 - i.e. we don't see it again.

@dulek dulek closed this as completed Sep 3, 2019
tanaypf9 pushed a commit to tanaypf9/pf9-requirements that referenced this issue May 20, 2024
Seems like latest releases - 0.5.4 and 0.5.5 create issues for
Kuryr-Kubernetes in nested mode (note that we don't run that mode in the
gate due to technical reasons). This is being tracked by issue 599 [1]
in pyroute2 bug tracker.

This commit constraints pyroute2 to 0.5.3.

[1] svinota/pyroute2#599

Story: 2005460
Task: 30530
Related-Bug: 1824846
Change-Id: I37e7b9de70a008edfbe5e745718b2dae3968722a
tanaypf9 pushed a commit to tanaypf9/pf9-requirements that referenced this issue May 20, 2024
Patch Set 1:

Until svinota/pyroute2#599 is fixed, we need to stay in 0.5.3.

I'll revisit this patch once pyroute2 solves this error.

Patch-set: 1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants