Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

During init container on 1.7.3 sysctl: cannot stat /proc/sys/net/ipv4/conf/eth0 #1250

Closed
s4mur4i opened this issue Oct 6, 2020 · 3 comments
Assignees
Labels

Comments

@s4mur4i
Copy link

s4mur4i commented Oct 6, 2020

What happened:

We upgraded from 1.7.1 to 1.7.3 and on a single node we had following line:

Copying CNI plugin binaries ...
+ PLUGIN_BINS='loopback portmap bandwidth aws-cni-support.sh'
+ for b in '$PLUGIN_BINS'
+ '[' '!' -f loopback ']'
+ for b in '$PLUGIN_BINS'
+ '[' '!' -f portmap ']'
+ for b in '$PLUGIN_BINS'
+ '[' '!' -f bandwidth ']'
+ for b in '$PLUGIN_BINS'
+ '[' '!' -f aws-cni-support.sh ']'
+ HOST_CNI_BIN_PATH=/host/opt/cni/bin
+ echo 'Copying CNI plugin binaries ... '
+ for b in '$PLUGIN_BINS'
+ install loopback /host/opt/cni/bin
+ for b in '$PLUGIN_BINS'
+ install portmap /host/opt/cni/bin
+ for b in '$PLUGIN_BINS'
+ install bandwidth /host/opt/cni/bin
+ for b in '$PLUGIN_BINS'
+ install aws-cni-support.sh /host/opt/cni/bin
+ echo 'Configure rp_filter loose... '
Configure rp_filter loose...
++ curl -X PUT http://169.254.169.254/latest/api/token -H 'X-aws-ec2-metadata-token-ttl-seconds: 60'
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    56  100    56    0     0  56000      0 --:--:-- --:--:-- --:--:-- 56000
+ TOKEN=xxxxx
++ curl -H 'X-aws-ec2-metadata-token: xxxx' http://169.254.169.254/latest/meta-data/local-ipv4
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    10  100    10    0     0  10000      0 --:--:-- --:--:-- --:--:-- 10000
+ HOST_IP=10.63.7.25
++ ip -4 -o a
++ grep 10.63.7.25
++ awk '{print $2}'
+ PRIMARY_IF='eth0
eth1
eth2'
+ sysctl -w 'net.ipv4.conf.eth0
eth1
eth2.rp_filter=2'
sysctl: cannot stat /proc/sys/net/ipv4/conf/eth0
eth1
eth2/rp_filter: No such file or directory

From around 50 nodes there was a single node that had this issue.

System Info:
  Machine ID:                 ec2808bea1300938b8f094dc685471a3
  System UUID:                EC2808BE-A130-0938-B8F0-94DC685471A3
  Boot ID:                    5fef8bf0-fedb-49a0-be74-d88fa9de552a
  Kernel Version:             4.14.193-149.317.amzn2.x86_64
  OS Image:                   Amazon Linux 2
  Operating System:           linux
  Architecture:               amd64
  Container Runtime Version:  docker://19.3.6
  Kubelet Version:            v1.17.9-eks-4c6976
  Kube-Proxy Version:         v1.17.9-eks-4c6976
ProviderID:                   aws:///us-east-1f/i-0115cbe4679fb31d2

machine is: m5.xlarge

What you expected to happen:

I would expect all nodes to behave similar, I think some case was not handled. When checking ENI interfaces, there were 3 attached to it, similar to other nodes aswell.

How to reproduce it (as minimally and precisely as possible):
Not sure how to reproduce, since from around 50 nodes, it only happened on 1.

@s4mur4i s4mur4i added the bug label Oct 6, 2020
@SaranBalaji90
Copy link
Contributor

Thank you for reporting the issue. #1247 should address this.

@jayanthvn
Copy link
Contributor

Hi @s4mur4i

#1247 is targeted for 1.7.5 release and it should be done this week.

Thanks.

@jayanthvn jayanthvn self-assigned this Oct 7, 2020
@jayanthvn
Copy link
Contributor

@s4mur4i - 1.7.5 release - https://github.com/aws/amazon-vpc-cni-k8s/releases/tag/v1.7.5 was done yesterday which has the fix for this issue. Kindly try it out. Closing this issue for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants