Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] agent does not add default route #1439

Open
akshaymoghe opened this issue Jan 9, 2019 · 1 comment

Comments

Projects
None yet
4 participants
@akshaymoghe
Copy link

commented Jan 9, 2019

Describe the bug

Adding the walinuxagent into our linux vm image does not help it come up with a working network.

Additional info/background
I'm trying to import a linux VM into Azure, and so far I'm able to import the disk and create a working (bootup succesful) VM off it. However, the VM comes up and is not SSH'able . On further inspection, is looks like this is due to a missing default route. When the VM comes up, the only routes are:

10.0.0.0/24 dev eth0  proto kernel  scope link  src 10.0.0.17 
172.17.0.0/16 dev docker0  proto kernel  scope link  src 172.17.0.1 

Of course, please ignore the docker route (that is added by the docker daemon that runs in the instance). Clearly, there is no default route seen. If I manually add the default route, things work fine.

On further reading, it appears as though the walinuxagent is responsible for some setup of networking on the VMs. So I went ahead and added the debian pkg (downloaded from azure.archive.ubuntu.com) and also set it up so that its dependencies were met.

I went ahead and disabled all the knobs in its config (/etc/waagent.conf) so that hopefully I'm not off chasing other issues. After that, I rebooted and the agent logs now have:

-- Logs begin at Wed 2019-01-09 03:25:26 UTC, end at Wed 2019-01-09 03:27:34 UTC. --
Jan 09 03:25:29 nsappliance systemd[1]: Started Azure Linux Agent.
Jan 09 03:25:29 nsappliance systemd[1104]: walinuxagent.service: Executing: /usr/bin/python3 -u /usr/sbin/waagent -daemon
Jan 09 03:25:31 nsappliance python3[1104]: 2019/01/09 03:25:31.021720 INFO Daemon Azure Linux Agent Version:2.2.32.2
Jan 09 03:25:31 nsappliance python3[1104]: 2019/01/09 03:25:31.030640 INFO Daemon OS: ubuntu 16.04
Jan 09 03:25:31 nsappliance python3[1104]: 2019/01/09 03:25:31.037715 INFO Daemon Python: 3.5.2
Jan 09 03:25:31 nsappliance python3[1104]: 2019/01/09 03:25:31.045196 INFO Daemon Creating cgroup directory /sys/fs/cgroup/cpu/system.slice/walinuxagent
Jan 09 03:25:31 nsappliance python3[1104]: 2019/01/09 03:25:31.054973 INFO Daemon Creating cgroup directory /sys/fs/cgroup/memory/system.slice/walinuxagent
Jan 09 03:25:31 nsappliance python3[1104]: 2019/01/09 03:25:31.065903 INFO Daemon Add daemon process pid 1104 to walinuxagent systemd cgroup
Jan 09 03:25:31 nsappliance python3[1104]: 2019/01/09 03:25:31.075529 INFO Daemon CGroups: ok
Jan 09 03:25:31 nsappliance python3[1104]: 2019/01/09 03:25:31.083154 INFO Daemon Run daemon
Jan 09 03:25:31 nsappliance python3[1104]: 2019/01/09 03:25:31.096632 INFO Daemon Clean protocol
Jan 09 03:25:31 nsappliance python3[1104]: 2019/01/09 03:25:31.105304 INFO Daemon Provisioning is disabled, skipping.
Jan 09 03:25:31 nsappliance python3[1104]: 2019/01/09 03:25:31.114230 INFO Daemon Detect protocol endpoints
Jan 09 03:25:31 nsappliance python3[1104]: 2019/01/09 03:25:31.121332 INFO Daemon Clean protocol
Jan 09 03:25:31 nsappliance python3[1104]: 2019/01/09 03:25:31.127522 INFO Daemon WireServer endpoint is not found. Rerun dhcp handler
Jan 09 03:25:31 nsappliance python3[1104]: 2019/01/09 03:25:31.135785 INFO Daemon Test for route to 168.63.129.16
Jan 09 03:25:31 nsappliance python3[1104]: 2019/01/09 03:25:31.143453 WARNING Daemon No route exists to 168.63.129.16
Jan 09 03:25:31 nsappliance python3[1104]: 2019/01/09 03:25:31.152797 INFO Daemon Checking for dhcp lease cache
Jan 09 03:25:31 nsappliance python3[1104]: 2019/01/09 03:25:31.161211 INFO Daemon looking for leases in path [/var/lib/dhcp/dhclient.*.leases]
Jan 09 03:25:31 nsappliance python3[1104]: 2019/01/09 03:25:31.173750 INFO Daemon dhcp entry:168.63.129.16, 245:False, expired:False
Jan 09 03:25:31 nsappliance python3[1104]: 2019/01/09 03:25:31.182963 INFO Daemon found endpoint [168.63.129.16]
Jan 09 03:25:31 nsappliance python3[1104]: 2019/01/09 03:25:31.190511 INFO Daemon cached endpoint found [168.63.129.16]
Jan 09 03:25:31 nsappliance python3[1104]: 2019/01/09 03:25:31.197709 INFO Daemon Cache exists [True]
Jan 09 03:25:31 nsappliance python3[1104]: 2019/01/09 03:25:31.204867 INFO Daemon Wire server endpoint:168.63.129.16
Jan 09 03:26:02 nsappliance python3[1104]: 2019/01/09 03:26:02.243364 INFO Daemon WireServer is not responding. Reset endpoint
Jan 09 03:26:02 nsappliance python3[1104]: 2019/01/09 03:26:02.243782 INFO Daemon Protocol endpoint not found: WireProtocol, [ProtocolError] [Wireserver Exception] [HttpError] [HTTP Failed] GET http://168
Jan 09 03:26:33 nsappliance python3[1104]: 2019/01/09 03:26:33.276685 INFO Daemon Protocol endpoint not found: MetadataProtocol, [ProtocolError] [HttpError] [HTTP Failed] GET http://169.254.169.254/Micros
Jan 09 03:26:37 nsappliance python3[1104]: 2019/01/09 03:26:37.288913 INFO Daemon Retry detect protocols: retry=0
Jan 09 03:26:47 nsappliance python3[1104]: 2019/01/09 03:26:47.299566 INFO Daemon WireServer endpoint is not found. Rerun dhcp handler
Jan 09 03:26:47 nsappliance python3[1104]: 2019/01/09 03:26:47.299963 INFO Daemon Test for route to 168.63.129.16
Jan 09 03:26:47 nsappliance python3[1104]: 2019/01/09 03:26:47.301780 WARNING Daemon No route exists to 168.63.129.16
Jan 09 03:26:47 nsappliance python3[1104]: 2019/01/09 03:26:47.305556 INFO Daemon Send dhcp request
Jan 09 03:26:47 nsappliance python3[1104]: 2019/01/09 03:26:47.306495 INFO Daemon Examine /proc/net/route for primary interface
Jan 09 03:26:47 nsappliance python3[1104]: 2019/01/09 03:26:47.307482 WARNING Daemon Could not determine primary interface, please ensure /proc/net/route is correct
Jan 09 03:26:47 nsappliance python3[1104]: 2019/01/09 03:26:47.309021 WARNING Daemon Contents of /proc/net/route:
Jan 09 03:26:47 nsappliance python3[1104]: Iface        Destination        Gateway         Flags        RefCnt        Use        Metric        Mask                MTU        Window        IRTT
Jan 09 03:26:47 nsappliance python3[1104]: eth0        0000000A        00000000        0001        0        0        0        00FFFFFF        0        0        0
Jan 09 03:26:47 nsappliance python3[1104]: docker0        000011AC        00000000        0001        0        0        0        0000FFFF        0        0        0
Jan 09 03:26:47 nsappliance python3[1104]: 2019/01/09 03:26:47.310744 WARNING Daemon Primary interface examination will retry silently

The lease file contains:

lease {
  interface "eth0";
  fixed-address 10.0.0.17;
  server-name "BL200106030836";
  option subnet-mask 255.255.255.0;
  option dhcp-lease-time 4294967295;
  option routers 10.0.0.1;
  option dhcp-message-type 5;
  option dhcp-server-identifier 168.63.129.16;
  option domain-name-servers 168.63.129.16;
  option dhcp-renewal-time 4294967295;
  option rfc3442-classless-static-routes 0,10,0,0,1,32,168,63,129,16,10,0,0,1,32,169,254,169,254,10,0,0,1;
  option unknown-245 a8:3f:81:10;
  option dhcp-rebinding-time 4294967295;
  option domain-name "rasxdctc1x0erekicyoeqrqrue.bx.internal.cloudapp.net";
  renew 6 2155/02/15 09:53:43;
  rebind 6 2155/02/15 09:53:43;
  expire 6 2155/02/15 09:53:43;
}

Now, it appears as though the walinux agent itself relies on connectivity to the server in the lease file. So I'm not sure what is causing this cyclic dependency (walinux needs network, but walinux is responsible for setting up default route so that network works correctly?)

Help understanding this would be much appreciated.

@akshaymoghe

This comment has been minimized.

Copy link
Author

commented Jan 9, 2019

FTR - all items in the config file are set to =n , so as to disable them

@narrieta narrieta added the consider label Jun 5, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.