New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bootcode.bin randomly doesn't PXE boot correctly. #764

Closed
ali1234 opened this Issue Mar 15, 2017 · 31 comments

Comments

Projects
None yet
7 participants
@ali1234

ali1234 commented Mar 15, 2017

I am using the latest version of bootcode.bin: https://github.com/raspberrypi/firmware/blob/f85646a8831d9579c2a745478149598da1ecfde5/boot/bootcode.bin

It is the only file on my SD card. I am using a Raspberry Pi 3.

Sometimes (but rarely) PXE boot works and sometimes it does not. I have to power cycle the Pi several times to make it boot.

Looking at the failed tcpdump log you can see that dnsmasq is replying to the boot request, but the Pi ignores it and sends another, for a total of 5 requests. Then it tries to request tftp files from 0.0.0.0.

dnsmasq log of failed session:

dnsmasq-dhcp: 653460281 available DHCP subnet: xxx.xxx.xxx.255/255.255.255.0
dnsmasq-dhcp: 653460281 vendor class: PXEClient:Arch:00000:UNDI:002001
dnsmasq-dhcp: 653460281 PXE(enp0s31f6) b8:27:eb:xx:xx:xx proxy
dnsmasq-dhcp: 653460281 tags: enp0s31f6
dnsmasq-dhcp: 653460281 broadcast response
dnsmasq-dhcp: 653460281 sent size:  1 option: 53 message-type  2
dnsmasq-dhcp: 653460281 sent size:  4 option: 54 server-identifier  xxx.xxx.xxx.5
dnsmasq-dhcp: 653460281 sent size:  9 option: 60 vendor-class  50:58:xx:xx:xx:xx:xx:xx:xx
dnsmasq-dhcp: 653460281 sent size: 17 option: 97 client-machine-id  00:f5:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx...
dnsmasq-dhcp: 653460281 sent size: 32 option: 43 vendor-encap  06:01:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx...
dnsmasq-dhcp: 653460281 available DHCP subnet: xxx.xxx.xxx.255/255.255.255.0
dnsmasq-dhcp: 653460281 vendor class: PXEClient:Arch:00000:UNDI:002001
dnsmasq-dhcp: 653460281 PXE(enp0s31f6) b8:27:eb:xx:xx:xx proxy
dnsmasq-dhcp: 653460281 tags: enp0s31f6
dnsmasq-dhcp: 653460281 broadcast response
dnsmasq-dhcp: 653460281 sent size:  1 option: 53 message-type  2
dnsmasq-dhcp: 653460281 sent size:  4 option: 54 server-identifier  xxx.xxx.xxx.5
dnsmasq-dhcp: 653460281 sent size:  9 option: 60 vendor-class  50:58:xx:xx:xx:xx:xx:xx:xx
dnsmasq-dhcp: 653460281 sent size: 17 option: 97 client-machine-id  00:f5:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx...
dnsmasq-dhcp: 653460281 sent size: 32 option: 43 vendor-encap  06:01:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx...
dnsmasq-dhcp: 653460281 available DHCP subnet: xxx.xxx.xxx.255/255.255.255.0
dnsmasq-dhcp: 653460281 vendor class: PXEClient:Arch:00000:UNDI:002001
dnsmasq-dhcp: 653460281 PXE(enp0s31f6) b8:27:eb:xx:xx:xx proxy
dnsmasq-dhcp: 653460281 tags: enp0s31f6
dnsmasq-dhcp: 653460281 broadcast response
dnsmasq-dhcp: 653460281 sent size:  1 option: 53 message-type  2
dnsmasq-dhcp: 653460281 sent size:  4 option: 54 server-identifier  xxx.xxx.xxx.5
dnsmasq-dhcp: 653460281 sent size:  9 option: 60 vendor-class  50:58:xx:xx:xx:xx:xx:xx:xx
dnsmasq-dhcp: 653460281 sent size: 17 option: 97 client-machine-id  00:f5:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx...
dnsmasq-dhcp: 653460281 sent size: 32 option: 43 vendor-encap  06:01:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx...
dnsmasq-dhcp: 653460281 available DHCP subnet: xxx.xxx.xxx.255/255.255.255.0
dnsmasq-dhcp: 653460281 vendor class: PXEClient:Arch:00000:UNDI:002001
dnsmasq-dhcp: 653460281 PXE(enp0s31f6) b8:27:eb:xx:xx:xx proxy
dnsmasq-dhcp: 653460281 tags: enp0s31f6
dnsmasq-dhcp: 653460281 broadcast response
dnsmasq-dhcp: 653460281 sent size:  1 option: 53 message-type  2
dnsmasq-dhcp: 653460281 sent size:  4 option: 54 server-identifier  xxx.xxx.xxx.5
dnsmasq-dhcp: 653460281 sent size:  9 option: 60 vendor-class  50:58:xx:xx:xx:xx:xx:xx:xx
dnsmasq-dhcp: 653460281 sent size: 17 option: 97 client-machine-id  00:f5:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx...
dnsmasq-dhcp: 653460281 sent size: 32 option: 43 vendor-encap  06:01:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx...
dnsmasq-dhcp: 653460281 available DHCP subnet: xxx.xxx.xxx.255/255.255.255.0
dnsmasq-dhcp: 653460281 vendor class: PXEClient:Arch:00000:UNDI:002001
dnsmasq-dhcp: 653460281 PXE(enp0s31f6) b8:27:eb:xx:xx:xx proxy
dnsmasq-dhcp: 653460281 tags: enp0s31f6
dnsmasq-dhcp: 653460281 broadcast response
dnsmasq-dhcp: 653460281 sent size:  1 option: 53 message-type  2
dnsmasq-dhcp: 653460281 sent size:  4 option: 54 server-identifier  xxx.xxx.xxx.5
dnsmasq-dhcp: 653460281 sent size:  9 option: 60 vendor-class  50:58:xx:xx:xx:xx:xx:xx:xx
dnsmasq-dhcp: 653460281 sent size: 17 option: 97 client-machine-id  00:f5:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx...
dnsmasq-dhcp: 653460281 sent size: 32 option: 43 vendor-encap  06:01:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx...

tcpdump port tftp or port bootpc from failed session:

07:46:27.892426 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from b8:27:eb:xx:xx:xx (oui Unknown), length 322
E..^......9..........D.C.J......&..9.....................'..............................................................................................................................................................................................................c.Sc5..7.+<C........B..]...^....a..................< PXEClient:Arch:00000:UNDI:002001.
07:46:27.892915 IP pxeserver.lan.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 314
E..V....@............C.D.B......&..9.....................'..............................................................................................................................................................................................................c.Sc5..6.....<	PXEClienta..................+ ...
..PXE	....Raspberry Pi Boot..
07:46:32.893294 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from b8:27:eb:xx:xx:xx (oui Unknown), length 322
E..^......9..........D.C.J......&..9.....................'..............................................................................................................................................................................................................c.Sc5..7.+<C........B..]...^....a..................< PXEClient:Arch:00000:UNDI:002001.
07:46:32.893647 IP pxeserver.lan.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 314
E..V....@..v.........C.D.B......&..9.....................'..............................................................................................................................................................................................................c.Sc5..6.....<	PXEClienta..................+ ...
..PXE	....Raspberry Pi Boot..
07:46:38.640376 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from b8:27:eb:xx:xx:xx (oui Unknown), length 322
E..^......9..........D.C.J......&..9.....................'..............................................................................................................................................................................................................c.Sc5..7.+<C........B..]...^....a..................< PXEClient:Arch:00000:UNDI:002001.
07:46:38.640795 IP pxeserver.lan.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 314
E..V.
..@.. .........C.D.B......&..9.....................'..............................................................................................................................................................................................................c.Sc5..6.....<	PXEClienta..................+ ...
..PXE	....Raspberry Pi Boot..
07:46:44.639980 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from b8:27:eb:xx:xx:xx (oui Unknown), length 322
E..^......9..........D.C.J......&..9.....................'..............................................................................................................................................................................................................c.Sc5..7.+<C........B..]...^....a..................< PXEClient:Arch:00000:UNDI:002001.
07:46:44.640339 IP pxeserver.lan.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 314
E..V.a..@............C.D.B......&..9.....................'..............................................................................................................................................................................................................c.Sc5..6.....<	PXEClienta..................+ ...
..PXE	....Raspberry Pi Boot..
07:46:50.639971 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from b8:27:eb:xx:xx:xx (oui Unknown), length 322
E..^......9..........D.C.J......&..9.....................'..............................................................................................................................................................................................................c.Sc5..7.+<C........B..]...^....a..................< PXEClient:Arch:00000:UNDI:002001.
07:46:50.640411 IP pxeserver.lan.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 314
E..V.b..@............C.D.B......&..9.....................'..............................................................................................................................................................................................................c.Sc5..6.....<	PXEClienta..................+ ...
..PXE	....Raspberry Pi Boot..
07:46:56.657068 IP 0.0.0.0.49153 > 0.0.0.0.tftp:  29 RRQ "autoboot.txt" octet tsize 0
E..9......:............E.%....autoboot.txt.octet.tsize.0.
07:46:57.657080 IP 0.0.0.0.49154 > 0.0.0.0.tftp:  27 RRQ "config.txt" octet tsize 0
E..7......:............E.#....config.txt.octet.tsize.0.
07:46:58.657351 IP 0.0.0.0.49155 > 0.0.0.0.tftp:  29 RRQ "recovery.elf" octet tsize 0
E..9......:............E.%....recovery.elf.octet.tsize.0.
07:46:59.657412 IP 0.0.0.0.49156 > 0.0.0.0.tftp:  26 RRQ "start.elf" octet tsize 0
E..6......:............E."....start.elf.octet.tsize.0.
07:47:00.773516 IP 0.0.0.0.49157 > 0.0.0.0.tftp:  26 RRQ "fixup.dat" octet tsize 0
E..6......:............E."....fixup.dat.octet.tsize.0.
@ghollingworth

This comment has been minimized.

Show comment
Hide comment
@ghollingworth

ghollingworth Mar 15, 2017

You're using a proxy server for the DHCP proxy and TFTP boot, do you also have a standard DHCP server as well replying with an IP address?

From the output it looks like you've got everything set up correctly (option 43 looks fine), it will only continue if it finds both an option 43 which tells it that the dhcp server is also going to serve the files and it has an IP address. From the information above no IP address has been offered.

Gordon

ghollingworth commented Mar 15, 2017

You're using a proxy server for the DHCP proxy and TFTP boot, do you also have a standard DHCP server as well replying with an IP address?

From the output it looks like you've got everything set up correctly (option 43 looks fine), it will only continue if it finds both an option 43 which tells it that the dhcp server is also going to serve the files and it has an IP address. From the information above no IP address has been offered.

Gordon

@ali1234

This comment has been minimized.

Show comment
Hide comment
@ali1234

ali1234 Mar 15, 2017

My ADSL router is a DHCP server, yes.

ali1234 commented Mar 15, 2017

My ADSL router is a DHCP server, yes.

@ghollingworth

This comment has been minimized.

Show comment
Hide comment
@ghollingworth

ghollingworth Mar 15, 2017

So can you dump that DHCP response as well?

ghollingworth commented Mar 15, 2017

So can you dump that DHCP response as well?

@ali1234

This comment has been minimized.

Show comment
Hide comment
@ali1234

ali1234 Mar 15, 2017

I ran "sudo tcpdump -A -i eth0 port tftp or port bootpc or port bootps or port 546 or port 547"

I did not capture any DHCP requests from the pxeserver to the main router. Output was the same as before.

ali1234 commented Mar 15, 2017

I ran "sudo tcpdump -A -i eth0 port tftp or port bootpc or port bootps or port 546 or port 547"

I did not capture any DHCP requests from the pxeserver to the main router. Output was the same as before.

@ghollingworth

This comment has been minimized.

Show comment
Hide comment
@ghollingworth

ghollingworth Mar 15, 2017

To debug this what I would do is to use the managed switch I have on my desk to mirror all traffic to a port with a Raspberry Pi on it... I'm wondering whether it's a STP problem with the switch in the router?

One thing we're going to do soon is to add the ability to provide serial debug from the bootcode, which should help with this...

Gordon

ghollingworth commented Mar 15, 2017

To debug this what I would do is to use the managed switch I have on my desk to mirror all traffic to a port with a Raspberry Pi on it... I'm wondering whether it's a STP problem with the switch in the router?

One thing we're going to do soon is to add the ability to provide serial debug from the bootcode, which should help with this...

Gordon

@ali1234

This comment has been minimized.

Show comment
Hide comment
@ali1234

ali1234 Mar 15, 2017

I have a dump from the router itself now:

non working:

08:42:03.321057 IP pxeserver.lan.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 314
08:42:08.321401 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from b8:27:eb:xx:xx:xx (oui Unknown), length 322
08:42:08.322173 IP pxeserver.lan.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 314
08:42:11.069915 IP router.lan.bootps > xxx.xxx.xxx.113.bootpc: BOOTP/DHCP, Reply, length 343
08:42:14.070079 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from b8:27:eb:xx:xx:xx (oui Unknown), length 322
08:42:14.070707 IP pxeserver.lan.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 314
08:42:14.071108 IP router.lan.bootps > xxx.xxx.xxx.113.bootpc: BOOTP/DHCP, Reply, length 343
08:42:20.069788 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from b8:27:eb:xx:xx:xx (oui Unknown), length 322
08:42:20.070493 IP pxeserver.lan.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 314
08:42:20.070818 IP router.lan.bootps > xxx.xxx.xxx.113.bootpc: BOOTP/DHCP, Reply, length 343
08:42:25.069862 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from b8:27:eb:xx:xx:xx (oui Unknown), length 322
08:42:25.070624 IP pxeserver.lan.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 314
08:42:25.070893 IP router.lan.bootps > xxx.xxx.xxx.113.bootpc: BOOTP/DHCP, Reply, length 343
08:42:31.086955 IP 0.0.0.0.49153 > 0.0.0.0.tftp:  29 RRQ "autoboot.txt" octet tsize 0
08:42:32.086983 IP 0.0.0.0.49154 > 0.0.0.0.tftp:  27 RRQ "config.txt" octet tsize 0
08:42:33.087249 IP 0.0.0.0.49155 > 0.0.0.0.tftp:  29 RRQ "recovery.elf" octet tsize 0
08:42:34.087316 IP 0.0.0.0.49156 > 0.0.0.0.tftp:  26 RRQ "start.elf" octet tsize 0
08:42:35.087460 IP 0.0.0.0.49157 > 0.0.0.0.tftp:  26 RRQ "fixup.dat" octet tsize 0

ali1234 commented Mar 15, 2017

I have a dump from the router itself now:

non working:

08:42:03.321057 IP pxeserver.lan.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 314
08:42:08.321401 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from b8:27:eb:xx:xx:xx (oui Unknown), length 322
08:42:08.322173 IP pxeserver.lan.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 314
08:42:11.069915 IP router.lan.bootps > xxx.xxx.xxx.113.bootpc: BOOTP/DHCP, Reply, length 343
08:42:14.070079 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from b8:27:eb:xx:xx:xx (oui Unknown), length 322
08:42:14.070707 IP pxeserver.lan.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 314
08:42:14.071108 IP router.lan.bootps > xxx.xxx.xxx.113.bootpc: BOOTP/DHCP, Reply, length 343
08:42:20.069788 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from b8:27:eb:xx:xx:xx (oui Unknown), length 322
08:42:20.070493 IP pxeserver.lan.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 314
08:42:20.070818 IP router.lan.bootps > xxx.xxx.xxx.113.bootpc: BOOTP/DHCP, Reply, length 343
08:42:25.069862 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from b8:27:eb:xx:xx:xx (oui Unknown), length 322
08:42:25.070624 IP pxeserver.lan.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 314
08:42:25.070893 IP router.lan.bootps > xxx.xxx.xxx.113.bootpc: BOOTP/DHCP, Reply, length 343
08:42:31.086955 IP 0.0.0.0.49153 > 0.0.0.0.tftp:  29 RRQ "autoboot.txt" octet tsize 0
08:42:32.086983 IP 0.0.0.0.49154 > 0.0.0.0.tftp:  27 RRQ "config.txt" octet tsize 0
08:42:33.087249 IP 0.0.0.0.49155 > 0.0.0.0.tftp:  29 RRQ "recovery.elf" octet tsize 0
08:42:34.087316 IP 0.0.0.0.49156 > 0.0.0.0.tftp:  26 RRQ "start.elf" octet tsize 0
08:42:35.087460 IP 0.0.0.0.49157 > 0.0.0.0.tftp:  26 RRQ "fixup.dat" octet tsize 0
@ali1234

This comment has been minimized.

Show comment
Hide comment
@ali1234

ali1234 Mar 15, 2017

Someone just turned on a Windows laptop on the network and now it started working correctly:

dnsmasq:

dnsmasq-dhcp: 653460281 available DHCP subnet: xxx.xxx.xxx.255/255.255.255.0
dnsmasq-dhcp: 653460281 vendor class: PXEClient:Arch:00000:UNDI:002001
dnsmasq-dhcp: 653460281 PXE(enp0s31f6) b8:27:eb:xx:xx:xx proxy
dnsmasq-dhcp: 653460281 tags: enp0s31f6
dnsmasq-dhcp: 653460281 broadcast response
dnsmasq-dhcp: 653460281 sent size:  1 option: 53 message-type  2
dnsmasq-dhcp: 653460281 sent size:  4 option: 54 server-identifier  xxx.xxx.xxx.5
dnsmasq-dhcp: 653460281 sent size:  9 option: 60 vendor-class  50:58:xx:xx:xx:xx:xx:xx:xx
dnsmasq-dhcp: 653460281 sent size: 17 option: 97 client-machine-id  00:f5:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx...
dnsmasq-dhcp: 653460281 sent size: 32 option: 43 vendor-encap  06:01:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx...
dnsmasq-dhcp: 653460281 available DHCP subnet: xxx.xxx.xxx.255/255.255.255.0
dnsmasq-dhcp: 653460281 vendor class: PXEClient:Arch:00000:UNDI:002001
dnsmasq-dhcp: 653460281 PXE(enp0s31f6) b8:27:eb:xx:xx:xx proxy
dnsmasq-dhcp: 653460281 tags: enp0s31f6
dnsmasq-dhcp: 653460281 broadcast response
dnsmasq-dhcp: 653460281 sent size:  1 option: 53 message-type  2
dnsmasq-dhcp: 653460281 sent size:  4 option: 54 server-identifier  xxx.xxx.xxx.5
dnsmasq-dhcp: 653460281 sent size:  9 option: 60 vendor-class  50:58:xx:xx:xx:xx:xx:xx:xx
dnsmasq-dhcp: 653460281 sent size: 17 option: 97 client-machine-id  00:f5:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx...
dnsmasq-dhcp: 653460281 sent size: 32 option: 43 vendor-encap  06:01:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx...
dnsmasq-tftp: file /home/al/pi-tftp/tftpboot/autoboot.txt not found
dnsmasq-tftp: sent /home/al/pi-tftp/tftpboot/config.txt to xxx.xxx.xxx.114
dnsmasq-tftp: file /home/al/pi-tftp/tftpboot/recovery.elf not found
dnsmasq-tftp: sent /home/al/pi-tftp/tftpboot/start.elf to xxx.xxx.xxx.114
dnsmasq-tftp: sent /home/al/pi-tftp/tftpboot/fixup.dat to xxx.xxx.xxx.114
dnsmasq-tftp: file /home/al/pi-tftp/tftpboot/recovery.elf not found
dnsmasq-tftp: sent /home/al/pi-tftp/tftpboot/config.txt to xxx.xxx.xxx.114
dnsmasq-tftp: file /home/al/pi-tftp/tftpboot/dt-blob.bin not found
dnsmasq-tftp: file /home/al/pi-tftp/tftpboot/recovery.elf not found
dnsmasq-tftp: sent /home/al/pi-tftp/tftpboot/config.txt to xxx.xxx.xxx.114
dnsmasq-tftp: file /home/al/pi-tftp/tftpboot/bootcfg.txt not found
dnsmasq-tftp: sent /home/al/pi-tftp/tftpboot/cmdline.txt to xxx.xxx.xxx.114
dnsmasq-tftp: file /home/al/pi-tftp/tftpboot/recovery8.img not found
dnsmasq-tftp: file /home/al/pi-tftp/tftpboot/recovery8-32.img not found
dnsmasq-tftp: file /home/al/pi-tftp/tftpboot/recovery7.img not found
dnsmasq-tftp: file /home/al/pi-tftp/tftpboot/recovery.img not found
dnsmasq-tftp: file /home/al/pi-tftp/tftpboot/kernel8.img not found
dnsmasq-tftp: file /home/al/pi-tftp/tftpboot/kernel8-32.img not found
dnsmasq-tftp: file /home/al/pi-tftp/tftpboot/armstub8.bin not found
dnsmasq-tftp: error 0 Early terminate received from xxx.xxx.xxx.114
dnsmasq-tftp: failed sending /home/al/pi-tftp/tftpboot/kernel7.img to xxx.xxx.xxx.114
dnsmasq-tftp: file /home/al/pi-tftp/tftpboot/armstub8-32.bin not found
dnsmasq-tftp: file /home/al/pi-tftp/tftpboot/armstub7.bin not found
dnsmasq-tftp: file /home/al/pi-tftp/tftpboot/armstub.bin not found
dnsmasq-tftp: sent /home/al/pi-tftp/tftpboot/kernel7.img to xxx.xxx.xxx.114

router, tcpdump -i br0 port tftp or port bootpc or port bootps or port 546 or port 547:

09:00:40.285719 IP pxeserver.lan.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 314
09:00:45.725710 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from b8:27:eb:xx:xx:xx (oui Unknown), length 322
09:00:45.726339 IP pxeserver.lan.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 314
09:00:45.727086 IP router.lan.bootps > rootfs.lan.bootpc: BOOTP/DHCP, Reply, length 343

pxeserver, tcpdump -A -i enp0s31f6 port tftp or port bootpc or port bootps or port 546 or port 547:

09:00:38.856729 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from b8:27:eb:xx:xx:xx (oui Unknown), length 322
E..^......9..........D.C.J......&..9.....................'..............................................................................................................................................................................................................c.Sc5..7.+<C........B..]...^....a..................< PXEClient:Arch:00000:UNDI:002001.
09:00:38.857410 IP pxeserver.lan.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 314
E..VB...@.uG.........C.D.B......&..9.....................'..............................................................................................................................................................................................................c.Sc5..6.....<	PXEClienta..................+ ...
..PXE	....Raspberry Pi Boot..
09:00:44.297417 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from b8:27:eb:xx:xx:xx (oui Unknown), length 322
E..^......9..........D.C.J......&..9.....................'..............................................................................................................................................................................................................c.Sc5..7.+<C........B..]...^....a..................< PXEClient:Arch:00000:UNDI:002001.
09:00:44.297964 IP pxeserver.lan.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 314
E..VD5..@.s..........C.D.B......&..9.....................'..............................................................................................................................................................................................................c.Sc5..6.....<	PXEClienta..................+ ...
..PXE	....Raspberry Pi Boot..
09:00:44.316425 IP rootfs.lan.49153 > pxeserver.lan.tftp:  29 RRQ "autoboot.txt" octet tsize 0
E..9...........r.......E.%....autoboot.txt.octet.tsize.0.

ali1234 commented Mar 15, 2017

Someone just turned on a Windows laptop on the network and now it started working correctly:

dnsmasq:

dnsmasq-dhcp: 653460281 available DHCP subnet: xxx.xxx.xxx.255/255.255.255.0
dnsmasq-dhcp: 653460281 vendor class: PXEClient:Arch:00000:UNDI:002001
dnsmasq-dhcp: 653460281 PXE(enp0s31f6) b8:27:eb:xx:xx:xx proxy
dnsmasq-dhcp: 653460281 tags: enp0s31f6
dnsmasq-dhcp: 653460281 broadcast response
dnsmasq-dhcp: 653460281 sent size:  1 option: 53 message-type  2
dnsmasq-dhcp: 653460281 sent size:  4 option: 54 server-identifier  xxx.xxx.xxx.5
dnsmasq-dhcp: 653460281 sent size:  9 option: 60 vendor-class  50:58:xx:xx:xx:xx:xx:xx:xx
dnsmasq-dhcp: 653460281 sent size: 17 option: 97 client-machine-id  00:f5:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx...
dnsmasq-dhcp: 653460281 sent size: 32 option: 43 vendor-encap  06:01:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx...
dnsmasq-dhcp: 653460281 available DHCP subnet: xxx.xxx.xxx.255/255.255.255.0
dnsmasq-dhcp: 653460281 vendor class: PXEClient:Arch:00000:UNDI:002001
dnsmasq-dhcp: 653460281 PXE(enp0s31f6) b8:27:eb:xx:xx:xx proxy
dnsmasq-dhcp: 653460281 tags: enp0s31f6
dnsmasq-dhcp: 653460281 broadcast response
dnsmasq-dhcp: 653460281 sent size:  1 option: 53 message-type  2
dnsmasq-dhcp: 653460281 sent size:  4 option: 54 server-identifier  xxx.xxx.xxx.5
dnsmasq-dhcp: 653460281 sent size:  9 option: 60 vendor-class  50:58:xx:xx:xx:xx:xx:xx:xx
dnsmasq-dhcp: 653460281 sent size: 17 option: 97 client-machine-id  00:f5:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx...
dnsmasq-dhcp: 653460281 sent size: 32 option: 43 vendor-encap  06:01:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx...
dnsmasq-tftp: file /home/al/pi-tftp/tftpboot/autoboot.txt not found
dnsmasq-tftp: sent /home/al/pi-tftp/tftpboot/config.txt to xxx.xxx.xxx.114
dnsmasq-tftp: file /home/al/pi-tftp/tftpboot/recovery.elf not found
dnsmasq-tftp: sent /home/al/pi-tftp/tftpboot/start.elf to xxx.xxx.xxx.114
dnsmasq-tftp: sent /home/al/pi-tftp/tftpboot/fixup.dat to xxx.xxx.xxx.114
dnsmasq-tftp: file /home/al/pi-tftp/tftpboot/recovery.elf not found
dnsmasq-tftp: sent /home/al/pi-tftp/tftpboot/config.txt to xxx.xxx.xxx.114
dnsmasq-tftp: file /home/al/pi-tftp/tftpboot/dt-blob.bin not found
dnsmasq-tftp: file /home/al/pi-tftp/tftpboot/recovery.elf not found
dnsmasq-tftp: sent /home/al/pi-tftp/tftpboot/config.txt to xxx.xxx.xxx.114
dnsmasq-tftp: file /home/al/pi-tftp/tftpboot/bootcfg.txt not found
dnsmasq-tftp: sent /home/al/pi-tftp/tftpboot/cmdline.txt to xxx.xxx.xxx.114
dnsmasq-tftp: file /home/al/pi-tftp/tftpboot/recovery8.img not found
dnsmasq-tftp: file /home/al/pi-tftp/tftpboot/recovery8-32.img not found
dnsmasq-tftp: file /home/al/pi-tftp/tftpboot/recovery7.img not found
dnsmasq-tftp: file /home/al/pi-tftp/tftpboot/recovery.img not found
dnsmasq-tftp: file /home/al/pi-tftp/tftpboot/kernel8.img not found
dnsmasq-tftp: file /home/al/pi-tftp/tftpboot/kernel8-32.img not found
dnsmasq-tftp: file /home/al/pi-tftp/tftpboot/armstub8.bin not found
dnsmasq-tftp: error 0 Early terminate received from xxx.xxx.xxx.114
dnsmasq-tftp: failed sending /home/al/pi-tftp/tftpboot/kernel7.img to xxx.xxx.xxx.114
dnsmasq-tftp: file /home/al/pi-tftp/tftpboot/armstub8-32.bin not found
dnsmasq-tftp: file /home/al/pi-tftp/tftpboot/armstub7.bin not found
dnsmasq-tftp: file /home/al/pi-tftp/tftpboot/armstub.bin not found
dnsmasq-tftp: sent /home/al/pi-tftp/tftpboot/kernel7.img to xxx.xxx.xxx.114

router, tcpdump -i br0 port tftp or port bootpc or port bootps or port 546 or port 547:

09:00:40.285719 IP pxeserver.lan.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 314
09:00:45.725710 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from b8:27:eb:xx:xx:xx (oui Unknown), length 322
09:00:45.726339 IP pxeserver.lan.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 314
09:00:45.727086 IP router.lan.bootps > rootfs.lan.bootpc: BOOTP/DHCP, Reply, length 343

pxeserver, tcpdump -A -i enp0s31f6 port tftp or port bootpc or port bootps or port 546 or port 547:

09:00:38.856729 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from b8:27:eb:xx:xx:xx (oui Unknown), length 322
E..^......9..........D.C.J......&..9.....................'..............................................................................................................................................................................................................c.Sc5..7.+<C........B..]...^....a..................< PXEClient:Arch:00000:UNDI:002001.
09:00:38.857410 IP pxeserver.lan.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 314
E..VB...@.uG.........C.D.B......&..9.....................'..............................................................................................................................................................................................................c.Sc5..6.....<	PXEClienta..................+ ...
..PXE	....Raspberry Pi Boot..
09:00:44.297417 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from b8:27:eb:xx:xx:xx (oui Unknown), length 322
E..^......9..........D.C.J......&..9.....................'..............................................................................................................................................................................................................c.Sc5..7.+<C........B..]...^....a..................< PXEClient:Arch:00000:UNDI:002001.
09:00:44.297964 IP pxeserver.lan.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 314
E..VD5..@.s..........C.D.B......&..9.....................'..............................................................................................................................................................................................................c.Sc5..6.....<	PXEClienta..................+ ...
..PXE	....Raspberry Pi Boot..
09:00:44.316425 IP rootfs.lan.49153 > pxeserver.lan.tftp:  29 RRQ "autoboot.txt" octet tsize 0
E..9...........r.......E.%....autoboot.txt.octet.tsize.0.
@ali1234

This comment has been minimized.

Show comment
Hide comment
@ali1234

ali1234 Mar 15, 2017

Also I can run "brctl showstp br0" on my router. It shows a lot of information but I don't really know what I am looking for.

ali1234 commented Mar 15, 2017

Also I can run "brctl showstp br0" on my router. It shows a lot of information but I don't really know what I am looking for.

@ghollingworth

This comment has been minimized.

Show comment
Hide comment
@ghollingworth

ghollingworth Mar 15, 2017

It looks like a problem we had before where we needed a broadcast packet to trigger the receiving of one of the packets (which is why turning on the Windows machine suddenly starts it...)

Be interesting to see if the receipt of the second DHCP reply was triggered by a broadcast packet...

ghollingworth commented Mar 15, 2017

It looks like a problem we had before where we needed a broadcast packet to trigger the receiving of one of the packets (which is why turning on the Windows machine suddenly starts it...)

Be interesting to see if the receipt of the second DHCP reply was triggered by a broadcast packet...

@ali1234

This comment has been minimized.

Show comment
Hide comment
@ali1234

ali1234 Mar 15, 2017

That is probably it. The Windows machine is spamming a lot of autoconfig junk continuously on both IPv4 and IPv6. The Pi and the pxe server are both connected directly to the router switch, but the windows machine is behind a second (unmanaged) switch. None of the machines are on wifi, but the router is bridging ethernet and two wifi radios.

ali1234 commented Mar 15, 2017

That is probably it. The Windows machine is spamming a lot of autoconfig junk continuously on both IPv4 and IPv6. The Pi and the pxe server are both connected directly to the router switch, but the windows machine is behind a second (unmanaged) switch. None of the machines are on wifi, but the router is bridging ethernet and two wifi radios.

@ghollingworth

This comment has been minimized.

Show comment
Hide comment
@ghollingworth

ghollingworth Mar 15, 2017

I might have to see if it's possible the USB->ETH bridge is holding onto a packet for some reason, or that I've dropped a packet... But can't understand why this would be...

In the past we've found doing an occasional broadcast ping will fix the problem...

ghollingworth commented Mar 15, 2017

I might have to see if it's possible the USB->ETH bridge is holding onto a packet for some reason, or that I've dropped a packet... But can't understand why this would be...

In the past we've found doing an occasional broadcast ping will fix the problem...

@ali1234

This comment has been minimized.

Show comment
Hide comment
@ali1234

ali1234 Mar 19, 2017

I found it not working again this morning. Broadcast ping brought it to life:

ping -b 192.168.0.255

ali1234 commented Mar 19, 2017

I found it not working again this morning. Broadcast ping brought it to life:

ping -b 192.168.0.255
@puck

This comment has been minimized.

Show comment
Hide comment
@puck

puck Apr 28, 2017

I'm seeing this roughly one time in 3 when I network boot my RPi 3, tcpdump from the DHCP server:

00:09:54.601333 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from b8:27:eb:52:63:05, length 320
00:09:54.602403 IP 10.1.0.251.67 > 10.1.0.203.68: BOOTP/DHCP, Reply, length 366
00:09:55.861047 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from b8:27:eb:52:63:05, length 320
00:09:55.862176 IP 10.1.0.251.67 > 10.1.0.203.68: BOOTP/DHCP, Reply, length 366
00:09:56.894372 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from b8:27:eb:52:63:05, length 320
00:09:56.895385 IP 10.1.0.251.67 > 10.1.0.203.68: BOOTP/DHCP, Reply, length 366
00:09:57.915143 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from b8:27:eb:52:63:05, length 320
00:09:57.916157 IP 10.1.0.251.67 > 10.1.0.203.68: BOOTP/DHCP, Reply, length 366
00:09:59.606050 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from b8:27:eb:52:63:05, length 320
00:09:59.607080 IP 10.1.0.251.67 > 10.1.0.203.68: BOOTP/DHCP, Reply, length 366

I find it can take several power off reboots of the RPi before it works correctly. When it does, it only makes DHCP request once.

I don't have any network switches on my home network which have mirror ports.

puck commented Apr 28, 2017

I'm seeing this roughly one time in 3 when I network boot my RPi 3, tcpdump from the DHCP server:

00:09:54.601333 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from b8:27:eb:52:63:05, length 320
00:09:54.602403 IP 10.1.0.251.67 > 10.1.0.203.68: BOOTP/DHCP, Reply, length 366
00:09:55.861047 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from b8:27:eb:52:63:05, length 320
00:09:55.862176 IP 10.1.0.251.67 > 10.1.0.203.68: BOOTP/DHCP, Reply, length 366
00:09:56.894372 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from b8:27:eb:52:63:05, length 320
00:09:56.895385 IP 10.1.0.251.67 > 10.1.0.203.68: BOOTP/DHCP, Reply, length 366
00:09:57.915143 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from b8:27:eb:52:63:05, length 320
00:09:57.916157 IP 10.1.0.251.67 > 10.1.0.203.68: BOOTP/DHCP, Reply, length 366
00:09:59.606050 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from b8:27:eb:52:63:05, length 320
00:09:59.607080 IP 10.1.0.251.67 > 10.1.0.203.68: BOOTP/DHCP, Reply, length 366

I find it can take several power off reboots of the RPi before it works correctly. When it does, it only makes DHCP request once.

I don't have any network switches on my home network which have mirror ports.

@ghollingworth

This comment has been minimized.

Show comment
Hide comment
@ghollingworth

ghollingworth May 1, 2017

@puck Is your test with the bootcode.bin as a single file on the SD card?

ghollingworth commented May 1, 2017

@puck Is your test with the bootcode.bin as a single file on the SD card?

@puck

This comment has been minimized.

Show comment
Hide comment
@puck

puck May 1, 2017

Hey @ghollingworth, no I'm networking booting and loading it from a TFTP server. I have no SD card in my RPi 3.

puck commented May 1, 2017

Hey @ghollingworth, no I'm networking booting and loading it from a TFTP server. I have no SD card in my RPi 3.

@puck

This comment has been minimized.

Show comment
Hide comment
@puck

puck May 1, 2017

I've realised I can make a poor man's wire tap using a laptop and two USB ethernet dongles. If you'd like a traffic capture, I should be able to do that tonight.

puck commented May 1, 2017

I've realised I can make a poor man's wire tap using a laptop and two USB ethernet dongles. If you'd like a traffic capture, I should be able to do that tonight.

@ghollingworth

This comment has been minimized.

Show comment
Hide comment
@ghollingworth

ghollingworth May 1, 2017

ghollingworth commented May 1, 2017

@puck

This comment has been minimized.

Show comment
Hide comment
@puck

puck May 2, 2017

@ghollingworth putting bootcode.bin (only) on an SD card makes the boot process reliable - I rebooted about 10 times with no failures. However, it doesn't look for any files inside the serial number directory now, first boot hung because I had them all in the sub directory.. I had to symlink all the other files into my TFTP root for the boot process to succeed.

puck commented May 2, 2017

@ghollingworth putting bootcode.bin (only) on an SD card makes the boot process reliable - I rebooted about 10 times with no failures. However, it doesn't look for any files inside the serial number directory now, first boot hung because I had them all in the sub directory.. I had to symlink all the other files into my TFTP root for the boot process to succeed.

@pelwell

This comment has been minimized.

Show comment
Hide comment
@pelwell

pelwell May 2, 2017

Contributor

@puck That is what I would expect to see from a bootcode.bin not built in the last week. Try a more recent one: https://github.com/Hexxeh/rpi-firmware/blob/170150d2210a3bb1801ae165d54794101f28fc54/bootcode.bin

Contributor

pelwell commented May 2, 2017

@puck That is what I would expect to see from a bootcode.bin not built in the last week. Try a more recent one: https://github.com/Hexxeh/rpi-firmware/blob/170150d2210a3bb1801ae165d54794101f28fc54/bootcode.bin

@puck

This comment has been minimized.

Show comment
Hide comment
@puck

puck May 2, 2017

Heh, yes, I just find bug #754 and tested again with the newest bootcode.bin - it now works with the serial directory. I was hoping to update this issue before someone responded. ;)

puck commented May 2, 2017

Heh, yes, I just find bug #754 and tested again with the newest bootcode.bin - it now works with the serial directory. I was hoping to update this issue before someone responded. ;)

@pelwell

This comment has been minimized.

Show comment
Hide comment
@pelwell

pelwell May 2, 2017

Contributor

Sorry - you caught me on a good day. ;)

Contributor

pelwell commented May 2, 2017

Sorry - you caught me on a good day. ;)

@tvk7

This comment has been minimized.

Show comment
Hide comment
@tvk7

tvk7 May 18, 2017

is there a workaround apart from putting bootcode.bin on a sd-card? For me it also looks like the Pi is trying several times to get an DHCP offer, but it always discards the reply and never reach out for the tftp server. Could this be an DHCP implementation issue? Broadcast happens a lot on my network.

tvk7 commented May 18, 2017

is there a workaround apart from putting bootcode.bin on a sd-card? For me it also looks like the Pi is trying several times to get an DHCP offer, but it always discards the reply and never reach out for the tftp server. Could this be an DHCP implementation issue? Broadcast happens a lot on my network.

@ghollingworth

This comment has been minimized.

Show comment
Hide comment
@ghollingworth

ghollingworth May 18, 2017

If you're using bootcode.bin (and only that) on an SD card then it is using the fixed version of the code...

If it is ignoring the reply then that will be because the offer doesn't contain the TFTP server address in a suitably understandable manor. Can you tcpdump (or wireshark) the reply?

Thanks

ghollingworth commented May 18, 2017

If you're using bootcode.bin (and only that) on an SD card then it is using the fixed version of the code...

If it is ignoring the reply then that will be because the offer doesn't contain the TFTP server address in a suitably understandable manor. Can you tcpdump (or wireshark) the reply?

Thanks

@tvk7

This comment has been minimized.

Show comment
Hide comment
@tvk7

tvk7 May 19, 2017

Here is a tcpdump, maybe there is a option missing, hopefully.
The addresses are static assigned. What I saw in other discussions is, that when you assign addresses from a address pool the server make some checks which results in a delayed send of an dhcp offer which the pi then accepts?

BOOTP/DHCP, Request from b8:27:eb:eb:cc:6e, length: 320, hops:1, xid:0x26f30339, flags: [none]
Gateway IP: 10.11.108.4
Client Ethernet Address: b8:27:eb:eb:cc:6e
Vendor-rfc1048:
DHCP:DISCOVER
PR:VO+VC+BF+T128+T129+T130+T131+T132+T133+T134+T135+TFTP
ARCH:0
NDI:1.2.1
GUID:0.68.68.68.68.68.68.68.68.68.68.68.68.68.68.68.68
VC:"PXEClient:Arch:00000:UNDI:002001"
16:02:36.394082 00:50:56:b7:34:a8 > 00:00:5e:00:01:0b, ethertype IPv4 (0x0800), length 342: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto: UDP (17), length: 328) 10.11.4.11.bootps > 10.11.108.4.bootps: BOOTP/DHCP, Reply, length: 300, hops:1, xid:0x26f30339, flags: [none]
Your IP: 10.11.108.226
Server IP: 10.11.5.141
Gateway IP: 10.11.108.4
Client Ethernet Address: b8:27:eb:eb:cc:6e
Vendor-rfc1048:
DHCP:OFFER
SID:10.11.4.11
LT:900
VO:82.97.115.112.98.101.114.114.121.32.80.105.32.66.111.111.116.32.32.32
TFTP:"10.11.5.141"
SM:255.255.252.0
16:02:37.430627 00:04:96:8b:bd:ad > 00:50:56:b7:34:a8, ethertype IPv4 (0x0800), length 362: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto: UDP (17), length: 348) 10.11.108.4.bootps > 10.11.4.11.bootps: BOOTP/DHCP, Request from b8:27:eb:eb:cc:6e, length: 320, hops:1, xid:0x26f30339, flags: [none]
Gateway IP: 10.11.108.4
Client Ethernet Address: b8:27:eb:eb:cc:6e
Vendor-rfc1048:
DHCP:DISCOVER
PR:VO+VC+BF+T128+T129+T130+T131+T132+T133+T134+T135+TFTP
ARCH:0
NDI:1.2.1
GUID:0.68.68.68.68.68.68.68.68.68.68.68.68.68.68.68.68
VC:"PXEClient:Arch:00000:UNDI:002001"

16:02:37.431011 00:50:56:b7:34:a8 > 00:00:5e:00:01:0b, ethertype IPv4 (0x0800), length 342: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto: UDP (17), length: 328) 10.11.4.11.bootps > 10.11.108.4.bootps: BOOTP/DHCP, Reply, length: 300, hops:1, xid:0x26f30339, flags: [none]
Your IP: 10.11.108.226
Server IP: 10.11.5.141
Gateway IP: 10.11.108.4
Client Ethernet Address: b8:27:eb:eb:cc:6e
Vendor-rfc1048:
DHCP:OFFER
SID:10.11.4.11
LT:900
VO:82.97.115.112.98.101.114.114.121.32.80.105.32.66.111.111.116.32.32.32
TFTP:"10.11.5.141"
SM:255.255.252.0

tvk7 commented May 19, 2017

Here is a tcpdump, maybe there is a option missing, hopefully.
The addresses are static assigned. What I saw in other discussions is, that when you assign addresses from a address pool the server make some checks which results in a delayed send of an dhcp offer which the pi then accepts?

BOOTP/DHCP, Request from b8:27:eb:eb:cc:6e, length: 320, hops:1, xid:0x26f30339, flags: [none]
Gateway IP: 10.11.108.4
Client Ethernet Address: b8:27:eb:eb:cc:6e
Vendor-rfc1048:
DHCP:DISCOVER
PR:VO+VC+BF+T128+T129+T130+T131+T132+T133+T134+T135+TFTP
ARCH:0
NDI:1.2.1
GUID:0.68.68.68.68.68.68.68.68.68.68.68.68.68.68.68.68
VC:"PXEClient:Arch:00000:UNDI:002001"
16:02:36.394082 00:50:56:b7:34:a8 > 00:00:5e:00:01:0b, ethertype IPv4 (0x0800), length 342: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto: UDP (17), length: 328) 10.11.4.11.bootps > 10.11.108.4.bootps: BOOTP/DHCP, Reply, length: 300, hops:1, xid:0x26f30339, flags: [none]
Your IP: 10.11.108.226
Server IP: 10.11.5.141
Gateway IP: 10.11.108.4
Client Ethernet Address: b8:27:eb:eb:cc:6e
Vendor-rfc1048:
DHCP:OFFER
SID:10.11.4.11
LT:900
VO:82.97.115.112.98.101.114.114.121.32.80.105.32.66.111.111.116.32.32.32
TFTP:"10.11.5.141"
SM:255.255.252.0
16:02:37.430627 00:04:96:8b:bd:ad > 00:50:56:b7:34:a8, ethertype IPv4 (0x0800), length 362: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto: UDP (17), length: 348) 10.11.108.4.bootps > 10.11.4.11.bootps: BOOTP/DHCP, Request from b8:27:eb:eb:cc:6e, length: 320, hops:1, xid:0x26f30339, flags: [none]
Gateway IP: 10.11.108.4
Client Ethernet Address: b8:27:eb:eb:cc:6e
Vendor-rfc1048:
DHCP:DISCOVER
PR:VO+VC+BF+T128+T129+T130+T131+T132+T133+T134+T135+TFTP
ARCH:0
NDI:1.2.1
GUID:0.68.68.68.68.68.68.68.68.68.68.68.68.68.68.68.68
VC:"PXEClient:Arch:00000:UNDI:002001"

16:02:37.431011 00:50:56:b7:34:a8 > 00:00:5e:00:01:0b, ethertype IPv4 (0x0800), length 342: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto: UDP (17), length: 328) 10.11.4.11.bootps > 10.11.108.4.bootps: BOOTP/DHCP, Reply, length: 300, hops:1, xid:0x26f30339, flags: [none]
Your IP: 10.11.108.226
Server IP: 10.11.5.141
Gateway IP: 10.11.108.4
Client Ethernet Address: b8:27:eb:eb:cc:6e
Vendor-rfc1048:
DHCP:OFFER
SID:10.11.4.11
LT:900
VO:82.97.115.112.98.101.114.114.121.32.80.105.32.66.111.111.116.32.32.32
TFTP:"10.11.5.141"
SM:255.255.252.0

@ghollingworth

This comment has been minimized.

Show comment
Hide comment
@ghollingworth

ghollingworth May 23, 2017

Looks to me like you've got the serial and client on different subnets, the Raspberry Pi bootrom doesn't support this. The bootcode.bin option does though

ghollingworth commented May 23, 2017

Looks to me like you've got the serial and client on different subnets, the Raspberry Pi bootrom doesn't support this. The bootcode.bin option does though

@tvk7

This comment has been minimized.

Show comment
Hide comment
@tvk7

tvk7 May 24, 2017

Yes, works with single bootcode.bin

tvk7 commented May 24, 2017

Yes, works with single bootcode.bin

@ali1234

This comment has been minimized.

Show comment
Hide comment
@ali1234

ali1234 Jul 30, 2017

This still does not work for me. With latest bootcode.bin the results are exactly the same: dnsmasq recieves the request and sends the response, and the Pi ignores it, five times, then it stops.

ali1234 commented Jul 30, 2017

This still does not work for me. With latest bootcode.bin the results are exactly the same: dnsmasq recieves the request and sends the response, and the Pi ignores it, five times, then it stops.

@andig

This comment has been minimized.

Show comment
Hide comment
@andig

andig Aug 27, 2017

This still does not work for me. With latest bootcode.bin the results are exactly the same: dnsmasq recieves the request and sends the response, and the Pi ignores it, five times, then it stops.

Checking in here after experiencing exactly the same problem with brand new pi3 with latest firmware update here https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=191778&p=1203002#p1203002

Reading through this issue it appears that it's not solved unless an SD card with bootcode.bin is used. Is this the expected behaviour?

andig commented Aug 27, 2017

This still does not work for me. With latest bootcode.bin the results are exactly the same: dnsmasq recieves the request and sends the response, and the Pi ignores it, five times, then it stops.

Checking in here after experiencing exactly the same problem with brand new pi3 with latest firmware update here https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=191778&p=1203002#p1203002

Reading through this issue it appears that it's not solved unless an SD card with bootcode.bin is used. Is this the expected behaviour?

@ali1234

This comment has been minimized.

Show comment
Hide comment
@ali1234

ali1234 Feb 15, 2018

@andig in my case it does not work reliably even with updated bootcode.bin on an SD card. However, setting dhcp-reply-delay=1 in dnsmasq.conf does help a bit. Sometimes it still takes several tries before it works though.

ali1234 commented Feb 15, 2018

@andig in my case it does not work reliably even with updated bootcode.bin on an SD card. However, setting dhcp-reply-delay=1 in dnsmasq.conf does help a bit. Sometimes it still takes several tries before it works though.

@ghollingworth

This comment has been minimized.

Show comment
Hide comment
@ghollingworth

ghollingworth Feb 19, 2018

DHCP-reply-delay is required in some cases because there is a bug in the silicon such that if it receives both the DHCP reply and the tftpboot server address in less than 2 seconds then the device will lock up forever.

bootcode.bin does not suffer from this problem

But it's possible something else is going wrong or the packets are being dropped by the switch

ghollingworth commented Feb 19, 2018

DHCP-reply-delay is required in some cases because there is a bug in the silicon such that if it receives both the DHCP reply and the tftpboot server address in less than 2 seconds then the device will lock up forever.

bootcode.bin does not suffer from this problem

But it's possible something else is going wrong or the packets are being dropped by the switch

@bunyevacz

This comment has been minimized.

Show comment
Hide comment
@bunyevacz

bunyevacz Jul 5, 2018

I have made a 30 raspberry pi setup. One 24 port PoE enabled managed switch (ES-24-250W), and two unmanaged PoE switch (TP-Link TL-SG1008P, TP-Link TL-SF1008P).

I have a raspberry pi acting as master:

  • wifi -> ethernet bridge for internet connectivity
  • DHCP server
  • TFTP server
  • NFS server

And I have 29 raspberry pi 3, without any SD cards. So they boot from network entirely, and gets its power from PoE. All the raspberry pi have the official touchscreen. (even the master)

I can switch on and off each port of the managed PoE switch. Essentially the exact same as pluggin in and unplugging the cable.

Here are my experiences (22 raspberry pi clients, 1 raspberry pi server, 1 laptop):

  1. Switching on each raspberry pi at the same time from the PoE switch. Ie. within half a second or less.
    3 raspberry pis booting up fine (usually between 1-5)
    14 raspberry pis stuck at the rainbow screen (usually about half the pis)
    5 raspberry pis stays black (and consuming 1.5W or less)

If the screen is on, the raspberry pi consumes between 5.5-7W.

  1. Apply the dhcp-replay-delay=1 to dnsmasq.conf
    Almost all the raspberry pi gets to the rainbow at least. Usually:
    10 raspberry pis finishes booting
    10 raspberry pis stuck at rainbow
    2 raspberry pis stays black screen.

  2. Apply the dhcp-replay-delay=1, and power each raspberry pi with 10sec delay
    So the managed switch apply (via a bash script from my laptop) power to each PoE interface 10 sec apart.

Almost all raspberry pi boots up fine. Worst case was 20/22.
Once the raspberry pi boots up, it sends a heartbeat message to the master. The master figures out, which raspberry pi failed to boot, and unplug-replug power via the managed switch (the switch can provide which mac address belongs to which physical interface).

The timing is as follow:
0-16 black screen
7s: ethernet port starts to blink
16-18sec rainbow
50sec: finished booting into text autologin, and starts the X server
1m3s: X server started , it is now completely grey
1m15: chromium started in kiosk mode. Boot finished.

I'm on this problem (unreliable starting) on a week now. I'm generating ICMPv6 packets, because I believed this helps. Because if I start a computer or a raspberry pi with sd card, then all the others raspberry pis starting up more likely.
And judging from tcpdump, I believed ICMPv6 packets made a difference.
In reality it turns out, all it does is just makes dnsmasq occupied, which results response time going up, which results more likely raspberry pi starting.

I'm now optimizing boot time from nfs. Disabling all services which is not needed. But the hard part was figuring out the unreliability. I think a pointer on the official site, or the netboot tutorial would be more appropriate, rather then the rather vague "packets on the network helps booting up".
Also this

If it doesn't boot on the first attempt, keep trying. It can take a minute or so for 
the Raspberry Pi to boot, so be patient.

phrase is totally misleading and wrong. If a raspberry pi does not start (rainbow picture) after 20 sec, 99% sure it will never start at all no matter how patient you are.
I booted up the raspberry pis like 400 times or more, and only one raspberry pi managed to start after 1 minute mark: once.

When I started this adventure, I was totally unaware how untested this codepath is.:(

bunyevacz commented Jul 5, 2018

I have made a 30 raspberry pi setup. One 24 port PoE enabled managed switch (ES-24-250W), and two unmanaged PoE switch (TP-Link TL-SG1008P, TP-Link TL-SF1008P).

I have a raspberry pi acting as master:

  • wifi -> ethernet bridge for internet connectivity
  • DHCP server
  • TFTP server
  • NFS server

And I have 29 raspberry pi 3, without any SD cards. So they boot from network entirely, and gets its power from PoE. All the raspberry pi have the official touchscreen. (even the master)

I can switch on and off each port of the managed PoE switch. Essentially the exact same as pluggin in and unplugging the cable.

Here are my experiences (22 raspberry pi clients, 1 raspberry pi server, 1 laptop):

  1. Switching on each raspberry pi at the same time from the PoE switch. Ie. within half a second or less.
    3 raspberry pis booting up fine (usually between 1-5)
    14 raspberry pis stuck at the rainbow screen (usually about half the pis)
    5 raspberry pis stays black (and consuming 1.5W or less)

If the screen is on, the raspberry pi consumes between 5.5-7W.

  1. Apply the dhcp-replay-delay=1 to dnsmasq.conf
    Almost all the raspberry pi gets to the rainbow at least. Usually:
    10 raspberry pis finishes booting
    10 raspberry pis stuck at rainbow
    2 raspberry pis stays black screen.

  2. Apply the dhcp-replay-delay=1, and power each raspberry pi with 10sec delay
    So the managed switch apply (via a bash script from my laptop) power to each PoE interface 10 sec apart.

Almost all raspberry pi boots up fine. Worst case was 20/22.
Once the raspberry pi boots up, it sends a heartbeat message to the master. The master figures out, which raspberry pi failed to boot, and unplug-replug power via the managed switch (the switch can provide which mac address belongs to which physical interface).

The timing is as follow:
0-16 black screen
7s: ethernet port starts to blink
16-18sec rainbow
50sec: finished booting into text autologin, and starts the X server
1m3s: X server started , it is now completely grey
1m15: chromium started in kiosk mode. Boot finished.

I'm on this problem (unreliable starting) on a week now. I'm generating ICMPv6 packets, because I believed this helps. Because if I start a computer or a raspberry pi with sd card, then all the others raspberry pis starting up more likely.
And judging from tcpdump, I believed ICMPv6 packets made a difference.
In reality it turns out, all it does is just makes dnsmasq occupied, which results response time going up, which results more likely raspberry pi starting.

I'm now optimizing boot time from nfs. Disabling all services which is not needed. But the hard part was figuring out the unreliability. I think a pointer on the official site, or the netboot tutorial would be more appropriate, rather then the rather vague "packets on the network helps booting up".
Also this

If it doesn't boot on the first attempt, keep trying. It can take a minute or so for 
the Raspberry Pi to boot, so be patient.

phrase is totally misleading and wrong. If a raspberry pi does not start (rainbow picture) after 20 sec, 99% sure it will never start at all no matter how patient you are.
I booted up the raspberry pis like 400 times or more, and only one raspberry pi managed to start after 1 minute mark: once.

When I started this adventure, I was totally unaware how untested this codepath is.:(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment