Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TW#14022] Wifi -- SYSTEM_EVENT_STA_GOT_IP is taking over 4 seconds #799

Closed
hchaudhary1 opened this issue Jul 12, 2017 · 41 comments

Comments

Projects
None yet
8 participants
@hchaudhary1
Copy link

commented Jul 12, 2017

I am running the "https://github.com/espressif/esp-idf/tree/master/examples/protocols/https_request" on ESP32-DevKitC.

It seems that the ESP32 is taking a really long time to retrieve the DHCP IP.
For applications where only 1 packet needs to be sent every 30min, this has a massive overhead, just to start the wifi. Other chips such as Atmel WINC1500 are able to connect and get IP in under 1 second.

I have tried with these routers with the same result: Apple, Netgear Orbi

Logs:

I (0) cpu_start: App cpu up.
I (1537) heap_alloc_caps: Initializing. RAM available for dynamic allocation:
I (1559) heap_alloc_caps: At 3FFAE2A0 len 00001D60 (7 KiB): DRAM
I (1580) heap_alloc_caps: At 3FFB7F18 len 000280E8 (160 KiB): DRAM
I (1601) heap_alloc_caps: At 3FFE0440 len 00003BC0 (14 KiB): D/IRAM
I (1622) heap_alloc_caps: At 3FFE4350 len 0001BCB0 (111 KiB): D/IRAM
I (1644) heap_alloc_caps: At 4009323C len 0000CDC4 (51 KiB): IRAM
I (1665) cpu_start: Pro cpu start user code
I (1723) cpu_start: Starting scheduler on PRO CPU.
I (202) cpu_start: Starting scheduler on APP CPU.
I (232) wifi: wifi firmware version: 72ddf26
I (232) wifi: config NVS flash: enabled
I (232) wifi: config nano formating: disabled
I (232) system_api: Base MAC address is not set, read default base MAC address f
rom BLK0 of EFUSE
I (242) system_api: Base MAC address is not set, read default base MAC address f
rom BLK0 of EFUSE
I (272) wifi: Init dynamic tx buffer num: 32
I (272) wifi: Init dynamic rx buffer num: 32
I (272) wifi: wifi driver task: 3ffbd87c, prio:23, stack:4096
I (272) wifi: Init static rx buffer num: 10
I (272) wifi: Init dynamic rx buffer num: 32
I (282) wifi: Init rx ampdu len mblock:7
I (282) wifi: Init lldesc rx ampdu entry mblock:4
I (282) wifi: wifi power manager task: 0x3ffc2c44 prio: 21 stack: 2560
I (292) >>hassan: Setting WiFi configuration SSID hotspot...
I (302) wifi: wifi timer task: 3ffc3cc4, prio:22, stack:3584
I (332) phy: phy_version: 355.1, 59464c5, Jun 14 2017, 20:25:06, 0, 0
I (332) wifi: mode : sta (24:0a:c4:05:d6:f8)
I (332) >>hassan: Seeding the random number generator
I (342) >>hassan: Loading the CA root certificate...
I (342) >>hassan: Setting hostname for TLS session...
I (352) >>hassan: Setting up the SSL/TLS structure...

I (452) wifi: n:6 0, o:1 0, ap:255 255, sta:6 0, prof:1
I (1442) wifi: state: init -> auth (b0)
I (1442) wifi: state: auth -> assoc (0)
I (1452) wifi: state: assoc -> run (10)
I (1462) wifi: connected with hotspot, channel 6
I (4662) event: ip: 192.168.245.95, mask: 255.255.255.0, gw: 192.168.245.1

I (4662) >>hassan: Connected to AP

@negativekelvin

This comment has been minimized.

Copy link
Contributor

commented Jul 13, 2017

Did you try to set channel, scan type, and scan time? Also maybe disable LWIP_DHCP_DOES_ARP_CHECK. Did you turn on lwip debug to see where delay is?

@HubbyGitter

This comment has been minimized.

Copy link

commented Jul 13, 2017

I can confirm that getting the IP address is the longest part, but my timing is under 1 s, including some parallel load.

init -> auth
[10 ms]
auth -> assoc
[10 ms]
assoc -> run
[20 ms]
connected
[800 ms]
IP

@hchaudhary1

This comment has been minimized.

Copy link
Author

commented Jul 13, 2017

@HubbyGitter
If you take a look at my logs above, I get connect to accesspoint in 990ms (layer 2). However, that is not the same a receiving a DHCP IP (layer 3) from the router, correct?

@negativekelvin
I will try your recommendations and post the results today

@HubbyGitter

This comment has been minimized.

Copy link

commented Jul 13, 2017

@hchaudhary1 Not sure what your point is. If you take a look at my timings, you will find that I get the IP address from my network's DHCP server in under one second, starting with init -> auth.

@hchaudhary1

This comment has been minimized.

Copy link
Author

commented Jul 13, 2017

@HubbyGitter I see! are you able to immediately communicate following the connect? Can you share an example program? Thanks

@HubbyGitter

This comment has been minimized.

Copy link

commented Jul 13, 2017

My implementation creates a socket immediately after the IP event. Connecting to a peer through that socket has never failed so far. But I do have unjustified "disconnect repeats" where instead of connecting, I get a disconnect event, which will go away on retry. Obviously, that won't influence DHCP times. See #724 although the OP describes a slightly different scenario.

My code is doing nothing special. It follows the recommended WiFi startup function call sequence. It waist for all events (including handling unexpected ones) and calls the functions from another context than the event handler.

  • After connecting, what WiFi signal strength does your access point / router show for your ESP32 board?
  • Have you used a WiFi analyser (app) to check the 2.4 GHz channel situation?
  • Channel sharing with third parties does not work well if their signals are only seen by one of your devices, not by both.
  • Do you create heavy CPU load on both cores?
@HubbyGitter

This comment has been minimized.

Copy link

commented Jul 13, 2017

My code doesn't scan. Should've mentioned it.

@negativekelvin

This comment has been minimized.

Copy link
Contributor

commented Jul 13, 2017

Sorry forgot you can't pass scan_type and scan_time to STA config only channel

@hchaudhary1

This comment has been minimized.

Copy link
Author

commented Jul 13, 2017

@negativekelvin
I have corrected the channel in the STA config only.
Here are the results I get running with different settings

  1. Unmodified Example from IDF 3.0
    4652ms since "App cpu up"

  2. seting: set channel + disable arp check
    4000ms since "App cpu up"

  3. seting: set channel + disable arp check + disable ampdu
    3782ms since "App cpu up"

  4. seting: set channel + disable arp check + disable ampdu + 240mhz + Os
    3695ms since "App cpu up"

Logs for #4 is as follows:

I (0) cpu_start: App cpu up.
I (2810) heap_alloc_caps: Initializing. RAM available for dynamic allocation:
I (2881) heap_alloc_caps: At 3FFAE2A0 len 00001D60 (7 KiB): DRAM
I (2942) heap_alloc_caps: At 3FFB7E68 len 00028198 (160 KiB): DRAM
I (3006) heap_alloc_caps: At 3FFE0440 len 00003BC0 (14 KiB): D/IRAM
I (3070) heap_alloc_caps: At 3FFE4350 len 0001BCB0 (111 KiB): D/IRAM
I (3135) heap_alloc_caps: At 400926A0 len 0000D960 (54 KiB): IRAM
I (3197) cpu_start: Pro cpu start user code
I (3356) cpu_start: Starting scheduler on PRO CPU.
I (585) cpu_start: Starting scheduler on APP CPU.
I (605) wifi: wifi firmware version: 72ddf26
I (605) wifi: config NVS flash: enabled
I (605) wifi: config nano formating: disabled
I (605) system_api: Base MAC address is not set, read default base MAC address f
rom BLK0 of EFUSE
I (615) system_api: Base MAC address is not set, read default base MAC address f
rom BLK0 of EFUSE
I (635) wifi: Init dynamic tx buffer num: 32
I (635) wifi: Init dynamic rx buffer num: 32
I (635) wifi: wifi driver task: 3ffbd7e4, prio:23, stack:4096
I (635) wifi: Init static rx buffer num: 10
I (635) wifi: Init dynamic rx buffer num: 32
I (645) wifi: Init rx ampdu len mblock:7
I (645) wifi: Init lldesc rx ampdu entry mblock:4
I (655) wifi: wifi power manager task: 0x3ffc2bac prio: 21 stack: 2560
I (655) >>hassan: Setting WiFi configuration SSID hotspot...
I (665) wifi: wifi timer task: 3ffc3c2c, prio:22, stack:3584
I (695) phy: phy_version: 355.1, 59464c5, Jun 14 2017, 20:25:06, 0, 0
I (705) wifi: mode : sta (24:0a:c4:05:d6:f8)
I (705) >>hassan: Seeding the random number generator
I (705) >>hassan: Loading the CA root certificate...
I (715) >>hassan: Setting hostname for TLS session...
I (715) >>hassan: Setting up the SSL/TLS structure...

I (825) wifi: n:6 0, o:1 0, ap:255 255, sta:6 0, prof:1
I (1385) wifi: state: init -> auth (b0)
I (1395) wifi: state: auth -> assoc (0)
I (1405) wifi: state: assoc -> run (10)
I (1435) wifi: connected with hotspot, channel 6
I (3695) event: ip: 192.168.245.95, mask: 255.255.255.0, gw: 192.168.245.1
I (3695) >>hassan: Connected to AP

@hchaudhary1

This comment has been minimized.

Copy link
Author

commented Jul 13, 2017

@HubbyGitter

Do you create heavy CPU load on both cores?

I am running exactly the example code only.

Can you post your logs from this exact example:
https://github.com/espressif/esp-idf/tree/master/examples/protocols/https_request

@negativekelvin

This comment has been minimized.

Copy link
Contributor

commented Jul 13, 2017

I am seeing 3s to get DHCP IP if I delete lease in my router, otherwise 30ms

I (1708) wifi: connected with ssid, channel 1
dhcp_start(netif=0x3ffc52ec) st0
dhcp_start(): starting new DHCP client
dhcp_start(): allocated dhcpdhcp_start(): starting DHCP configuration
dhcp_discover()
transaction id xid(5851f42d)
dhcp_discover: making request
dhcp_discover: realloc()ing
dhcp_discover: sendto(DISCOVER, IP_ADDR_BROADCAST, DHCP_SERVER_PORT)
dhcp_discover: deleting()ing
dhcp_discover: SELECTING
dhcp_discover(): set request timeout 500 msecs
dhcp_fine_tmr(): request timeout
dhcp_timeout()
dhcp_timeout(): restarting discovery
dhcp_discover()
transaction id xid(5851f42d)
dhcp_discover: making request
dhcp_discover: realloc()ing
dhcp_discover: sendto(DISCOVER, IP_ADDR_BROADCAST, DHCP_SERVER_PORT)
dhcp_discover: deleting()ing
dhcp_discover: SELECTING
dhcp_discover(): set request timeout 1000 msecs
dhcp_coarse_tmr()
dhcp_fine_tmr(): request timeout
dhcp_timeout()
dhcp_timeout(): restarting discovery
dhcp_discover()
transaction id xid(5851f42d)
dhcp_discover: making request
dhcp_discover: realloc()ing
dhcp_discover: sendto(DISCOVER, IP_ADDR_BROADCAST, DHCP_SERVER_PORT)
dhcp_discover: deleting()ing
dhcp_discover: SELECTING
dhcp_discover(): set request timeout 2000 msecs
dhcp_coarse_tmr()
dhcp_recv(pbuf = 0x3ffb80f8) from DHCP server 192.168.11.1 port 67
pbuf->len = 300
pbuf->tot_len = 300
skipping option 28 in options
searching DHCP_OPTION_MESSAGE_TYPE
DHCP_OFFER received in DHCP_STATE_SELECTING state
dhcp_handle_offer(netif=0x3ffc52ec) st0
dhcp_handle_offer(): server 0x010ba8c0
dhcp_handle_offer(): offer for 0x4a0ba8c0
dhcp_select(netif=0x3ffc52ec) st0
transaction id xid(5851f42d)
dhcp_select: REQUESTING
dhcp_select(): set request timeout 2000 msecs
dhcp_recv(pbuf = 0x3ffb8198) from DHCP server 192.168.11.1 port 67
pbuf->len = 300
pbuf->tot_len = 300
skipping option 28 in options
searching DHCP_OPTION_MESSAGE_TYPE
dhcp_recv(pbuf = 0x3ffb8f0c) from DHCP server 192.168.11.1 port 67
pbuf->len = 300
pbuf->tot_len = 300
skipping option 28 in options
searching DHCP_OPTION_MESSAGE_TYPE
dhcp_recv(pbuf = 0x3ffb80f8) from DHCP server 192.168.11.1 port 67
pbuf->len = 300
pbuf->tot_len = 300
skipping option 28 in options
searching DHCP_OPTION_MESSAGE_TYPE
DHCP_ACK received
dhcp_bind(netif=0x3ffc52ec) st0
dhcp_bind(): t0 renewal timer 86400 secs
dhcp_bind(): set request timeout 86400000 msecs
dhcp_bind(): t1 renewal timer 43200 secs
dhcp_bind(): set request timeout 43200000 msecs
dhcp_bind(): t2 rebind timer 75600 secs
dhcp_bind(): set request timeout 75600000 msecs
dhcp_bind(): IP: 0x4a0ba8c0 SN: 0x00ffffff GW: 0x010ba8c0
I (4398) event: ip: 192.168.11.74, mask: 255.255.255.0, gw: 192.168.11.1
I (1748) wifi: connected with ssid, channel 1
dhcp_start(netif=0x3ffc52ec) st0
dhcp_start(): starting new DHCP client
dhcp_start(): allocated dhcpdhcp_start(): starting DHCP configuration
dhcp_discover()
transaction id xid(5851f42d)
dhcp_discover: making request
dhcp_discover: realloc()ing
dhcp_discover: sendto(DISCOVER, IP_ADDR_BROADCAST, DHCP_SERVER_PORT)
dhcp_discover: deleting()ing
dhcp_discover: SELECTING
dhcp_discover(): set request timeout 500 msecs
dhcp_recv(pbuf = 0x3ffb80f8) from DHCP server 192.168.11.1 port 67
pbuf->len = 300
pbuf->tot_len = 300
skipping option 28 in options
searching DHCP_OPTION_MESSAGE_TYPE
DHCP_OFFER received in DHCP_STATE_SELECTING state
dhcp_handle_offer(netif=0x3ffc52ec) st0
dhcp_handle_offer(): server 0x010ba8c0
dhcp_handle_offer(): offer for 0x4a0ba8c0
dhcp_select(netif=0x3ffc52ec) st0
transaction id xid(5851f42d)
dhcp_select: REQUESTING
dhcp_select(): set request timeout 2000 msecs
dhcp_recv(pbuf = 0x3ffb80f8) from DHCP server 192.168.11.1 port 67
pbuf->len = 300
pbuf->tot_len = 300
skipping option 28 in options
searching DHCP_OPTION_MESSAGE_TYPE
DHCP_ACK received
dhcp_bind(netif=0x3ffc52ec) st0
dhcp_bind(): t0 renewal timer 86400 secs
dhcp_bind(): set request timeout 86400000 msecs
dhcp_bind(): t1 renewal timer 43200 secs
dhcp_bind(): set request timeout 43200000 msecs
dhcp_bind(): t2 rebind timer 75600 secs
dhcp_bind(): set request timeout 75600000 msecs
dhcp_bind(): IP: 0x4a0ba8c0 SN: 0x00ffffff GW: 0x010ba8c0
I (1868) event: ip: 192.168.11.74, mask: 255.255.255.0, gw: 192.168.11.1
@hchaudhary1

This comment has been minimized.

Copy link
Author

commented Jul 13, 2017

@negativekelvin
any tips to generate those DHCP logs... i didn't find any settings for lwip in the menuconfig

@negativekelvin

This comment has been minimized.

Copy link
Contributor

commented Jul 13, 2017

I enabled in esp-idf\components\lwip\include\lwip\port\lwipopts.h

@negativekelvin

This comment has been minimized.

Copy link
Contributor

commented Jul 13, 2017

Even if I set the initial dhcp discover timeout to 4s, it takes the same amount of time to receive the response from the router. So looks like dhcp server might be responsible for this delay.

@hchaudhary1

This comment has been minimized.

Copy link
Author

commented Jul 13, 2017

must be my router/dhcp then... this is good to hear
ok, I will play around with lwip later tonight, and post back

@negativekelvin

This comment has been minimized.

Copy link
Contributor

commented Jul 14, 2017

I observed my router does 3 ARP requests to the IP address and waits for them to time out before handing it out so that accounts for the 3s

@HubbyGitter

This comment has been minimized.

Copy link

commented Jul 14, 2017

@hchaudhary1 So were you referring to another DHCP server when you stated the following?

Other chips such as Atmel WINC1500 are able to connect and get IP in under 1 second.

@hchaudhary1

This comment has been minimized.

Copy link
Author

commented Jul 14, 2017

@HubbyGitter

So were you referring to another DHCP server when you stated the following?

yes. connecting to the same router in the same location, I am finding the the DHCP client in the WINC is extremely fast/optimized. Total time connection+IP from router's DHCP is under 1 sec.

@negativekelvin

Even if I set the initial dhcp discover timeout to 4s, it takes the same amount of time to receive the response from the router. So looks like dhcp server might be responsible for this delay.

Testing the same code at my home, I am seeing ~700ms to connect and 20ms to get IP from DHCP. This is ideal. However, when I test in my lab, I get 740ms to connect, but an extra 2510ms to receive an ip.
Home: 720ms total -- ch9 Asus ac68u router
Lab: 3250ms total -- ch6 Apple a1408 router

All of this is using the following settings:
set channel + disable arp check + disable ampdu + 240mhz + Os

Will be messing with lwip DHCP settings today, and post back if I find any good results

@HubbyGitter

This comment has been minimized.

Copy link

commented Jul 14, 2017

@hchaudhary1 MS Server e.g. has a setting for the number of "address conflict detection attempts" which can be set from 0 to 6. The IDF doesn't play a major role here although some people say that it's the client's responsibility to check and potentially reject the IP address suggested by the server.

Either way, if someone puts their DHCP server into paranoia mode, or a specific device just works this way, then there's not much one can do (except for static settings) because that DHCP server will take its time.

Maybe this one router wants to double check because your device is re-requesting a lease while it's still valid? Are you disconnecting or simply going to deep sleep?

@hchaudhary1

This comment has been minimized.

Copy link
Author

commented Jul 14, 2017

@HubbyGitter
I see what you mean; I will try with a new router next.

Here are the DHCP logs for the slow Apple A1408 router:

----I (665) wifi: wifi timer task: 3ffc3c00, prio:22, stack:3584
I (695) phy: phy_version: 355.1, 59464c5, Jun 14 2017, 20:25:06, 0, 0
I (695) wifi: mode : sta (24:0a:c4:05:d6:f8)
I (695) >>hassan: Seeding the random number generator
I (695) >>hassan: Loading the CA root certificate...
I (705) >>hassan: Setting hostname for TLS session...
I (705) >>hassan: Setting up the SSL/TLS structure...
I (815) wifi: n:6 0, o:1 0, ap:255 255, sta:6 0, prof:1
I (1385) wifi: state: init -> auth (b0)
I (1385) wifi: state: auth -> assoc (0)
I (1395) wifi: state: assoc -> run (10)

----I (1415) wifi: connected with apple-hotspot, channel 6
dhcp_start(netif=0x3ffc4314) st0
dhcp_start(): starting new DHCP client
dhcp_start(): allocated dhcpdhcp_start(): starting DHCP configuration
dhcp_discover()
transaction id xid(5851f42d)
----dhcp_discover: making request
dhcp_discover: realloc()ing
dhcp_discover: sendto(DISCOVER, IP_ADDR_BROADCAST, DHCP_SERVER_PORT)
dhcp_discover: deleting()ing
dhcp_discover: SELECTING
dhcp_discover(): set request timeout 500 msecs
dhcp_coarse_tmr()
dhcp_fine_tmr(): request timeout
dhcp_timeout()
dhcp_timeout(): restarting discovery
dhcp_discover()
transaction id xid(5851f42d)
----dhcp_discover: making request
dhcp_discover: realloc()ing
dhcp_discover: sendto(DISCOVER, IP_ADDR_BROADCAST, DHCP_SERVER_PORT)
dhcp_discover: deleting()ing
dhcp_discover: SELECTING
dhcp_discover(): set request timeout 1000 msecs
dhcp_recv(pbuf = 0x3ffc44bc) from DHCP server 192.168.245.1 port 67
pbuf->len = 300
pbuf->tot_len = 300
searching DHCP_OPTION_MESSAGE_TYPE
----DHCP_OFFER received in DHCP_STATE_SELECTING state
dhcp_handle_offer(netif=0x3ffc4314) st0
dhcp_handle_offer(): server 0x01f5a8c0
dhcp_handle_offer(): offer for 0x5ff5a8c0
dhcp_select(netif=0x3ffc4314) st0
transaction id xid(5851f42d)
----dhcp_select: REQUESTING
dhcp_select(): set request timeout 2000 msecs
dhcp_recv(pbuf = 0x3ffc44bc) from DHCP server 192.168.245.1 port 67
pbuf->len = 300
pbuf->tot_len = 300
searching DHCP_OPTION_MESSAGE_TYPE
dhcp_coarse_tmr()
dhcp_recv(pbuf = 0x3ffc44bc) from DHCP server 192.168.245.1 port 67
pbuf->len = 300
pbuf->tot_len = 300
searching DHCP_OPTION_MESSAGE_TYPE
--------DHCP_ACK received
dhcp_bind(netif=0x3ffc4314) st0
dhcp_bind(): t0 renewal timer 86400 secs
dhcp_bind(): set request timeout 86400000 msecs
dhcp_bind(): t1 renewal timer 43200 secs
dhcp_bind(): set request timeout 43200000 msecs
dhcp_bind(): t2 rebind timer 75600 secs
dhcp_bind(): set request timeout 75600000 msecs
dhcp_bind(): IP: 0x5ff5a8c0 SN: 0x00ffffff GW: 0x01f5a8c0

----I (3635) event: ip: 192.168.245.95, mask: 255.255.255.0, gw: 192.168.245.1

@HubbyGitter

This comment has been minimized.

Copy link

commented Jul 14, 2017

Certain brands, erm, aren't really concerned if their own products interoperate better than a mix with others.

Maybe you want to play with this. And I just wondered whether disconnecting WiFi will really automatically lead to a DHCP release.Guess what: According to my DHCP server's log, it doesn't.

@negativekelvin

This comment has been minimized.

Copy link
Contributor

commented Jul 14, 2017

The log is similar to mine, delay is on dhcp server side. You get this even with existing lease?

Maybe winc1500 stores lease information in nvram and tries to renew on boot?

I don't know that there are any flags in dhcp packet that client can tell server not to check for IP conflict.

@FayeY FayeY changed the title Wifi -- SYSTEM_EVENT_STA_GOT_IP is taking over 4 seconds [TW#14022] Wifi -- SYSTEM_EVENT_STA_GOT_IP is taking over 4 seconds Jul 17, 2017

@hchaudhary1

This comment has been minimized.

Copy link
Author

commented Nov 22, 2017

Looks like Apple's MAC laptops also had this problem, and they solved using this method! Here is a quick explanation:

Last known DHCP server's MAC: aa:bb:cc:dd:ee:ff
Last known DHCP server's IP: 192.168.0.1
Last known ESP32 (aka Client) IP: 192.168.0.200

Method 1 (used by apple macbook):

  1. ARP unicast to aa:bb:cc:dd:ee:ff who-has 192.168.0.1 tell 192.168.0.200
  2. If response is successfully received, skip DHCP discover (and save yourself 3+ seconds), and go to dhcp_select/request 192.168.0.200.
  3. wait for DHCP ACK

Method 2 (used by android):

  1. Skip DHCP discover, and blindly do a dhcp_select/request 192.168.0.200
  2. if DHCP ACK, then you are done.
  3. if there is timeout, you must do a full DHCP request from scratch, starting from DHCP discover.

sources:
https://cafbit.com/post/rapid_dhcp_or_how_do/
https://news.ycombinator.com/item?id=2755461
https://tools.ietf.org/html/rfc1796

EDIT:
looks like apple even applied for a patent that was never granted:
http://www.google.com/patents?id=AMqyAAAAEBAJ

@hchaudhary1 hchaudhary1 reopened this Nov 22, 2017

@hchaudhary1

This comment has been minimized.

Copy link
Author

commented Nov 22, 2017

So the WINC is definitely saving its own ip as well as the ip of the DHCP server in flash.
Sniffing the traffic from the WINC1500 wifi module, it is confirmed that it uses "method 2" a DHCP request + "Gratuitous" ARP (not to be confused with a "unicast" ARP that Apple does). see wireshark packet sniff attached.

winc1500

@negativekelvin

This comment has been minimized.

Copy link
Contributor

commented Nov 22, 2017

Are you going to develop save/restore lease for dhcp client?

@hchaudhary1

This comment has been minimized.

Copy link
Author

commented Nov 23, 2017

yes I will implement it as a proof of concept.
looks like all I have to do is save the struct netif and call the dhcp_reboot().

The problem is that these LwIP functions are not meant to be called by user/application code, as it might mess up the state or do a non-thread safe call etc etc. Could use some guidance from ESP team or LwIP experts here.

@negativekelvin

This comment has been minimized.

Copy link
Contributor

commented Nov 23, 2017

Maybe better to call dhcp_network_changed

@hchaudhary1

This comment has been minimized.

Copy link
Author

commented Nov 23, 2017

yea, that's another option, since it is an exposed API call... I would also have to call dhcp_set_state(dhcp, DHCP_STATE_REBOOTING) beforehand though. Otherwise it will just do a full DHCP discover. Bit of a catch-22 situation.

@negativekelvin

This comment has been minimized.

Copy link
Contributor

commented Nov 23, 2017

If you restore netif->dhcp then it's state should be DHCP_STATE_BOUND which leads to reboot call, because when you save the dhcp it should only be saved if it is valid aka bound

@hchaudhary1

This comment has been minimized.

Copy link
Author

commented Nov 23, 2017

true! good point. dhcp_network_changed() it is

@urbanze

This comment has been minimized.

Copy link

commented Jan 2, 2018

Here it takes 3-4 seconds for the feedback that was connected, have you solved the problem after all this time? What do I need to do to lessen this gigantic time?

@hchaudhary1

This comment has been minimized.

Copy link
Author

commented Jan 2, 2018

Here is what you can do for now:
A. fast way to solve this is configuring your router to enable "DHCP Reservation" for your ESP's IP. This is a 50/50 hit or miss
B. change to a faster/newer/different router that responds faster to DHCP requests
C. hack the LwiP library in the ESP and make it smarter, as mentioned in my above posts.
D. use a static IP

At some point in the future (mid 2018), I might implement item C and post back, but I don't have my final hardware yet, so I am not in any hurry.

@liuzfesp

This comment has been minimized.

Copy link
Collaborator

commented Mar 5, 2018

Hi @hchaudhary1 the min time 300ms in scan config is too large if you want to make the connecting process fast, e.g. if the target AP is in channel 13, then from channel 1~11 is active scan, it takes 300 * 11=3300ms, from channel 12 to channel 13 is passive scan, it takes 1000*2=2000ms, in other words, it takes 3300+2000=5300ms to find the target AP, it's too long. My suggestion is use the default min/max scan time, it's 130ms/360ms.

As for DHCP process, if there is no re-transmit at all, the whole DHCP discover processs (discover/reply/request/offer) should take <500ms, it costs too much time most of time is because the re-transmit of DHCP process. In official the LWIP, if there DHCP discover failed, LWIP restart the process with increased delay: 2s, 4s, 8s, 16s, 32s, 60s, 60s..., it means if the first time DHCP discover failed, LWIP restarts the whole process 2s later, if the 2nd one failed again, LWIP restarts the whole process 4s later ... in ESP LWIP, in order to make the DHCP process fast, we already add some optimizations, such as reduced the delay time to 500ms, 1s, 2s, 4s, 8s, 15s, 15s.

However, since DHCP process is based on UDP protocol, it has potential to fail, especially when the network is not so good, e.g. the device is far away from the AP etc.

Another factor that causes DHCP costs too much time is about DHCP server, I see you already some some method to optimize it.

@hchaudhary1

This comment has been minimized.

Copy link
Author

commented Mar 5, 2018

hi @liuzfesp
yes, the main problem is slow DHCP server. So in case of slow server ESP should Skip DHCP discover, by using method 1 or 2 as stated above. I will make some code and a pull-request later...

@liuzfesp

This comment has been minimized.

Copy link
Collaborator

commented Mar 5, 2018

Hi @zhangyanjiaoesp, please help to check whether @hchaudhary1 's optimization that skipping the DHCP discovery, can be taken into ESP IDF.

@hchaudhary1

This comment has been minimized.

Copy link
Author

commented Jun 5, 2018

Here is how I think it works:

Original esp-idf flow:

  1. wifi_connect() --> system_event_sta_connected_handle_default() -- > tcpip_adapter_dhcpc_start() --> SYSTEM_EVENT_STA_GOT_IP

Proposed flow (untested):

  1. tcpip_adapter_init()
  2. esp_wifi_start()
  3. callback: tcpip_adapter_start()/SYSTEM_EVENT_STA_START (esp_netif[TCPIP_ADAPTER_IF_STA] is now calloc'd)
  4. restore_netif_from_nvs() -- new function
  5. malloc & restore DHCP struct. dhcp->state = DHCP_STATE_BOUND (reflects that DHCP has existing lease)
  6. esp_wifi_connect() --> system_event_sta_connected_handle_default() --> tcpip_adapter_dhcpc_start() --> dhcp_start() modified to NOT run dhcp_discover() first time
  7. dhcp_network_changed() --> dhcp_bind() --> tcpip_adapter_dhcpc_cb() --> SYSTEM_EVENT_STA_GOT_IP
@Stasikiii

This comment has been minimized.

Copy link

commented Aug 27, 2018

@hchaudhary1
CONFIG_LWIP_DHCP_DOES_ARP_CHECK
menuconfig -> Component config -> LWIP
just uncheck -> DHCP: Perform ARP check on any offered address
ahhah

@hchaudhary1

This comment has been minimized.

Copy link
Author

commented Aug 27, 2018

@Stasikiii arp check happens after DHCP lease is already offered.

@negativekelvin

This comment has been minimized.

Copy link
Contributor

commented Sep 10, 2018

@hchaudhary1

This comment has been minimized.

Copy link
Author

commented Sep 10, 2018

looks like the hooks are getting placed!

@zhangyanjiaoesp

This comment has been minimized.

Copy link
Contributor

commented Sep 11, 2018

@hchaudhary1 The changes in esp-lwip has been mergerd, and the corresponding changes in IDF will be merged soon. After that, you can use it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.