Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return error in class deepSleep() when max sleeptime is exceeded #4936

Open
wants to merge 26 commits into
base: master
Choose a base branch
from

Conversation

Marc-Vreeburg
Copy link

Proposed codechange returns 0 when max sleeptime is exceeded.

Proposed codechange returns 0 when max sleeptime  is exceeded.
#endif
return 0; // error: max sleeptime exceeded
}
else
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove else and braces.

Copy link
Collaborator

@devyte devyte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies, I missed this detail in my previous review.

#ifdef DEBUG_SERIAL
DEBUG_SERIAL.println("Error: max sleeptime exceeded");
#endif
return 0; // error: max sleeptime exceeded
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return type is bool, but returned value is int => return false here

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right.
Strange though that this is not detected by git code check.

system_deep_sleep_set_option(static_cast<int>(mode));
system_deep_sleep(time_us);
esp_yield();
return 1; // never gets called
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return type is bool, but returned value is int => return true here

@liebman
Copy link
Contributor

liebman commented Jul 20, 2018

I have a question. I've noticed that the deepSleepMax() can vary. For instance deep sleep an hour, take an action and then go back to sleep. I print the deepSleepMax() value and it changes. Does this value only change during a sleep or could it very between calling deepSleepMax() and then calling deepSleepMax()? Asking because that could create a race condition even with the check being added here.

@devyte
Copy link
Collaborator

devyte commented Jul 20, 2018

I think it can only vary within the sys context, i.e.: between loop()s, or between a yield() and return. That means that it shouldn't vary between the call to deepSleepMax() and the actual sleep.
The above is an educated guess though. It's based on my experience with how the SDK works in general, and not on any deep knowledge of what goes on under the hood wrt sleep, so I could be wrong.

Copy link
Collaborator

@earlephilhower earlephilhower left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a user of the ESP in powerdown modes, and this patch is technically correct, so I'm fine approving this patch.

However, since you are a user, do please think about what you and others would expect it to do if passed in a value that's greater than the (potentially variable) deepSleepMax() time.

{
if (time_us > deepSleepMax())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it not be more useful (and help fix peoples code which is broken already) if, instead of possibly printing an error and return false; , you just printed a warning and adjusted the time down to deepSleepMax()?

Existing code will not expect to ever get a return value and potentially go off into lala-land, since this was a void.

"Be conservative in what you do, be liberal in what you accept from others."

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@earlephilhower Point taken. I'll make some changes to the code.

Copy link
Collaborator

@earlephilhower earlephilhower left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good with just the floating point math issue noted below.

@@ -109,6 +109,13 @@ extern "C" void esp_yield();

void EspClass::deepSleep(uint64_t time_us, WakeMode mode)
{
if (time_us > deepSleepMax()) // we need to prevent the esp8266 from not waking up from deepsleep
{
time_us = (deepSleepMax() - (round(deepSleepMax() * 5 / 100))); // 5% correction because of inaccurate timekeeping by the esp8266
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for round() here. You're doing integer division already, so this takes the integer division result, promotes it to a double, then promotes deepSleepMax() to a double, subtracts them as doubles, then truncates it all back to an integer. A shorter way which won't cause a program to include the whole floating point libs (if not being included already):
time_us = (deepSleepMax() * 95L) / 100L;
or if time_us can be > 2^24 (so the *95 but could overflow)
time_us = deepSleepMax() - (deepSleepMax() / 20);

I bring this up because there was much howling of pain when we included floating point support in printf with the latest release, as some folks binaries ended up being just a hair too big for OTA...

@devyte
Copy link
Collaborator

devyte commented Jul 26, 2018

@marcvtew did you actually encounter a case where the deepsleep would get stuck with a value close to the max?
I'm not an expert in this, but my understanding is that the max value is related to the max number that can be put in some timer without overflowing it. If it does overflow, then behavior is undefined. I say related because there is another internal factor that gets updated every so often.
What I mean is that, in theory, sleeping for max time should always be possible. If it's not, then that should be documented and reported to Espressif.
Or is this 5% meant to be more accurate wrt how long the ESP actually sleeps? If so, then most likely it should be shaved off in all cases, not just when the specified value is > max.

@Marc-Vreeburg
Copy link
Author

Marc-Vreeburg commented Jul 26, 2018

@devyte Made some code to read and display maxSleeptime(). The esp8266 on my Lolin board has a max sleeptime of 13840678905 microseconds. I'll set my sleeptime to 13840678904 microseconds and see if it wakes up.

Actually there's a difference of more than 35 seconds between two different calculations of maxSleeptime(). So i'll settle for a sleeptime of 13800000000 microseconds (230 minutes).

@Marc-Vreeburg
Copy link
Author

@devyte I made some code to measure the value of maxDeepsleep() every two seconds, remembering the absolute max value and the absolute min value measured. I let the code run for about 15 minutes to get a stable result:
max deep sleep MAX: 14386462713 microseconds
max deep sleep MIN: 13704364025 microseconds
max deep sleep DIF: 682098688 microseconds

So, in my case the shortest calculated maxsleeptime differs more than 11 minutes (within max around 3,8 to around 4 hours deepsleep) from the longest calculated maxsleeptime. It looks to me that it's not sufficient to only test (and set) deepsleeptime once (i.e. before going into sleep) because during deepsleep the calculated maxsleeptime may become shorter (worst case: more than 11 minutes) resulting in a timer overflow (esp8266 never restarts allthough we passed if (time_us > (deepSleepMax())).

Safe deepsleep time for me seems to be13704364025 microseconds. With this sleeptime the esp8266 should always wake up. I'll test this tomorrow.
Esp.cpp needs an adjustment like this i think:
if (time_us > (deepSleepMax() - 682098688 or % of deepSleepMax()) // choose one of the two methods { time_us = deepSleepMax() - 682098688 or % of deepSleepMax()); // choose one of the two methods }

@d-a-v
Copy link
Collaborator

d-a-v commented Aug 3, 2018

True. datasheet tells 2^31-1.
So we would have these equiv:

  • (time_in_us / cali) << 12 < 2^31 - 1 (datasheet)
  • time_in_us < (cali << (31-12))
  • time_in_us < (cali<<19)
  • time_in_s * 1000000 < cali * 524288
  • time_in_s < cali * 0.524288

implies assured by

  • time_in_s < cali/2

But (datasheet):

the cali is the RTC clock period (in us); bit11 ~ bit0 are decimal.

So we need either interpret this decimal part (datasheet gives the way to interpret it, it gives a maximum of 1000s / 16mn40s - not 100% sure about this)), or remove it:

  • time_in_s < ((cali & ~((1<<12)-1)) / 2), same as
  • time_in_s < ((cali >> 12) << 11)

My board gives 10240s with this formula (2h50mn). I remember having seen it fluctuating a bit.
edit: now 12288s (3h24mn)

@5chufti
Copy link
Contributor

5chufti commented Aug 3, 2018

for me this is purely academic, I cannot see any use in ESP.deepSleep(ESP.deepSleepMax());
For a start I would have expected the API call to return false if the max_deep_sleep_time is exceeded, but I forgot we are dealing with espressif ...

So the decent thing to do would be to either
a) replace the passed value with tmax if exceeded
(I think this is what API does instead of returning false)
b) return with "false" if tmax exceeded

just for the unlikely case that someone really exeeds tmax for ESP.deepSleep();
They should beforhand check on the use and probable range in documentation and/or div forum posts. I don't even see need of deepSleepMax().

regarding 2^31 vs. 2^32 see here #4969 (comment) (or 3.3.9 vs 3.3.50)

@artua
Copy link

artua commented Aug 3, 2018 via email

@d-a-v
Copy link
Collaborator

d-a-v commented Aug 3, 2018

check the sleep cycles left

How are you doing that ?

@5chufti
Copy link
Contributor

5chufti commented Aug 3, 2018

@artua
still I wouldn't use deepSleepMax() but a value that I can deal with, like 9000s (2.5)h.
without any true reference like ntp or rtc you'll never get repeatable cycles anyhow.

@marcvtew
your formula rewrite is not mathematically correct

@d-a-v
probably he has his 1024bit counter in rtc-ram, decrementing by current deepSleepMax() just before deepSleep() in every cycle... (but don't forget to correct by micros(), since wake time also counts!)

@5chufti
Copy link
Contributor

5chufti commented Aug 3, 2018

do we really want/need two deepSleepMax versions?

@Marc-Vreeburg
Copy link
Author

I'm using my esp8266's for IoT purposes, mostly battery powered. For me not knowing the safe max sleeptime makes the esp8266 totally useless.

@5chufti
Copy link
Contributor

5chufti commented Aug 3, 2018

thats why I argue for only one function, returning a safe value.
Who needs a function returning a value "without any value" (not being reliable)?

@Marc-Vreeburg
Copy link
Author

@5chufti I agree that it's best to have only one meaningful function. However ths might mean that we will be deviating from the api specification of deepSleepMax().
I've tested ESP.deepSleep(ESP.deepSleepMax(), WAKE_RF_DEFAULT) twice. Both times the esp8266 eventually did NOT wake up from deepsleep anymore. Is it possible for you to check my results?

@d-a-v
Copy link
Collaborator

d-a-v commented Aug 3, 2018

Since we are still speaking in uS, and bits 0..11 are BCD, and(?) we don't wish to deal with them, we could use:
time_in_us < ( ((uint64_t)(system_rtc_clock_cali_proc() >> 12)) << (12+19))

@artua
Copy link

artua commented Aug 3, 2018 via email

@5chufti
Copy link
Contributor

5chufti commented Aug 3, 2018

@d-a-v
BCD? any confidential info? not even the rtc "example" in A.2 does suggest that?

@marcvtew
as deepSleepMax() is allready in the latest official release, keep it, BUT have it return a safe value (can be documented in function why, it more likely serves the purpose than the dubious calculation example in the sdk doc)

edit: former test result was not correct, have to repeat

@5chufti
Copy link
Contributor

5chufti commented Aug 3, 2018

@marcvtew
unfortunately your right, no wake-up when tmax is exceeded.
I call this a severe API flaw, system_deep_sleep() is of type bool and should return false - when, if not at least when called with too high deepsleep time?
but I really wonder: how did they do it, so it never wakes up again?
If you load a timer register with a value exeeding the register width, one would usually assume the tineout would occur after the "modulo" ...

@devyte
Copy link
Collaborator

devyte commented Aug 3, 2018

@5chufti some timers don't continue to count past overflow, or have undefined behavior if they overflow. Another explanation is that some interrupt is triggered on overflow, but no ISR is assigned, or a garbage ISR is assigned.
Whatever else, I see two distinct usage cases here:

  1. sleep for a specific time, which should always be "safe"
  2. sleep for the max time possible, whatever that time is, even if that max time changes over time

The key point here is that neither of the above cases is really addressed in the current Espressif SDK API or docs.

  • The formula for the max theoretical sleep time is useless, because cali could change between the calculation and sleep start. Per @5chufti it could change even between calls within the same loop(). I didn't observe this in my own tests, for me it was always constant within a loop, but could change between loops, or during a yield(), but then again I didn't exactly spend a lot of time on that testing.
    Given that race condition, there is no way that 1. or 2. above can be done "safely".
  • If the max time is exceeded, in some cases you get an error message on Serial, in other cases the ESP doesn't wake up. I suspect that the code that checks the sleep time on the SDK side suffers from the same race condition due to cali changing.

@5chufti
Copy link
Contributor

5chufti commented Aug 4, 2018

breaking news:

system_deep_sleep_instant(time_us);

behaves much better as it at least restarts if tmax is exceeded.
(and also only 2^31-1 is feasible -> another "glitch" in the datasheet)

@devyte
we're talking about hw timer / compare registers, so no isr etc.

@5chufti
Copy link
Contributor

5chufti commented Aug 4, 2018

I now personally have settled for this approach

void EspClass::deepSleep(uint64_t time_us, WakeMode mode)
{
    if (mode == WAKE_NO_RFCAL) 
       system_phy_set_powerup_option (2);
    else
       system_deep_sleep_set_option(static_cast<int>(mode));
    if (time_us & 0x0800000000ull)
	time_us = (uint64_t)system_rtc_clock_cali_proc() * 0x7A120ull;
    if (time_us & 0x1000000000ull)
    	system_deep_sleep(time_us);
    else
    	system_deep_sleep_instant(time_us);
    esp_yield();
}

now I can easily select the functionality I want with

ESP.deepSleep(sleepTime[, wakeupmode])                         // backward compatible   
ESP.deepSleep(sleepTime | 0x0800000000ull[, wakeupmode]);      // do max safe deepsleep
ESP.deepSleep(sleepTime | 0x1000000000ull[, wakeupmode]);      // do instant deepsleep
ESP.deepSleep(sleepTime | 0x1800000000ull[, wakeupmode]);      // do max safe instant deepsleep

Edit: inserted probable workaround for #3408

@devyte
Copy link
Collaborator

devyte commented Aug 6, 2018

The current implementation uses system_deep_sleep(), which internally shuts down the wifi core "safely" (whatever that means), and then goes to sleep.
I think that to get this to work as expeted, @5chufti is correct in that we should use system_deep_sleep_instant() to go to sleep immediately, which should remove the race condition.
My only worry is the meaning of shutting down the wifi core "safely", but there are several possibilities here:

  1. going to sleep immediately has no impact on sleep power consumption, so we don't care about safe or non-safe
  2. the user is responsable for shutting down wifi before going to sleep
  3. we shut down wifi before calling these api calls, and hide that from the user

@Marc-Vreeburg
Copy link
Author

The past 30 hours i've tested wether the esp8266 wakes up with deepSleepMax() = ((uint64_t)system_rtc_clock_cali_proc()/2*(0xF4240ULL)-1). My esp8266 restarted 11 times so using deepSleepMax() = ((uint64_t)system_rtc_clock_cali_proc()/2*(0xF4240ULL)-1) seems to always wake up the esp8266.

In one of the api docs i remember to have read that system_deep_sleep_instant() function is void. Is this function still being supported by Espressif?

I did some testing shutting down wifi before going into system_deep_sleep(). Made no difference: esp8266 did not wake up with time_us = deepSleepMax(). Maybe someone else is willing to test this as well?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants