Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix usb sleep for NRF #2868

Merged
merged 5 commits into from
May 21, 2020
Merged

Fix usb sleep for NRF #2868

merged 5 commits into from
May 21, 2020

Conversation

xobs
Copy link
Collaborator

@xobs xobs commented May 8, 2020

This fixes connecting and disconnecting USB while executing time.sleep() on NRF. This issue may be present on other ports, so similar fixes should be implemented on other platforms as well.

@xobs xobs changed the title Fix usb sleep nrf DO NOT MERGE: Fix usb sleep nrf May 8, 2020
@xobs
Copy link
Collaborator Author

xobs commented May 8, 2020

Note that this patch doesn't actually work, because you cannot run tud_task() with interrupts disabled, as it will possibly deadlock:

Program received signal SIGINT, Interrupt.
edpt_dma_start (reg_startep=0x40027010) at ../../lib/tinyusb/src/portable/nordic/nrf5x/dcd_nrf5x.c:104
104           while ( _dcd.dma_running )  { }
(gdb) bt
#0  edpt_dma_start (reg_startep=0x40027010) at ../../lib/tinyusb/src/portable/nordic/nrf5x/dcd_nrf5x.c:104
#1  0x0002f1f6 in xact_in_prepare (epnum=<optimized out>) at ../../lib/tinyusb/src/portable/nordic/nrf5x/dcd_nrf5x.c:180
#2  0x0002f290 in dcd_edpt_xfer (ep_addr=<optimized out>,
    buffer=buffer@entry=0x2000b1d0 <_mscd_itf+32> "USBS\340\331", <incomplete sequence \360>, total_bytes=<optimized out>,
    rhport=0 '\000') at ../../lib/tinyusb/src/portable/nordic/nrf5x/dcd_nrf5x.c:314
#3  0x0002f2c8 in usbd_edpt_xfer (rhport=rhport@entry=0 '\000', ep_addr=<optimized out>,
    buffer=buffer@entry=0x2000b1d0 <_mscd_itf+32> "USBS\340\331", <incomplete sequence \360>, total_bytes=total_bytes@entry=13)
    at ../../lib/tinyusb/src/device/usbd.c:1002
#4  0x00037da8 in mscd_xfer_cb (rhport=<optimized out>, ep_addr=<optimized out>, event=<optimized out>,
    xferred_bytes=<optimized out>) at ../../lib/tinyusb/src/class/msc/msc_device.c:629
#5  0x00030936 in tud_task () at ../../lib/tinyusb/src/device/usbd.c:427
#6  tud_task () at ../../lib/tinyusb/src/device/usbd.c:360
#7  usb_background () at ../../supervisor/shared/usb/usb.c:78
#8  0x00038b02 in run_background_tasks () at background.c:62
#9  run_background_tasks () at background.c:55
#10 0x00038f42 in supervisor_run_background_tasks_if_tick () at ../../supervisor/shared/tick.c:86
#11 port_sleep_until_interrupt () at supervisor/port.c:256
#12 port_sleep_until_interrupt () at supervisor/port.c:236
#13 mp_hal_delay_ms (delay=<optimized out>) at ../../supervisor/shared/tick.c:110
#14 0x00038f8a in common_hal_time_delay_ms (delay=<optimized out>) at ../../shared-module/time/__init__.c:44
#15 time_sleep (seconds_o=<optimized out>) at ../../shared-bindings/time/__init__.c:82
#16 0x00052484 in mp_call_method_n_kw (args=0x2001fe94, n_kw=<optimized out>, n_args=<optimized out>) at ../../py/runtime.c:636
#17 mp_execute_bytecode (code_state=code_state@entry=0x2001fe80, inject_exc=<optimized out>, inject_exc@entry=0x0)
    at ../../py/vm.c:1017
#18 0x00053690 in fun_bc_call (self_in=0x200198a0, n_args=<optimized out>, n_kw=0, args=0x0) at ../../py/objfun.c:286
#19 0x00050016 in parse_compile_execute (source=source@entry=0x5434d, input_kind=input_kind@entry=MP_PARSE_FILE_INPUT,
    exec_flags=exec_flags@entry=32, result=result@entry=0x2001ffb8) at ../../lib/utils/pyexec.c:114
#20 0x00050118 in pyexec_file (result=0x2001ffb8, filename=0x5434d "code.py") at ../../lib/utils/pyexec.c:529
#21 maybe_run_list (filenames=filenames@entry=0x20008100 <supported_filenames>, exec_result=exec_result@entry=0x2001ffb8)
    at ../../main.c:185
#22 0x000508de in run_code_py (safe_mode=<optimized out>) at ../../main.c:243
#23 main () at ../../main.c:467
(gdb) list
99            return;
100         }
101         else
102         {
103           // Otherwise simply block wait
104           while ( _dcd.dma_running )  { }
105         }
106       }
107
108       _dcd.dma_running = true;
(gdb)

@xobs xobs changed the title DO NOT MERGE: Fix usb sleep nrf Fix usb sleep for NRF May 8, 2020
@xobs
Copy link
Collaborator Author

xobs commented May 8, 2020

This patchset requires hathach/tinyusb#396

There may be other areas where RUN_BACKGROUND_TASKS cannot be run with interrupts disabled, however this patchset covers all cases I've encountered so far.

@xobs
Copy link
Collaborator Author

xobs commented May 8, 2020

Update: With this patchset, usage is down to 10 uA when running on battery. It quickly and reliably wakes from sleep, and it appears to be capable of logging to internal flash.

@tannewt
Copy link
Member

tannewt commented May 8, 2020

Update: With this patchset, usage is down to 10 uA when running on battery. It quickly and reliably wakes from sleep, and it appears to be capable of logging to internal flash.

Why do you think this improves power usage so dramatically? Were we never actually sleeping?

@xobs
Copy link
Collaborator Author

xobs commented May 9, 2020

Update: With this patchset, usage is down to 10 uA when running on battery. It quickly and reliably wakes from sleep, and it appears to be capable of logging to internal flash.

Why do you think this improves power usage so dramatically? Were we never actually sleeping?

That was poor wording on my part. Without these patches, I couldn't reliably test, since whenever it went on battery it would effectively crash. With this patchset in place, I was able to concentrate on other parts of the chip in order to bring current consumption down there.

The most recent victory was setting the I2S_SCK and I2S_LRCK lines to output-low, which prevents those lines from floating to the midpoint and toggling the microphone. This was causing the microphone to wake up, which was burning 330 uA.

Without this patchset, it would grind to a halt prematurely, causing the host to reset it (if connected via USB), or delaying flash writes until the system awoke from sleep.

@tannewt
Copy link
Member

tannewt commented May 11, 2020

That was poor wording on my part. Without these patches, I couldn't reliably test, since whenever it went on battery it would effectively crash. With this patchset in place, I was able to concentrate on other parts of the chip in order to bring current consumption down there.

kk, awesome! I'm excited to see the subsequent PRs. Are you blocked on TinyUSB changes now?

@xobs
Copy link
Collaborator Author

xobs commented May 12, 2020

This does depend on the tinyusb patch, because tud_task() can hang the system if interrupts are disabled. In fact, the hang only occurs if it would have encountered the disconnect/connect issue that this patch aims to fix.

My concern is that other background tasks require interrupts to be enabled, and I am unable to exercise those. However, I do think that others will begin to report issues with missed events with the lower power WFI patch, and some approach like this will be required to fix them. So it's probably better to figure out what crashes and fix it.

@tannewt
Copy link
Member

tannewt commented May 12, 2020

I'm assuming we'll have bugs to track down with the lower_power merge. So, merging this is fine once TinyUSB has the needed update. We can test the other background stuff in the beta releases.

// available, even though the actual handler won't fire until
// we re-enable interrupts.
common_hal_mcu_disable_interrupts();
RUN_BACKGROUND_TASKS; // Drain any buffers remaining
Copy link
Member

@hathach hathach May 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

background task are run in the caller mp_hal_delay_ms(), we don't have to run it here, other platforms don't seem to run this before WFI() as well. I am not so sure though, will do more testing.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, but you must run background tests after you disable interrupts, to ensure that e.g. the USB queues are empty.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I know that intention of yours, I am thinking a way to make tinyusb maybe cooperate better with wfi() 🤔 🤔

Copy link
Member

@hathach hathach May 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tannewt @xobs we shouldn't run background task with interrupt disabled. This particular issue is mentioned by TockOS team (note 3) a few months ago hathach/tinyusb#279 . but I overlooked it. The better implementation should be an function to check if there is anything to run with the event queue of tud_task(), if yes, don't enter WFI(), re-enable interrupt, and loop over again.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two approaches that we can take, and I think @tannewt should decide which approach they think is best:

  1. Allow background tasks to be run with interrupts disabled. This is the approach that's started by this patch, but there may be many other minefields in other code that requires interrupts to be enabled in order to function properly
  2. Add a new SHOULD_RUN_BACKGROUND_TASKS macro that polls each subsystem, similar to how RUN_BACKGROUND_TASKS runs each subsystem. This SHOULD_RUN_BACKGROUND_TASKS will include a call to tud_task_is_queue_empty(), among other calls.

Am I correct in stating you're advocating for approach 2? This patch takes approach 1, because I thought it would be easier to implement and require less code overall. Also because tud_task_is_queue_empty() didn't exist. Approach 2 does benefit from potentially allowing non-critical-section-safe code from going into the RUN_BACKGROUND_TASKS macro, which seems cleaner.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah you are spot-on and I am adding the usb event queue count API() just now. Your PR has already implemented 99% of the work. We only need to change the run background to if still has more background then skip before executing wfi().

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR for tinyusb hathach/tinyusb#414 is on the way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xobs I was thinking 2 would be our approach to background tasks long term but @jepler prototyped a callback queue in #2879

@hathach
Copy link
Member

hathach commented May 20, 2020

Note that this patch doesn't actually work, because you cannot run tud_task() with interrupts disabled, as it will possibly deadlock:

How often do you encounter the deadblock, I switch to your PR branch to try reproducing it. I know it is totally possible, but it is bit hard to encounter. I may just simulate it with other modified example, the goal is to check the corresponding registers to find a simpler alternative fix to hathach/tinyusb#396 than invoking the dcd_int_handler() if possible.

@xobs
Copy link
Collaborator Author

xobs commented May 21, 2020

I encountered it maybe 20% of the time.

The problem always seemed to occur on the last packet as part of an MSC SCSI transaction (the same packet that causes adafruit/Adafruit_nRF52_Bootloader#120 interestingly enough) that gets stuck in the queue between calling tud_task() and WFI. If tud_task() is called with data in the buffer, then it will deadlock.

@hathach
Copy link
Member

hathach commented May 21, 2020

I encountered it maybe 20% of the time.

The problem always seemed to occur on the last packet as part of an MSC SCSI transaction (the same packet that causes adafruit/Adafruit_nRF52_Bootloader#120 interestingly enough) that gets stuck in the queue between calling tud_task() and WFI. If tud_task() is called with data in the buffer, then it will deadlock.

Thanks for the reply, I hardly got into it, I guess it depends on the host OS as well since it is racing. I will try to simulate it then with my simple example, it shouldn't be much different. I may need your help to test with it again if you have some time :).

@xobs
Copy link
Collaborator Author

xobs commented May 21, 2020

Sure.

I still need to rework my USB Beagle -- an inductor has decided it doesn't like its house on the PCB, and has started trying to separate from the PCB -- but I can test it.

This seems like a discussion for hathach/tinyusb#396 though.

For this issue, I'm confused about what @tannewt means -- with the background_callback work, does that mean I should create a SHOULD_RUN_BACKGROUND_TASKS that includes a call to tud_task_is_queue_empty()? Or does the addition of background_callback mean we should make RUN_BACKGROUND_TASKS work with interrupts disabled?

xobs added 5 commits May 21, 2020 21:34
This update gives us access to a function we can run with interrupts
disabled to determine if the queue is empty.

Signed-off-by: Sean Cross <sean@xobs.io>
Allow for passing `-DCFG_TUSB_DEBUG=1` or `-DCFG_TUSB_DEBUG=2` on the
command line to enable debugging tinyusb within circuitpython.

Signed-off-by: Sean Cross <sean@xobs.io>
In order to ensure we don't have any outstanding requests, disable
interrupts prior to issuing `WFI`.

As part of this process, check to see if there are any pending USB
requests, and only execute the `WFI` if there is no pending data.

This fixes micropython#2855 on NRF.

Signed-off-by: Sean Cross <sean@xobs.io>
ARM recommends issuing a DSB instruction propr to issuing WFI, as it is
required on many parts suchas Cortex-M7. This is effectively a no-op on
the Cortex-M4 used in most NRF parts, however it ensures that we won't
be surprised when new parts come out.

See
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dai0321a/BIHICBGB.html
for more information.

Signed-off-by: Sean Cross <sean@xobs.io>
If interrupts are disabled, then calling sd_* functions will hardfault.
Instead, assume that it's safe to write if interrupts are disabled.

Signed-off-by: Sean Cross <sean@xobs.io>
@xobs
Copy link
Collaborator Author

xobs commented May 21, 2020

I've updated the PR to use tud_task_event_ready() instead of running all background tasks. It does this because we don't have a SHOULD_RUN_BACKGROUND_TASKS routine yet.

Note that it keeps the bulk of the previous patches, because it uses common_hal_mcu_disable_interrupts() and common_hal_mcu_enable_interrupts() which don't handle recursion properly when the bluetooth radio is enabled. However, this version of the patch simply checks the tinyusb queue rather than running all background tasks.

Copy link
Member

@hathach hathach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR address all issues around entering low power mode with WFI as implemented by other platforms (freeRTOS tickless, zephyrs cpu idle). Should be ready to merge, great work @xobs

Copy link
Member

@tannewt tannewt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me too! Thanks @xobs and @hathach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants