New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
machine_bitstream: Implement for rp2 and unify all variants. #7805
Conversation
e2f1fa5
to
7fb087f
Compare
It would be nice to combine all these bitstream implementations to reduce code duplication. But maybe do that as a refactoring exercise once they are all implemented? |
Thanks @robert-hh ! On rp2 should we could consider freezing the neopixel.py module?
Not sure I follow -- there's examples/rp2/pio_ws2812.py that does this. Or is the issue that the timing needs to be hardcoded, so a PIO implementation cannot handle arbitrary bit widths? |
@dpgeorge: These implementations are short and very hardware dependent. For instance, esp32 and mimxrt use a 32bit increasing counter, rp2 a 24 bit decreasing one; and all mechanisms are different. Combining these would not reduce the variation. So either you would move the hardware-related code into other files (like mphalport.*), or you would have #ifdef sections with all variants in the same file, which does not improve readability. |
4cd2be3
to
2c36d89
Compare
The most recent attempt generalizes the hardware related calls, but one problem exists, of how to tell the compiler/linker to move that code into RAM. When executed from flash, the timing is not good. The first bit(s) timing is wrong. |
Try looking at the |
So that works fine for the ARM based devices. Where is it defined for the ESP32? |
I'm not sure. But, I'm happy for this PR to be merged as-is, ie without generalising it. We can refactor later on if it makes sense/is possible. |
I have a version now which is identical for mimxrt and rp2. That includes changes to the mimxrt files and the rp2 files, both c and .ld files. I can push that, or keep it aside. |
IMO a good addition, especially when looking at the quickref doc: I believe the user expects to have the neopixel module present when flashing the device... |
The not-yet-pushed esp32 variant is now also identical to rp2 and mimxrt, besides the instruction for the linker to move the code to RAM. I found information and examples how to do that, but I expect a little bit of exercise until it works. Then the function could be moved to a common place and the code in the respective mphalport.h files can be cleaned if required. What name should it have, if moved to extmod? machine_bitstream_high_low.c? |
The linking challenge for ESP32 is solved. Now machine_bitstream is identical for ESP32, rp2 and mimxrt. |
f0b57a7
to
933cc85
Compare
With the most recent commit, all of stm32, esp32, esp8266, rp2 and mimxrt use the same implementation of machine_bitstream.c, located in drivers/bitstream. Except for the esp8266, the timing is pretty good. The ESP8266 is somewhat asymmetric, possibly requiring separate compensation for the high and low pulse. |
22a6ccd
to
2efe69d
Compare
Thanks for the effort to unify the variants. Please note that @jimmo spent a lot of time tuning the stm32 code, in particular the Cortex-M0 version with hand-tuned assembly. Both versions for stm32 should remain unchanged after any code factoring (ie the generated machine code should be the same before and after refactoring). Also, I think the common source code should be |
I have seen this STM32 code, a very pretty one with precise timing. It is removed for now, but I can put that back for STM32. STM32 would then use it's specific code. That would allow me to roll back a change that was made for STM32 only. |
I looked again into @jimmo's code. The non-Cortex M0 version is identical to the one I have as common version. Only instead of direct register access the common code has function calls, which in turn are defined as inline ode #defines in the port specific mphalport.h, turning into direct register access again. Wouldn't the cortex M0 also also apply to the RP2? I have to check. For RP2, I use at the moment the systick counter. |
ee3688e
to
a7a94b6
Compare
So this includes now the assembler version of @jimmo for STM32 Cortex-M0. Basically the common code is still the STM32 driver with stubs for Macros from mphalport.h instead of direct calls. Obviously the difference between the versions did not disappear. It just moved from the driver to mphalport.h. The donwside of unification is, that options for port-specific timing optimization are reduced. |
All HW related calls to mphalport are placed in mphalport.h Doing this the respective mp_hal_tick_cpu_xxx() functions or defines cover the existing ports. These are: mp_uint_t mp_hal_get_cpu_freq(void) Return the CPU frequency; also used by machine.freq() mp_uint_t mp_hal_ticks_cpu(void) Return fast ticks as increasing with time values. void mp_hal_ticks_cpu_init(void) Initialize the ticks_cpu counter. Required so far by the mimxrt and rp2 port. May be empty. mp_uint_t mp_hal_ticks_cpu_reset(void) Resets the ticks_cpu counter and returns a start value. If the counter cannot be reset, it must return the value of mp_hal_ticks_cpu() (be identical to ..) void mp_hal_pin_output(mp_hal_pin_obj_t pin) Sets pin to output mode. void mp_hal_pin_low(mp_hal_pin_obj_t pin) Set the pin value to 0. void mp_hal_pin_high(mp_hal_pin_obj_t pin) Set the pin value to 1. MP_HAL_BITSTREAM_NS_OVERHEAD A define for the timing correction. For esp32 and mimxrt this value was 6, for rp2 it is 9. Some of these may already exist in various ports. They can obviously either be implemented as inline fuctions of #defines, where possible.
These three are now identical and use linker methods for moving machine_bitstream to IRAM. All hardware depedent code is in the respective mphalport.h and the linker scripts.
- Adapt the Make files for esp32, rp2 and mimxrt and remove the local machine_bitstream.c copies. - Rename mp_hal_ticks_cpu_init to mp_hal_ticks_cpu_enable to match the existing code of esp8266 and stm32. - Rename mp_hal_ticks_cpu_reset to mp_hal_ticks_cpu_start. It sounds like a more appropriate name.
The timing error is higher than for the other ports, especially at 80 MHz. The timing is also worse than with the specific driver, but good enough for neopixel and the like.
The STM32 mp_hal_ticks_cpu has the unfavourable habit that it checks on each call whether cpu_ticks is enabled. That make the timing somewhat instable. Therefore the calls were renamed from mp_hal_ticks_cpu into mp_hal_ticks_bitstream and mp_hal_ticks_cpu_start into mp_hal_ticks_bitstream_start. In all ports, the respective defines are added to mphalport.h. With that change, the STM32 implementation works pretty precise. For the STM32 M0 MCU, the assembler version is used. Timing varies quite a bit with changing CPU clock frequencies. Testing done with: PYBV11 at 168MHz PBYD_SF6 at 196MHz ESP8266 at both 80 and 160MHz ESP32 at 240MHz RP2 Pico at 125, 180 and 250MHz MIMXRT1050 at 600MHz MIMXRT1020 at 500MHz Testing was done with symmetrical 1000ns cycles. The worst figure is for ESP8266 at 80Mhz, being off at about +62ns for a low phase, -38ns for the high phase, and a total error of 20ns for the full 2000ns cycle.
a7a94b6
to
bfb2f52
Compare
I close that one for now. Reasons:
I will roll it back to the rp2-only implementation and make a PR for that later. |
Ok, sounds good. |
…-rate mimxrt: Fix output frequency for samples that don't divide 192kHz
Like the variant for esp32 and mimxrt it is implemented by active polling the systick counter. The rp2 systick counter seemed to be unused before. Since it is a 24 bit counter, the longest bit pulse pair must not exceed ~67ms at 125MHz or ~34ms at 250MHz. For speed, the code uses direct register access. The timing is precise within +/- 20ns at 125 Mhz clock.
I did not find a way to implement it efficiently using the PIO. The PIO has not sufficient scratch memory registers to store the four bit times.