Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sometimes on .show() the lights are all set wrong? #12

Open
mpeddicord opened this issue Apr 17, 2023 · 31 comments
Open

Sometimes on .show() the lights are all set wrong? #12

mpeddicord opened this issue Apr 17, 2023 · 31 comments
Assignees
Labels
bug Something isn't working

Comments

@mpeddicord
Copy link

This is a great project. I'm using it as the light controller for LEDs around my bed.
One thing I've seen though that I haven't been able to solve is sometime when I update the lights. They seem to get the wrong data. Usually it's like the data is offset. Often it manifests as the entire strand goes to full white for just the one frame.

When I use the examples I don't see the problem. It only seems to be when I'm doing a lot of manual setting of pixels using slices. One reason why it hasn't been such a big deal is because I'm using Home Assistant to send MQTT commands to the Pico, and that's usually just one update at a time. But it's become clear that the problem happens like 1 in 200 updates, roughly. So, when I'm updating manually I almost never see it. But when I'm trying to use a custom animation with a fair amount of complexity, I see lots of wrong updates.

I suppose it's possible it may not be a software thing. But I'd like to know if you've seen something like this before. (Also, everything is sharing a ground correctly)

Sorry if this isn't the right place for this.

@blaz-r blaz-r self-assigned this Apr 17, 2023
@blaz-r blaz-r added the bug Something isn't working label Apr 17, 2023
@blaz-r
Copy link
Owner

blaz-r commented Apr 17, 2023

Hi,
I haven't encountered this issue before, but it is possible it is due to software if updates are done too rapidly, though I'm not sure.
Could you please send the animation implementation so I can test it to see if I encounter similar thing.

@Davis8483
Copy link

Davis8483 commented Jun 19, 2023

I'm having a similar issue. In my setup I have a pi pico w connected to my computer over serial. The part of the pico's code that handles the commands is in a separate thread from the one that updates the leds.

LED Bug Video

There are two led strips side by side. The first one is set to display one pixel of blue and the rest red, although it has random flashes of blue. This bug only happens when its connected over serial.

@mpeddicord
Copy link
Author

That's actually how mine also usually manifests with a shorter LED count. Longer stands got worse for me. I kept thinking it was a power supply or grounding issue. Because you definitely do see weird flickering like that if you're not sharing a ground between the chip and the LED strip. (I found this out from firsthand experience and subsequently burnt out one of the data pins of my first PICO W before realizing all the things I was doing wrong)

Unfortunately, I didn't ever resolve this though, I put the project on hold instead.

I just wanted to chime in with additional similarities to you. I too was running my main code and the light updates on a separate thread. Though, I think I remember ruling that out. I almost had a small sample project to show the weirdness, but I had a baby and now there's no time.

Good luck! I'd still love to get to the bottom of this one. Whether it's software, the hardware itself, or some weird interaction between the two.

Last little data point, I'm using these LEDs exactly:
https://a.co/d/7uFL16u

@blaz-r
Copy link
Owner

blaz-r commented Jun 19, 2023

Thank you both for this. As I said in previous comment, it'd be great if you could give a minimal code that can reproduce this issue. Another question I have is if you are both using Pico W or Pico?

I think this can be software, hardware or even mix of both issue. It seems like the timing might be a bit off every once in a while. This might mean that the PIO wait intervals are incorrect, or PIO frequency or even something else.

Neopixles do have some tolerance for timing, about +- 25%, that's why I don't think that the problem comes from timing, because then it'd be more frequent. Could be something with PIO buffer.

It could be problem with "latching" at the end, where strip needs some time for colors to stick. But if you kept default delay of 0.0001 seconds, which equals 100 microseconds, it should be enough as strip should require 50 microseconds. It could as well be something about how this is achieved with time.sleep and threading that you are using.

With this many possibilities, I can only help if you can send me the code where this problem appears.

@Davis8483
Copy link

I was finnaly able to reproduce it with simplified code. Btw, I'm using a pi pico w.

Upload this to the pico as main.py with the neopixel library in a folder titled pi_pico_neopixel

from pi_pico_neopixel.neopixel import Neopixel
import _thread
import sys
import time

strip_length = 8
strip_pin = 14
strip_mode = "GRB"

led_strip = Neopixel(strip_length, 0, pin=strip_pin, mode=strip_mode)

def serial_communication_thread():
    while True:
        serial = sys.stdin.readline()

        sys.stdout.write(serial)

# start the serial communication handling thread
_thread.start_new_thread(serial_communication_thread, ())

while True:
    time.sleep(0.05)

    # set all pixels red
    led_strip.fill((255, 0, 0))

    # set last pixel blue
    led_strip.set_pixel((strip_length - 1), (0, 0, 255))

    led_strip.show()

Then run this on your computer.

import subprocess
import time

try:
    import serial

except:
    subprocess.run(["pip", "install", "pyserial"])
    import serial

port = "COM4"
baudrate = 9600

ser = serial.Serial(port=port, baudrate=baudrate, timeout=1)
time.sleep(1)

print("starting data tansfer")

while True:
    # send alot of data
    for index in range(50):
        ser.write("meaningless data".encode())
    
    # delay
    time.sleep(1)

@blaz-r
Copy link
Owner

blaz-r commented Jun 21, 2023

Thank you. I'll look into it when I get some time. I suspect it has something to do with threading and timing.

@Davis8483
Copy link

Any update?

@blaz-r
Copy link
Owner

blaz-r commented Jun 30, 2023

I'm really busy right now and don't have access to my pico. I'll try to test this as soon as possible, but might still take a few days.

@Davis8483
Copy link

Ok, thanks!

@blaz-r
Copy link
Owner

blaz-r commented Jul 2, 2023

Hello.
I managed to reproduce your problem on my pico.
I'm really not sure how this can be solved, but it seems that PIO doesn't work that well when serial is actually working is thread.

I rewrote the entire PIO assembler code for leds, to really get the timing down to 10ns precision, but it still happens.
I even changed the "latching delay" from 100 micro seconds (required should be 50) to 500 and it still occurs.

Tried to add flush to serial, or call gc every here and then and still no success.

At this point I don't know what else to try or what could cause the problems, but it seems that serial reading and writing somehow messes with PIO, or the code that loads data to PIO buffer.

@blaz-r
Copy link
Owner

blaz-r commented Jul 2, 2023

I think I might have solved it, but I'm unsure.
I checked the docs and I see that state machine put can accept entire array. So I replaced the loop with just one call inside show(), and the flickering is gone for me on ws2812b.

Can you try replacing show with the following and tell me if it solves the issue:

    def show(self):
        """
        Send data to led-strip, making all changes on leds have an effect.
        This method should be used after every method that changes the state of leds or after a chain of changes.
        :return: None
        """
        # If mode is RGB, we cut 8 bits of, otherwise we keep all 32
        cut = 8
        if self.W_in_mode:
            cut = 0

        self.sm.put(self.pixels, cut)
        
        time.sleep(self.delay)```

@Davis8483
Copy link

Davis8483 commented Jul 3, 2023

It still seems to have the random glitching. I might just rewrite my program to exchange less data.
Edit: I'm also using ws2812b leds

@blaz-r
Copy link
Owner

blaz-r commented Jul 3, 2023

I'm sorry to hear that. I'm really not sure what to do at this point. I will update the code to remove the for loop nonetheless as this seems a better approach.

As for this issue, I'm really not sure how to solve. Even when debugging I had a lot of problems with serial. I've also seen a lot of people having issues with threading online, so it might be a bigger issue.

If you by any chance find a solution, please share it here.

@blaz-r
Copy link
Owner

blaz-r commented Jul 3, 2023

One thing that does come to mind is that data is for some reason incorrectly sent to PIO or PIO doesn't set the right output for the first few pixels.

This could be due to threading implementation, if it uses something that allocates an equal share of time. If the instruction to put pixels is interrupted and then after the interruption the rest of data is sent. This would explain why the last different color pixel is sometimes sent to the index smaller than needed.
It would also explain why changing for loop to single instruction solved the issue on my side.

Maybe it would be worth running the serial entirely on the other core, so the main thread running the leds is never interrupted. Or somehow figure out how to prevent this interruption in that exact time.

@blaz-r
Copy link
Owner

blaz-r commented Jul 3, 2023

Since I can no longer reproduce this bug with your code, yet you say that it still happens, make sure that you did actually replace the show function with the one above and that it is loaded to pico. If you did, and it still doesn't work, you can maybe also try restarting the PIO before sending:

    def show(self):
        """
        Send data to led-strip, making all changes on leds have an effect.
        This method should be used after every method that changes the state of leds or after a chain of changes.
        :return: None
        """
        # If mode is RGB, we cut 8 bits of, otherwise we keep all 32
        cut = 8
        if self.W_in_mode:
            cut = 0

        self.sm.restart()
        self.sm.put(self.pixels, cut)
        
        time.sleep(self.delay)

I'm not sure it will make any difference because if the transfer gets interrupted when threading does context switch, then the only way to solve it would be to restart the entire transfer when thread gets the control back. (And I've read from some old posts that pico doesn't even have preemptive context switching so I'm really not sure at this point.)

Again, maybe you can avoid that by having serial on the other core.

@blaz-r
Copy link
Owner

blaz-r commented Jul 3, 2023

Another possibility would be to use DMA, which could potentially avoid the entire problem of interrupted data transfer. There was some work on that by @benevpi in his fork if you want to check it out.

@benevpi
Copy link

benevpi commented Jul 3, 2023

If this is a timing issue, DMA should solve it, but it's a bit hack-y to get it running in MicroPython. TBH, I'm a bit surprised that this is happening when the two bits of code are running on different cores like they should be here (at least, I think that's what _thread does on rp2040, it's been a while since I've used it). I don't have a Pico to hand at the moment, but I wonder if just switching the cores around would solve it? i.e. running the serial communication in the main thread and starting a new thread to run the NeoPixels.

@blaz-r
Copy link
Owner

blaz-r commented Jul 3, 2023

I don't think it's a timing issue, I believe something goes wrong when bits are sent to the PIO. At least replacing the for-loop with a single sm.put instruction where the array is passed solved it for me, but not for Davis, so I really think that something goes wrong when data is transferred to TX FIFO.

@mcarlson
Copy link

mcarlson commented Feb 2, 2024

First off, thank you for the awesome library! It's by far the best I've found for the pico!!!

I've also been struggling with intermittent, random glitchiness. In my application, I use Timers to update the strip. I really noticed the 'tearing' happening just drawing. At first, I figured it was due to me including negative numbers when using slices (very cool feature BTW) so I rewrote my code to draw a pixel at a time, mostly to understand what was happening.

I'm still seeing glitching and tearing - but it seems to be related to using a Timer(), which uses an interrupt. So, I eliminated that and all print()/serial calls and it's better, but still not fixed. When I use a Timer() to animate one pixel at a time, it's definitely worse. One sequence has black and draws a dark red pixel one at a time. When there are more black pixels than red, I sometimes see previous red pixels glitch out to black.

Adding 'self.sm.restart()' per above doesn't seem to help, but adding a single sm.put instruction where the array is passed did.

Also, Adafruit says says we need 300 microseconds delay so I changed the default to reflect that, which does seem to help a bit. I tried using a second array just for writing out values, that didn't help.

Definitely seems timing/interrupt related! I know neopixels are super timing-dependent for what it's worth! I'm also using this for lighting to any glitchiness is super annoying.

@blaz-r
Copy link
Owner

blaz-r commented Feb 2, 2024

Hi,
nice to hear that the library is coming in useful 😄.
Could you please provided the code for reproducing this so I can try fixing it?

These glitches are a problem, but I hardly find any time to check those and when I was testing the issue went away while using a single put instruction. This "single put" upgrade is for now inside a separate branch https://github.com/blaz-r/pi_pico_neopixel/tree/improve_show. It'll probably be merged into main.

Additionally, I wrote the code here: #17 which is a new state machine program, with much more precise timing, so you can try that as well, and if you provide the code I'll try this too to hopefully get this solved.

@mcarlson
Copy link

mcarlson commented Feb 2, 2024

Thanks for the quick reply!

I definitely applied the same fix in https://github.com/blaz-r/pi_pico_neopixel/tree/improve_show, that helped a lot and I nominate it for a merge to main. I also applied #18 per Adafruit's advice.

Here's a testcase I whittled down that shows the issue. It turns out it didn't matter if I use a Timer() or not.

Note that it requires ulab/numpy, you can grab a micropython build with it included from https://github.com/v923z/micropython-builder. For what it's worth, I'm using this one, but it shouldn't matter: https://github.com/v923z/micropython-builder/releases/tag/latest

I added a few array-like methods to neopixel and applied the changes in https://github.com/blaz-r/pi_pico_neopixel/tree/improve_show as well as the PRs I've submitted: #18 and #19.

Here's the program that shows the issue:

from machine import Timer
from neopixel import Neopixel

strip = Neopixel(144, 0, 28, "GRBW")

import ulab
import random
np = ulab.numpy
leds_np = np.array(strip, dtype=np.int16)  # numpy working copy of LED data
fade_by = np.array( (-3,-3,-3,-3), dtype=np.int16 )  # amount to fade by
def fireAnim():
    global leds_np
    # pick a new random set of LEDs to light up with fire
    c = 0xff660011
    c = (c>>24 & 0xff, c>>16 & 0xff, c>>8 & 0xff, c & 0xff)  # turn into tuple
    for i in range(8):
        leds_np[ random.randint(0,len(strip)-1) ] = c

    # fade down all LEDs, using numpy,  takes 4 msec for 256 LEDs on RP2040
    leds_np += fade_by         # fade down the working numpy array
    leds_np = np.clip(leds_np, 0, 255)  # constrain everyting to 0-255
    strip[:] = leds_np.tolist()  # copy working array to leds

    # update the strip
    strip.show()


# animation_timer = Timer(period=1, callback=fireAnim)

while True:
    fireAnim()

Every so often, it seems the pixels all shift by several positions, then snap back.

I'm using one of these 144 pixel RGBW strips: https://thepihut.com/products/flexible-rgbw-led-strip-neopixel-ws2812-sk6812-compatible-144-led-metre?variant=41013613887683 on a pico w.

I tried the changes you suggested at #17 but it didn't improve the issue. It also had a noticable green cast, and the first pixel was stuck on green, I'm not sure why. The jumping behavior i described still happens.

I hope this testcase helps, it makes it super obvious to me that things are being transposed - which wasn't clear before.

@mcarlson
Copy link

mcarlson commented Feb 4, 2024

Hello! I did some poking around and came across https://github.com/awawa-dev/HyperSerialPico/blob/master/pio/neopixel.pio which had some alternate timing. I tried it, doesn't seem to help a lot.

https://learn.adafruit.com/intro-to-rp2040-pio-with-circuitpython/advanced-using-pio-to-drive-neopixels-in-the-background suggests a 'trailer' is needed after each send, which is perhaps missing, e.g. trailer = b"\0" * padding_count + struct.pack(">L", 3840)

Just some more thoughts!

@blaz-r
Copy link
Owner

blaz-r commented Feb 4, 2024

Thanks. Yeah it's hard to say really what is the problem.
If I remember correctly my main suspicion was sending of data to PIO, which would explain why having a single put instruction did help, but not completely. Maybe a DMA transfer would be needed or the transfer should be done in some other way.

About the second link, it looks like it is using some "background_write", which is a CircuitPython thing, and it seems like it is using DMA.
I'm not sure about the trailer though, it seems to be sending some zeros ended by 111100000000. From what I know, the only requirement is that there is a certain period at the end where line stays 0.

Looking at the code I also noticed that they have this delay coded into PIO, where in this library it is done with time.sleep:

cut = 8
if self.W_in_mode:
cut = 0
sm_put = self.sm.put
for pixval in self.pixels:
sm_put(pixval, cut)
time.sleep(self.delay)

This sleep might actually be problematic in parallel environment (maybe it's even problematic already), additionally we don't really poll the PIO if transfer is done, we rely on the delay. This leads me to believe that maybe we should also handle the delay inside PIO program, and then possibly poll the SM if we can transfer again.

@blaz-r
Copy link
Owner

blaz-r commented Feb 4, 2024

Oh I think I understand the thing now. Header and trailer are used to first transmit number of bits (total bits in a strip) and inside trailer there is a delay value, and it seems like it's padded so it's read as correct 32bit value inside PIO program.

@mcarlson
Copy link

mcarlson commented Feb 5, 2024

Yeah, I mostly have the PIO program I linked to running (I think I understand it also, learning a lot!) and I reached a similar conclusion. Check out #20 if you want to give it a go.

I still doesn't fix my issue :/

After reflecting on it and noodling around to (mostly) get the new PIO program working, I don't see how replacing a call to sleep() with this on a single-threaded system like mine would really matter... Unless I missed something in my port? Maybe it's a hardware issue for me. Are you able to reproduce the issue with my test case? I'll try another strip and pico just in case...

DMA sounds like a very cool option, I see some pointers at https://github.com/orgs/micropython/discussions/11081 and https://github.com/benevpi/RP2040_micropython_dma but that's a whole new rabbit hole and it's not going to solve my glitch issue either.

Perhaps polling to prevent interrupting a previous write is the next thing to explore. But wouldn't the new data just accumulate at the end of the FIFO anyway and not interrupt the write?

Thanks for playing along, this is fun! I love how the Internet brings very specific kinds of interests together :)

@benevpi
Copy link

benevpi commented Feb 5, 2024

Hi,

On the timer -- I actually wouldn't expect this to work with a timer. I'm not 100% sure how micropython handles sleep(), but at a silicon level, RP2040 won't accept a sleep inside an interrupt handler. It's possible the MP interpreter has something that makes this work though.

Are you 100% sure this is a software problem? often problems like this are hardware-related. If you're running the strip directly from the GPIO, and powering it with 5V, then you're running it out of spec. 99% of the time you'll get away with running it out of spec, but occasionally, you get issues like this. have a look here for details (scroll to the bottom) : https://learn.adafruit.com/adafruit-neopixel-uberguide/powering-neopixels. Adding a capacitor to smooth the power line can also help in some cases.

@blaz-r
Copy link
Owner

blaz-r commented Feb 5, 2024

Yeah, I mostly have the PIO program I linked to running (I think I understand it also, learning a lot!) and I reached a similar conclusion. Check out #20 if you want to give it a go.

I still doesn't fix my issue :/

That is unfortunate and I'm out of ideas here (besides the DMA which at this point probably won't solve this).

After reflecting on it and noodling around to (mostly) get the new PIO program working, I don't see how replacing a call to sleep() with this on a single-threaded system like mine would really matter... Unless I missed something in my port? Maybe it's a hardware issue for me. Are you able to reproduce the issue with my test case? I'll try another strip and pico just in case...

I didn't have the time (and sadly won't for some time as I don't have access to my pico) but I'll try to do that asap on both of my boards.

DMA sounds like a very cool option, I see some pointers at https://github.com/orgs/micropython/discussions/11081 and https://github.com/benevpi/RP2040_micropython_dma but that's a whole new rabbit hole and it's not going to solve my glitch issue either.

Yeah, might be worth pursuing, but it sure is a rabbit hole and no guarantees for the solution.

Perhaps polling to prevent interrupting a previous write is the next thing to explore. But wouldn't the new data just accumulate at the end of the FIFO anyway and not interrupt the write?

I am not sure about this one really.

Thanks for playing along, this is fun! I love how the Internet brings very specific kinds of interests together :)

Yeah, it is indeed a great thing, and with more contributors to an issue we are more likely to find a solution 😄

@blaz-r
Copy link
Owner

blaz-r commented Feb 5, 2024

Hi,

On the timer -- I actually wouldn't expect this to work with a timer. I'm not 100% sure how micropython handles sleep(), but at a silicon level, RP2040 won't accept a sleep inside an interrupt handler. It's possible the MP interpreter has something that makes this work though.

I'm not familiar with micropython enough to know this, but from my experience with other things, sleep can be problematic in many cases so that's why I suspected it. Since @mcarlson rewrote this with PIO delay in #20 and the problem doesn't go away, it's probably not that, but nevertheless this sleep should probably be avoided?

Are you 100% sure this is a software problem? often problems like this are hardware-related. If you're running the strip directly from the GPIO, and powering it with 5V, then you're running it out of spec. 99% of the time you'll get away with running it out of spec, but occasionally, you get issues like this. have a look here for details (scroll to the bottom) : https://learn.adafruit.com/adafruit-neopixel-uberguide/powering-neopixels. Adding a capacitor to smooth the power line can also help in some cases.

It might be hardware as well. I'll test this asap to see how it behaves on my end, but currently I don't have access to my pico.

Thanks for providing these valuable insights 😃

@mcarlson
Copy link

mcarlson commented Feb 5, 2024

Hi,

On the timer -- I actually wouldn't expect this to work with a timer. I'm not 100% sure how micropython handles sleep(), but at a silicon level, RP2040 won't accept a sleep inside an interrupt handler. It's possible the MP interpreter has something that makes this work though.

Iteresting! The testcase I posted avoids the use of a Timer and I still see the issue. I'm starting to think it's something hardware-related.

Are you 100% sure this is a software problem? often problems like this are hardware-related. If you're running the strip directly from the GPIO, and powering it with 5V, then you're running it out of spec. 99% of the time you'll get away with running it out of spec, but occasionally, you get issues like this. have a look here for details (scroll to the bottom) : https://learn.adafruit.com/adafruit-neopixel-uberguide/powering-neopixels. Adding a capacitor to smooth the power line can also help in some cases.

Thank you for the pointer! I'm not running directly from the GPIO and have the pico wired to a common ground/5v with a high-quality power supply per that guide. The issue also shows up when I'm powering the Picos from my computer with the power supply's 5v line disconnected. I haven't tried adding a smoothing capacitor yet.

The good news is, the issue shows up on an identical Pico W and lightstrip. I'll post any updates!

@mcarlson
Copy link

mcarlson commented Feb 6, 2024

An update:I'm still seeing the glitches even with a smoothing capacitor. I've tried across two different Pico Ws and two different light strips. Let me know if you have any other ideas!

@benevpi
Copy link

benevpi commented Feb 6, 2024

I'm wondering if this is a garbage collection issue. This is a bit of a guess, but can you try adding

import gc

to the top of the file, and:

gc.collect()

immediately before strip.show()

if it's not that, I'm out of ideas.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants