Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Electrons flashing blue. #687

Closed
andyw-lala opened this issue Oct 12, 2015 · 34 comments

Comments

@andyw-lala
Copy link
Contributor

commented Oct 12, 2015

Flashing blue is supposed to mean no SIM card, but when running 0.0.2 and tinker with taoglas patch antenna both 2G & 3G units appear to enter this mode. Note that devices running identical code, but connected to external antennas do not exhibit this problem.

One or more resets (no power cycle, no reseat of SIM card) brings the device back to life.

Will try and capture serial logs to enable analysis.

@brycekahle

This comment has been minimized.

Copy link
Contributor

commented Oct 12, 2015

My guess is that the device experienced enough connection issues that it fell back to listening/setup mode.

@andyw-lala

This comment has been minimized.

Copy link
Contributor Author

commented Oct 12, 2015

Ah could be - what is that algorithm ?

@andyw-lala

This comment has been minimized.

Copy link
Contributor Author

commented Oct 12, 2015

On reflection, I struggle to think of a use case where this behaviour is desirable for any product in the particle portfolio (e.g. was happily connected, became disconnected, stopped trying to reconnect after < insert conditions here > until manual intervention.)
Unless I'm misunderstanding something (quite often the case.)

@brycekahle

This comment has been minimized.

Copy link
Contributor

commented Oct 12, 2015

I'm not sure of the algorithm, @m-mcgowan would know.

@m-mcgowan

This comment has been minimized.

Copy link
Contributor

commented Oct 12, 2015

to the best of my knowledge, the device only enters listening mode when it tries to connect and there are no credentials. For the electron, no credentials means no SIM card, which could also mean a communications failure with the ublox module?

@andyw-lala

This comment has been minimized.

Copy link
Contributor Author

commented Oct 12, 2015

OK - so it's not expected behaviour after < n > attempts or anything. That is good, I think.

I'll concentrate on capturing serial monitor output and see if I can correlate with the failure and then post here for analysis.

@andyw-lala

This comment has been minimized.

Copy link
Contributor Author

commented Oct 13, 2015

As I asked on the forums - do I need to take an extra step to enable serial monitor output with firmware 0.0.2 ? I see no output when connecting to the USB serial device with minicom.

@technobly

This comment has been minimized.

Copy link
Member

commented Oct 13, 2015

The firmware needs to be compiled with DEBUG_BUILD=y and should have the following in the app like Tinker for Electron does:

// ALL_LEVEL, TRACE_LEVEL, DEBUG_LEVEL, WARN_LEVEL, ERROR_LEVEL, PANIC_LEVEL, NO_LOG_LEVEL
SerialDebugOutput debugOutput(9600, ALL_LEVEL);

If 0.0.2 was not compiled this way, debugging will not appear to work. To get around that, you can checkout the tag and use this to compile and program:

list of tags for electron
electron-v0.0.1-rc.3
electron-v0.0.1-rc.4
electron-v0.0.1-rc.5
electron-v0.0.2
electron-v0.0.2.1
git checkout xxxx (replace xxxx with tag from above)
firmware/modules $ make clean all -s PLATFORM_ID=10 APP=tinker_electron DEBUG_BUILD=y program-dfu
@kennethlimcp

This comment has been minimized.

Copy link
Contributor

commented Oct 13, 2015

@technobly, you guys might want to be clear and distribute a pre-compiled tinker binary that people should be using to load for testing. That's a controlled environment for beta rather than we having to compile and tweak things.

@technobly

This comment has been minimized.

Copy link
Member

commented Oct 13, 2015

That was the idea :) It appears that with multiple people creating firmware releases we may have missed the DEBUG_BUILD=y on one of them.

@andyw-lala

This comment has been minimized.

Copy link
Contributor Author

commented Oct 13, 2015

For this beta, I request a 0.0.2b binary be built and supplied that has the
requisite debug flags and tinker as an app.

I can build my own stuff too, but that introduces a whole different raft of
variables into the equation.

On Tue, Oct 13, 2015 at 11:28 AM, Technobly notifications@github.com
wrote:

That was the idea :) It appears that with multiple people creating
firmware releases we may have missed the DEBUG_BUILD=y on one of them.


Reply to this email directly or view it on GitHub
#687 (comment).

Andy

@technobly

This comment has been minimized.

Copy link
Member

commented Oct 13, 2015

@andyw-lala please check the updated v0.0.2-rc.2 Electron Beta thread here: http://community.particle.io/t/in-test-user-loop-firmware-fix/16404

@m-mcgowan

This comment has been minimized.

Copy link
Contributor

commented Jan 18, 2016

We are still seeing the blinking blue issue I believe, so just bumping this for continued investigation.

@technobly

This comment has been minimized.

Copy link
Member

commented Mar 17, 2016

There were some issues related to this that were resolved... I think what might be outstanding is if the Electron tries to connect to the tower and cannot for 5 minutes, it will end up in Listening Mode instead of immediately retrying. This is a behavior requirement, but I will look into how often it is acceptable to retry. Listening mode helps to signal to the user that it has tried a good long time (5 mins), and maybe there is something wrong... like:
Antenna not connected
Battery not plugged in
No signal
Bad SIM card connections
Wrong SIM card entirely
Wrong APN for 3rd Party SIM
SIM deactivated

However if the unit is remote it's not possible to signal anything to the user except that the Electron won't respond to API calls. What we can do is drop out of Listening Mode automatically after a period of time to try again, in case something like one of these got us to Listening Mode in the first place:
Antenna not connected temporarily
Battery died, and recharged
No signal temporarily
Temporarily Bad SIM card connections (vibration, oxidation, moisture, etc..)
SIM deactivated / reactivated

@m-mcgowan m-mcgowan added this to the 0.6.x milestone Apr 8, 2016

@zoltan-fedor

This comment has been minimized.

Copy link

commented Apr 23, 2016

I have this exact issue now.
I am testing my code on Electron which will be a remote installed device, running a code in every 15 mins and sleeping in between. Today - after some spotty runs (probably network unavailability), it has entered the listening mode (blinking blue) and now I am waiting and hoping that it will recover itself, because this device will be out in the field, so if ever enters the listening mode then it will get "stuck" in there, having noone to reset it.

As such I would very much support a functionality which would limit the amount of time it spends in listening mode and after that time it would try connecting again.

I hope this will get added in 0.6.x - looking forward to it.
Unfortunately I will have to deploy in 2 weeks, so I can only hope that this new functionality arrives by then.

@m-mcgowan

This comment has been minimized.

Copy link
Contributor

commented Apr 23, 2016

while waiting for a fix, there is a workaround you can apply to your own code. You can subscribe to system events, in particular the setup_update event which notifies listeners how long the device has been in setup mode. Your handler can call Cellular.listen(false) to exit listening mode.after a timeout. https://docs.particle.io/reference/firmware/photon/#system-events

@zoltan-fedor

This comment has been minimized.

Copy link

commented Apr 23, 2016

Thanks for the tip.
I will have to read up on how to create listeners, but that sounds like a good temporary fix.

@zoltan-fedor

This comment has been minimized.

Copy link

commented Apr 23, 2016

So I read about the system calls and it seems the following code would get a handler for the listeing mode event and if the device is in listening mode more than 3 minutes, then it should reset (reboot the device) - and by doing so - preventing the device to be "stuck" in listening mode for all eternity.

I have tested it by manually placing the device in listening mode and hoping the device would reset (reboot) after 3 mins in that mode, but I can't place the device into listening mode anymore - at least not longer than about 2-4 seconds.

Please let me know if you see any issues with this snippet.

void setup_mode_handler(system_event_t event, int ms_passed) {
    // to prevent being stuck in setup mode, after 3 mins in setup mode, let's restart
    if (ms_passed > 3*60000UL) {
        System.reset();
    }
}

void setup() {
    // register the setup mode handler - to handle when we get stuck in setup mode
    System.on(setup_update, setup_mode_handler);
}
@zoltan-fedor

This comment has been minimized.

Copy link

commented May 3, 2016

Unfortunately even after having the above added to my code, I still get stuck in listening mode (blue flashing) after a period of bad network connectivity.

Any idea what else to do?

void setup_mode_handler(system_event_t event, int ms_passed) {
    // to prevent being stuck in setup mode, after 3 mins in setup mode, let's restart
    if (ms_passed > 3*60000UL) {
        System.reset();
    }
}

void setup() {
    Serial.begin(38400);
    pinMode(battery_relay_pin, OUTPUT);

    // register the setup mode handler - to handle when we get stuck in setup mode
    System.on(setup_update, setup_mode_handler);
}

viod loop() {
    if (msg_id <= msg_num_restart) {
        Serial.println("Going to normal sleep (does not turn off the network).");
        System.sleep(D1, RISING, sleepInterval*60, SLEEP_NETWORK_STANDBY); // Does not turn off network
    }
    else {
        Serial.println("Going to deep sleep (turns off the network, restarting the process with from 'setup()').");
        System.sleep(SLEEP_MODE_DEEP, sleepInterval*60); 
    }
}
@towynlin

This comment has been minimized.

Copy link
Member

commented May 5, 2016

I agree with @andyw-lala that this can be a significant problem in a production device. Here's my suggested behavior:

  • Keep the current "try for 5 minutes and drop into listening mode" behavior.
  • Once in listening mode try to connect again after a minute
  • On failure, drop back into listening mode
  • Use an exponential backoff — next time stay in listening mode for 2 minutes then try again
  • Next time 4 minutes listening and try again
  • Max out at 32 minutes in listening mode before retry, and if feasible make this value configurable by the user.
@zoltan-fedor

This comment has been minimized.

Copy link

commented May 5, 2016

Hi @towynlin ,

Thanks for the suggestion.
I was trying something similar, but only got to point #2 ("Once in listening mode try to connect again after a minute") and it got stuck there. At least with my above code I couldn't get the device to try to connect again after being stuck in listening mode for 3 minutes.
Any idea why the above code (see my previous comment) would get the device stuck in listening mode forever? I was observing it being stuck in listening mode for a few hours before I gave up and manually pressed the reset button.

As far as I believe my above code should take the device out of listening mode and restart it (System.reset()), but that wasn't happening.

So really the question is how to achieve what you described at point #2 ("Once in listening mode try to connect again after a minute")?

@towynlin

This comment has been minimized.

Copy link
Member

commented May 5, 2016

The firmware team is better equipped to answer your question @zoltan-fedor. I'll have to leave this to them. @technobly I see you're assigned here — when you next work on this issue, please try Zoltan's code and either suggest a fix or file a separate issue if necessary.

@KeighDub

This comment has been minimized.

Copy link

commented May 5, 2016

Do we need to have the system thread enabled for a handler to trigger on setup_update? I've never actually checked to see if my code was still running while one of our units was in the blue blink of death state. I've fooled with the system thread a bit, but hadn't intended to activate it for our production code.

@m-mcgowan

This comment has been minimized.

Copy link
Contributor

commented May 5, 2016

Hi @KeighDub - you don't need system threading to get the setup_update event - it's also published in single-threaded mode. (The same is true for all system events.)

@m-mcgowan

This comment has been minimized.

Copy link
Contributor

commented May 5, 2016

If you're using automatic mode, then it's best to register the system event handler on startup, rather than in setup() since the device will first try to get online before running setup() and adding your handler.

You can register it during startup like this:

// register the setup mode handler - to handle when we get stuck in setup mode
    STARTUP(System.on(setup_update, setup_mode_handler));
@zoltan-fedor

This comment has been minimized.

Copy link

commented May 5, 2016

Thanks @m-mcgowan !!!

I will try to register the system event handler on startup as suggested.

Unfortunately I can't promise a quick turnaround on the results of testing, because (luckily) getting stuck in listening mode doesn't happen often and I have no way of triggering it manually to test the handling of it.

@KeighDub

This comment has been minimized.

Copy link

commented May 5, 2016

I've got a build with the fix in now. I'm going to push it to one of my lab devices and pop off the antenna. Hopefully that will let me observe it failing back to update and trying to recover.

@KeighDub

This comment has been minimized.

Copy link

commented May 5, 2016

I haven't yet been able to reproduce a device spontaneously dropping into update mode with no antenna, but it does recover fine when I manually push it to update. One note, just calling Cellular.listen(false) in the handler function worked once, but the second time I tried to kick the device to update mode, it just locked up. I added a System.reset() to the call and I can kick it into update as often as I want and it bounces back like a champ.

@zoltan-fedor

This comment has been minimized.

Copy link

commented May 5, 2016

@KeighDub thanks for the testing.
In my handler function (setup_mode_handler) I was using System.reset(), so I will stick to that because it seems your testing proves that being a good way to bounce it back from listening mode:

void setup_mode_handler(system_event_t event, int ms_passed) {
    // to prevent being stuck in setup mode, after 3 mins in setup mode, let's restart
    if (ms_passed > 3*60000UL) {
        System.reset();
    }
}

// register the setup mode handler - to handle when we get stuck in setup mode
STARTUP(System.on(setup_update, setup_mode_handler));

void setup() {
   ...
}

void loop() {
   ...
}
@ScruffR

This comment has been minimized.

Copy link
Contributor

commented Jun 10, 2016

Just to add to the problem, while AndyW could revive his Electron via reset, this member and i could not
https://community.particle.io/t/electron-entering-listening-mode-flashing-blue-by-error/20711/19

BTW, if this is one of the reasons to enter this mode (as said by @technobly further up)

Battery not plugged in
...
Battery died, and recharged

Wouldn't going to sleep (maybe conditionally only for that reason) be a better choice than draining the battery ever further?

@vidiot1969

This comment has been minimized.

Copy link

commented Jul 6, 2016

Hate to revive this thread (again), but this issue does affect me as well (Electron 2G 0.5.1, Particle SIM).

I've noticed it when the battery has drained, then the device is connected to USB and recharged. No other actions are taken. It flashes blue forever, although not in 100% of the cases. Hitting Reset has solved the issue. That won't always be feasible.

Is there something in Setup() I could add to prevent this? Seemed like someone got close above, but not if this issue is resolved.

Thanks.

@KeighDub

This comment has been minimized.

Copy link

commented Jul 6, 2016

I've added the above fix into all of my builds and haven't seen any device locked-up since. Have you tried the above solution and had it not work? If so, details?

@Osmosis311

This comment has been minimized.

Copy link

commented Jul 28, 2016

I'm also experiencing this issue from time to time on a few remote devices that can't be easily reset.

What I'm very confused about regarding the code solutions listed above is that from my understanding, unless you enter multi-threading mode (SYSTEM_THREAD) no code will be executed while the device is in listening mode.

Is that not the case? If not, I'm comfortable with code to keep trying to re-connect and reset the device after certain intervals.

Thanks!

@ScruffR

This comment has been minimized.

Copy link
Contributor

commented Jul 29, 2016

Since System.on() hooks up a callback function to a system event, that function will be called by the system as "extended code path" for system tasks.

no code will be executed while the device is in listening mode.

This is obviously not applicable to "system code" (including callbacks), otherwise Listening Mode would be a dead mode too ;-)

@technobly technobly modified the milestones: 0.7.x, 0.6.x Sep 22, 2016

@technobly technobly modified the milestones: 0.7.x, 0.6.1 Nov 29, 2016

@technobly technobly removed their assignment Nov 29, 2016

@technobly technobly closed this Nov 29, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
You can’t perform that action at this time.