Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1.2.0 OTA bricked RATGDO #151

Closed
infamous-pattern opened this issue Apr 4, 2024 · 40 comments
Closed

1.2.0 OTA bricked RATGDO #151

infamous-pattern opened this issue Apr 4, 2024 · 40 comments

Comments

@infamous-pattern
Copy link

Guys:

I'm not sure what happend, but this morning I installed the 1.2.0 update via the web interface and my RATGDO bricked. It went through the update process without issue and then rebooted but never came back up. I waited 15 minutes, but no luck. So I removed it, attached it to my PC via USB, and attempted to re-flash it. The first attempt failed, so I rebooted my PC and gave it another go. The second time worked and now its functioning normally again. I just wanted to make y'all aware in case this happens to anyone else.

Thanks.

@Damien514
Copy link

Same for me, bricked.
I am going to plug my laptop to it and restore...

@jgstroud
Copy link
Collaborator

jgstroud commented Apr 4, 2024

Wow, sorry about that. I upgraded both of mine to the release build as soon as it was ready to make sure there were no issues. 2 people having the same issue doesn't seem like a fluke. Were you both upgrading from 1.1.0?

@Damien514
Copy link

Yep, I was on 1.1.0 - Upgrade went fine and the popup displayed "Rebooting" and then... nothing!

@jgstroud
Copy link
Collaborator

jgstroud commented Apr 4, 2024

If you haven't recovered it yet, when you go to connect your laptop to the device to restore it, maybe see if you can get some serial logs first.

@Damien514
Copy link

I didn't recovered, but will do. Any clue on how to get the logs for you?

@donavanbecker
Copy link
Contributor

Sane thing happened to me, but re-flashed (didn't erase device) once and everything was as it was before the update.

@jgstroud
Copy link
Collaborator

jgstroud commented Apr 4, 2024

If you use the web flash utility to recover it, that also has the ability to connect to the serial console. From there you can hit the reboot button, wait for it to boot and then download the logs. If you can't get it, no problem.

@jgstroud
Copy link
Collaborator

jgstroud commented Apr 4, 2024

Ugh.... 3 people now

I wonder if this is a v1.1.0 or a v1.2.0 problem.

@jgstroud
Copy link
Collaborator

jgstroud commented Apr 4, 2024

We must have broken OTA in v1.1.0 somehow.

@eric-ooi
Copy link

eric-ooi commented Apr 4, 2024

Not bricked, but it wouldn't update from 1.1.0 to 1.2.0 when using the "Update from GitHub" option. Tried twice and it'd just reboot back to 1.1.0. Eventually, I downloaded the firmware manually and used the "Update from local file" option and it rebooted into 1.2.0 just fine.

@infamous-pattern
Copy link
Author

No worries; stuff happens. Yes, I was on 1.1.0 when I ran the Update from GitHub update this morning. Mine woulnd't come back at all so I had to do a complete erase and re-flash.

@jgstroud
Copy link
Collaborator

jgstroud commented Apr 4, 2024

I just flashed the official v1.1.0 and then upgraded to v1.2.0 and no issues. Again clearly there is an issue, but I gotta figure out how to reproduce.

@jgstroud
Copy link
Collaborator

jgstroud commented Apr 4, 2024

At least they are all coming back up after doing the USB upgrade.

@Damien514
Copy link

Empty logs, I think that rebooting the device cleared them all. I tries twice to flash. Success, but it was not working. I suspected that there was a network issue. I think the issue was more a Wifi password "forgotten". I re-entered it while connected over USB after my third flash, and now it's working as expected.

@jgstroud
Copy link
Collaborator

jgstroud commented Apr 4, 2024

I suspect the OTA just completely bombed and left the device unbeatable. Thanks. Sorry for the trouble. Glad it's working now. I'll keep trying to reproduce the issue.

@jgstroud
Copy link
Collaborator

jgstroud commented Apr 4, 2024

Mentioning @dkerr64. Maybe he has some ideas.

@dkerr64
Copy link
Collaborator

dkerr64 commented Apr 4, 2024

Like @jgstroud I have successfully updated my ratgdo's without issue, I just tried again from GitHub as previously I had used local file.

Upgrading from GitHub or from a local file should make no difference... In both cases the upgrade goes through the browser, so where the bin file comes from is different, but the subsequent transfer from browser to ratgdo is the same for both.

One thing we don't do that perhaps we should is have a SHA-1 or CRC for the bin file, and in the browser check that the file downloaded without error. Now-a-days I think transfer errors at the app layer would be extremely rare, the underlying TCP stack should re-request any bad packets. But it is a gap.

I do not know how we could capture any logs of the update / reboot process. We are looking at how we can persistently save a log in the ratgdo, but right now the design point is to save only if a crash occurs, not continuously save to flash for fear of wearing it out (how durable is the ESP8266 flash supposed to be, is this a real concern?).

David

@jgstroud
Copy link
Collaborator

jgstroud commented Apr 4, 2024

I believe this was most likely introduced in ee38a90

I reverted this commit and released a v1.2.1
Note, I don't know for sure this was the problem, but seems the most likely. Also, since the issue is in the running firmware, you might have the same problem updating to v1.2.1 :/ But hopefully not with future builds

@dkerr64
Copy link
Collaborator

dkerr64 commented Apr 4, 2024

@jgstroud You may be onto something here... we should certainly stop all running tickers during the upgrade process. There is no need to continue Server-Sent-Event notifications to the browser during upgrade. I'll look into that.

@cefoster0
Copy link

So do we think we can update 1.1 to 1.2.1 via file and it will be ok? or does it need to be USB?

@dkerr64
Copy link
Collaborator

dkerr64 commented Apr 5, 2024

So do we think we can update 1.1 to 1.2.1 via file and it will be ok? or does it need to be USB?

For most people updating from either GitHub or local file is working fine. USB is only necessary to recover if OTA update fails. We suspect that OTA may be failing if you have poor internet or WiFi connection and we have changes queued to to be more tolerant of that and suspend any other network and GDO activity while update is in progress.

If your internet connection is slow then you might want to download the bin file first then use it to upgrade rather than direct from GitHub... but we have no good reason as to why there would be any difference.

@infamous-pattern
Copy link
Author

infamous-pattern commented Apr 5, 2024

So do we think we can update 1.1 to 1.2.1 via file and it will be ok? or does it need to be USB?

Yesterday, mine bricked upgradinbg to 1.2 OTA, but I just upgrade to 1.2.1 OTA via GitHub without isses.

@TLee1964
Copy link

TLee1964 commented Apr 10, 2024

OTA also fails from v0.11.0. Looking for my USB cable...

Updated via USB. After re-installing it worked with HomeKit but I couldn't connect via the browser. Discovered (via my router) the IP address was changed so my bookmark no longer worked. Now I'm wondering if that was the problem all along. I never actually tried HomeKit after the OTA update to v1.2.1 because I couldn't connect with my original bookmark.

@jgstroud
Copy link
Collaborator

Thanks for the feedback. Anything release prior to release v1.0.0 is a roll of the dice because we had so many problems. I have yet to be able to reproduce this issue, but we have implemented several safeguards to ensure reliability of the OTA process going forward.

@Cristov9000
Copy link

I just updated from 1.1 to 1.2.1 and it bricked the RATGDO. I have 4 other RATGDO on 1.1, is there any way to avoid this when upgrading those?

@dkerr64
Copy link
Collaborator

dkerr64 commented Apr 11, 2024

I just updated from 1.1 to 1.2.1 and it bricked the RATGDO. I have 4 other RATGDO on 1.1, is there any way to avoid this when upgrading those?

Did you update from GitHub or from a local copy of the firmware.bin file?

@paint697
Copy link

My 1.1.0 to 1.2.1 OTA upgrade bricked my unit. My AP is co-located with the unit in the garage and I have locked my ratgdo's MAC to only use that AP. How do I recover?

@dkerr64
Copy link
Collaborator

dkerr64 commented Apr 12, 2024

My 1.1.0 to 1.2.1 OTA upgrade bricked my unit. My AP is co-located with the unit in the garage and I have locked my ratgdo's MAC to only use that AP. How do I recover?

You will need to erase/flash using USB cable from computer/laptop like your initial setup.

Did you update from GitHub or local file? We are working on adding checksum to ensure integrity of uploaded files which will hopefully prevent this error.

@paint697
Copy link

paint697 commented Apr 12, 2024 via email

@dangerusty
Copy link

I've connected mine to USB, reset erased and installed the latest firmware, but the WiFi does not connect: "Unable to connect".

@paint697
Copy link

I had a similar issue with mine. I had previously "locked" mine to one of my APs. I had to temporarily remove that lock and then I could get the ratgdo back on my WiFi and I could again lock it to my AP.

@dangerusty
Copy link

I had a similar issue with mine. I had previously "locked" mine to one of my APs. I had to temporarily remove that lock and then I could get the ratgdo back on my WiFi and I could again lock it to my AP.

Ah, thank you! I had to dig deep on the Unifi console to find the offline device from a while ago and disable the lock. It works now.

@dkerr64
Copy link
Collaborator

dkerr64 commented Apr 13, 2024

Thanks for all the reports. I have firmware update integrity checksum code in test which will hopefully prevent this in future (after next release of course).

@Damien514
Copy link

Damien514 commented Apr 13, 2024

Since this update, my RATGDO keep disconnecting after 12-18h. Not connecting to Wifi, even after switching it OFF/ON. I have to re-flash the 1.2.1 update to make it run again (meaning move the cars, take a ladder, plug the Mac to the RATGDO...). Thing is I can't export the logs, they are empty when I re-flash.
It has been running solid for weeks on the previous version, so it's really weird.

UPDATE - Just to add more context, when I plug the RATGDO USB cable, the blue LED quickly light on and goes off immediately.

@Cristov9000
Copy link

I just updated from 1.1 to 1.2.1 and it bricked the RATGDO. I have 4 other RATGDO on 1.1, is there any way to avoid this when upgrading those?

Did you update from GitHub or from a local copy of the firmware.bin file?

I updated via github. Is it possible to avoid plugging into the other 4 units to upgrade? or do I just need to get it over with?

@jgstroud
Copy link
Collaborator

There seems to be an issue with fetching the binary from GitHub. Just download it first and use the local file update method. As @dkerr64 mentioned, there is a fix coming in the next release to address this issue.

@cefoster0
Copy link

Should I try the local upload with 1.2.1 or wait for some of the other commits in the next release?

@jgstroud
Copy link
Collaborator

It's entirely up to you. The next release should be coming probably end of the week or early next week.

@jgstroud
Copy link
Collaborator

Hopefully fixed in #145

@cefoster0
Copy link

Hopefully fixed in #145

Updated from 1.2.1 to 1.3.2 successfully via local file. Great job fixing this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests