BOUNTY: Find a workaround/fix for update issues - $1000 #54
Comments
|
Relevant files as mentioned above: Generic Node File Server with HTTPS: Attempt at releasing chunks of file more slowly: Firmwares to load after setting IP/Domain and Fingerprint, and source code of those firmware for reference: |
|
I think maybe file_chunk_write should wait for the write callback to respond, before starting the timer. That way you won't fill up a send buffer if the connection is slow. Which could cause a large burst of data to be sent on a slow connection, I would guess. And this is nitpicking; but getFileSizeInBytes is kind of redundant, since you just read the entire file. So file.length holds your answer to the file size. |
|
Sounds like the TCP stack in the firmware is broken, i.e. it can't request missing packages. I would hook up wireshark and look carefully of what is going on. If that can be verified (no re-sends are requested), maybe a solution creating a local network with a local server (i.e. in virtualbox or on a mobile device) could minimize package drop. One way to capture with wireshark could be to share "internet" by Wifi on a Mac and run wireshark on the same machine, this to make sure to get as much Wifi traffic as possible. |
|
Is there some way to get the source and replicate the build environment for the factory image? I realize that the final solution must work with the image already flashed, but for debug it would be helpful to see the code on the Oak and make changes. |
|
The source is the same as attached in the second comment zip file You can setup the environment to compile it by downloading the following https://www.dropbox.com/s/dgb4qf1cooz3oba/oak_fallback.zip?dl=0 (Note still Then select Oak Fallback as the board, Single as the rom config, and hit On Wed, Mar 2, 2016 at 2:50 PM, jldeon notifications@github.com wrote:
|
|
OK, working on getting my environment set up. Should I expect any response to the "set" commands from the factory firmware? |
|
Yes - just added, thanks for asking: The Oak should respond with {"r":0} if after each three lines (set, length, content) are sent - if you get {"r":-1} back then something is wrong with the input. Also baud should be 115200 |
|
@jldeon - just added some more notes on how to confirm these changes and about timeouts for sending it all, see just under the "set" codeblocks in the top post |
|
I've got everything set up and running now, and I'm deep in debug land. I think I've got a solid lead on part of what's going wrong, but not everything yet. Will try to keep you posted, assuming someone else doesn't figure it out first :) |
|
Just started to get things setup to have a play with this, but im in the horrible(joke) situation where my Oak's update fine a good 80%+ of the time. 9/10 in a row this morning. ( The digistump server im talking about ) I've even tried Enabling WPA2 and forcing wifi channel 1 as was advised against. I've also tried hammering my connections up/down stream while updating as well. My only other AP is an Alcatel Pixi 4.5 in AP mode but the Oak just refuses to connect to that full stop. Im happy to provide remote testing of anyone's WiP solution from a location that seemingly is blessed. |
|
I've uncovered one major problem, and I believe the solution is fairly simple. It seems like having one failure (ie, one bad connection to the update server) causes cascading failures for the Oak. My fix should alleviate this: endless loop. It seems like the network stack is doing something silly, which is causing the server to become confused. Basically, TCP connections are established based on the client's IP and port, and the server's IP and port. When most web clients connect to a server, they pick a random port as their client port. The library on the Oak picks the same one every time (4097). If the transmission of the firmware file goes south, the socket is left open on the server. Most web servers have long timeouts on these sockets, because they expect you to make multiple requests (ie, HTML file, and then a dozen images or JS files or whatever). Now you reboot your Oak and try to connect again. The problem is, the Oak is sending a SYN using the same source IP and port (more than likely, if you're on a home router using NAT). The server looks at that packet, goes "I have a connection already" and drops it. Meanwhile, the server is still waiting for acknowledgement on the last chunk of the firmware file it sent. We can't fix the silliness on the Oak side, but we can more aggressively close the sockets on the server side. For Apache, try setting: In the VirtualHost section (or similar). This doesn't 100% fix the problem, but it does make it a lot less likely. KeepAlive Off is pretty sensible for the server, since we're not making bulk requests to it. The TimeOut parameter gives the client 3 seconds to acknowledge a packet if the send buffer is full (so by that point, you're already behind in terms of transmission). On Node.JS, the timeout parameter of the HTTP server object appears to serve a similar purpose. Ideally, we'd set something like Another option is setting Before these changes, if I interrupt the download from my local server (pull power while the 123123123 is scrolling), I'll get "NO CLIENT" errors over and over and over again. With these changes, I can pretty reliably flash my Oak on my local network. |
|
Another way we can get stuck in this "NO CLIENT" state is if the web server tries to close the socket, but can't get the Oak to reply to its attempts to do so. I'm testing with 2 on my machine, and getting much better results. |
|
@jldeon - some great discoveries, thanks! - just wanted to note that we have been setting /proc/sys/net/ipv4/tcp_retries2 to 4 which seems to be a good balance between closing when it shouldn't and allowing frequent retries - this is noted at the top of the server.js file, but this was a shot in the dark - you certainly figured out why this is necessary. In general I don't mind if any of the settings we need to change are OS level - this update server will run isolated on its own cloud/virtual server |
|
@digistump Ah, should have looked at the node.js code :) I figured you guys weren't going to do anything else on this box, which is why I suggested some of those OS-level changes. I'm not sure how the kernel does math with that tcp_retries2 value, so I don't know what 4 means in the context of how long the socket will persist. I'd suggest trying some of the other settings, as those helped immensely. If you want to see if a lot of people are stuck in the FIN-WAIT-1 state and would benefit from the For the record, I'm doing no throttling whatsoever on bandwidth and not having any issues flashing the Oak over and over again. The update server and the oak are on the same high-speed LAN. |
|
@jldeon before you implemented those changes were you getting SOCKET READ What server are you using to server it? Apache (version, etc?) or Node or On Wed, Mar 2, 2016 at 6:53 PM, jldeon notifications@github.com wrote:
|
|
@jldeon Do you have any issues receiving updates from the official server though? My local server works fine as well, but I have no issues with the official so cant really debug. |
|
@digistump I get the occasional SOCKET READ TIMEOUT, (10% of the time, or so?) in looking at the tcpdump when it occurs, the server is retransmitting the packet but it's not getting to the Oak (at least from the tcpdump on the server, I see no ACK). I don't know that there's much that can be done about this, though, since we're dealing with wi-fi. I'm going to keep digging on that, though, now that I've done what I can with this issue. |
|
@DarkLotus Yes, I tried last night and this morning to update with the official server on 2 Oaks, and it failed on every attempt. |
|
Ah at least you can reproduce :) if you need a vps or anything to test from a remote server let me know. |
|
@DarkLotus Thanks for the offer! I think I've probably got it covered. I've got a couple of VPSes, credit to AWS, credit to Azure... probably some other junk if I dug around a bit :) |
|
@digistump Totally flaked on your questions. The box I'm using currently is Ubuntu 14.04.4 LTS, 32-bit, testing with Apache 2.4.7 (latest available in the Ubuntu repos) currently. |
|
@jldeon - when it failed on every attempt last night against the live On Wed, Mar 2, 2016 at 7:04 PM, jldeon notifications@github.com wrote:
|
|
@digistump I was running the factory build, so there was no debug output. I posted what I had to the forums: http://digistump.com/board/index.php/topic,2034.msg9360.html#msg9360 |
|
I can reliably reproduce the "SOCKET READ TIMEOUT" error by pointing my Oak at my VPS, and I've been digging into it for the last hour or so. It doesn't look like an actual packet timeout, it looks more like some sort of weird conflict or maybe a race condition? I see a lot of retransmitted packets on both sides. I've got to sleep now, but I'll try and craft an experiment to test this tomorrow if I've got time. |
|
If anyone is around ive spun up a CentOS 6 server with Apache 2.2 just in case reverting back to Apache 2.2 is the fix. At least on my end i can confirm this works at 100% 5/5 times thus far. Will setup a ubuntu 14.04 apache 2.4 setup next on the same host, and see if i can get some time-outs reliably happening. |
This one is Ubuntu 14.04 with apache 2.4 All stock as well. Still I cant reproduce socket timeouts reliably, I start to wonder if its router chipset related or something. |
|
When I see that there are more problems with high speed networks it kinda ring a bell for me : MTU ! So I don't have time to test it myself currently, but trying to lower the server network interface MTU might help there ? On a different subject, I would be happy to test if any server you guys put out there are an improvement : I am "lucky" enough to not being able of upgrading any of my 3 oaks here at home (Cable internet), and have a 3.3V able USB/TTL adapter available. |
|
@DeuxVis oak2.jameskidd.net is running with MTU set to 576 if you want to give it a shot. |
|
I have 100% failure rate with the MTU lowered to 576, Socket read timeouts. So Packet size could definitely be a factor in this. Bed time for me, will look at it again tomorrow. |
|
@jldeon You're right, the update runs: 574 |
|
I don't know if this means anything, but during my test run I had one single failure that threw an exception: |
EDIT: There is now an official build/release of this tool from Digistump, visit this page instead of using the instructions in this comment.Okay. This is "it." Still very beta, testers needed. I've built a standalone and/or server hosted solution that you can play with. Code is in this repo. It's a fork of OakSoftAP so that I can get the config.html file. SetupWindows: The "easy way" is to use this build: oakupsrv-win-983bbf7.zip. It should be self-contained. If you run into issues with the pyinstaller version, you can run it from source. Grab the source from the repo above, and install Python 2.7 from python.org. Install pyopenssl, twisted, and service_identity from PIP (ie, Ubuntu (14.04 LTS tested):
Other platforms probably work, assuming you can install Python 2.7 and get the requisite packages from PIP or your package manager. These are just the two platforms I've tested on. Running the Update ServerThe update server does two things:
If you plan on serving the firmware yourself, I strongly suggest using a Linux machine for your update server. Windows works, but I can't tune the TCP parameters enough to get it fully reliable, and you're highly likely to get For Windows users using the prebuilt exe, open a command prompt, navigate to the app directory, and run *nix users, do On Linux, you can also optionally run Give the program a minute to run, you should get output like: Configuring the OakOn another machine that has WiFi (not the computer from the previous step - anything with a browser is fine - phone, tablet, laptop, etc), navigate to:
Where Follow the configuration setup as per normal, until you arrive at the WiFi network setup step. Below the list of WiFi networks, you'll see options to enter an update server IP and certificate thumbprint. At this point, you can pick between the Digistump server, the AWS testing instance I have, or using the custom update server you set up in the previous step. If you select the custom server, the values will be populated based on the IP and thumbprint of the server, (hopefully) automatically. If this fails, you can fill them in based on the messages that printed on the console when you started running the server. Don't forget to click "Save" after entering the proper values! Then you can click the "next" button on the wifi config as per normal. if everything goes according to plan, you should see your Oak reboot and attempt to update. If you're pointed at your own custom update server, the server console should show lines like: Other NotesWindows update servers are more likely to cause issues. I can't tune the retransmission parameters hardly at all. Plus, Windows' timeout on sockets is something like 4 minutes, and the only way to change that is via the registry. Expect a lot of If you want to generate your own standalone binary, make sure pyinstaller is installed (ie, on Windows, python -m pip install pyinstaller) and then run If something goes wrong with the server's configuration, you can wipe all the created files by deleting the
|
|
@jldeon I am very impressed |
|
@epatel Thanks! For me, using a local Linux server is about as good as my AWS instance. I have good (200mbit) internet, though. Hosting your own update server will probably help more for folks with less reliable internet. I am, in fact, gainfully employed :) I'm a senior firmware engineer in the CTO group of a videoconferencing company, I do a lot of prototyping and research-y stuff. What kind of jobs have you got? I was sick for a couple of days this week and wanted to work on my Oak project, so I had some time to hack away at this. Once I got started, the problem grabbed me and I've spent most of my free time on it... |
|
@jldeon Ah figures, you being senior and having a good gig already. I am pretty senior too, Lead Dev/Architect here at Mag+. Wish I had had time to get dirty with this challenge, I like challenges, especially when one need to think outside the box. But, I very much enjoyed the show, and the collaboration everyone pitched in with. http://www.fastcompany.com/3031498/hit-the-ground-running/problem-solving-lessons-from-nasa |
|
@digistump My solution(s) work around a few bugs in the update code, which makes the initial update a bit more reliable. However, going forward, it would be best if those bugs in the update code could be fixed. That way, at least subsequent updates would be far more likely to succeed. Is that feasible? Is that part of the .bin that is downloaded from the update server? I'm still not 100% clear on all the software architecture in place, so I'm not certain how exactly to go about building and testing this sort of change. If it's possible, any documentation is appreciated. |
|
I'm trying to debug the "Unable to connect or save settings to your Oak" problem where you can't even get the Oak to connect to your WiFi network. Using Charles and Postman, I've narrowed it down to how the configure-ap JSON parser is handling my SSID name. I've grabbed the source for OakSystem, oak_fallback (from above), and oak_update (from above). I'm using Arduino IDE 1.6.5 and installed the 2.0.0-rc1 folder from oak_fallback into ~/Library/Arduino15/Hardware (Mac OS). My Arduino settings are: Oak by Digistump (Pin 1 Safe Mode - Default), Serial (Expert Use Only), 80 MHz, and the port is set to my USB -> Serial adaptor. I can now get OakSystem.ino to compile, but I'm unclear on how to get it installed onto my Oak. I've tried both uploading from the IDE and exporting the compiled binary and using esptool, but in both cases, when the upload is complete and I re-boot the Oak the LED flashes 3 times, pauses, and repeats. The esptool command I tried was: What is the correct way to get a new OakSystem installed for debugging? |
|
@AtomicCat I might be able to help somewhat, but let's take this over to the Digistump Oak forum so that other folks can find the info - it's kind of tangentally related to the problem this issue is trying to address. If you make a post over there, I'll give you what I know :) |
|
@fri-sch Doing some digging into the occasional stack traces I get like that. I used objdump on my shiny new oakupdate binary to get the full disassembly:
(I had to pull oakupdate.cpp.elf from the Arduino IDE build directory under %TEMP%) The exception 28 means LoadProhibited or trying to read from an invalid address. The part of the code that is executing is in lmacProcessAckTimeout, which is only referenced in libpp.a, a binary part of the ESP8266 SDK. I assume this means that it's in Espressif's network driver, and not something that's going to be easy for us to fix. |
|
@jldeon I've started a thread for building OakSystem at How to build OakSystem. |
|
@jldeon - Sorry for my absence in this thread this weekend, it's been a busy one around here. The bounty doesn't require fixing the issues in the firmware itself because that firmware is a one time use updater - after this first update all future updates occur through the Particle cloud using totally different firmware that seems to be pretty darn reliable for everyone. In addition, the next run of Oaks at the factory will be pre-loaded with whatever the latest Particle firmware is, so this flawed firmware will never be used anywhere again and its whole lifespan is the first use of each Oak. Also, I imagine this has been assumed but - the Bounty is awarded to jldeon, assuming he sticks around here to fix any issues/improve things further if possible. Awesome work, many thanks! I'll dive more into all of this tomorrow. @jldeon - is your AWS server running the same source as the package is using? |
|
@jldeon Nice work!! Thank you! Thank you! All hail jideon... the Oak Update God!! xD I can say on my end, with Windows 10 Pro x64 as the server, and my wifi configured normally (WPA2 TKIP+PSK, N) that updating via the custom local server is the first time I have had a new Oak update first time out of the box. I did try your AWS server by OakRestoring the updated Oak, and picking your server via the config file dropdown (and I remembered to press save!!), but it failed with the usual socket timeout error. So for me the local server is a perfect fix - that's my 2 cents (and tries for Oak 5!!) anyway ;) |
|
@digistump No worries! :) I did dig into the firmware a bit to figure out as close as I could to the source of all the issues, I'll summarize in another post so that perhaps someone can learn from this or if the issues come up again we'll have whatever I learned from my investigation. Woo! Bounty! Awesome. Yes, I'll stick around and help, clearly. Code's all up in the repo I linked above, feel free to report issues. Looking forward to actually starting work on my Oak-based project at some point :P My AWS server is running an older version of my code. Most of what I changed between that version and the published code is stuff like the auto-generation of SSL keys, the firmware auto-download, Windows changes, etc. Usability changes, basically. The core of it is unchanged between AWS and the repo. Let me know if there's anything logistics-wise that would be useful. I don't know if my UI changes to the setup app are what you want, so I didn't submit a pull request, but I can. I can keep the AWS server running or you guys can set up your own based on my code, either way. Maybe next time you guys are prepping to ship something neat, you could get in touch, maybe hook me up with one to help test early? ;) @pfeerick woo! Glad it worked for you, and happy to help :) The local server solution is best if you've got kind of spotty internet compared to your wireless LAN. I still do get socket timeout errors occasionally from both my local server and the AWS instance; it hovers around 10% for both. |
Error Analysis
|
|
@jldeon nice error analysis... hope it helps in resolving the issues that people have encoutered. Yeah, I don't know what the go is with my connection - I suspect it is more MTU / packet related, as my internet connection is pretty stable, and the Broadcom chip in the modem seems to be rock solid for the wifi. I have to other ESP modules running 24/7 posting temperature stats to thingspeak, and they rarely miss a beat - considering they don't have any retry code or anything - they just power up every 10 minutes, push out a update, and go to sleep until the next reboot cycle. I was able to update an Oak using the official server via a portable hotspot on the first attempt (or was it the second?), so make what you will of that. Regardless of all of that - I think you have given us the 2nd and 3rd options, making it very unlikely that anyone will have to resort to the 4th - 1) update from the main server 2) update from an alternate server 3) update from local server, and 4) manual update via serial. |
|
I am not sure if this is related to the discussed issue, but I have seen the SoftAP SSID corrupted in my tests today. At home I have got only a couple of wifi networks, and my Oaks connect and get their update perfectly. But I took one of them with me this morning (to impress my coworkers). When I tried the update process, my linux laptop detected an ACORN-0bda41 SSID, started a connection, launched the dhcp, and then the SSID dissapeared!. A new scan showed a weird SSID, ending with a strange character. I repeated the process half a dozen of times, with the same result. And my NetworkManager has got two different networks stored in /etc/NetworkManager/system-connections: "ACORN-0bda41 automatic" and "ACORN-0bda41? automatic" Checking this file with a binary editor, the weird character in the SSID is a '01': My office is in a shared building, with lots of little startups and loads of wireless networks (and many different security policies). Possibly that has something to do with this strange behaviour. As soon as I arrived home, tried the update again, and it worked at the first attempt. |
|
Thanks, this worked for me where nothing worked before (including serial recovery). I used the digistump server. Much appreciated. |
|
@alfem I have experienced the same thing multiple time at home, extra non-printable character at the end of the acorn SSID. There are a lots of WIFI networks too here, I live in a large residential building. I have not been able to make my USB/serial adapter talk to the oak yet, so cannot provide more details currently. |
|
Sorry for delay. Here are some unsuccessful attempts logs against your server jldeon. I will do more tests later with improved wifi environment. |
|
Attempts against the digistump server, with the new softapp, still fails for me at home. Some socket read timeouts and some unability to connect to the server. Note that the first (fast) update went way more far than what I have experienced on the official server until now. Official_server_new_SoftAP.logs.txt I'm going to try with the oak near to my router, but this won't allow me to get the debug output. Later : nope, no luck with proximity to router. Anything else I can try to help debug this ? |
|
@DeuxVis looks like you're getting The Have you tried running your own local instance of the update server? That would eliminate any internet-based packet loss. If that still doesn't work, you can try tweaking your wifi settings - for me, switching to B only on wireless made a big difference. |
|
Thanks for your reply @jldeon To clarify, I am not looking for help to get my oaks updated, I can do that by serial or by using another wifi router/internet connection. I will try a local instance of update server next, should help narrow down where the problem comes from. |
|
Hey @jldeon, can you update the version of the firmware your custom server is pushing please? It appears to still be hosting the 0.9.5 / 5 core, instead of the more recent 1.0.0 / 6 core. I can update from your server just fine while the official server is giving me the 'SOCKET TIMEOUT' errors, but I then end up having to update (again!) using the local server to get 1.0.0 / 6. Interestingly, on the official server, the './+' symbols appear in groups of four, whereas yours is in groups of one. |
|
@DeuxVis I don't know that there's much more diagnostics that would be helpful at this point. I think I know why the errors are occurring, and beyond fixing the factory firmware (which is impossible and/or useless) I don't think I can do a better job of working around them. @pfeerick Firmware file is updated & the server on my AWS instance has been restarted. while we're on the subject... @digistump Did you guys set up an instance of this server somewhere? Just wondering when I can shut the AWS instance down & point it at your version. :) |
|
@jldeon - yes at oakotafallback.digistump.com - and the config app now points to it automatically if the first server update fails (and then refers people to this tutorial if that fails too: http://digistump.com/wiki/oak/tutorials/local_update) For our fallback server I used your python code tweaked for running on a remote server, and ran your linux tcp settings file - anything else you did to set that up? Closing this issue as well, as we've now released all of this officially - many thanks @jldeon, please email me with how you'd like your bounty (support@digistump.com) |
|
@digistump Gotcha. I've edited my fork and the comments on this page so they point there instead of to my AWS instance. I'll take the AWS instance down here in a bit. I didn't do anything else to my AWS instance except for what's in the Python script and the TCP parameters shell script. You're quite welcome! Glad I could help. I'll send you guys an email here this weekend, got to run off to 'work' in a few. |
It's a good idea to randomize the port used, otherwise a rebooted Oak will try and reconnect on the same port, which may result in a long delay if the connection was not closed before the reboot, as mentioned by @jldeon here: digistump#54 (comment) I believe I have observed this issue in practice on a few occasions, where it would take exactly five seconds for the first data to come through the connection, which seems too constant to be a random delay. As the read timeout in blocking_receive() is 2 s, this is obviously problematic. As most users will be behind routers running NAT, it will be the router settings, rather than the Particle server settings, that affect how reconnections from the same source port are treated.
Skills Required: System Admin, Node, C++, Linux Sockets, ????
Difficulty: Unkown
Challenges/Thoughts:
The technical limitations: The firmware shipped on the Oak is solely there to let you configure your wi-fi and get the first update, it, of course - given the Oak does not have a USB interface, cannot be changed except for with this update, so any changes to make this work have to happen on the server side of things.
The technical details: The Oaks preformed very well in our early testing at getting the update, before we sent in the firmware to the factory, while this was tested in a variety of ways the main development setup had the update file hosted on an Apache 2.2 server on Amazon EC2, and was being accessed from our machines over a 3mbps DSL connection. We also tested with it locally on our b/g/n wifi network. Our routers were run stock for testing with b/g/n enabled and auto for channel. - Between this point where we approved the firmware to be burned to the units and when people started to report issues the following changes occurred (or may have occurred): The hardware went from prototype to production, with likely ever so slightly different parts that should not have effected any performance and the update file was moved to an Apache 2.4 server on Ubuntu 14.04 on Digital Ocean (and then to various server setups - see github for more on this). After this updates were still working very well for us, since we didn't think it was an area we needed to worry about I can't say for sure if one ever failed, but they never failed enough to even catch my attention as a possible issue before I started shipping the factory produced units. The final point, and given that the update still works for me more than it doesn't over DSL, but not always over the local network, is why I believe connection speed may be involved.
Issue specifics: Specifically the Oak seems to either disconnect from the server prematurely or fails to get the next packet/chunk of data from the server. This usually is seen on the oak as a Socket Timeout or the Oak restarts because the watchdog timer kicks in after it sits in a loop doing nothing for awhile. If the Oak makes it to the end of the update it works. This may have to do with WiFi interference (we don't expect you to magically make it work even if other things are on the same channel as the Oak) - we are just trying to get it to work with a minimal set of rules for the user (make sure your router is not on channel 1 is acceptable, make sure your router is set to B only and 1mbps is probably not). For some rather extreme router settings that seem to work, and support our idea that this is speed related please see this post by a fellow beta tester on the forums: http://digistump.com/board/index.php/topic,2046.0.html
Things we've tried: We've tried various server setups including apache and basic node.js ssl servers. We've tried using Node.JS to make a custom https server that slowly pushes chunks of the firmware to simulate a slower connection. This seemed 100% reliable at one point in our testing, and then it wasn't any more, no idea why - but it is worth noting that the linux sockets buffer this data anyway, so it was unlikely this was actually helping, but we really don't know for sure yet.
How to test:
where yourdomainorip is the domain or ip you are testing with, and the 00s are replaced by the SHA1 thumbprint of your SSL certificate. Your Oak will now expect this certificate, connect to this domain or IP, and expect the firmware file at /firmware/firmware_v1.bin (grab the latest firmware here to place on your server for testing: https://oakota.digistump.com/firmware/firmware_v1.bin)
NOTE You should have these strings ready to send as there is a 30 second timeout between sending the set and sending the json string.
The Oak should respond with {"r":0} after each three lines (set, length, content) are sent - if you get {"r":-1} back then something is wrong with the input.
To confirm you have changed the two parameters send over serial these two lines:
You should get back a JSON response that includes those two settings.
This will cause your Oak to endlessly loop trying to download the update, displaying a log to serial of its progress, and then rebooting and doing it again.
6. Implement your solution in any way you can. (more below)
7. Test with your solution being served both on fast and slow broadband connections (local network, phone hotspot, high speed connection, etc). You can repeat 3-5 to set it to connect to a different server to test locally/remotely - these test bin files don't check for a certificate domain match, so you can reuse the same certificate if desired.
8. If you feel you have a good solution repeat step 5 but use oakupdate_debug_silent.bin instead - this does not show status counters during the update loop and therefore is more true to the speed of that loop on the factory Oak.
9. Submit your fix.
Acceptable solutions: The sky is pretty much the limit here, other than the firmware cannot be changed. The solution must run on a standard linux server (hey if you can get it to work with a windows server that'd be fine too), but can change the OS/server/etc in any way, this will run on a standalone cloud server. It does not have to be particularly performant (we can run many servers if we need to), though it must be able to serve the firmware to more than one device at a time. You can mess with linux sockets, you can write it in any language, you can use any existing software or packages - really we're open to anything. I have a fear it could be as easy as setting up Apache 2.2 again without any changes to the default linux socket setup (we've messed with that too many times now probably, without fully understanding - see comment at top of node scripts) - I don't think that's true as I'm sure I've tried that, but really even if it was that simple we would reward you the bounty.
Solution testing: Any submitted solution will be tested first by ourselves, and then by a selected group of users who have experienced issues updating and are able and willing to carefully test. If your solution works for most of these case we will reward the bounty to you. Partial bounties for partial improvements may be granted as well. Solutions that require the user to run something locally on their network may be considered, but preference will be given to server only solutions (not sure if that would offer any advantage, but thought I'd throw it out there).
Bounty
$1000 cash or $2000 credit or 200 OaksWon by @jldeonCash or credit is your choice. Cash to be paid via Paypal. Credit has no expiration but can only be applied to a single order and does not cover shipping (because that is how our shopping cart works, not because we want to be limiting). Oaks reward includes shipping. You can also pick a split between any of the options.
You may credit yourself in the files as well, leaving in tact existing licenses and credits.
Legal Stuff: We will choose a winner at our sole discretion. The winner will be the first pull request/comment that submits fully working code meeting the above requirements and following good coding practices, based on the timestamp of the pull request. Bounty will be awarded (or in the case of Oaks, sent) within 48 hours of confirming winner. Cash awards will be made in USD. This is not an offer for hire. All work submitted becomes the property of Digistump LLC to be used at our discretion in compliance with any associated licenses. Void where prohibited by law.
The text was updated successfully, but these errors were encountered: