-
Notifications
You must be signed in to change notification settings - Fork 37
Do not lose tweets on TLS issues #9
Comments
Error is:
At least it's from Vala code and we get a line number! |
This may be handled correctly on sending tweets. But |
Previous code assumed a best-case situation that never failed because of network errors. It also forgot to bail after an exception. RTs now say whether they succeeded or not.
Added bonus: Favouriting changes the fav count! Also, added code reuse
Still need to send a DELETE message so that the tweet disappears everywhere. Probably also need to handle DELETE on ProfilePage.
This also has to: * Load the tweet asynchronously * Add fields to save the tweet ID * Change when we save/clear * Add extra UI for clearing saved state (e.g. if replying to a now deleted tweet)
From initial testing then the patches are working. I've had an RT that failed that then reset back to the "not RTed" status so that I could RT properly. Previously the UI would say RTed, but the dialog says it failed, then you'd un-RT and it'd complain that the tweet didn't exist, and then you'd RT. The only problem is that I've had to add a "Save tweet?" dialog to give users a way out when they quote a tweet, it fails for transient reasons (like the TLS issue), and then by the time they go to post it again then it fails to load the tweet and so they want to clear the draft tweet. One thing I need to think about is attached media. I'm not sure how well we can handle keeping draft attached images. |
Today was a REALLY bad day for these errors. Happened repeatedly. Not a clue what's happening with Twitter/the network. I briefly tried Wireshark but it wasn't hugely helpful. However, I did find that I've got more work to do:
Also, I still need to look at attached media. |
* Set status based on correct flag * Don't close action panel until success (as per RT button)
* Alert is now centred * Emoji button gets disabled * Send button gets reactivated * Title goes back to text, not spinner * Image buttons become active again
* Errors are bubbled out to the caller (through async pattern used for `download_avatar` method) * Alert is now centred * Emoji button gets disabled * Send button gets reactivated * Title goes back to text, not spinner * Image buttons become active again
* Swap to injecting DM message that Twitter gives in response * Only inject DM when it succeeds * Track whether we've seen DMs, not whether it's newer than the newest (in case the other party replied before our last message, but we don't poll until we added our message to the model)
This solves a crash from the null assertion in cb_compose_job_send_async
Okay, I think that tweet, quote, reply, DMs, RT/unRT, favourite/un-favourite and delete are all working now (as in they catch the error and report it to the user, then put the system in a sensible state). We've also got the "Save?" dialog when you close the compose window with text. I'm not hugely keen on it, but it gives us a way out of the "compose a reply, sending fails, re-open compose later, try to load tweet that you were replying to, tweet doesn't exist, cancel button saves tweet, re-open compose, try to load tweet, tweet doesn't exist…" loop. But as I type this then I'm wondering whether we're better off asking the user whether to ditch the saved tweet when fetching the reply fails. Now I need to look at saving attached images and work out whether it's better to save the local paths and handle the user deleting them, or save the remote paths (which would lose incomplete uploads) and work out how to download the thumbnail. Technical detail From running Wireshark and decrypting my comms (instructions) and then running with
I still don't know why this is a problem, but the gnutls state machine appears to be in an incorrect state (or Twitter are sending things out of order) and it's expecting the wrong bit. |
Question answered on the "save local image path or uploaded image" - from the docs:
(Emphasis mine) Given that we don't know when we'll come back to it, we'll have to assume that users keep the images that they're attaching (and if they don't then it's their fault!). The example shows 24h, which will generally be sufficient, but we can't guarantee that'll be the case. |
* Only prompt the user about saving when a tweet load fails * If it fails and they say no, reset everything * If it fails and they say yes, close so they can try again (maybe it was a transient error) * Follow the "throw errors" pattern for `TweetUtils.get_tweet`
File/line numbers are already recorded, and we can look up the readable text for the domain quark
Okay, I've improved how we handle saving - most users won't see a prompt. You'll now only get prompted if you compose with a draft reply/quote and it fails to load the referenced tweet (e.g. TLS error). Only thing to do now is load/save draft images. I've patched my custom version of Cawbird with the branch as it stands, so I'll do some testing before I merge. |
This is in-line with RT and Favourite buttons (Reply and Quote are different because the dialog can't fail, and the compose dialog is then responsible for handling TLS errors)
Baedert added "draft" texts, but now we support draft images as well. If a TLS error happens and the user doesn't close the compose window then we can use the existing IDs because they're live for ~24h and the user probably won't wait that long. If they close the window then it could be re-opened the next day so we can't rely on the image IDs on the Twitter server still being live, so we need to re-upload them.
We hope the images are still there when we re-open the compose window. But maybe they aren't.
I've put in some load/save draft images stuff now. Possibly not so relevant since we now don't close the window and lose the tweet if it fails anyway, but a) useful for when people give up and try again later and b) it's let me spot another TLS error case to handle! Still to go:
Anything I missed? I'm vaguely suspicious that it's a GnuTLS bug underlying all of this, but I can't track it down and I don't know that level of the network stack enough to just drop in to their community and go "what's going on here?". Also, it could be GnuTLS following the spec and Twitter not doing. |
This required a swap to a button (not a widget) and hooking up events We also remove classes on upload completion to make sure we don't get conflict
If timeline/mentions/favourites/DMs fails because of the specific SSL error code then retry. This appears to work, but it *might* cause problems (e.g. hammering out requests with no back-off approach when the certificate is broken)
Okay, timeline loading issues have a basic cludge. Image upload is sorted. I'm going to run a test build for a while to see if I hit any issues. |
It looks like GnuTLS are acknowledging this as a bug, so all of these changes will be defensive and shouldn't be needed in future. It appears to be something that Twitter changed that GnuTLS doesn't handle cleanly. |
Paging isn't forcibly reloaded because the user can just try again but these functions are "load at start" actions
Okay, I think this is about done now. I've still got to look in to the GnuTLS side, but most things should now be handled correctly, fail and retry, or fail silently for some remaining less important things. I've merged to Master and will release soon, so give me a shout if there's anything else that people find. |
It looks like there's a simple fix upstream in GnuTLS that appears to work for me. It's currently scheduled for v3.6.11 in December. In the mean-time, I'll consider this as fixed as it's going to be, unless someone tells me otherwise. |
@IBBoard Have you really checked compiling cawbird with GnuTLS from master? I want to test building cawbird agains the latest GnuTLS master but currently fail at building the latter. |
I've been running with a custom build of GnuTLS for a few days that basically just adds MR 1087 to the 3.6.10 code and I've not seen a single TLS error. I don't believe it's glib-networking that needs recompiling with the updated GnuTLS. I think it's libsoup. I've rebuilt GnuTLS but not glib-networking. |
That is pretty interesting, as libsoup does not even take gnutls as a dependency in NixOS but just depends on glib-networking. |
Huh, maybe you're right. Maybe I was reading old docs where libsoup used GnuTLS directly. I just rebuilt them both and it worked, but looking at build specs and |
Good news, having built Cawbird against NixOS-unstable with the latest GnuTLS release, I haven't encountered any TLS issues for several hours so far \0/ Good job. |
I just debugged it. The GnuTLS people knew the protocols enough to understand what to do to fix it! |
I'm running Ubuntu 20.04 with GnuTLS 3.6.13 and still getting the error. |
Well i'm running the snap which has a 18.04 base, so maybe that's the issue then. |
I've never stopped having this error the first time I try to open any tweet in the timeline. |
As far as I can tell, it looks like Ubuntu 18.04 is still using an old version of GnuTLS (possibly 3.5.18). From what I understand of Snap, I think it bundles its own version of many libraries and I assume it is also outdated if you're seeing the errors. There's nothing we can do. The changes in this ticket put in the best work-arounds we can without causing different issues. The only real solution is for distros and "application distribution systems" that insist on bundling their own copies of libraries to update their libraries to fix the real bug. |
I've filed a bug with the snap dev of cawbird as this is fixable in snaps. |
Patch to fix it in 18.04 and 19.10 was submitted by the snap maintainer. |
Sometimes (possibly just on one network) I get TLS connection errors. Normally this happens when RTing, but occasionally it happens when tweeting. Normally, Corebird/Cawbird keeps the last tweet, but in one specific situation that I think is related to the TLS errors then it forgets it. My assumption is that it doesn't handle the error correctly and treats it as posted.
The text was updated successfully, but these errors were encountered: