-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Many export errors #11
Comments
I ran the process again and verified that those 56 attachments are NOT created during export. |
Thank you for the useful report. Signal's database format was updated, and this script hadn't been updated. If you run this:
It'll checkout some changes that I've just pushed up. Then run the thing again, maybe you'll get some different / more exciting errors. |
I don't have git on my system. Is there an easier way than setting up that whole environment in windows? |
You could download the updated file directly: |
|
Could you please dive into Signal settings and see what version of Signal you're running. It'd be useful if I could detect the Signal version from the Signal backup file, but I don't see that information in there - so we need to retrieve it manually. |
It's already uninstalled, but it was updated just this morning because Signal refused to let me delete my account until I updated. Prior to that I was still on 5.59 or something thereabouts. |
The version was 6.6.8. |
So would this backup file have been generated from the older version of Signal or from the newest version? |
The older version. |
Is the sqllite structure pretty simple? Can I look in there easily to see if those attachments are really missing? |
Ah ok - then the newer version I just patched isn't going to be much good for you. Revert back to the original one here: Feel free to post exact copy-paste of the error that's occurring. The sqlite schema is a little hairy, but there is a table named And some info in this thread about examining the sqlite database: |
All 56 of the errors are substantially the same, just different message IDs and attachments.
|
I think maybe the parent application signalbackup-tools didn't extract those attachments into files for some reason. |
signalbackup-tools pulls out all the file attachments (MMSes) and dumps them as files into the |
Right. I'm trying to figure out a way to manually see if those attachments are there, if they're a specific file type, corrupted, or something. I notice the "name" attribute is null for all of them. Not sure if that's true for the ones that are successfully extracted. |
I also noticed that it took about 3 hours to import the XML file and it appears that MMS groups are not maintained. The messages are stored under each individual sender. |
Unfortunately I'm not sure how groups are modelled in MMSes - it's possible that someone could perform a standard sms/mms export via SMS Backup and Restore, then find some group messages in that XML file, and then document that format somewhere, and then I might be able to adjust this script to create that same format. (the lack of groups thing wasn't much of a problem for me, as I didn't have any Signal groups) |
I figured out how to look into it and those 56 messages have a part_count=1 in the mms table, implying they should have a corresponding entry in the parts table, but there is no corresponding row in the parts table. It looks to be a Signal issue. |
Looks like group MMS look like this in the XML:
Hope that helps. The 12025551236 number is my phone. I think the "type="137" indicates the originator of the message. |
Perfect thank you! I've pushed a change which should make the No guarantee it'll import the groups correctly - but unless I've missed something the format now looks very similar to what you've posted. You'll need to fetch the latest copy of the script again. If it still doesn't create the groups correctly, then I guess there's something else going on. |
This was the change I added:
|
It's doing some weird stuff on group messages. Here are some I found in the output XML:
Two things I see:
|
Re 1. I've just pushed some code which removes a bit of error suppression - if you download the latest version and run it again - an error will occur and you can let me know what it is (the longer the copy-paste the more helpful).
|
It seems to me that the addr structure should be that the sender's address has type=137 and the other members of the group addresses are type=151. Let me run the latest |
I didn't experience any new errors. Just the same errors with attachments as before. |
There's something a bit weird with determining who the sender is when it comes to group messages. There's a field on each MMS and it states who receives the message. The field is named This There's no field I can see that specifies this person was the sender, and that person was the receiver though... I wonder if it's just the first of the recipients. I.e. if there are 4 recipients to an MMS, just assume that the first number was the sender, and everyone else is the receiver. I can add a modification to the code if you think that's a goer. |
This is definitely the same "older" backup file that we were working with earlier? No file attachments at all? Quite sure? Or just particular messages don't have parts? |
|
I'm still trying to figure out why there is no
structure in the output. |
OK. I have fixed the node issue and made it write more readable XML. Here's the raw file. Working on getting the message text into a <part> and not duplicating parts. |
I verified that the type=137 on the sender's <address> is how authorship is determined. Still working. |
OK. Now I have it properly creating a <part> for the mms.body text. Stand by for fixing the sender. |
Fixed the python bug with newlines. Still working on sender info. |
Got the sender issue fixed. Only issue left I can see is that messages with an attachment AND text aren't working properly. Still working. |
OK. Everything's working. The only issue I'm having is that a couple of the messages are blank (no text and no attachment). I am researching the issue. Uploading here to make a snapshot of what I've done so far. |
The blank messages are when somebody sent me a shared contact. They do not appear in the message body nor in the part table. I am trying to see where those are stored. |
The data is in the database under mms.shared_contacts. The problem is that it looks like SMS Backup & Restore doesn't know how to deal with those. I can tell because when I use it to export one of those messages it's also blank. Not sure how to approach this one yet. |
This one's a far more complicated nut to crack. It looks like "SMS Backup & Restore" can handle vCard info just fine. The problem is translating what's stored in the Signal database into vCard format. The database stores it as flat text instead of the usual base64 encoded vCard format that appears to be standard, and it the text is not in vCard format. |
Here's my work in a diff format so it's easier to see what I've done. |
Cleaned it up a little more by "monkey patching" the minidom bug and redefining the writexml() call to produce formatted XML without having to use prettyXML. |
This is all epic. Well done. The monkey patch thing didn't work for me - it produced an error - but the rest is great. I'm just running an end-to-end test and if all good I'll merge your work into master asap. |
I patched the monkey patch ... lol. The way it is in there right now works. I'm working on unrolling the vCard data into a legitimate vCard attachment. I'm hoping that I can just create some Python dictionaries with the existing data and unroll them in the code to produce a vCard and then base64 encode it. |
Yeah I'm definitely getting an error with the monkey patch - what version of python are you running? When I run the code, I do so via I'm a bit surprised about the vcard stuff - what problem are you trying to solve there? (I'm tempted to suggest to not worry about it - but I don't understand what you're trying to fix) |
The monkey patch error:
|
Yes. I just ran into that. Testing a fix now. I am just not doing the .replace if |
All the messages where someone has texted me a contact information (vCard) show up as blank messages. I am trying to make those messages appear with the vCard attachment like they do in Signal and stock Android Messenger. |
Nope. Still blowing up. I'll keep trying to figure it out. |
Have updated master, you may want to fetch again - to simplify your diff |
It looks like you've fixed the mms authorship btw! One catch on that is in a group chat that has both mms and sms, the problem I described last night is still occurring, the sender of the sms is not being set. |
I think I got it. The money patch is quite old, so I just copy/pasted from the current xml.dom.minidom and modified it with the relevant stuff from the monkey patch. Running on my 10 GB Signal database now. |
That was the hardest part.
I didn't know there was a such thing as group SMS. I was under the impression that all group messages were MMS. |
I might be mis-stating what I'm seeing. It's like there's this Signal group chat, and there's a bunch of messages in there that are media messages (stored in the mms table), and also text only messages (that are stored in the sms table). But Signal presents them all in the same group chat. |
Btw I haven't had a lot of luck with the various sms applications for verifying the results of an import. The only one that seems to be rendering the entire message history ok is (my phone's a bit de-googled so I don't have google messages) |
It's fixed on my end. It doesn't blow up any more. The issue was just that the patch was stale in relation to the current xml.dom.minidom code base. I updated the patch and everything ran OK over here. Let me know. As to the message in the SMS table showing up in your Signal under a group chat, are those messages all from you by any chance? I've heard of people having a problem where they reply to a group message but their message is sent out as individual SMS to each group member rather than as an MMS to the group. |
Thanks @jbaker6953 it ran successfully for me now too! You've really nailed your goals here. Super impressed😍 I've just re-formatted the patch a bit so it passes my linter With the sms in group chat problem, unfortunately they are messages not just sent by me, but by others in the group too. It's a Signal group chat that has messages in it that are in both the |
I don't think I have anything like that. Perhaps because all of my group chats involve at least one person who's not using Signal, which makes it default to MMS. I bet your groups where this is an issue involve all Signal users. I'd be curious to see how those are represented in the database. As long as we can build some kind of JOIN query to link them all up it shouldn't be too difficult to fix. Happy to help people be free of vendor lock-in of their own personal data. I'd hate for Signal to pull the plug on us and have no way to to make our data portable, and if I'm going to do it for myself I might as well let everyone benefit from the work. |
There's a major bug in my code. I would suggest you revert it until I fix it. It probably explains why you were seeing the issue you described. I think I can fix it tonight or tomorrow. |
I believe this is fixed. I'll be curious to see if it addresses your issue too. I also discovered a bug where we were writing "None" in XML attributes when we should have been writing "null." Fixed that, too. Also fixed bug when there were naked 'bad' characters in the text message (e.g., "<", ">", "/", etc.), which weren't being escaped the way I had the monkey patch before. Tried to improve resource use on very large databases by only selecting columns used in XML generation. |
I got it to handle vCards stored before Signal fixed the vCard bug, but I still can't figure out where the vCard data is stored after they fixed that bug. I'm going to have to dig into the Signal source to investigate. |
A lot of errors.
because local variable 'phone' referenced before assignment
because No such file or directory: 'bits/Attachment_8161_1670890804128.bin'
or similar filenames. The files actually do not exist.There were no errors listed in creating the files in the
bits
directory.The only thing that catches my eye is the double slash
The text was updated successfully, but these errors were encountered: