-
-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue while compiling signalbackup-tools under Fedora Live 30 (Hardware) #4
Comments
Thanks. Is this running in a VM? I can reproduce this in VM, and just pushed a fix for that. Could you try the same commands again, and let me know? If you are not running in a virtual machine, I don't know what's happening, but you might be able to fix the build by just changing line 7 of the script to |
thank you for your quick reply. Fedora Live does not run in a VM. Unfortunately, the error also occurs with the fix. The adjustment of the code in line 7 leads to the same error message. Can I create a more detailed error log to better narrow down the error? Error with modified line 7:
Error with line 7 unmodified ( NUMCPU=$(nproc) ):
|
Hm, that should work just fine, I just did the exact same thing here with a F30 Live usb, so it should work just the same. I think the problem is with the If it still fails, maybe you could try not running the script at all and just run:
or, the same with the
|
thank you, that worked (compiling), but now the signalbackup-tools command doesn't start:
|
Good! To run executables in linux they need to be in your path (which this one is not), or you need to specify the full location. Long story short, use |
I'm sorry for asking such dumb questions but I tried to solve this for 30 minutes now, it still won't start the executable: |
Nope, I don't see anything wrong. If the build succeeded you should have the executable in the directory, can you see it when you type |
i think it is there: signalbackup-tools and BUILDSCRIPT.sh are the only green elements between the rest of blue and white elements:
Since the signal backup file is 4-5 GB I can't put the backup file within the signalbackup-tools folder (that's why I have to put the path to the volume/backup file in the command) but I can't execute the executable right now... |
Hm, I'm not sure what's going on then, as far as I can tell it should just work. I've made a little video of the process, right from the start of booting the Live image (it hangs for a bit while installing the packages, so it's slightly long, but maybe check it out, see where the difference is): https://send.firefox.com/download/e9706d671e830f1f/#tGohrdcjyzc0ezuc4i7lnw It ends with an error btw, only because I don't supply any arguments to the program, this is expected. |
Thank you so much! It's working now. The tool is processing my corrupted backup to a new backup file. Unfortunately, it does not seem to correct the error. When trying to import the new signal backup file in Signal the import process stops at the count of 67101 messages. Is there anything in the syntax that can fix my Backup? The terminal's log of signalbackup-tool was this
|
Hm.. it looks very much like your backup file is not corrupted, but simply incomplete! That is quite a big problem, since the end of the backup file contains important data for the backup. (There is a small possibility of corruption though, if the size of that last attachment is incorrect it might try to read past the end of the file.) I need to think about a way to verify what's going on and how to fix it. Obviously, the missing data is simply gone, but I might be able to at least get the data that is still there imported. I'm assuming this is somewhat important to you because it is going to be a somewhat complicated procedure (if it works at all), so be prepared for some complicated instructions. Also it will probably take a while for me to think up possible solutions. |
Thank you so much for your help! Yes, for personal reasons it is very important for me to restore as many conversations as possible from this backup so that I accept every effort for it. If you find a way to restore the conversations up to the EOF I would be happy to pay you for the time you invested, I appreciate the effort! If the backup was aborted prematurely or if it is a corrupt attachment, I can't tell. I think that the backup process ran smoothly and there was enough space on the device. |
I will be working on this, I have some ideas, but first: I can't find it in this thread (maybe you deleted it?), but in my mailbox I have a message from you where you say you copied the backup file to a FAT32 formatted usb stick? Is this true and still the case? Earlier you also said the backup file was 4-5GB? Files on a FAT32 filesystem have a maximum size of exactly 4GB, if you copy a larger file onto it it will be truncated to 4GB. Can you check the filesizes? Do you still have the original? I suggest trying to format the usb storage to a more modern filesystem (NTFS should work out of the box on both linux and windows). |
Dear bepaald, thank you for your input. It's correct that I copied the backup file to a FAT32 formatted USB-drive to access it from witin Fedora Live. I checked the file size, it is identical to the original on my harddisk (4.039.229.440 Bytes). Since there was no error message when copying to the usb-stick, this seems to be still within the FAT32 limit. I am almost 100% sure that I transferred the signal backup directly from the phone to the computer via SmartSwitch. So I don't know why and when the file was cropped. |
Ok, in your message on the signal github you mentioned 4,7GB (signalapp/Signal-Android#7637 (comment)), so I figured that was way to big for FAT32 (maybe you meant 3,7GB, that's about 4039229440 bytes?). Anyway, if you don't have any larger versions of the backup, it doesn't matter how it was cropped, I'll get working on a way to fix it. I've done some investigating and I have an idea to get the messages imported (it may take a little time though). Also from refreshing my memory by looking at the code, I'm pretty much certain there was no corruption but truncation (corruption would have resulted in a bad MAC before reaching eof). |
I had 4.7 GB in memory but the original backup file still exists and has 4,039,229,440 bytes (4.04 GB under macOS). I guess I remember it wrong because I copied the backup file directly from the smartphone to the computer without any detours (via FAT32). If the file is cropped, not much can be missing and my hope would be that at least most of the conversations could still be recovered. |
I think I can tell from the output you posted earlier it is probably only the last attachement that is missing. From the order in which the backup data are written, if I'm correct, you should have all messages (the text parts of the messages) including messages received after that last incomplete attachment. Unfortunately, there was also important stuff that was written after the attachments that is now gone. I have just pushed a commit that tries to generate the most important tables from the information in the messages. It fills in data for the thread table (otherwise your list of conversations would appear empty, even though the messages are in the db) and the groups info. I was worried about the 'identities' table remaining empty, but from my testing the app seems to accept this, it will just fill in new data after restoring. Check out the current code and compile (no need to edit the buildscript anymore!), then run like this:
I'm extremely tired, but some notes I can think of right now:
hm... that's all I can think of right now. I would love full output of the command above (censor anything you need, but I don't think it reveals much sensitive information). Also if you notice anything about your restored backup (missing or incorrect things) I would like to know. You might also want to test actually sending and receiving messages. |
dear beepald, thank you so much for your efforts!
Thanks again so much! |
Whoops, sorry that was a stupid mistake, I fixed it now. If you check the code out again the buildscript should be fixed. However, just adding the "2" should have worked. This looks like the exact same problem you had earlier (#4 (comment)), which I didn't understand either. How did you fix that one? |
Sorry, I just noticed: in your first command you were inside the signalbackup-tools directory ( However, in your second attempt (where you cleverly added the "2") you are in the wrong directory |
@elbrutalo Any luck so far? I could make another video if you need one... |
Dear bepaald, please forgive my late feedback, I was on my way and only now I had the possibility to report back. The recovery process went on until the end, most of the messages could be restored (especially the old messages were all there, at the end some days might have been missing). Not restored were the attachments of the last weeks (photos, voice messages, videos). These appear as speech bubbles in the conversations, but are empty (see screenshots). But I'm overjoyed about the older messages that could still be saved! Therefore I thank you infinitely for your help! I have also sent you via Paypal a small expense allowance and appreciate your commitment here for the community and me very much! Unfortunately, I have now caused a new problem due to carelessness:
In the months since the broken backup I have been working with a new instance of Signal and have received and sent several hundred messages. These messages exist in a second backup file with a different passphrase. This means I now have the recovered backup file (with passphrase 1) and the new one from another signal instance (with passphrase 2). Could I merge them with your tool? I found some threads on the net where people were facing the same problem. There doesn't seem to be a solution. Maybe it is possible in my case because the conditions are favorable. The backup file from the new signal contains messages from the same contacts as in the old corrupt backup. So numbers and contact names are the same. Is there a way to solve this with your script? Best regards, elb |
No problem!
Good! If I had to guess, I'd say all messages that were in the database were restored. I think I was pretty careful about that. Any messages that could not be placed in a thread should have produced some output stating that. For example, with my own (truncated) testing backup:
I don't see any screenshots :) But that's okay, in my testing backup I also had one attachment missing and it also showed an empty bubble. I don't think they will pose a problem, but if the message body is empty (it was just an attachment with no actual text message) you might as well delete the messages to be on the safe side. I think any missing attachments were imply not present in the backup file anymore, but you could test this. The program has had the ability to dump the entire decrypted database to a folder for a while now. You could use this option to see if there were any attachments in the database that haven't been placed in the fixed backup (I would guess not, but it's possible if the message they belong to is gone). To do this:
It will obviously fail at the end, because the database is no good, but it will write the attachment data to the directory as it finds them. The filenames will not be very helpful, but you can manually inspect the attachment files (they should also be pretty much chronologically stored in the backup, so the attachment with the highest number should be the most recent one).
Thanks a lot! I just noticed that this morning and did not know who it was from. Though of course I would have helped you anyway (or tried to at least), I really do appreciate it a lot!
Haha, wow, that is some bad luck! But also good news! I have already implemented this feature a couple of weeks ago! In fact, when I woke up this morning I just decided to post a message in this thread to let those people know they could test it out, but now I will let you be the brave tester. I have only tested with a few handcrafted, very small backups, you are really the first to try it seriously. Be prepared for it not to work (at least not the first time), it may need more work. Also, it is not a fully automated procedure, there are some slightly more complicated instructions than before. Example: Assuming a current backup
Then, you can import a selection of threads from the source file into your current backup and export to a new backup file. Expect tons of output (I really need to clean that up sometime):
The program automatically tries to determine into which thread of the current db the old messages should be inserted. This might fail if one of the backups has a contact with a country code (+316.....) and the other omits it (06....). Please let me know if you need any more help in running this function, I'm not sure the above instructions are very clear. And of course, if you do manage to get it going I would love to hear the results. Good luck! |
Dear beepald, ah, great that this feature already exists! I'll test it right now, but just to be on the safe side I'll ask the question first: What exactly can go wrong if there are foreign contacts with an international prefix in one of the backups and not in the other? Does the whole process then stop? In my case there are many threads with foreign phone numbers (partly only in the old backup, partly in the new and in the old one). Is there anything else to consider before I start? +41 / +43 / +44 / +17 / +35 / +33 / +21 / +39 etc. |
(some of the following is guesswork, as I said, the code is not extensively tested. Keep in mind that the input backups are opened read-only, so you really can't end up any worse than you start :) ) Well nothing can go wrong exactly, but signal internally uses the phone numbers as the contact id. It is by this id that this program matches the threads in the old and new databases. So nothing can go wrong, just if you have 0611111111 in one backup and +31611111111 in the other, the threads will not be identified as the same person and will not be merged, they will just turn into two separate threads. You could check this by running the tool with
I have very little experience with this, but I kind of expect it to work even if it's not a number, as long as it's the same string in both databases they hopefully get merged.
I can't think of anything else. I just tested merging the fixed backup I had truncated for testing. It seems to have worked fine, even with the missing attachments. There was also a service number in there which also seemed to work. Now all the messages in it are doubled (because I merged it with itself). I think the best way to get answers is to try it! I'm very curious myself actually, so if you have the time and feel up to it, please try it out. PS If you have a lot of threads, it might be tedious to write |
@elbrutalo Did you try it out yet? Obviously, you do not have to if you don't want, but I just glanced over the code changes for the upcoming 4.48 version of Signal (currently in beta testing), and the changes will definitely break my current merging code. So, if you are going to try, please do so before the new version comes out and before updating (or wait until I've updated my code, but it could take quite a while). |
Dear beepald,
thanks for reminding me of the upcoming changes. I’ll try it tonight and let you know!
Thank you so much!
Patrick
… Am 25.09.2019 um 20:41 schrieb bepaald ***@***.***>:
@elbrutalo <https://github.com/elbrutalo> Did you try it out yet?
Obviously, you do not have to if you don't want, but I just glanced over the code changes for the upcoming 4.48 version of Signal (currently in beta testing), and the changes will definitely break my current merging code. So, if you are going to try, please do so before the new version comes out and before updating (or wait until I've updated my code, but it could take quite a while).
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub <#4?email_source=notifications&email_token=ANCCX44KFPWW7EPZ5UMUVX3QLOWFHA5CNFSM4ITQ5CJKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD7S5TMI#issuecomment-535157169>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ANCCX44ETJIGFK42NK3HD6LQLOWFHANCNFSM4ITQ5CJA>.
|
Dear beepald, The problem occurs both when I select all threads (1-587) or only some threads (e.g. 1-4). Since the computer freezes I cannot send you the terminal log but instead only two screenshots of the frozen screen: https://www.dropbox.com/s/ekheqaz1v3yaia2/20190926_121934.jpg?dl=0 Do you have any idea how I can avoid this problem? thank you so much. |
Hi, Thanks for trying! Not sure what's going on here, but I can imagine the machine is running out of memory. Do you think that is possible? Though I already had some low-memory options prepared, I never bothered to enable them. The merging code was very memory hungry (more than the other functions), and in combination with your huge backup file, I can imagine the machine is getting low on RAM. I've spent the last couple of hours enabling the low-memory mode for the merging routine (I had not maintained it properly, so it took a little work) and testing if I didn't break any other functionality. For my tests (merging 10 threads from two 96MB files) memory usage went from 520MB to around 150MB (also, it should go a bit quicker). Please try again with the current code, I hope it helps. If it still hangs, does it at least get further along? EDIT: Thinking about it, the max RAM used is probably around the same as the total size of the two backups you are trying to merge (in the new version), so if you have less RAM than the size of your two backups combined you might still get in trouble. EDIT2: I've further reduced memory usage, if RAM was the problem I don't think it can be anymore. Also, with the number of threads you are merging, output will be waaay to big to capture from the terminal, so if you append EDIT3: |
@bepaald Yep, they belong all to that thread :-)
and the unique ID's are exactly the missing attachment errors from #7 (comment) |
@Esokrates Ok good! Since the attachments are already missing anyway, the best thing to do is just delete these entries from the part database (they contain no other useful information):
I sometimes see messages for missing attachments (like in #7 (comment)) in my own backups, but they are not a problem (I think they're just deleted attachments, where the part-entry is not removed), but none of them have NULL for size, and that seems to be the problem in your case. Let me know if it's fixed! If it's not, we may need to do something with the mms messages these part-entries belong to (i really don't think that's necessary though). But I probably won't have anything new before tomorrow (if at all), because it's getting late here. |
@bepaald Thanks, I'll try that tomorrow, however could you tell me how to exclude Am I understanding this right, that this only deletes the attachments but not the text of the messages? |
Good catch! I'm glad one of us is paying attention :) Change the command to:
Correct, the part database does not hold any of that data. Each entry in the part database belongs to a message in the mms database which holds the actual message body (which might be empty, I often send pictures without any text). You can print out some info on the messages they belong to using this command (it will show the id, the message body, and the date(_received) (in milliseconds):
Where that last number is the 'part.mid' (mid == mmsid) as you just posted them (#4 (comment)). Again, we are not actually touching these messages, but if you remember this info, you can hopefully see they are all still there after you restore the backup. To convert the date to a more readable format, you could probably just type Example:
The date string is localized so it will probably look different (more normal) to you when you do it on your own machine. |
@bepaald I tried it and now the thread doesn't crash anymore :-). Now it seems I have some zombie messages that display like this: Example 1: It looks like those are empty messages and in case of example 2 I can't even select that message in Signal, so could you help me find those empty messages and delete them? That would certainly make for an great general option for the tool like If I remember correctly those missing attachments were caused by the sender not being able to send the attached pictures so the spinner kept spinning indefinitely for the attached pictures. |
Yippie!
Hmmm.... I assume these zombie messages correspond to the ones we deleted the 'part' entries from (are there 8 of them, in that same thread, around that same date)? I'm not sure why the app thinks there is still an attachment for them though... I think it has something to do with the fact that the entries in the 'mms' table still have a non-zero 'part_count' field, could you run the following command to list all messages in the mms database with a part_count > 0 and no entry in the 'part' table:
IF the list is exactly the zombie messages (as far as you can tell, I hope the date and address help you determine this) you could delete this same selection of messages by running
That is probably useful info for the bug report you made (assuming someone will look at it sometime).
Excellent
Thank you for reporting and testing!
No, I've tried to write a function that will scan for doubles in your case, but it's hard to test properly. Are there many of these duplicate messages? Is it doable to check the results manually or are there way too many to do that? Try to run the program like this |
@bepaald First of all, let me say that you are amazing! :-) I would guess Strangely all of the matched messages have a timestamp of Aug. 25 between 18:00 and 22:00, so I see no logical reason for example 1. EDIT:
Is it expected that the address of 837-840 differ from the rest? |
Right, that's good, I actually expected that! They are indeed the same messages, with the NULL for data_size in the part table (which we deleted). You can see the same _id's in your previous message here: #4 (comment)
Yes, that looks correct to me, don't forget to add
You're really getting the hang of this! Soon you won't need my help! ;-)
Well, at least it's not expected that they are the same. In group threads incoming messages have the address of the sender, outgoing messages have the group_id as address, so it's normal for messages in group threads to have many different addresses (all participants' phone numbers and the group_id) |
Yeah I happily noticed that :-), but I am just testing to delete the 3 messages with empty body and see if it fixes the rendering issues, it'll take some time as the files are big.
You're great thanks, but luckily that number is from the old backup and nonexisting anymore :-). |
@bepaald Good news: Deleting the 3 messages with NULL body indeed fixed the problem :-). Regarding the duplicate function, it complains about syntax errors:
Could me tell me how to query sms messages? |
That's great! So, just the duplicate messages left to fix right?
I think there are some issues when the message bodies contain quotes... I'll work on a fix and let you know when I have something.
Well, they are in a table called 'sms', so a simple "SELECT * FROM sms" will print everything (but that's a lot). Messages have a 'type' which is a number which can be bitmasked to get certain properties (see: https://github.com/signalapp/Signal-Android/blob/master/src/org/thoughtcrime/securesms/database/MmsSmsColumns.java#L27). My guess (and I tested this on my own databases, but I have only few examples) is that for signal messages the 'SECURE_MESSAGE_BIT' and for normal sms messages it is not. So to get all sms messages in your database (the 4953 you found in your message above), you would do "SELECT * FROM sms WHERE (type & 0x800000) IS 0)" |
Ok, I've fixed the quote-issue. Still don't know if it'll be the solution, but at least it shouldn't give errors now. Please try again |
Yeah :-)
Tested this and it's not completely safe, I found some messages that were not sent unencrypted that are now marked unencrypted. That seems to have been a bug in Signal, because those messages where never delivered to my messages partner, they had only one tick indefinitely. This happened sometimes years ago, but luckily not very often. |
So something is wrong unfortunately. |
Ok, well that's not working. I have in my database 1 message, which I sent as signal message first, and then as a regular sms a minute or so later (in the same thread to the same person), so that's almost your situation I think, except the timestamp will be a bit different between the mesages. Could you try to find one of your duplicate messages with some identifying body-text and print some info on them? An example with that one message in my database:
Please post the output, but remember there is a phone number and the message body in there, so censor those (but tell me if they are not the same). Hopefully this way we can see more easily what exactly is the same and what is different in these duplicate messages. (As you can see, in mine only the 'type' is different (and the timestamp), so that's why I thought that would be the way to identify your duplicates) |
This implies that the offset should be I think I'll be reasonably safe with
which outputs messages with ID from 4835 and 9646, so I guess actually it is safe to delete all messages with ID between those numbers. |
Now I am a bit puzzled: The id's in the output are strictly monotone increasing, the first message has ID 4835 and the last hast ID 9657 which makes for 4812 messages, yet counting the lines with
gives me 4820. How is that possible? EDIT: Ah, I forgot about your debug output which I can't suppress sadly.
so I'am reasonably certain that its safe to delete all of those, since the messages before 4835 are the same as the last of the matches. |
@bepaald Is it safe to specify an existing backup as output file? Will it safely overwrite that file or may that lead to something unexpected? |
I think I just found why the
Should be safe, as long as it's not the same as the input file (even that might actually work, I just wouldn't dare try it with important data). |
@bepaald The
overall to fix all mentioned problems. As far as I can see now everything works as I wanted it to do. So I'll stick to the merged backup file now. Is there anything I could test for you? I'll wait for your reply before upgrading to the latest Signal release. This won't be my final message here, but I want to say a very very big thank you for the amazing tool and your incredible support and efforts! As I said earlier, you are amazing! Kudos! |
Excellent! Very happy for you. I had forgotten that the plaintext backup would not export the raw status messages, but would decode them first. So indeed the body would not match the original body. I could have tried to write something for that as well, but your script will probably have been a lot quicker.
Thanks! I'm trying to think of something, but I really can't right now. You should probably just upgrade. Make sure to save a backup before updating, just in case something goes wrong during the database migrations the app is going to have to do coming from your old version.
No problem! You certainly had some unexpected problems and requests, but it's fun trying to get everything fixed and even more fun succeeding :-) Very happy to have been able to help! |
@bepaald The upgrade seems to have worked very well. :-) Only issue now is Signal making a backup every day and making one backup sucks battery badly because of the much bigger size now but I guess I'll just disable the backups and manually back up every now an then / request a feature to manually specify the backup interval. A couple of questions / thoughts regarding signalbackup-tools:
|
Good! I seem to remember an editable backup interval has been requested in the past, but nothing seems to have come of it. I would love that feature too though, I'd probably set it weekly.
I plan on it, the normal merge was already ported and I have ported the
Yes. It always used to be like that, but when I added support for the input and output to be an unencrypted directory, I had to remove that because when outputting to a directory, no password is given (as it's unencrypted) and I used the fact that no output password was supplied to determine the export should output to directory. Anyway, locally, I've already made the changes to have the program check the filesystem to find out whether output is a file or a dir, so now the output password is optional again. Also, the program now refuses to overwrite files unless the
Yes, I would love something like that. And I have given it quite a bit of thought as it would be very useful in the test scripts I run after big changes (that's why I haven't pushed, still waiting for the tests to finish). But really I have not come up with any (simple) way to guarantee that two backups are equal in all cases. In simple cases (decode the backup to directory, then pack the directory up using the same password) the backups are bit-for-bit identical so that's easy. When using some operations (split the backup in two parts, the combine them with In your specific use case, it might even be a little more difficult because you actually know and expect the files to be different. I will continue to think about this, but it will certainly take a while. |
Since this issue has not had any comments in almost a year, just to clean up, I am closing this. Feel free to start a new one. |
Fix bepaald#9 Fix bepaald#85 Fix bepaald#70 Fix bepaald#53 Fix bepaald#38 Fix bepaald#4 Fix bepaald#1 Note: we could easily add support for g++-8 by copy/pasting the following changes: https://github.com/InfiniTimeOrg/InfiniSim/pull/83/files
Fix bepaald#9 Fix bepaald#85 Fix bepaald#70 Fix bepaald#53 Fix bepaald#38 Fix bepaald#4 Fix bepaald#1 Note: we could easily add support for g++-8 by copy/pasting the following changes: https://github.com/InfiniTimeOrg/InfiniSim/pull/83/files
Hello, when compiling under Fedora Live 30 I'm getting the following error:
I did the following steps on a Fedora30 on hardware:
$ git clone https://github.com/bepaald/signalbackup-tools.git
$ sudo dnf install gcc-g++ cryptopp-devel sqlite-devel
$ cd signalbackup-tools && chmod +x BUILDSCRIPT.sh
$ sh BUILDSCRIPT.sh
Can anyone help? My linux skills are limited.
The text was updated successfully, but these errors were encountered: