Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

App Data Backup/Restore #30

Closed
MuntashirAkon opened this issue Jul 3, 2020 · 23 comments
Closed

App Data Backup/Restore #30

MuntashirAkon opened this issue Jul 3, 2020 · 23 comments
Labels
Documentation Improvements or additions to documentation Feature New feature or request Priority: 0 Highest priority
Milestone

Comments

@MuntashirAkon
Copy link
Owner

MuntashirAkon commented Jul 3, 2020

App data backup and restore itself is not a difficult task but it is difficult to do it in a proper way. In this issue, I shall describe how App Manager is going to handle app backups.

Backup Format

All data is backed up using the standard linux shell commands and is stored as tarball individually based on their directory types. Then the tarball will be compressed with one of the compression methods (gzip for greater compatibility, bz2 might be considered in future). The advantage of this method is that the size of the compressed file will be little to largely less than the size of a zip file and more importantly, the permissions will be kept as is.

Metadata

It is important to preserve some metadata in order to display user information about the package as well as for signature verification and to decide which folder goes to where during the restore process. Every metadata is a key-value pair and is stored as a single json file.

Metadata Keys

  1. label: (String) App Label
  2. package_name: (String) Name of the package
  3. version_name: (String) Version name of the package being stored
  4. version_code: (long) Version code of the package being stored
  5. source_dir: (String) Source directory
  6. data_dirs: (String[]) Data directories
  7. is_system: (boolean) Whether the app is a system app
  8. is_split_apk: (boolean) Whether the app consists of multiple apks
  9. split_names: (String[]) Name of each split
  10. split_sources: (String[]) Source of each split (same order as above)
  11. has_rules: (boolean) Whether there is a rules file for this app to apply after restoring the backup
  12. backup_time: (long) Backup time of the package being stored
  13. cert_sha256_checksum: (String[]) SHA-256 checksum of the signing certificate
  14. source_dir_sha256_checksum: (String) SHA-256 checksum of the compressed source file
  15. data_dirs_sha256_checksum: (String[]) SHA-256 checksum of the compressed data file (same order as data_dirs)
  16. mode: (int) Backup mode
    • 0 - Unencrypted backup
    • 1 - PGP encrypted backup
  17. version: (int) Metadata version (preserved for future, current value is 1)
  18. apk_name: (String) Name of the apk (usually base.apk)
  19. instruction_set: (String) Name of the instruction set (architecture) where the backup is made (e.g. arm64, x86_64, etc.)
  20. flags: (Integer) Backup flags
    • 0 - Do nothing (nothing is backed up/nothing to be restored)
    • 1 << 0 - Backup up source files
    • 1 << 1 - Backup data directories (/data/user and /data/user_de)
    • 1 << 2 - Backup external data directories (e.g., /sdcard/Android/data)
    • 1 << 3 - Exclude cache (from all data directories)
    • 1 << 4 - Backup rules (blocking rules)
    • 1 << 5 - Skip signature verifications (applicable only for restore operation)
    • 1 << 6 - Backup only apk files (instead of the whole source). See Add a flag to backup only apk(s) instead of the whole source directory #64
    • 1 << 7 - Backup obb and media files (e.g., /sdcard/Android/obb, /sdcard/Android/media). See Add flag to choose which files on external storage to copy #65
    • 1 << 8 - Enable multiple backups (the backup will result in a new backup separate from the base backup)
    • 1 << 9 - Enable backups for all users
  21. user_handle: (Integer) The user (internally known as user handle) to whom the backup belong to. Additional options are only available in the App Info tab
  22. tar_type: (String) Compression method: z for GZip and j for BZip2
  23. key_store: (boolean) Whether the app uses Android KeyStore

Storage

Backup data is stored at the user's preferred location (in future) or at /sdcard/AppManager. Each package has its own dedicated folder named after its package name. The folder structure for each package is given below:

.
|-- package.name
|   |-- 0
|   |   |-- data0.tar.gz  (data dir at zeroth index)
|   |   |-- source0.tar.gz (source dir at zeroth index)
|   |   |-- keystore.tar.gz (optional KeyStore files)
|   |   |-- rules.am.tsv (optional rules file to be applied after restoring the app)
|   |   |-- perms.am.tsv (permissions)
|   |   `-- meta.am.v1
|   `-- 10
|       |-- data0.tar.bz2
|       |-- source0.tar.bz2
|       |-- rules.am.tsv
|       |-- perms.am.tsv
|       `- meta.am.v1
|-- another.package

Backup Process

Backup Process involves creation of the metadata based on user's preferences (described in the design section below). This metadata is then used to backup files. It is probably important to distinguish between system applications as the sources directory for these apps can be /system/app or /system/priv-app in which case typical restoring process may not work as expected. For sources directories, the whole source directory is backed up instead of just apk files. This is helpful for people who use patched odex file, for example. Split apks have a bunch of apk files which requires special attentions (and possibly, workarounds).

Restoring

Restoring is performed only after the verification of the file contents as well as signature verification (if the app is currently installed). In future, the sensitive information such as checksums will be encrypted using the encryption key. Restore process is a bit complicated than the backup process due to several factors mostly involving source files. First of all, it is essential to install/update or even reinstall an app based on the current state of the application: 1) If the app in not installed, it has to be installed first, 2) If it is already installed and the installed version is less than or equal to the app version to be restored, it can be simply updated, 3) Otherwise, the app has to be uninstalled (without deleting data if only apk restore is requested) and then installed again. After applying any one of these operations (the app should be forced closed), the source file has to be directly restored to source directory and data files to data directories. At the same time, pm or cmd package install can easily be utilised for split apks.

UI Design

Backup/restore feature is part of batch ops as well as available individually in the App Info tab. Apps with existing backups have a backup sign at the bottom of their app icon in the Main window as well as the version info. In case of batch ops, a menu item called backup/restore data is added which if clicked opens a new alert dialog consisting of a multichoice list consisting of apk, data, external data, exclude cache and blocking rules (all but external data is enabled by default), and three actions: backup (positive), restore (negative), delete backup (neutral). In case of App Info tab, a single backup/restore option is added in the menu which opens an alert dialog similar to the former. A tag is also displayed in this tab describing the backup app version.

Multiple user support (#70)

App Manager currently doesn't support multiple user. If devices have multiple users, it will simply fail to restore (#69). Support for multiple user will be added in future.

How multiple user support works (out of order)

Currently, only user 0 is supported (it is also hard coded, unfortunately), and the backups are stored at /AppManager/package.name/ as explained above. The idea is that data from other users will be saved in the same way as described but in another directory whose name is the user id. An optional boolean parameter user_handle will be added in the meta file to identify it. At the same time, a flag Multiple users will be added (disabled by default) in the backup/restore dialog (if multiple users are available) so that user can choose to backup/restore data for multiple users.

Limitations & workarounds

In case, the user id doesn't match, the restore process will fail. To solve this issue, an option could be added in settings so that user can choose the replacement for the user. But this is only for Android phones with more than 2 profiles. If there are only two profiles and backup data are also available for two profiles, then the unmatched profile will be automatically marked as matched and data will be restored for that particular user. This is useful for devices with work profile, for example.

Multiple backup support (#87)

This implementation is similar to above except that the folders will begin with non-digit letter/symbols to distinguish between the two. If user chooses to do multiple backup via batch operation, they will be offered to pick a name for the backups (all packages have the same name inside their respective directory). If the name begins with a digit (despite the warning), an underscore will be added as a prefix and/or if the name already exists, a number will be added at the end. Unlike the current behaviour (which is still experimental), each package have folders inside them where the backups will be kept. The 0 folder will be used to keep the backups of the user 0 (this is decided from the current user handle), for example. However, when restoring backups via batch operations, only backups from the user 0 will be restored. In routine operation, user can choose which backups to restore by specifying a name in the backup profile (if the named backup doesn't exist, the restoring will be ignored for that package). Other backups can only be restored from the respective App Info tab.

Misc

Handling large backup files (#60)

FAT32 partition has 4GB file size limit which is a problem for apps with large chunks of data. Although it is highly unlikely for a single app to have such an huge amount of storage, it cannot be ignored. Besides, having a max archive size limit increases portability as a user might be interested in transferring files to computer, cloud or even to different device as part of upgrading or migrating. To achieve that, each archive can be at most 1GB in size. Since we're using tarball, it can be done liek this:

tar -czf - directory/to/backup | split -b 1G - directory/with/backup/prefix.

Extracting this tar is easy:

cat /directory/with/backup/prefix* | tar -xzf - -C /

Since we're already using .tar.gz, during restore, the * will fall back to nothing.

Checksum will be the combination of all of these files.

Platform dependency

Since the whole source directory is being backed up, the app restoration process is largely platform dependent if not handled properly. As stated in #58, oat, lib and lib-32 are largely platform dependant (doesn't matter if the app is bundled or not). Hence, if the platform of the backup doesn't match the current platform, only the app will be restored instead of the entire source directory. To circumvent this problem, two more keys (namely instruction_set and apk_name) are added (see Metadata keys section above for description).

Encryption

Encryption support will be added shortly. Since the backup data are stored in the external directory, it's not secure to store sensitive files without encryption. Therefore, once support for encryption is added, unencrypted backup option will be removed (unencrypted backups can still be restored). Support for encryption may include OpenKeychain (the most secure option), RSA-4096) and AES-256. Passwords have to be set for the latter encryption methods and will not be remembered by App Manager (but will be remembered for a single session ie. as long as the app isn't destroyed if the user chooses to do so).

Data Integrity (fixed)

It is essential to preserve data integrity. When an app backup is requested, the existing backup is deleted before the new backup takes place which is directly written to the same directory. If the new backup fails, there is no way to recover the old backups. The alternative way would be to write backups to a separate directory first and after the new backup is complete, replace the old backup with the new one. This could be done in many ways but an efficient way would be to write the new backups inside the backup directory (for an app package.name the new backup directory is /sdcard/AppManager/package.name~) and upon completion delete the old backup and rename the new directory (ie. mv /sdcard/AppManager/package.name{~,}). This way old backup will not be lost if the new backup fails. Another similar but quite fatal issue occurs when restoring fails as there's no way to recover the original data if this happens. Sadly, there's no easy way to solve it (I can take backup of the current app and its data but it might result in an infinite loop is it fails continuously).

Scheduling (part of #61)

Backup scheduling is considered an important option by many and will be added in a future release.

Split apk support (related to #49)

Support for split apk is added in 5f190a4.

App crashing after restore

Due to several reasons, an app may crash such as:

Compression methods

gzip and bz2 (xz isn't supported by toybox and therefore won't be added).

Android Lollipop (API < 23) support

Since toybox is introduced in API 23, tar command is not available before API 23 and needs additional support. A possible solution is to ask users to install busybox tools at preferable location. App Manager is going to supply toybox within App Manager (see #84).

@MuntashirAkon MuntashirAkon added the Feature New feature or request label Jul 5, 2020
@Atrate
Copy link
Contributor

Atrate commented Jul 14, 2020

This codebase might come in handy:

https://github.com/machiav3lli/oandbackupx

@MuntashirAkon
Copy link
Owner Author

This codebase might come in handy:

https://github.com/machiav3lli/oandbackupx

Unfortunately, no. OAB* have completely ignored many security issues like signing info verification, OABX has an encryption method which is not convenient (I don't want to give users illusion of security), uses traditional zip archive which doesn't preserve any permission. The backup process that I intend to utilise will be very different from what OAB* authors have implemented.

@Atrate
Copy link
Contributor

Atrate commented Jul 14, 2020

Would you also consider giving the user a choice when deriving asymetric/symetric keys? Like a choice between AES-128/192/256 or RSA-2048/4096 or ECC (preferrably Curve25519, no?)

@MuntashirAkon
Copy link
Owner Author

Would you also consider giving the user a choice when deriving asymetric/symetric keys? Like a choice between AES-128/192/256 or RSA-2048/4096 or ECC (preferrably Curve25519, no?)

Probably. But it depends largely on how much effort would it take to do that. I'll be supporting OpenKeychain as well if they care to update their API to support Androidx. I never had to work with real-life encryption before so it'll take some time to understand how the APIs work. But since it's Java, the purest OOP, I expect it to be easy.

@MuntashirAkon
Copy link
Owner Author

MuntashirAkon commented Jul 25, 2020

@MuntashirAkon MuntashirAkon added this to the v2.5.10 milestone Jul 25, 2020
@Atrate
Copy link
Contributor

Atrate commented Jul 25, 2020

Suggestion: add the possibility to schedule backups (e.g. 2 times a week, only when charging or only when battery ≥ 80%)

@MuntashirAkon MuntashirAkon modified the milestones: v2.5.10, v2.5.11 Jul 28, 2020
@Atrate
Copy link
Contributor

Atrate commented Jul 29, 2020

It should also be considered whether AM should offer to back up .odex files, see NeoApplications/Neo-Backup#70

@MuntashirAkon
Copy link
Owner Author

It should also be considered whether AM should offer to back up .odex files, see machiav3lli/oandbackupx#70

As described in the first comment, I'll backup the entire source directory instead of just the apk file. This way you wouldn't need to worry about your patched odex, split apk or anything.

@MuntashirAkon MuntashirAkon added the Documentation Improvements or additions to documentation label Aug 2, 2020
@MuntashirAkon MuntashirAkon modified the milestones: v2.5.11, v2.5.12 Aug 3, 2020
MuntashirAkon added a commit that referenced this issue Aug 6, 2020
@MuntashirAkon MuntashirAkon modified the milestones: v2.5.12, v2.5.13 Aug 13, 2020
MuntashirAkon added a commit that referenced this issue Aug 13, 2020
Added `apk_name` and `instruction_set` metadata to store app name and architecture respectively. Fixed app data backing up/clearing all applications if the source directory is /data/app.

Closes #63, Related issue #30
Repository owner locked and limited conversation to collaborators Aug 14, 2020
@MuntashirAkon
Copy link
Owner Author

Would you also consider giving the user a choice when deriving asymetric/symetric keys? Like a choice between AES-128/192/256 or RSA-2048/4096 or ECC (preferrably Curve25519, no?)

It seems adding support for encryption (other than OpenPGP) requires some work. Since they only rely on keys, it is necessary to secure the key itself which needs a lot of work and I don't have any good examples other than OpenKeychain itself.

@Atrate
Copy link
Contributor

Atrate commented Sep 29, 2020

Since they only rely on keys, it is necessary to secure the key itself

Android Keystore maybe? Or only enable the option if the keystore is hardware-backed.

@MuntashirAkon
Copy link
Owner Author

Android Keystore maybe?

Yeah, I was thinking the same thing.

But there are other security issues as well. For example, the backups have to be written to disk before encrypting them which is insecure because a bad app can exploit it easily enough. If I could somehow redirect the tar streams directly to the crypto utils (without writing to a file), that would be more secure. While this isn't impossible, it requires some modifications in the libsu library.

@MuntashirAkon
Copy link
Owner Author

8d92570 marks the end of this issue. From now on, issues on backup will be discussed separately in their respective issue.

@MuntashirAkon MuntashirAkon unpinned this issue Oct 3, 2020
@MuntashirAkon
Copy link
Owner Author

KeyStore has another problem: If the user has set password in the OpenPGP client, s/he will be prompted to provide the password. But KeyStore doesn't have such option. So, there's no protection when decrypting the backup. Any app with accessibility permission can in theory launch AM automatically and decrypt the backup files. So, AM needs to implement a custom app lock as well.

@nerd190
Copy link

nerd190 commented Oct 6, 2020

Hey, sorry I havent been active in sone time now, been hella busy, but I have been keeping up with the project.
How about OpenSSL? a library like this: https://github.com/leenjewel/openssl_for_ios_and_android/blob/master/README.md

My idea?

Use binaries instead:

  • OpenSSL.
  • GnuPG.

If you added a tiny shell (right now the smallest embeddable shell would be 'NetHunter' terminal, an up-to-date version of JackPal's 'Terminal Emulator' used by Kali NetHunter Android, its way less than half a megabyte! just needs some 'NetHunter' branding stripped from res folder and its ready to use) this way, 'AM' can fire more shell commands without using root access, like Termux, whilst using actual OpenSSL/GnuPG binaries, and skipping the BS that is 'OpenKeyChain' (whilst its the best so far for Android, it doesnt really say much, theres not much competition for it!) plus it could run bash rather than mksh, though all backups etc would have to be in mksh syntax obviously.

Why?
People have suggested a terminal emulator & no-root options too, whilst they are things I do not particulary need, if a shell with OpenSSL/GnuPG can help us encrypt, then this addresses the problem, whilst also giving those who want terminal/no-root options what they want too. Plus added benefits, like being able to add other binaries, like adding sqlite3 for organising meta, I understand that whilst shells can read json, they wasnt designed to so it can be difficult, so most people use a lib to help out with this (or atleast I do when making terminal pkgs, I use YAML for this reason) sqlite3 has been good when I can use it though, e.g If I know its for Termux where it is available. Although, you probably use Java for querying the json, its just an example that other options open up with a more shell-based backend that can provide utilities that only us rooted users can install to /system.

@MuntashirAkon
Copy link
Owner Author

MuntashirAkon commented Oct 6, 2020

My idea?
Use binaries instead:

Terminal emulator is beyond what App Manager stands for. Termux is currently struggling with compatibility problem even after dropping support for several Android versions (this time the problem is with Android 11). Therefore, no terminal emulator will be added.

'AM' can fire more shell commands without using root access, like Termux

AM is already running shell commands without root access. Running a shell command is like creating a new process. There's nothing special about shell commands. Since toybox is now built-in, AM can run most linux commands without problems.

a library like this: https://github.com/leenjewel/openssl_for_ios_and_android/blob/master/README.md

These libraries only support API 23 or later. AM supports API 21 or later. So, they can't be integrated into AM. If you'd followed recent activities, you'd have known that I've already spent a lot of time integrating the toybox binary and I ended up modifying toybox source code in order for it to work up to Android 11 (I've also considered busybox which is even worse).

if a shell with OpenSSL/GnuPG can help us encrypt, then this addresses the problem

They will create even greater problem. OpenPGP clients such as OpenKeychain is considered (besides the fact that it's an audited software) because it's very difficult to store keys without a substantial knowledge of cryptography (I only have a basic knowledge of this field). Using these binaries require AM to store and manage these keys and provide the users necessary UI to import, export or modify them. In order words, AM has to support cryptographic features similar to OpenKeychain. I'm currently designing something similar in #116 but the features are very limited and supported encryptions are AES, RSA and EC. This is mainly for the people who do not use a OpenPGP client and need an encryption solution that somewhat works.

it could run bash rather than mksh, though all backups etc would have to be in mksh syntax obviously.

This isn't a problem for me. I usually do the necessary processing in Java instead of shell as shell is very slow.

@nerd190
Copy link

nerd190 commented Oct 6, 2020

Just an idea is all! I use openssl & opengpg freely within terminal emulators, they work both in or out of Termux, making encryption easy, since then I have stopped using OpenKeyChain.
When I looked into Android encryption, its difficult, its a mess, unlike on linux and I thought, if I had to do it, a small bash shell with openssl/gnupg (a few MBs) wouldve been the easiest approach to do so, also opening up further opportunity, like mosh, ssh, rsync, rclone etc etc. all of which run with or without Termux. I saw that Kali's is the smallest available that is updated regurlarly. Whilst I dont need another "terminal emulator" an independent "shell" that you can add what you see fit too is different, the user doesnt need a UI as AM will fire the commands not the user at a command prompt, user just sees AM's GUi and clicks button, these buttons are commands that AM completes, no "gpg --list-long ass -command here"
Again, just one solution of many, I thought personally (as I was going to) a tiny shell with proper binaries is easiest, these binaries are used millions of times, every day for years, contributed by thousands, minimal bugs, bugs get fixed quick...
Unlike OpenKeyChain, tink, blahblah... that are MUCH newer, made and fixed by few, have more bugs that take longer to fix, and are simply "a java compatability layer" to bring some of the functionality that the binaries I named do. e.g OpenKeyChain is just a java layer for a limited version of OpenGPG etc etc.
I thought, may as well use the binary.
Yes, Android has done something shitty with 11, but Termux has ideas that should continue full functionality, but again, this is only a worry for unrooted devices, I have OpenGPG, OpenSSL, SSH, Rsync, Rclone etc installed on my device already.

@MuntashirAkon
Copy link
Owner Author

MuntashirAkon commented Oct 6, 2020

Unlike OpenKeyChain, tink, blahblah... that are MUCH newer, made and fixed by few, have more bugs that take longer to fix, and are simply "a java compatability layer" to bring some of the functionality that the binaries I named do. e.g OpenKeyChain is just a java layer for a limited version of OpenGPG etc etc.

OpenKeychain isn't a Java compatibility layer. It uses Bouncy Castle library. Bouncy Castle is a widely used library and is as old as GnuPG itself.

@Cyberavater
Copy link

Cyberavater commented Mar 6, 2021

It should also be considered whether AM should offer to back up .odex files, see machiav3lli/oandbackupx#70

As described in the first comment, I'll backup the entire source directory instead of just the apk file. This way you wouldn't need to worry about your patched odex, split apk or anything.

As I mentioned here, NeoApplications/Neo-Backup#70 (comment) ; backing up (patched) odex doesn't work. Maybe consider not backing them up, since it's useless and can save up some space? (i.e only backing up split apk)

@MuntashirAkon
Copy link
Owner Author

The apk only flag is enabled by default which means odex/vdex won't be backed up by default. It's obviously possible to use patched odex files. Consider opening a separate issue along with what you've done so that I can further look into it in future.

@Cyberavater
Copy link

The apk only flag is enabled by default which means odex/vdex won't be backed up by default. It's obviously possible to use patched odex files. Consider opening a separate issue along with what you've done so that I can further look into it in future.

So, what I'll have to do is, unmark apk only, to backup odex and reuse them? Ok, I'll try it and if no luck, I'll file a new issue.

@MuntashirAkon
Copy link
Owner Author

So, what I'll have to do is, unmark apk only, to backup odex and reuse them?

You have to unmark APK Only during both backup and restore.

@Cyberavater
Copy link

So, Currently I tried the latest version of SB specifically for this one app/issue #473

I found out that SB encrypts (AM equivalent) Internal and External data, by default and which can't be disabled even the users want to.

image

Now, something similar was already planned by MA (For which I argued with him, sorry). But the thing I didn't like about encryption is that you need to remember a password (at least that's what I knew back then), but it's implemented in a way that SB used (i.e, the user doesn't need to remember anything), then I guess encrypting sensitive data by default won't be a bad idea.

It's possible that MA already had it planned somewhat similarly, I've just added this example here just in case.

Of course, AM can't be 100% like SB, as SB can get its data from Firebase, but AM would also have its own way of handling its data (#237).

@MuntashirAkon
Copy link
Owner Author

If you want encryption or verification, you have to use a password or secure hash. I agree that it's not possible to remember all the passwords you use, but that's why password managers exist. This is also what we want people to get familiar with, especially in a time when data breaches have become so prevalent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Documentation Improvements or additions to documentation Feature New feature or request Priority: 0 Highest priority
Projects
None yet
Development

No branches or pull requests

4 participants