Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicati creates large files in the roaming part of the user profile on Windows #2222

Closed
1 task done
asteppke opened this issue Jan 2, 2017 · 17 comments
Closed
1 task done

Comments

@asteppke
Copy link

asteppke commented Jan 2, 2017

I have:

  • searched open and closed issues for duplicates

Version info

Duplicati Version: 2.0.1.32_canary_2016-11-12
Operating System: Windows 7 64
Backend: file

Bug description

Windows roaming profile becomes large after longer usage of Duplicati. Login on computers that are part of a domain takes a long time due to the large profile. This is caused due to the files BIMOMVVWCK.sqlite (400 MB) and GBGIAIXHXI.sqlite (100 MB) in the folder C:\Users\username\AppData\Roaming\Duplicati.

Steps to reproduce

  • use current version of Duplicati on a Windows computer that is part of a domain network
  • backup files for several weeks

Actual result: Roaming profile increases in size, increased network traffic, long login delays
Expected result: Duplicati SQL database should not be part of the roaming profile. It belongs to the local machine. It should therefore be stored in C:\Users\username\AppData\Local\Duplicati instead.

--- Want to back this issue? **[Post a bounty on it!](https://www.bountysource.com/issues/40561569-duplicati-creates-large-files-in-the-roaming-part-of-the-user-profile-on-windows?utm_campaign=plugin&utm_content=tracker%2F4870652&utm_medium=issues&utm_source=github)** We accept bounties via [Bountysource](https://www.bountysource.com/?utm_campaign=plugin&utm_content=tracker%2F4870652&utm_medium=issues&utm_source=github).
@asteppke
Copy link
Author

The same issue occurred in the earlier Duplicati 1.3 and has been fixed (github issue). Please also apply the same fix, i.e. using the local instead of roaming profile to store large database files.

@agrajaghh agrajaghh added this to the mini issues milestone Jan 24, 2017
@FlohEinstein
Copy link

FlohEinstein commented Jan 31, 2017

Fixed it, but be aware, you have to move your configuration files from the AppData\Roaming\Duplicati to AppData\Local\Duplicati manually if you install this over an existing installation.
(and sorry for the two commits, my bad connection...)

@ptar
Copy link

ptar commented Feb 2, 2017

There's an issue #1757 that duplicati should ignore it's own (locked) files (or handle it in a graceful way).
Please check if the fix for this issue and #1757 don't interfere (or if this fix could possibly solve #1757)
Thanks!

@FlohEinstein
Copy link

FlohEinstein commented Feb 3, 2017

Since with this fix, sqlite files are saved in AppData\Local instead of AppData\Roaming, it will solve #1757 if user only ticks the box for "Application Data" and leaves "Local Application Data" unticked.
If programs are well written, AppData\Local should not contain stuff that needs to be backuped.

@kenkendk
Copy link
Member

kenkendk commented Feb 3, 2017

I think we need some gracefull handling of upgrades. If the user has checked %APPDATA%, it would be a nasty surprise if this folder is no longer being backed up.

@FlohEinstein: Your fix only renames the variable? Is there a conflict with %APPDATA% such that the rename fixes it? If we rename the variable could there be cases where the data is no longer backed up when users upgrade?

For the original issue, the logic is here:
https://github.com/duplicati/duplicati/blob/master/Duplicati/Library/Main/DatabaseLocator.cs#L40

The logic should be something like:

  • Check if the desired file exists in the new location (local)
  • Check if we are on Windows, and the desired file is in the old location (roaming)
  • If not found, create in the new location (local)

@kenkendk
Copy link
Member

kenkendk commented Feb 3, 2017

Would it be agressive to move the database file from roaming to local if it is found?

@kenkendk
Copy link
Member

kenkendk commented Feb 3, 2017

Maybe we should also move the lock file that prevents two instances from running:
https://github.com/duplicati/duplicati/blob/master/Duplicati/Server/SingleInstance.cs#L120

This is a bit harder to decide on, because we really want to "just change it", but on the other hand we would like upgrades to guarantee that there are not two instances running at the same time.

@FlohEinstein
Copy link

@kenkendk It's not a renaming of the variable, it's what variable is read from Windows. I first thought it would be enough. But that way, the AppData\Roaming folder would not be selectable for backup anymore. That's why I made a better version with comments.
https://github.com/FlohEinstein/duplicati/blob/patch-2/Duplicati/Server/SpecialFolders.cs

Databases are saved into "Duplicati" in the location of where SpecialFolder.ApplicationData points to. While this might be alright for *nix, Windows has 3 different folders (I decided to only use the two that are usually used). When we change SpecialFolder.ApplicationData to point to "AppData\Local" instead of "AppData\Roaming", the databases, config and lock get saved to the right place, however the AppData\Roaming-content isn't backed up anymore. So I added another value to represent this folder.

Normal procedure for user would be to not backup "AppData\Local" since this among other things contains unnecessary stuff like the whole IE/Chrome/Firefox-cache. Thousands of files. Only "AppData\Roaming" contains the important stuff. Or at least it should.

@kenkendk
Copy link
Member

kenkendk commented Feb 3, 2017

I like the new version better, perhaps more confusing for the average user though?

Changing SpecialFolder.cs, only affects the location the user can pick, and where that location maps to.

If you look at the code that picks a database, it uses Environment.GetFolderPath, so your patch will not change where the database is stored:
https://github.com/duplicati/duplicati/blob/master/Duplicati/Library/Main/DatabaseLocator.cs#L48

@FlohEinstein
Copy link

FlohEinstein commented Feb 3, 2017

You sure? It should...
System.IO.Path.Combine(System.Environment.GetFolderPath(Environment.SpecialFolder.ApplicationData), "Duplicati") gives you
"C:\Users\Username\AppData\Roaming\Duplicati" with the old code and
"C:\Users\Username\AppData\Local\Duplicati" with my new one
compare "echo %LOCALAPPDATA%" and "echo %APPDATA%" at the commandline

image

If you want to receive the old folderpath in the new, you need to use
System.IO.Path.Combine(System.Environment.GetFolderPath(Environment.SpecialFolder.LocalApplicationData), "Duplicati")

EDIT: Disregard that, you're right... I have to rethink this

@FlohEinstein
Copy link

OK, I think I got it.
a) The second fix that I added allows the Windows-User to tick/untick all the AppData-Folders. Nice to have, but you're right, it might confuse them.
b) What we need is a fix on https://github.com/duplicati/duplicati/blob/master/Duplicati/Library/Main/DatabaseLocator.cs#L48. There it should say
var folder = System.IO.Path.Combine(System.Environment.GetFolderPath(Environment.SpecialFolder.LocalApplicationData), "Duplicati")

Since we also have #1352 which is kind of the same problem on Synology, I suggest we address this as a larger project.
a) Allow selection of a database-default folder during setup (defaulting to AppData\Local\Duplicati under Windows)
b) Moving of existing DBs to this folder maybe needs changes in dbconf.json too?

@kenkendk
Copy link
Member

kenkendk commented Feb 3, 2017

The server and GUI already use the environment variable Duplicati_HOME to decide where to store data, and defaults to %APPDATA%:
https://github.com/duplicati/duplicati/blob/master/Duplicati/Server/Program.cs#L18

I think that could be applied to the dbconfig.json as well as the databases, such that it looks in Duplicati_HOME and if not supplied, in the LocalApplicationData folder.

The trick is to make sure that existing users are not in for a surprise when they upgrade. Everything should preferably work right away.

I suggest that Duplicati looks in the LocalApplicationData folder first, and if it does not find what it needs, it probes the old ApplicationData folder. This ensures that upgrades will work, and some downgrades will work fine as well.

@kees-z
Copy link

kees-z commented Feb 6, 2017

This is an interesting discussion.
Indeed, in a domain environment, it is not convinient that the database files and Duplicati config are stored in the roaming profile.

Maybe it is even dangerous:
If a user logs in on another PC in the same domain and this user has a roaming profile, the complete Duplicati config is copied to that workstation in the current situation. Probably the local disks of that workstation have a complete different folder structure. If Duplicati is installed on that PC, the backup task will run and detect that a lot/all source folders do not exist, assume they are deleted and thus marks them as deleted at the backend. Am I right with this assumption?

Moving the config to the local APPDATA folder fixes this, but some new questions arise about this:
A standard Duplicati installation is intended to make backups of files on the local host, not from a local user. When scheduling a backup task, normally the user would expect that this backup will run, regardless which user is logged on.
With the default setup, Duplicati puts a shortcut to Duplicati.GUI.TrayIcon.exe in the user's Startup folder and Duplicati will look for config files of that user's APPDATA folder (roaming or local).
So if that user (let's call this user User1) logs out and User2 logs in to that computer, any scheduled task of User1 will not run, until User1 logs in, which will cause that User2's scheduled tasks will stop.

The solution I use in my environment is to run Duplicati as a service with the SYSTEM account (which also makes using VSS possible, regardless of the user's permissions). But Duplicati should run all scheduled tasks in a standard setup to make it more reliable.

Solution could be to not use %APPDATA% or %LOCALAPPDATA%, but to store config and database in %PROGRAMDATA%. The %PROGRAMDATA% Environment variable points in most situations to C:\ProgramData, which is the same for all user profiles. This way all backup tasks are shared between all users that log on to a single computer. All users will see the same backup tasks and know which backup tasks will run on that PC.

There are some things to think about when using %PROGRAMDATA% for storage of Duplicati files.
When User1 logs on and selects the Documents and Pictures folder, the folders C:\User1\Documents and C:\User1\Pictures are selected as source folders.
So the documents and pictures folders of User2 will not be included in the backup task.
If User2 logs on and edits the backup task, he will see that the Documents and Pictures folders are unselected, but he will see the paths to User1's documents and pictures folders in the source files list.
User2 can add his own Documents and Pictures files to the backup task by clicking the appropriate selection boxes.
Duplicati should somehow make clear that when selecting Documents/Pictures or other libraries, only those libraries of the current user are included in the backup.

If Duplicati runs with the SYSTEM account (for example when running it as a service, or Duplicati.Server.exe is executed from the Windows Task Scheduler using this account), the user profile folders point to C:\Windows\System32\config\systemprofile. This folder will not contain anything interesting for a backup, so if Duplicati runs with the SYSTEM account, the libraries to Documents, Pictures etc. should be hidden.

Third potential issue: what happens if User1 and User2 log on at the same time on a single PC? Will the web interface start on port 8300 and will each scheduled task run twice? This should be handled somehow.

Maybe the best option to fix all issues is to use the %PROGRAMDATA% location for storage of config files and install Duplicati as a service by default (MSI installer could ask for a startup mode: System Tray for current user or all users, as a service or no automatic startup, running as a service should be selected by default).

@FlohEinstein
Copy link

FlohEinstein commented Feb 6, 2017

OK, now we're getting somewhere :-)
First: I really like @kees-z 's idea about putting all the Duplicati-stuff in %ProgramData%. If ran as a service, this is better than putting it in the %appdata% or %localappdata% of the SYSTEM-account.

But keep the following in mind:
If user A generates a backup task only for his personal stuff in the service, and chooses an encryption key, user B can see the encryption key AND restore files. Even if the users don't have access to each other's folders in the first place.

So the documentation needs to reflect that ANY user with access to the machine has access to nearly everything when he can access the WebGUI. Therefore: Password for WebGUI should be set.

But the idea of every user selecting their respective personal folders and adding them to the backup task is pleasant. As long as Duplicati is run as service by SYSTEM, it has the privileges to access the files of all the users (with the problems mentioned above). If Duplicati is run as application by a user, it might then not have access to all the source folders (e.g. when User has specifcally set ACLs to his folders)

Regarding the other questions:
Duplicati.GUI.TrayIcon.exe should also check if there's a service running. Now, it only starts/connects to localhost:8300. We need it to work for
a) Service installed, not running -> start it
b) Service installed, running -> Allow to connect to it
c) Service installed, running AND User wants to run it as application too.

Two users starting it as application:
I tested that once, accidentally. As far as I remember localhost:8300 showed the WebGUI of the first started instance even when accessing it from the second user.
Yet another reason to make port configuration changes possible in Duplicati.GUI.TrayIcon.exe.

AppData-Folders:
With the current build, Duplicati saves AppData\Roaming when "Application Data" is selected. A user would lose AppData\Local and AppData\LocalLow. Under Windows Environments, "Application Data" should select all three or allow to select

  • Application Data
    |- Local
    |- LocalLow
    |- Roaming

If we keep the configuration in the AppData\Roaming and the local databases of the backup tasks in AppData\Local, and the user moves to another computer, taking Roaming with him, leaving Local and LocalLow. The task will fail at first, since there's no local DB. User has to run a database delete and repair.

If the paths to the Personal Stuff change, it gets interesting: Now I'm not completely sure about the backup task itself, but what I hope happens is this:
We get all the checksums of the files through the db repair.
The paths of the files vary, but a lot of files should remain the same. Therefore: Same checksum.
Duplicati says "Oh, I already have that, over there in file xyz" and just points to it in the new dlistfile.

Regarding the main topic I propose the following, slightly varying from @kees-z 's idea:

  • If run as a service:
    -- use %ProgramData% for storage of config, lock and sqlite-files
  • If run as a program:
    -- use %ProgramData% for storage of lockfile (addressing problem of multiple users starting it, unless we allow port changes to run multiple instances)
    -- use %AppData% for config-files, %LocalAppData% for storage of sqlite-files (addressing problem of users gaining access to each others encryption keys)

Changes needed:
MSI Installer:

  • should ask if Duplicati should be run as a service (and yes, this should be default)
    -- if yes, it should urge the user to set a password (needs to be part of the start-command of the service)
    -- and if Duplicati.GUI.TrayIcon.exe should be started for current or all users

Duplicati.GUI.TrayIcon.exe:

  • must know the configuration to address problems mentioned above.
  • might even allow port changes

Program itself:

  • should include a subselection-menu for "Application Data" when run under Windows
  • needs to know if it's run as a service or application
  • if run as a service:
    --the selection of personal folders (Desktop, My Docs, Music, Application Data...) should work as a wildcard (C:\Users*[Selected Option]) instead of the SYSTEM-accounts stuff
    -- config, lock and sqlite should be placed in %PROGRAMDATA%
  • if run as a program:
    -- config in %AppData%
    -- sqlite in %LocalAppData%
    -- lock in %PROGRAMDATA% if we don't allow port-changing (only one instance) or
    -- lock in %LocalAppData% if we allow multiple instances by different concurrent users (by allowing port changes)

Unaddressed:
We know what happens when we try to backup the files locked for the current job.
I spent my whole lunchbreak on this. Damn :-D

@kees-z
Copy link

kees-z commented Feb 6, 2017

But keep the following in mind:
If user A generates a backup task only for his personal stuff in the service, and chooses an encryption key, user B can see the encryption key AND restore files. Even if the users don't have access to each other's folders in the first place.

That makes sense. I did not think about that, but this is a security problem that should be addressed somehow.

Two users starting it as application:
I tested that once, accidentally. As far as I remember localhost:8300 showed the WebGUI of the first started instance even when accessing it from the second user.
Yet another reason to make port configuration changes possible in Duplicati.GUI.TrayIcon.exe.

As far as I know, Duplicati starts the web interface on TCP port 8200 by default, if no port is specified. If port 8200 is in use, Duplicati will use port 8300. I assume it will continue testing ports 8400, 8500 and so on, until an unused port is found. So the same security problem applies here:

User1 starts Duplicati as an application, so this instance listens on port 8200.
If User1 locks his PC and User2 logs on, User2 can access/restore the backup data of User1 by opening the web interface on port 8200. He can even browse the file system by adding a new backup.
If User2 starts Duplicati, port 8200 cannot be used (in use by User1), so Duplicati will start a new instance listening on port 8300.
If User2 doesn't close Duplicati or doesn't log off, User1 can access the data of User2 by opening http://localhost:8300.

If we keep the configuration in the AppData\Roaming and the local databases of the backup tasks in AppData\Local, and the user moves to another computer, taking Roaming with him, leaving Local and LocalLow.

I still have doubts about storing any config information in a user's APPDATA folder, it sounds quite dangerous to me. For the roaming folders (Documents, Pictures etc) this will work, but all other folders selected for backup could be deleted.
Suppose I run an FTP server and use C:\FTPHome as the root folder for this FTP server.
Probably C:\FTPHome exists only on 1 PC, the PC that is running the FTP server.
If the Duplicati backup job is saved in Appdata\Roaming and the user logs on to another PC that does not have a folder C:\FTPHome, the complete FTPHome folder will be deleted from the backup.
Because most backup solutions are used to make a backup of a certain host, I feel more comfortable by keeping each backup configuration on the host where it was created.
If someone will use the same backup task on another PC, an export of the config on the source computer and importing it on the destination computer will do the trick.

@kenkendk
Copy link
Member

I finally got around to looking at this, and have implemented it in a backwards compatible way.
The rules are:

  • If the new file exists, we use that
  • If the new file does not exist, and the old file exists we use the old
  • Otherwise we use the new location

This ensures that any new installs will not have the old file, and will thus use the new location.
After the first run they will have the new location and not fall back to the old.
Existing data will stay where it is.

This is a less intrusive approach than moving the data (as suggested in #2304).

I have not addressed the "run as service" things mentioned here. I think they are better suited for #1739.

@kenkendk
Copy link
Member

I think this issue was fixed, if not, please re-open

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants