Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not backing up filenames with ANSI encoding #4753

Open
2 tasks done
tom1422 opened this issue Jun 19, 2022 · 4 comments
Open
2 tasks done

Not backing up filenames with ANSI encoding #4753

tom1422 opened this issue Jun 19, 2022 · 4 comments

Comments

@tom1422
Copy link

tom1422 commented Jun 19, 2022

  • I have searched open and closed issues for duplicates.
  • I have searched the forum for related topics.

Environment info

  • Duplicati version: 2.0.6.3_beta_2021-06-17
  • Operating system: Linux
  • Backend: Docker

Description

Certain files which contain ANSI(not officially) or Windows-1252 character encoding fail to back up. These files were created using older versions of windows and don't show up on SMB, but are seen by linux over NFS.
This is the error received:

[Warning-Duplicati.Library.Main.Operation.Backup.FileEnumerationProcess-FileAccessError]: Error reported while accessing file: /source/xxxx�xxxx.xxx
[Warning-Duplicati.Library.Main.Operation.Backup.FileEnumerationProcess-PathProcessingError]: Failed to process path: /source/xxxx�xxxx.xxx

Steps to reproduce

  1. Create a file with an older version of windows (Not tried yet but one that uses Windows 1252 encoding for filenames)
  2. Run the backup
  • Actual result:
    Doesn't backup files
  • Expected result:
    Backs up files regardless of filename encoding

Screenshots

Debug log

@ts678
Copy link
Collaborator

ts678 commented Jun 26, 2022

Backs up files regardless of filename encoding

How is it supposed to interpret an unknown (or potentially multiple unknown) 8 bit character sets, e.g. for display purposes?
Code pages have given way to Unicode. What are these files on? Can you convert the names before sharing over a network?

@tom1422
Copy link
Author

tom1422 commented Jun 26, 2022

Yes I can convert the names manually as I'm no longer using these old windows machines (some old files got backed up at some point and so happened to have the characters in them). The files themselves are stored on the filesystem of my NAS. It's only the special characters which don't show properly for display purposes (which is fine, there is no concrete way for it to tell what character set they are using, but the letters are universal).
The only issue is that these files are not backed up and this weird error is given. I'm not skilled in C# or knowledgeable in string handling or filename handling but my assumption is that the filename is just handled as a list of bytes throughout the program so the program should not care about what bytes are in the filename. I am more concerned about the reason duplicati wont back up these files (e.g. is it a problem with the library duplicati is using?) because I think that it should back them up regardless.

@ts678
Copy link
Collaborator

ts678 commented Jun 27, 2022

my assumption is that the filename is just handled as a list of bytes

I'm not a C# developer, but I think this is incorrect. Character encoding in .NET (Microsoft) says that a string uses 16-bit Unicode. Windows has supported Unicode since Windows NT in 1993 although the encoding changed to UTF-16 with Windows 2000 which is different from the UTF-8 encoding that NFS (at least NFSv4) likes. In both cases, though, what's encoded is Unicode. Interpreting encoded forms has certain format expectations which an unpredictable list of 8-bit bytes will eventually not meet and cause errors.

Try getting a better message by watching with About --> Show log --> Live --> Warning. You might need to click the warning line.

If it helps, your second (weirder-looking) error is a little easier to track in the code. It looks like it got a string that choked its query:
(actually, I now notice that this warning can happen in two other places in file, but you can use this as an example of the problem)

private static bool AttributeFilter(string path, FileAttributes attributes, Snapshots.ISnapshotService snapshot, Library.Utility.IFilter sourcefilter, Options.HardlinkStrategy hardlinkPolicy, Options.SymlinkStrategy symlinkPolicy, Dictionary<string, string> hardlinkmap, FileAttributes fileAttributes, Duplicati.Library.Utility.IFilter enumeratefilter, string[] ignorenames, Queue<string> mixinqueue)
{
// Step 1, exclude block devices
try
{
if (snapshot.IsBlockDevice(path))
{
Logging.Log.WriteVerboseMessage(FILTER_LOGTAG, "ExcludingBlockDevice", "Excluding block device: {0}", path);
return false;
}
}
catch (Exception ex)
{
Logging.Log.WriteWarningMessage(FILTER_LOGTAG, "PathProcessingError", ex, "Failed to process path: {0}", path);
return false;
}

@duplicatibot
Copy link

This issue has been mentioned on Duplicati. There might be relevant details there:

https://forum.duplicati.com/t/database-recration-not-really-starting/16948/36

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants