Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unicode and extra long file paths on Windows #136

Closed
hugbug opened this issue Dec 28, 2015 · 1 comment

Comments

Projects
None yet
1 participant
@hugbug
Copy link
Member

commented Dec 28, 2015

Currently only characters from current locale are supported but even them don't work very well because when changing settings via web-interface everything is stored in UTF8.

For better support of international characters in file paths we should use Unicode-functions of Windows API instead of standard functions of C++ library.

The Unicode-functions have one additional advantage - they can work on file paths longer than 260 characters, which is a limitation for ANSI functions. The long paths still require special treatment though.

Based on #131 we now have a central point to process all file access request. The module FileSystemmust be improved to use Unicode-functions.

In this topic:

  • rework module FileSystem to use Unicode-functions when on Windows;
  • automatically convert long paths to extended form with "\?" prefix (more details);
  • create new processes with Unicode-functions (starting unrar and scripts);
  • use extended paths if necessary when passing parameters to unrar (see #127);
  • parsing of nzb-files via MSXML requires special encoding for paths;
  • extra long paths on network shares (require special prefix);
  • par2-module requires extra work as it doesn't use our module FileSystem.

When completed the following issues should be solved:

  • #127 (Unpack fails if filename is too long [Windows 10]);
  • #91 (utf8 characters on download folder);
  • make sure the issue fixed in #96 is still fixed after the changes.

@hugbug hugbug added the feature label Dec 28, 2015

@hugbug hugbug added this to the v17.0 milestone Dec 28, 2015

hugbug added a commit that referenced this issue Dec 28, 2015

#136: Unicode-Windows-API for file operations
- internally all paths are handled in UTF8;
- all paths are stored in config-file in UTF8;
- when calling file access Windows API functions the paths are
converted to wide-chars and Unicode-API is used;
- extra long paths are prefixed with “\\?\” (extended path format).

hugbug added a commit that referenced this issue Dec 28, 2015

#136: Unicode Windows-API when calling other programs
- use CreateProcessW;
- pass command-line in Unicode;
- pass environment in Unicode;
- if current directory is too long convert it to short path (8.3
notation); because CreateProcessW doesn’t support extra long path
(prefixed with “\\?\”) for current directory.

hugbug added a commit that referenced this issue Dec 28, 2015

#136: extra long paths with unrar
- pass extra long paths to unrar using “\\?\” prefix (and in Unicode).

hugbug added a commit that referenced this issue Dec 28, 2015

#136: Unicode in script environment
- pass all data to scripts in Unicode environment;
- script which doesn’t support Unicode can access ANSI-version of
environment, which is provided by Windows automatically from the
Unicode-version.

hugbug added a commit that referenced this issue Dec 28, 2015

#136: avoid crash in par-renamer
if the directory content could not be read.

hugbug added a commit that referenced this issue Dec 28, 2015

#136: handling of Unicode paths with MSXML
- when parsing nzb-files (option NzbDir);
- when parsing rss feeds (option TempDir).

hugbug added a commit that referenced this issue Dec 28, 2015

#136: removed unneeded function
deleted functions for converting ANSI <-> UTF which are not used
anymore.

hugbug added a commit that referenced this issue Dec 29, 2015

#136: support for Unicode and extra long paths in par2-module
- par-renamer and par-checker both support Unicode paths and extra long
paths;

hugbug added a commit that referenced this issue Dec 29, 2015

@hugbug

This comment has been minimized.

Copy link
Member Author

commented Dec 29, 2015

Completed.

Related: check how pp-scripts work with extra long paths, especially the included scripts EMail.py and Logger.py. Do adjustments in scripts if necessary. This is for a separate issue however.

@hugbug hugbug closed this Dec 29, 2015

hugbug added a commit that referenced this issue Dec 29, 2015

#136: adjustment in par2-module
using built-in latin1-to-utf8 conversion routine instead of Windows
function (which depend on current code page).

hugbug added a commit that referenced this issue Dec 29, 2015

#136: transcoding of names in par2-module
when running on POSIX the file names must be transcoded as well, not
only on Windows.

hugbug added a commit that referenced this issue Jan 21, 2016

hugbug added a commit that referenced this issue Jan 23, 2016

#136: avoid double slashes in paths
Extra long path names are not normalized automatically by Windows and
therefore must contain paths in canonical form.

hugbug added a commit that referenced this issue Jan 24, 2016

hugbug added a commit that referenced this issue Jan 30, 2016

#136: fixed: root drive paths sometimes unusable on Windows
Paths like “C:\” were sometimes not usable.

hugbug added a commit that referenced this issue Mar 22, 2016

#136, #127: forcing extended paths with unrar (Windows)
Instead of using extended (extra long) path notation only if the path
is really long, we now always use extended notation since we don’t know
how long the paths inside archive are.

hugbug added a commit that referenced this issue Apr 12, 2016

#136: fixed possible crash when canceling download
and having option “DirectWrite” disabled.

hugbug added a commit that referenced this issue Apr 18, 2016

#136: fixed crash on feed fetch failure (Windows)
Crash caused by nullptr passed to “FileSystem::DeleteFile”.

hugbug added a commit that referenced this issue Jul 29, 2016

hugbug added a commit that referenced this issue Oct 9, 2017

#136, #127: forcing extended paths with unrar (Windows)
Instead of using extended (extra long) path notation only if the path
is really long, we now always use extended notation since we don’t know
how long the paths inside archive are.

hugbug added a commit that referenced this issue Oct 9, 2017

#136: fixed possible crash when canceling download
and having option “DirectWrite” disabled.

hugbug added a commit that referenced this issue Oct 9, 2017

#136: fixed crash on feed fetch failure (Windows)
Crash caused by nullptr passed to “FileSystem::DeleteFile”.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.