Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parameter loading fails when the filesystem encoding is not UTF-8 #44

Closed
daira opened this issue Oct 25, 2018 · 4 comments
Closed

Parameter loading fails when the filesystem encoding is not UTF-8 #44

daira opened this issue Oct 25, 2018 · 4 comments
Assignees
Labels

Comments

@daira
Copy link
Contributor

daira commented Oct 25, 2018

Most Windows systems when using ANSI APIs, and some Unix systems, do not use UTF-8 encoding for filesystem paths. It's also common for usernames to have non-ASCII characters. In that case parameter loading will fail here:

// These should be valid CStr's, but the decoding may fail on Windows
// so we may need to use OSStr or something.
let spend_path = unsafe { CStr::from_ptr(spend_path) }
.to_str()
.expect("parameter path encoding error")
.to_string();
let output_path = unsafe { CStr::from_ptr(output_path) }
.to_str()
.expect("parameter path encoding error")
.to_string();
let sprout_path = unsafe { CStr::from_ptr(sprout_path) }
.to_str()
.expect("parameter path encoding error")
.to_string();

The comment is correct; we should not be using CStr, since CStr::to_str assumes UTF-8 (see doc just above here).

Strictly speaking the Right Way to do it on Windows is to use "wide" (UTF-16) filesystem APIs. I don't know how well supported that is in Rust, and for now I'd recommend sticking to ANSI APIs and just passing the bytes through as-is.

@daira daira added the bug label Oct 25, 2018
@daira
Copy link
Contributor Author

daira commented Oct 25, 2018

We have a lot of Russian and Chinese users, so this is going to crop up a lot. Note that it's a regression in recent versions of zcashd (post-1.1.2 I think), because this code was not previously used and the C++ code does handle non-ASCII filenames correctly.

@radix42
Copy link

radix42 commented Oct 25, 2018

yes it is post-1.1.2, 2.0.0 didn't build on Windows (because #reasons), so it didn't pop up until now

@ebfull
Copy link
Collaborator

ebfull commented Oct 25, 2018

My suggestion is to use u8string() when constructing sapling_spend_str etc. on the C++ side of things, so the Rust code can be unchanged and just assume UTF-8. Does this work? no

@daira
Copy link
Contributor Author

daira commented Oct 26, 2018

The Rust code can't be unchanged, because it calls CStr::to_str which assumes UTF-8.

str4d added a commit to str4d/librustzcash that referenced this issue Oct 27, 2018
On Windows, the slices are [u16] representing UTF-16. On all other
platforms, the slices are [u8] in the native filesystem encoding.

Closes zcash#44.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants