Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Argument file needs support for multi-char and BOM #7474

Closed
Falco20019 opened this issue May 7, 2020 · 6 comments
Closed

Argument file needs support for multi-char and BOM #7474

Falco20019 opened this issue May 7, 2020 · 6 comments
Assignees
Labels

Comments

@Falco20019
Copy link
Contributor

What version of protobuf and what language are you using?
libprotoc 3.3.0 / Grpc.Tools 2.28.1

What operating system (Linux, Windows, ...) and version?
Windows 10 Build 1803

What did you do?

  1. Create test.rsp containing non-ascii:
--plugin=protoc-gen-grpc=ágrpc_csharp_plugin.exe
--grpc_out=grpc_out
test.proto
  1. Run protoc.exe @test.rsp

What did you expect to see
protoc running and creating the files.

What did you see instead?
UTF-8 without BOM: --grpc_out: protoc-gen-grpc: Das System kann die angegebene Datei nicht finden.
UTF-8 with BOM: You seem to have passed an empty string as one of the arguments to C:\Users\dejhbk0l\.nuget\packages\grpc.tools\2.28.1\tools\windows_x64\protoc.exe. This is actually sort of hard to do. Congrats. Unfortunately it is not valid input so the program is going to die now.

Anything else we should know about your project / environment
https://github.com/protocolbuffers/protobuf/blame/master/src/google/protobuf/compiler/command_line_interface.cc#L1326 uses std::ifstream instead of std::wifstream and therefore does not allow BOM or wide characters.

This is a minimal example. We stumbled over this in grpc/grpc#17995 as we distribute protoc and the protoc-gen-grpc through NuGet and the default for dotnet nuget locals global-packages is %userprofile%\.nuget\packages\. For some of our users, the username includes non-ascii characters which leads to protoc-gen-grpc containing non-ascii characters and the argument file not being parsable.

@Falco20019
Copy link
Contributor Author

CC:

@jtattermusch
Copy link
Contributor

jtattermusch commented Jun 7, 2021

I'm not sure if making the ExpandArgumentFile function support std:wifstream is feasible (the functions returns a std::string which is then processed by the command_line_interface.cc logic, and the entire logic is written in terms of std::string and it doesn't know about wide characters at all).

bool CommandLineInterface::ExpandArgumentFile(

@Falco20019
Copy link
Contributor Author

Not too deep into C++, but AFAIK this would mean that the whole chain would need to support std::wstring. So this change would need to go a long way.

It's a bit sad to see that UTF-8 is still not the default in 2021 in so many situations (not just here).

@Master-Code-Programmer
Copy link

Not supporting UTF-8 in the year 2022 is a complete anachronism. This really must be fixed.
But std::wstring isn't the solution: http://utf8everywhere.org.
Best would be to just keep using std::string, but make it Unicode capable.

@jtattermusch
Copy link
Contributor

The original link to command_line_interface.cc is no longer valid. This seems to be the location it was referring to:

std::ifstream file_stream(file.c_str());

@tonydnewell
Copy link
Contributor

BOM doesn't need supporting. Grpc.Tools writes out the argument files as utf-8 without BOM.

Looking at the protobuf code, the fix isn't changing from std::string to std::wstring and using std::wistream in CommandLineInterface::ExpandArgumentFile - in fact I think that is not the correct solution.

Reading each line of the utf-8 encoded file into a std::string is correctly (each line is just a sequence of bytes). It is when the string is used that conversion from utf-8 to std::wstring needs to be done.

io_win32.cc handles conversion from utf8 to std::wstring correctly (in as_windows_path()).

The problem seems to be when calling the plugins in subprocess.cc. Subprocess::Start() uses CreateProcessA instead of CreateProcessW and doesn't do the utf-8 to wstring conversion.

I've tested my own fix locally.

tonydnewell added a commit to tonydnewell/protobuf that referenced this issue Jul 1, 2022
Use CreateProcessW so that non-ascii paths are handled correctly
fowles added a commit that referenced this issue Jul 6, 2022
Fix for grpc.tools #17995 & protobuf #7474 (handle UTF-8 paths in argumentfile)
@fowles fowles closed this as completed Jul 16, 2022
tonydnewell pushed a commit to tonydnewell/protobuf that referenced this issue Oct 5, 2022
…obuf-7474

Fix for grpc.tools #17995 & protobuf protocolbuffers#7474 (handle UTF-8 paths in argumentfile)
mkruskal-google added a commit that referenced this issue Oct 19, 2022
* Force uninstall protobuf in python macos builds

We are seeing failures in brew uninstall protobuf due to no package. Change this to a force install to avoid the error.

* Fix spelling errors (#10717)

* Merge pull request #10200 from tonydnewell/bugfix/protobuf-7474

Fix for grpc.tools #17995 & protobuf #7474 (handle UTF-8 paths in argumentfile)

* Upgrade to kotlin 1.6

* 21.x No longer define no_threadlocal on OpenBSD

* Upgrade kokoro to Xcode 14 (#10732)

* Upgrade kokoro to Xcode 14

* Fix osx errors

* Merge pull request #10770 from protocolbuffers/googleberg-cl-480629524

Mark default instance as immutable first to avoid race during static initialization of default instances.

* Auto capitalize enums name in Ruby (#10454) (#10763)

This closes #1965.

* Edit toolchain to work with absl dep

* Bump upb to latest version after fixes applied (#10783)

* 21.x 202210180838 (#10785)

* Updating version.json and repo version numbers to: 21.8

* Update changelog

Co-authored-by: Protobuf Team Bot <protobuf-team-bot@google.com>

* Update generated protos

Co-authored-by: deannagarcia <69992229+deannagarcia@users.noreply.github.com>
Co-authored-by: Matt Fowles Kulukundis <matt.fowles@gmail.com>
Co-authored-by: Deanna Garcia <deannagarcia@google.com>
Co-authored-by: Brad Smith <brad@comstyle.com>
Co-authored-by: Jerry Berg <107155935+googleberg@users.noreply.github.com>
Co-authored-by: tison <wander4096@gmail.com>
Co-authored-by: Protobuf Team Bot <protobuf-team-bot@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants