Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: Forbid calling unsafe fs::path(std::string) constructor and fs::path::string() method #22937

Merged
merged 2 commits into from Oct 15, 2021

Conversation

ryanofsky
Copy link
Contributor

@ryanofsky ryanofsky commented Sep 10, 2021

The fs::path class has a std::string constructor which will implicitly convert from strings. Implicit conversions like this are not great in general because they can hide complexity and inefficiencies in the code, but this case is especially bad, because after the transition from boost::filesystem to std::filesystem in #20744 the behavior of this constructor on windows will be more complicated and can mangle path strings. The fs::path class also has a .string() method which is inverse of the constructor and has the same problems.

Fix this by replacing the unsafe method calls with PathToString and PathFromString function calls, and by forbidding unsafe method calls in the future.

src/addrdb.cpp Outdated Show resolved Hide resolved
@DrahtBot
Copy link
Contributor

DrahtBot commented Sep 10, 2021

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #23280 (init: Coalesce Chainstate loading sequence between {,non-}unittest codepaths by dongcarl)
  • #23203 ([POC] build: static musl libc based bitcoind (with LTO) by fanquake)
  • #22910 ([RFC] Encapsulate asmap in NetGroupManager by jnewbery)
  • #22663 (dbwrapper: wrap options in unique_ptr and use DeleteOptions by Crypt-iQ)
  • #22350 (Rotate debug.log file by LarryRuane)
  • #20974 (test: add test for corrupt wallet bdb logs by ryanofsky)
  • #20744 (Use std::filesystem. Remove Boost Filesystem & System by fanquake)
  • #20435 (util: use stronger-guarantee rename method by vasild)
  • #20331 (allow -loadblock blocks to be unsorted by LarryRuane)
  • #19461 (multiprocess: Add bitcoin-gui -ipcconnect option by ryanofsky)
  • #19460 (multiprocess: Add bitcoin-wallet -ipcconnect option by ryanofsky)
  • #10443 (Add fee_est tool for debugging fee estimation code by ryanofsky)
  • #10102 ([experimental] Multiprocess bitcoin by ryanofsky)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@hebasto
Copy link
Member

hebasto commented Sep 10, 2021

mingw complains:

wallet/rpcdump.cpp: In lambda function:
wallet/rpcdump.cpp:553:76: error: use of deleted function ‘fs::path::path(const string&)’
  553 |         file.open(request.params[0].get_str(), std::ios::in | std::ios::ate);
      |                                                                            ^
In file included from ./flatfile.h:11,
                 from ./chain.h:11,
                 from wallet/rpcdump.cpp:5:
./fs.h:40:5: note: declared here
   40 |     path(const std::string& string) = delete;
      |     ^~~~
make[2]: *** [Makefile:10361: wallet/libbitcoin_wallet_a-rpcdump.o] Error 1

Copy link
Contributor Author

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rebased 1e07139 -> ddbfc51 (pr/u8path.1 -> pr/u8path.2, compare) due to conflict with #21222, also fixing win64 build error https://cirrus-ci.com/task/5157617598201856

src/addrdb.cpp Outdated Show resolved Hide resolved
@ryanofsky
Copy link
Contributor Author

re: #22937 (comment)

mingw complains:

Thanks for headsup, and this should be addressed in the last push

@ryanofsky
Copy link
Contributor Author

ryanofsky commented Sep 10, 2021

This PR is basically ready but I still am looking for feedback about about whether best solution to path.string() method and path(string) constructor brokenness on windows is to make these methods uncallable and replace with them with PathToString PathFromString functions like this PR is doing. Or to just override these methods to keep current boost path behavior, and actually do what we want, but not follow the standard path class behavior. Or to take an alternate approach like using the new (since mid-2019) windows build option to force a UTF-8 code page, and not worry about older windows versions.


Other remaining todos here: improve fs.h documentation and split commit into two parts so fs.h functions are added in the first commit, and code is switched over to call the new functions in a second commit.

@maflcko
Copy link
Member

maflcko commented Sep 10, 2021

Or to take an alternate approach like using the new (since mid-2019) windows build option to force a UTF-8 code page

If this fixes all our issues and avoids developer headaches when writing code (which conversion function to use), that seems most preferable.

@ryanofsky
Copy link
Contributor Author

ryanofsky commented Sep 10, 2021

Or to take an alternate approach like using the new (since mid-2019) windows build option to force a UTF-8 code page

If this fixes all our issues and avoids developer headaches when writing code (which conversion function to use), that seems most preferable.

Sorry, I thought I had posted more details/drawbacks about this alternate approach previously, but I don't think I ever did. Details can be found at https://docs.microsoft.com/en-us/windows/apps/design/globalizing/use-utf8-code-page#set-a-process-code-page-to-utf-8

The only impact I believe this alternate approach would have for future developers (as opposed to current reviewers) compared to the current approach is that future developers will be able to write p.string() instead of PathToString(p) and write fs::path(s) instead of fs::PathFromString(s). And I think this difference is actually a drawback, not a benefit of that approach, because the fs::path constructor is not explicit, and the implicit conversions can lead to unintended bugs that are less likely to be cause by reviewer. Also I think it is better to have our own PathToString and PathFromString functions because these give a way to document more obviously which functions should be used for our project, instead of leaving developers looking at abstract cppreference pages that do not give concrete information about what the functions actually do on different platforms, or give any guidance on whether it is appropriate to chose .string() or .u8string() or .native() conversion methods in the context of writing bitcoin application code.

Another drawback of the UTF-8 windows code page build option is that is silently ignored by older versions of windows.

Another drawback of the UTF-8 windows code page build option is that it is not yet unimplemented, and will require someone with more knowledge than me of both the MSVC and Mingw builds to implement, while this PR is a straightforward code change that is basically ready to go.


Updated ddbfc51 -> d312e52 (pr/u8path.2 -> pr/u8path.3, compare) to fix https://cirrus-ci.com/task/6285153786920960

@hebasto
Copy link
Member

hebasto commented Sep 10, 2021

Out of curiosity, what is the leveldb approach to handle paths on Windows?

Copy link
Member

@hebasto hebasto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re-ACK 6544ea5, only added fsbridge_stem test case, updated comment, and rebased since my previous review. Verified with the following command:

$ git range-diff master bf9197c563cc62847a331df38c0a612433ea8beb 6544ea5035268025207d2402db2f7d90fde947a6

@kiminuo
Copy link
Contributor

kiminuo commented Oct 9, 2021

@hebasto I compiled on my Windows 10 machine using instructions here: https://github.com/ryanofsky/bitcoin/blob/pr/u8path/build_msvc/README.md#building

Note that bitcoin source files are located in folder C:\dev\fs_tests_₿_🏃 on my machine.

Now when I attempt to run bitcoind.exe with a unicode datadir (using PowerShell 7), it just stops for me saying that the directory does not exist:

PS C:\dev\fs_tests_₿_🏃\bitcoin-msvs\build_msvc\x64\Release> .\bitcoind.exe -testnet -datadir="C:\dev\fs_tests_₿_🏃\datadir"
Error: Specified data directory "C:\dev\fs_tests_₿_🏃\datadir" does not exist.

PS C:\dev\fs_tests_₿_🏃\bitcoin-msvs\build_msvc\x64\Release> mkdir "C:\dev\fs_tests_₿_🏃\datadir"

    Directory: C:\dev\fs_tests_₿_🏃

Mode                 LastWriteTime         Length Name
----                 -------------         ------ ----
d----          09.10.2021     9:51                datadir

PS C:\dev\fs_tests_₿_🏃\bitcoin-msvs\build_msvc\x64\Release> .\bitcoind.exe -testnet -datadir="C:\dev\fs_tests_₿_🏃\datadir"
Error: Specified data directory "C:\dev\fs_tests_₿_🏃\datadir" does not exist.

I'm not sure if I have not made an error somewhere but it would be great to know if this is a bug or not (and whose). And if it is, then if it is replicable on other machines.

@hebasto
Copy link
Member

hebasto commented Oct 9, 2021

@hebasto I compiled on my Windows 10 machine using instructions here: https://github.com/ryanofsky/bitcoin/blob/pr/u8path/build_msvc/README.md#building

Why not using the updated build instructions?

In the Command Prompt everything works as expected:

>git fetch origin pull/22937/merge:pr22937m-1009
>git switch pr22937m-1009
>cd .\build_msvc
>python msvc-autogen.py
>msbuild -property:Configuration=Release -maxCpuCount -verbosity:minimal bitcoin.sln
>cd .\x64\Release
>.\bitcoind.exe -testnet -datadir="C:\dev\fs_tests_₿_🏃\datadir"
Error: Specified data directory "C:\dev\fs_tests_₿_🏃\datadir" does not exist.


>mkdir "C:\dev\fs_tests_₿_🏃\datadir"

>.\bitcoind.exe -testnet -datadir="C:\dev\fs_tests_₿_🏃\datadir"
2021-10-09T08:22:03Z Bitcoin Core version v22.99.0-unk (release build)
2021-10-09T08:22:03Z Assuming ancestors of block 0000000000004ae2f3896ca8ecd41c460a35bf6184e145d91558cece1c688a76 have valid signatures.
2021-10-09T08:22:03Z Setting nMinimumChainWork=0000000000000000000000000000000000000000000005180c3bd8290da33a1a
2021-10-09T08:22:03Z Using the 'standard' SHA256 implementation
2021-10-09T08:22:03Z Default data directory C:\Users\hebasto\AppData\Roaming\Bitcoin
2021-10-09T08:22:03Z Using data directory C:\dev\fs_tests_₿_🏃\datadir\testnet3
...

@kiminuo
Copy link
Contributor

kiminuo commented Oct 9, 2021

Why not using the updated build instructions?

I forgot they have changed. Thanks for letting me know.

Anyway, I have re-compiled the source code (with merged master) and now it works for me with C:\dev\fs_tests_₿_🏃\datadir path. So it looks good. (I'm not entirely sure what was wrong the first time. Sorry for confusion.)

@kiminuo
Copy link
Contributor

kiminuo commented Oct 9, 2021

ACK 6544ea5

@DrahtBot
Copy link
Contributor

Guix builds

File commit 5b7210c
(master)
commit 143235b
(master and this pull)
SHA256SUMS.part c79a21dc3da6e71b... b983ebfb72b583d8...
*-aarch64-linux-gnu-debug.tar.gz 108b2ef989bd0d35... 3898376d7605da5d...
*-aarch64-linux-gnu.tar.gz a5bacdca50e2c3f0... 8a95a295c0788b6a...
*-arm-linux-gnueabihf-debug.tar.gz b21f70b81f75b1f6... 2a6531d4ec7722d0...
*-arm-linux-gnueabihf.tar.gz 6c79c07aa1d2df22... 08265d99eab44622...
*-osx-unsigned.dmg e15275439c514402... fe44836857a69d76...
*-osx-unsigned.tar.gz 23080798bd4790d0... 54e7d3c2230ef7bd...
*-osx64.tar.gz f4afcac2e217616d... 9cbeb8b52d026dee...
*-powerpc64-linux-gnu-debug.tar.gz 7bb92bb01f337a75... 7995f70142a230c2...
*-powerpc64-linux-gnu.tar.gz 085787f59f3c1a58... dad8aef36d96bec4...
*-powerpc64le-linux-gnu-debug.tar.gz 2732fa036e1ba621... 1cae331b80b8d374...
*-powerpc64le-linux-gnu.tar.gz 495e1c10d8950c1a... 30347fbdb5a1f1de...
*-riscv64-linux-gnu-debug.tar.gz 82c896f2e942879b... 39b532b2edc99221...
*-riscv64-linux-gnu.tar.gz ebc9d158e7fe286e... 8f76c93958777a8d...
*-win-unsigned.tar.gz 4a871e218e240ccb... 70f7493a26b2abd0...
*-win64-debug.zip 0ac6d7beea9e7a67... f108c4ac98fb6df1...
*-win64-setup-unsigned.exe 73d5ba8460a67851... be36ccc82ed58c0b...
*-win64.zip 104adfb4cd021131... 1e316e8ba80be236...
*-x86_64-linux-gnu-debug.tar.gz 896c0ecaefff84eb... baab942104ce73f7...
*-x86_64-linux-gnu.tar.gz 9cc9de13c910b2a4... ad1c11a6bd713e48...
*.tar.gz 0d51a99bd81f38ec... eba1905c58e6ac72...
guix_build.log 463c2d4bbaa53638... 3df07f1a2625978b...
guix_build.log.diff e06b14caa048bdb8...

@laanwj
Copy link
Member

laanwj commented Oct 15, 2021

Code review ACK 6544ea5

@laanwj laanwj merged commit 1884ce2 into bitcoin:master Oct 15, 2021
Fabcien pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Jan 11, 2022
…functions

Summary:
There is no change in behavior. This just helps prepare for the
transition from the `boost::filesystem` to the `std::filesystem` path
implementation.

Co-authored-by: Kiminuo <kiminuo@protonmail.com>

This is a backport of [[bitcoin/bitcoin#22937 | core#22937]] [1/2]
bitcoin/bitcoin@b39a477

Test Plan: `ninja all check-all`

Reviewers: #bitcoin_abc, Fabien

Reviewed By: #bitcoin_abc, Fabien

Differential Revision: https://reviews.bitcoinabc.org/D10784
Fabcien pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Jan 11, 2022
Summary:
> There is no change in behavior. This just helps prepare for the
> transition from `boost::filesystem` to `std::filesystem` by avoiding calls
> to methods which will be unsafe after the transaction to `std::filesystem`
> to due lack of a `boost::filesystem::path::imbue` equivalent and inability
> to set a predictable locale.
>
> Co-authored-by: Hennadii Stepanov <32963518+hebasto@users.noreply.github.com>
> Co-authored-by: Kiminuo <kiminuo@protonmail.com>
> Co-authored-by: MarcoFalke <falke.marco@gmail.com>

This is a partial backport of [[bitcoin/bitcoin#22937 | core#22937]] [2a/2g]
bitcoin/bitcoin@6544ea5

The backport for this commit was split in to multiple commits to make review easier.

This first commit does mainly two things:
 - add a couple of necessary methods to `fs::path`: `quoted` and `operator+` to be used in subsequent commits to migrate away from methods that will be  unsafe when we will  transition from `boost::filesystem` to `std::filesystem`.
 - use `fs::ofstream` instead of `std::ostream` because the former defines a constructor for `fs::path`, and pass in `fs::path` directly when constructing `fs::ofstream` instead of a c string. This removes the need to to path to string conversions that depend on the the BOOST version.

Depends on D10784, D10781, D10783

Test Plan: `ninja all check-all`

Reviewers: #bitcoin_abc, Fabien

Reviewed By: #bitcoin_abc, Fabien

Differential Revision: https://reviews.bitcoinabc.org/D10795
Fabcien pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Jan 11, 2022
Summary:
This is a partial backport of [[bitcoin/bitcoin#22937 | core#22937]] [2b/2g]
bitcoin/bitcoin@6544ea5

The backport for this commit was split in to multiple commits to make review easier. This part deals with converting  `fs::string()` calls that
will become unsafe when migrating from boost::filesystem to  std::filesystem.

Depends on D10795

Test Plan: `ninja all check-all`

Reviewers: #bitcoin_abc, Fabien

Reviewed By: #bitcoin_abc, Fabien

Differential Revision: https://reviews.bitcoinabc.org/D10796
Fabcien pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Jan 11, 2022
…s, qt, rpc)

Summary:
This is a partial backport of [[bitcoin/bitcoin#22937 | core#22937]] [2c/2g]
bitcoin/bitcoin@6544ea5

The backport for this commit was split in to multiple commits to make review easier. This part deals with converting  `fs::string()` calls that
will become unsafe when migrating from `boost::filesystem` to  `std::filesystem`.

Depends on D10796

Test Plan: `ninja all check-all`

Reviewers: #bitcoin_abc, Fabien

Reviewed By: #bitcoin_abc, Fabien

Differential Revision: https://reviews.bitcoinabc.org/D10797
Fabcien pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Jan 11, 2022
…z, wallet/test)

Summary:
This is a partial backport of [[bitcoin/bitcoin#22937 | core#22937]] [2d/2g]
bitcoin/bitcoin@6544ea5

The backport for this commit was split in to multiple commits to make review easier. This part deals with converting  `fs::string()` calls that
will become unsafe when migrating from `boost::filesystem` to  `std::filesystem`.

Depends on D10797

Test Plan: `ninja all check-all bitcoin-fuzzers`

Reviewers: #bitcoin_abc, Fabien

Reviewed By: #bitcoin_abc, Fabien

Differential Revision: https://reviews.bitcoinabc.org/D10798
Fabcien pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Jan 11, 2022
Summary:
This is a partial backport of [[bitcoin/bitcoin#22937 | core#22937]] [2e/2g]
bitcoin/bitcoin@6544ea5

The backport for this commit was split in to multiple commits to make review easier. This part deals with converting  `fs::string()` calls that
will become unsafe when migrating from `boost::filesystem` to  `std::filesystem`.

Depends on D10798

Test Plan: `ninja all check-all`

Reviewers: #bitcoin_abc, Fabien

Reviewed By: #bitcoin_abc, Fabien

Differential Revision: https://reviews.bitcoinabc.org/D10799
Fabcien pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Jan 11, 2022
Summary:
This is a partial backport of [[bitcoin/bitcoin#22937 | core#22937]] [2f/2g]
bitcoin/bitcoin@6544ea5

The backport for this commit was split in to multiple commits to make review easier. This part deals with converting  `fs::string()` calls that
will become unsafe when migrating from `boost::filesystem` to  `std::filesystem`.

Depends on D10799

Test Plan: `ninja all check-all`

Reviewers: #bitcoin_abc, Fabien

Reviewed By: #bitcoin_abc, Fabien

Differential Revision: https://reviews.bitcoinabc.org/D10800
Fabcien pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Jan 11, 2022
Summary:
This concludes backport of [[bitcoin/bitcoin#22937 | core#22937]] [2g/2g]
bitcoin/bitcoin@6544ea5

This commit disallows passing a `std::string` to `fs::path`, building on previous commits which removed all such calls. This is because the behavior of this constructor  on windows will be more complicated and can mangle path strings after the transition from `boost::filesystem` to `std::filesystem` in [[bitcoin/bitcoin#20744 | core#20744]] .

Depends on D10800

Test Plan: `ninja all check-all`

Reviewers: #bitcoin_abc, Fabien

Reviewed By: #bitcoin_abc, Fabien

Differential Revision: https://reviews.bitcoinabc.org/D10801
dekm pushed a commit to unigrid-project/daemon that referenced this pull request Oct 27, 2022
a43b8e9 build: set OSX_MIN_VERSION to 10.15 (fanquake)

Pull request description:

  Taken out of bitcoin#20744, as splitting up some of the build changes was mentioned [here](bitcoin#22937 (comment)).

  This is required to use `std::filesystem` on macOS, as support for it only landed in the libc++.dylib shipped with 10.15. So if we want to move to using `std::filesystem` for `23.0`, this bump is required.

  See also: https://developer.apple.com/documentation/xcode-release-notes/xcode-11-release-notes

  > Clang now supports the C++17 \<filesystem\> library for iOS 13, macOS 10.15, watchOS 6, and tvOS 13.

  macOS 10.15 was released in October 2019. macOS OS's seem to have a life of about 3 years, so it's possible that 10.14 will become officially unsupported by the end of 2021 and prior to the release of 23.0.

  Guix builds:
  ```bash
  bash-5.1# find guix-build-$(git rev-parse --short=12 HEAD)/output/ -type f -print0 | env LC_ALL=C sort -z | xargs -r0 sha256sum
  abc8b749be65f1339dcdf44bd1ed6ade2533b8e3b5030ad1dde0ae0cede78136  guix-build-a43b8e955558/output/dist-archive/bitcoin-a43b8e955558.tar.gz
  1edcc301eb4c02f3baa379beb8d4c78e661abc24a293813bc9d900cf7255b790  guix-build-a43b8e955558/output/x86_64-apple-darwin19/SHA256SUMS.part
  e9dbb5594a664519da778dde9ed861c3f0f631525672e17a67eeda599f16ff44  guix-build-a43b8e955558/output/x86_64-apple-darwin19/bitcoin-a43b8e955558-osx-unsigned.dmg
  11b23a17c630dddc7594c25625eea3de42db50f355733b9ce9ade2d8eba3a8f3  guix-build-a43b8e955558/output/x86_64-apple-darwin19/bitcoin-a43b8e955558-osx-unsigned.tar.gz
  257ba64a327927f94d9aa0a68da3a2695cf880b3ed1a0113c5a966dcc426eb5e  guix-build-a43b8e955558/output/x86_64-apple-darwin19/bitcoin-a43b8e955558-osx64.tar.gz
  ```

ACKs for top commit:
  hebasto:
    ACK a43b8e9
  jarolrod:
    ACK a43b8e9

Tree-SHA512: 9ac77be7cb56c068578860a3b2b8b7487c9e18b71b14aedd77a9c663f5d4bb19756d551770c02ddd12f1797beea5757b261588e7b67fb53509bb998ee8022369
delta1 added a commit to delta1/elements that referenced this pull request May 11, 2023
@bitcoin bitcoin locked and limited conversation to collaborators Aug 9, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants