Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

util: ParseByteUnits - Parse a string with suffix unit #23249

Merged
merged 1 commit into from
Nov 24, 2021

Conversation

dougEfresh
Copy link
Contributor

@dougEfresh dougEfresh commented Oct 11, 2021

A convenience utility for parsing human readable strings sizes e.g. 500G is 500 * 1 << 30

The argument/setting maxuploadtarget now accept human readable byte units [k|K|m|M|g|G||t|T]
This change backward compatible, defaults to M if no unit specified.

src/init.cpp Outdated Show resolved Hide resolved
src/util/strencodings.cpp Outdated Show resolved Hide resolved
src/util/strencodings.h Outdated Show resolved Hide resolved
src/util/strencodings.cpp Outdated Show resolved Hide resolved
@dougEfresh
Copy link
Contributor Author

dougEfresh commented Oct 13, 2021

Updated with

  • Replacing size_t in favor of uint64_t (windows/mac didn't play nice with size_t)
  • Adding InitError on parsing failure
  • Simplified case statement in ParseByteUnit
  • 1000 for lowercase 1024 base for UPPERCASE

What I need help with is overflows. I use __builtin_mul_overflow but there must be a better/easier solution.

@dougEfresh dougEfresh changed the title util: ParseByteUnits - Parse a string with a suffix unit util: ParseByteUnits - Parse a string with suffix unit Oct 13, 2021
@DrahtBot
Copy link
Contributor

DrahtBot commented Oct 14, 2021

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #22834 (net: respect -onlynet= when making outbound connections by vasild)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@dougEfresh dougEfresh force-pushed the util-parsebyteunit branch 2 times, most recently from 933f7ff to 0d786a1 Compare October 15, 2021 17:04
Copy link
Member

@promag promag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK. However, I think the implementation can be improved. Why not check the last char first? If it's not a digit then extract it and validate to get the multiplier. Only then parse the number.

src/util/strencodings.cpp Outdated Show resolved Hide resolved
src/util/strencodings.cpp Outdated Show resolved Hide resolved
src/init.cpp Outdated Show resolved Hide resolved
Copy link
Contributor

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review and lightly tested ACK c5d4a59. I left some suggestions below, but none are important so feel free to ignore them.

It might be good to add a release note in the doc/ directory mentioning the new argument verification, since in theory it could make a configuration that used to work previously now cause an error on startup.

It would also be good to mention new behavior in the PR description, since current description is a little ambiguous about whether this is only adding a utility function or changing behavior.

src/init.cpp Outdated Show resolved Hide resolved
src/util/strencodings.cpp Outdated Show resolved Hide resolved
src/test/util_tests.cpp Outdated Show resolved Hide resolved
src/test/util_tests.cpp Outdated Show resolved Hide resolved
@dougEfresh
Copy link
Contributor Author

It might be good to add a release note in the doc/ directory mentioning the new argument verification, since in theory it could make a configuration that used to work previously now cause an error on startup.

Added to release docs

It would also be good to mention new behavior in the PR description, since current description is a little ambiguous about whether this is only adding a utility function or changing behavior.

Updated PR description.

Thanks for feedback @ryanofsky

Copy link
Contributor

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review ACK f1f8a77. Just suggested changes since last review (thanks!)

src/init.cpp Outdated Show resolved Hide resolved
src/test/util_tests.cpp Show resolved Hide resolved
src/util/strencodings.cpp Outdated Show resolved Hide resolved
src/init.cpp Outdated Show resolved Hide resolved
Copy link
Contributor

@vasild vasild left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK f3f8806

Some styling nits below, which would be good to address before merge.

The style rules are defined in src/.clang-format. You can use contrib/devtools/clang-format-diff.py to properly format the patch.

src/util/strencodings.h Outdated Show resolved Hide resolved
src/util/strencodings.cpp Outdated Show resolved Hide resolved
src/util/strencodings.cpp Outdated Show resolved Hide resolved
src/init.cpp Outdated Show resolved Hide resolved
Copy link
Contributor

@vasild vasild left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 6ebb476

@dougEfresh
Copy link
Contributor Author

@MarcoFalke would you mind reviewing? Thanks

Copy link
Contributor

@vasild vasild left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 4451a00

Copy link
Member

@promag promag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review ACK 4451a00.

src/util/strencodings.cpp Outdated Show resolved Hide resolved
src/util/strencodings.cpp Outdated Show resolved Hide resolved
}

uint64_t num;
if (!ParseUInt64(unit ? str.substr(0, str.size() - 1) : str, &num) ||
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this breaking change (no longer ignoring whitespace), which I think is fine, it might be best to also reject sign characters (+-) and use ToIntegral<uint64_t>.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have tests for +-, whitespace, invalid unit...etc
I'm now using ToIntegral.
I've added an assert(default_multiplier > 0) but I don't know how to test this. Any ideas examples?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You cannot have the test trigger an assert() and continue successfully. Instead, you can throw std::out_of_range from the function and try/catch it from the test (BOOST_CHECK_THROW()). But in this case maybe it is best/simpler to just remove the parameter. Then there is no need to test edge cases for it :)

* @returns optional uint64_t bytes from str or nullopt
* if ParseUInt64 is false, str is empty, trailing whitespace or overflow
*/
std::optional<uint64_t> ParseByteUnits(const std::string& str, uint64_t default_multiplier = 1024UL * 1024UL);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the default multiplier have a default? It seems like the default is something that is per-arg and not project-wide.

Copy link
Contributor Author

@dougEfresh dougEfresh Nov 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I will remove the default.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about 1 as a default multiplier ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about removing the argument? It's not needed. Either way please update doc comment above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

default of 1 seems natural, so my latest change defaults to 1

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about removing the argument? It's not needed.

Indeed! I would remove that unused code. No parameter, no problem :) (what should be its default value? should it have default value? how to test 0 if we have assert()?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i do need this multiplier argument. the intent of this function is to be used with maxupload, dbcache...etc. they inheritly have a default multiplier (mostly MiB). the main purpose is maintain backwards capability, e. g. maxupload=300 becomes 300m.
As I write this, i think I will change the argument to a an enum with the valid units [k|K|m|M...]
with no default value. This makes it clear and easy to verify/test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vasild @MarcoFalke @promag
I added an enum ByteUnitMultiplier to src/util/strencodings.src/util/strencodings.h
Let me know what you think.

Copy link
Member

@maflcko maflcko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feel free to ignore the style nits

src/util/strencodings.cpp Outdated Show resolved Hide resolved
src/util/strencodings.cpp Outdated Show resolved Hide resolved
src/util/strencodings.h Outdated Show resolved Hide resolved
@dougEfresh dougEfresh force-pushed the util-parsebyteunit branch 2 times, most recently from 7f101cc to 02df252 Compare November 13, 2021 15:16
Copy link
Contributor

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review ACK 1f51a25. Main changes since last review are removing default multiplier, asserting multiplier is positive, adding enum

Copy link
Contributor

@vasild vasild left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks ok (1f51a25). Some nits below, feel free to ignore.

I think the 3 commits should be squashed into one - subsequent ones just change stuff added by the first commit.

src/test/util_tests.cpp Outdated Show resolved Hide resolved
src/util/strencodings.h Outdated Show resolved Hide resolved
A convenience utility for human readable arguments/config e.g. -maxuploadtarget=500g
@dougEfresh
Copy link
Contributor Author

@MarcoFalke
hopefully, last review 21b58f43
Main changes since last review are removing assigning a default multiplier, adding enum for ByteUnit(s)

Copy link
Contributor

@vasild vasild left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 21b58f4

Just squash and 2 minor nits since 1f51a25, verified with

diff -u <(git diff 1f51a25aa5~3..1f51a25aa5) <(git show 21b58f430f) |vim -

Thanks!

Copy link
Contributor

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review ACK 21b58f4. Only changes since last review are dropping optional has_value call, fixing comment punctuation, squashing commits.

BOOST_CHECK_EQUAL(ParseByteUnits("1", noop).value(), 1);
BOOST_CHECK_EQUAL(ParseByteUnits("0", noop).value(), 0);

BOOST_CHECK_EQUAL(ParseByteUnits("1k", noop).value(), 1000ULL);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Could also add a test that default_multiplier is ignored completely if passed in via the string? Maybe also add a fuzz test?

@maflcko maflcko merged commit 73ac195 into bitcoin:master Nov 24, 2021
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Nov 24, 2021
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Nov 24, 2021
@rebroad
Copy link
Contributor

rebroad commented Dec 24, 2021

This fails to compile on some (unsupported, e.g. cygwin) systems:

  CXX      bitcoind-bitcoind.o
In file included from ./netaddress.h:19,
                 from ./chainparams.h:11,
                 from bitcoind.cpp:10:
./util/strencodings.h:41:10: warning: multi-character character constant [-Wmultichar]
   41 |     m = 1'000'000ULL,
      |          ^~~~~

@sipa
Copy link
Member

sipa commented Dec 24, 2021

@rebroad You need a C++17 compiler.

delta1 added a commit to delta1/elements that referenced this pull request May 30, 2023
@bitcoin bitcoin locked and limited conversation to collaborators Sep 21, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants