New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: introduce single-separator split helper (boost::split replacement) #22953
refactor: introduce single-separator split helper (boost::split replacement) #22953
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems ok to use the already existing split function.
e332fcd
to
e3c7fe4
Compare
Force-pushed with changes as suggested by MarcoFalke (#22953 (comment)). Pinging @practicalswift as someone who has always shown interest in potential boost replacements and string processing helpers, with lots of contributions in that area. Maybe you feel like taking a look :) |
e3c7fe4
to
742fc50
Compare
I would still add a unit test even though there are tests for the internal helper. Maybe it's a little bit over the top so feel free to ignore my suggestion. (Something like this possibly: kiminuo@ac6dd5a) |
Any reason the tests aren't fixed up as well?
Also, if you add a unit test, might as extend the |
There non-tests occurrences too:
If I'm not missing something. edit: Ah, I'm not aware what the difference is between |
742fc50
to
25915e4
Compare
I agree that it's a good idea to add a unit test.
Thanks! As quick feedback, can you change the commit subject to be more verbose (e.g. "test: add unit tests for SplitString helper") and put the test into
Oh, I only |
Modified here: kiminuo@4516727. Anyway, feel free to modify the commit in whatever way you like :) |
The fuzz test I mentioned: (might not compile) diff --git a/src/test/fuzz/string.cpp b/src/test/fuzz/string.cpp
index 0c1b45b86c..e5a09c0013 100644
--- a/src/test/fuzz/string.cpp
+++ b/src/test/fuzz/string.cpp
@@ -127,6 +127,7 @@ FUZZ_TARGET(string)
int64_t amount_out;
(void)ParseFixedPoint(random_string_1, fuzzed_data_provider.ConsumeIntegralInRange<int>(0, 1024), &amount_out);
}
+ (void)SplitString(random_string_1, fuzzed_data_provider.ConsumeIntegral<char>());
{
(void)Untranslated(random_string_1);
const bilingual_str bs1{random_string_1, random_string_2}; |
Oh, I didn't see that in your previous comment. It compiles, I added another commit 👌 |
a30f721
to
8a9f940
Compare
Concept ACK - this is low-overhead for dropping more Boost dependence. Surprised we can't also nuke anything from |
Concept ACK. I don't think this boost use is very bad (it's a header-only library) but if we can replace it with functionality we already have internally, who not.
Well, |
8a9f940
to
ed24bf5
Compare
Force-pushed the alternative variant as suggested by @martinus, changing |
ed24bf5
to
8949509
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a fan of forcing devs to remember to construct a std::string manually each time before calling the function.
src/test/util_tests.cpp
Outdated
|
||
// Case-sensitivity of the separator. | ||
{ | ||
std::vector<std::string> result = SplitString(std::string("AAA"), 'a'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems dangerous to offer a function that silently does the wrong thing when passed the "wrong" type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree. This is fixed now by using an explicit interface instead of an alias, as suggested by martinus in #22953 (comment).
8949509
to
36ac60d
Compare
Windows build error seems unrelated, in
|
See #18548
On Thu, 23 Sep 2021 at 08:54, Martin Leitner-Ankerl < ***@***.***> wrote:
Windows build error seems unrelated, in wallet_address_types.py:
OSError: [WinError 10048] Only one usage of each socket address (protocol/network address/port) is normally permitted
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#22953 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AH3PXPQDTK3VEUUU4R2KEY3UDK6JVANCNFSM5D246WWQ>
.
--
…--
Hennadii Stepanov
|
36ac60d
to
049e609
Compare
Rebased on master. |
…-util.cpp 3bb9627 refactor: remove unused boost header include in bitcoin-util.cpp (Sebastian Falbesoner) Pull request description: This header was included since the introduction of bitcoin-util in commit 13762bc, but boost was actually never used (see `git log -S boost ./src/bitcoin-util.cpp`). Cherry-picked out of bitcoin#22953, which currently needs rebase. This commit could just be merged on its own. ACKs for top commit: MarcoFalke: review ACK 3bb9627 Tree-SHA512: 201ee1aa4d49074056654203db73a473479c2b92c49df8dbf8e35979f85178013c66540a665f0f6dc0a2efef88eb091e2b088bebff85d840033dffd8ae719349
This helper uses spanparsing::Split internally and enables to replace all calls to boost::split where only a single separator is passed. Co-authored-by: Martin Ankerl <Martin.Ankerl@gmail.com> Co-authored-by: MarcoFalke <falke.marco@gmail.com>
1d066fc
to
a62e844
Compare
Rebased on master again (there was a conflict in src/rest.cpp after #24098 has been merged). |
@martinus want to take another look here? |
Code review ACK a62e844. Ran all tests. I also like that with After that PR there is only one usage of |
…r (boos… …t::split replacement)
f849e63 fuzz: SplitString with multiple separators (Martin Leitner-Ankerl) d1a9850 http: replace boost::split with SplitString (Martin Leitner-Ankerl) 0d7efcd core_read: Replace boost::split with SplitString (Martin Leitner-Ankerl) b7ab9db Extend Split to work with multiple separators (Martin Leitner-Ankerl) Pull request description: As a followup of #22953, this removes the remaining occurrences of `boost::split` and replaces them with our own `SplitString`. To be able to do so, this extends the function `spanparsing::Split` to work with multiple separators. Finally this removes 3 more files from `lint-includes.py`. ACKs for top commit: theStack: Code-review ACK f849e63 Tree-SHA512: f37d4dbe11cab2046e646045b0f018a75f978d521443a2c5001512737a1370e22b09247d5db0e5c9e4153229a4e2d66731903c1bba3713711c4cae8cedcc775d
…litString f849e63 fuzz: SplitString with multiple separators (Martin Leitner-Ankerl) d1a9850 http: replace boost::split with SplitString (Martin Leitner-Ankerl) 0d7efcd core_read: Replace boost::split with SplitString (Martin Leitner-Ankerl) b7ab9db Extend Split to work with multiple separators (Martin Leitner-Ankerl) Pull request description: As a followup of bitcoin#22953, this removes the remaining occurrences of `boost::split` and replaces them with our own `SplitString`. To be able to do so, this extends the function `spanparsing::Split` to work with multiple separators. Finally this removes 3 more files from `lint-includes.py`. ACKs for top commit: theStack: Code-review ACK f849e63 Tree-SHA512: f37d4dbe11cab2046e646045b0f018a75f978d521443a2c5001512737a1370e22b09247d5db0e5c9e4153229a4e2d66731903c1bba3713711c4cae8cedcc775d
backport: merge bitcoin#18289, bitcoin#19090, bitcoin#22859, bitcoin#18758, bitcoin#21016, bitcoin#22953, bitcoin#25027, bitcoin#20461, bitcoin#25025, bitcoin#25057, bitcoin#25068 (deboostification)
This PR adds a simple string split helper
SplitString
that takes use of the spanparsingSplit
function that was first introduced in #13697 (commit fe8a7dc). This enables to replace most calls toboost::split
, in the cases where only a single separator character is used. Note that while previous attempts to replaceboost::split
were controversial (e.g. #13751), this one has a trivial implementation: it merely uses an internal helper (that is unit tested and in regular use with output descriptiors) and converts its result from spans to strings. As a drawback though, not allboost::split
instances can be tackled.As a possible optimization, one could return a vector of
std::string_view
s (available since C++17) instead of strings, to avoid copies. This would need more carefulness on the caller sites though, to avoid potential lifetime issues, and it's probably not worth it, considering that none of the places where strings are split are really performance-critical.