New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
StringUtils.h cleanup #553
StringUtils.h cleanup #553
Conversation
b170f0b
to
3a7144a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much.
I have Some small suggestions and very few questions,
but overall it is good, if we remove our own hacky implementations of stuff if it's inside
std::
, boost::
, absl::
....
} | ||
return -1; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Were all these functions completely unused in the current QLever codebase?
I think this utils files was copied over from a previous project. I think most of them became unused,
when we refactored the parsing to use (more) proper and generic approaches.
I am not sure whether we should leave utf8ToUpper
in as the "dual" to the lowercasing, but then again, we don't
really need it, and it can be reimplemented easily, so we should probably remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same with lstrip
, strip
and rstrip
are used but I don't think it's worth to keep them, especially because it looks like absl already has those kind of functions:
https://github.com/abseil/abseil-cpp/blob/e3fdd9b16a2a90c9e01e00de46605ce59bebc661/absl/strings/ascii.h#L230-L234
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some functions even had their tests in-place but the tests were the only usage
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I would suggest replacing strip
by absl::StripAsciiWhitespace
and rstrip
by absl::StripTrailingAsciiWhitespace
.
They also have better names and reduce the technical burden of maintining the StringUtils
class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed that rstrip is currently taking a non-whitespace argument, so replacing is not possible unfortunately contrary to my initial assumption. I also wasn't able to find an equivalent function in absl
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How often do we use rstrip
with specific characters and for what?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rstrip
is used once to trim trailing zeroes, haven't checked the usages of strip
though, but I assume it is less than 5
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
strip
is also only used once, I'm not even sure if this single use makes sense from a semantic standpoint.
If you want we can talk about those 2 remaining usages
@joka921 PR comments have been addressed, please have another look |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think,
If this passes the tests, we can merge it.
I will also have another close look at many of the utils soon, we don't
have to remove all of them at once.
@@ -214,7 +214,7 @@ class LocaleManager { | |||
size_t prefixLengthSoFar = 1; | |||
SortKey completeSortKey = getSortKey(s, Level::PRIMARY); | |||
while (numContributingCodepoints < prefixLength || | |||
!completeSortKey.get().starts_with(sortKey.get())) { | |||
!completeSortKey.starts_with(sortKey)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In line 221 the get()
can also go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will merge this as soon as the tests pass
Just some small adjustments to prefer built-in functions over self written ones