Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StringUtils.h cleanup #553

Merged
merged 17 commits into from Jan 27, 2022
Merged

Conversation

RobinTF
Copy link
Collaborator

@RobinTF RobinTF commented Jan 25, 2022

Just some small adjustments to prefer built-in functions over self written ones

Copy link
Member

@joka921 joka921 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much.
I have Some small suggestions and very few questions,
but overall it is good, if we remove our own hacky implementations of stuff if it's inside
std:: , boost::, absl::....

src/index/CompressedRelation.cpp Show resolved Hide resolved
src/index/CompressedRelation.h Show resolved Hide resolved
src/index/Index.Text.cpp Outdated Show resolved Hide resolved
src/index/IndexBuilderTypes.h Show resolved Hide resolved
src/parser/RdfEscaping.cpp Outdated Show resolved Hide resolved
src/parser/RdfEscaping.cpp Outdated Show resolved Hide resolved
src/parser/TurtleParser.cpp Outdated Show resolved Hide resolved
}
return -1;
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Were all these functions completely unused in the current QLever codebase?

I think this utils files was copied over from a previous project. I think most of them became unused,
when we refactored the parsing to use (more) proper and generic approaches.

I am not sure whether we should leave utf8ToUpper in as the "dual" to the lowercasing, but then again, we don't
really need it, and it can be reimplemented easily, so we should probably remove it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same with lstrip, strip and rstrip are used but I don't think it's worth to keep them, especially because it looks like absl already has those kind of functions:
https://github.com/abseil/abseil-cpp/blob/e3fdd9b16a2a90c9e01e00de46605ce59bebc661/absl/strings/ascii.h#L230-L234

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some functions even had their tests in-place but the tests were the only usage

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I would suggest replacing strip by absl::StripAsciiWhitespace and rstrip by absl::StripTrailingAsciiWhitespace.
They also have better names and reduce the technical burden of maintining the StringUtils class.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that rstrip is currently taking a non-whitespace argument, so replacing is not possible unfortunately contrary to my initial assumption. I also wasn't able to find an equivalent function in absl

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How often do we use rstrip with specific characters and for what?

Copy link
Collaborator Author

@RobinTF RobinTF Jan 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rstrip is used once to trim trailing zeroes, haven't checked the usages of strip though, but I assume it is less than 5

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

strip is also only used once, I'm not even sure if this single use makes sense from a semantic standpoint.
If you want we can talk about those 2 remaining usages

src/util/StringUtils.h Show resolved Hide resolved
src/util/StringUtils.h Show resolved Hide resolved
@RobinTF
Copy link
Collaborator Author

RobinTF commented Jan 27, 2022

@joka921 PR comments have been addressed, please have another look

Copy link
Member

@joka921 joka921 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think,
If this passes the tests, we can merge it.
I will also have another close look at many of the utils soon, we don't
have to remove all of them at once.

src/index/Index.Text.cpp Outdated Show resolved Hide resolved
src/index/Index.Text.cpp Show resolved Hide resolved
@@ -214,7 +214,7 @@ class LocaleManager {
size_t prefixLengthSoFar = 1;
SortKey completeSortKey = getSortKey(s, Level::PRIMARY);
while (numContributingCodepoints < prefixLength ||
!completeSortKey.get().starts_with(sortKey.get())) {
!completeSortKey.starts_with(sortKey)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In line 221 the get() can also go

Copy link
Member

@joka921 joka921 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will merge this as soon as the tests pass

@joka921 joka921 merged commit 3618a89 into ad-freiburg:master Jan 27, 2022
@RobinTF RobinTF deleted the string-utils-refactoring branch January 27, 2022 12:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants