Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StringUtils.h cleanup #553

Merged
merged 17 commits into from
Jan 27, 2022
Merged
4 changes: 2 additions & 2 deletions src/index/Index.Text.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -482,7 +482,7 @@ void Index::calculateBlockBoundariesImpl(
if (forcedBlockStartSortKey.get() >= prefixSortKey.get()) {
RobinTF marked this conversation as resolved.
Show resolved Hide resolved
break;
}
if (prefixSortKey.get().starts_with(forcedBlockStartSortKey.get())) {
if (prefixSortKey.starts_with(forcedBlockStartSortKey)) {
prefixSortKey = std::move(forcedBlockStartSortKey);
prefixLength = MIN_WORD_PREFIX_SIZE;
return;
Expand Down Expand Up @@ -518,7 +518,7 @@ void Index::calculateBlockBoundariesImpl(
// The `startsWith` also correctly handles the case where
RobinTF marked this conversation as resolved.
Show resolved Hide resolved
// `nextPrefixSortKey` is "longer" than `MIN_WORD_PREFIX_SIZE`, e.g. because
// of unicode ligatures.
bool samePrefix = nextPrefixSortKey.get().starts_with(prefixSortKey.get());
bool samePrefix = nextPrefixSortKey.starts_with(prefixSortKey);
if (tooShortButNotEqual || !samePrefix) {
blockBoundaryAction(i);
numBlocks++;
Expand Down
2 changes: 2 additions & 0 deletions src/index/IndexBuilderTypes.h
Original file line number Diff line number Diff line change
Expand Up @@ -139,6 +139,8 @@ auto getIdMapLambdas(std::array<ItemMapManager, Parallelism>* itemArrayPtr,
map.assignNextId(ad_utility::convertToLanguageTaggedPredicate(
lt._triple[1], lt._langtag));
auto& spoIds = *(res[0]); // ids of original triple
// TODO replace the std::array by an explicit IdTriple class,
// then the emplace calls don't need the explicit type.
// extra triple <subject> @language@<predicate> <object>
res[1].emplace(
std::array<Id, 3>{spoIds[0], langTaggedPredId, spoIds[2]});
Expand Down
2 changes: 1 addition & 1 deletion src/index/StringSortComparator.h
Original file line number Diff line number Diff line change
Expand Up @@ -214,7 +214,7 @@ class LocaleManager {
size_t prefixLengthSoFar = 1;
SortKey completeSortKey = getSortKey(s, Level::PRIMARY);
while (numContributingCodepoints < prefixLength ||
!completeSortKey.get().starts_with(sortKey.get())) {
!completeSortKey.starts_with(sortKey)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In line 221 the get() can also go

auto [numCodepoints, prefix] =
ad_utility::getUTF8Prefix(s, prefixLengthSoFar);
auto nextLongerSortKey = getSortKey(prefix, Level::PRIMARY);
Expand Down
4 changes: 4 additions & 0 deletions test/StringSortComparatorTest.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -237,6 +237,10 @@ TEST(LocaleManager, PrefixSortKey) {

ASSERT_GT(a.size(), b.size());
ASSERT_TRUE(a.starts_with(b));
// Also test the defaulted consistent comparison.
ASSERT_GT(a, b);
ASSERT_EQ(a, a);
ASSERT_NE(a, b);
ASSERT_FALSE(comp("vivæ", "vivae", LocaleManager::Level::PRIMARY));
ASSERT_FALSE(comp("vivæ", "vivae", LocaleManager::Level::PRIMARY));
}