Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compile QLever with C++20 #385

Merged
merged 7 commits into from May 2, 2021
Merged

Compile QLever with C++20 #385

merged 7 commits into from May 2, 2021

Conversation

joka921
Copy link
Member

@joka921 joka921 commented May 2, 2021

This PR sets the Dockerfile and CMakeLists.txt to use C++20. It is the future and will allow us to write much more readable code, especially when it comes to template-metaprogramming.

This PR applies also several small fixes to make QLever actually compile with C++20, most notably:

  • u8-Strings like u8"äpfel" now have type char8_t[] which is not compatible with char. I have removed this prefix, because it is not necessary (we enforce Utf-8 everywhere anyway).

  • Small fixes for the stxxl-related submodules (I have contacted the team, to find out, whether stxxl is still maintained or whether we will have to maintain our own fork in the future).

# Conflicts:
#	src/util/Serializer/Serializer.h
…th plain char. The simple fix is to remove it

- We generally enforce Utf-8 everywhere, so this is fine.
Copy link
Member

@hannahbast hannahbast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for this!

There are now no more u8 prefixes, that much I understood. I did not yet understand when you need to add the fs aka fixed_string (several cases) and when you need to add the s suffix (one case) to a string. And in RuntimeInformation.h there still are a few occurrences of u8.

Comment on lines 56 to 57
out << indentStr(indent - 1) << reinterpret_cast<const char*>(u8"├─ ")
<< _descriptor << '\n';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you keep the u8 prefix here, but not in, say, Tokenizer.h ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No good reason, I will remove them.

*/
static constexpr auto PnCharsBaseString = fixed_string("A-Za-z");

static constexpr auto PnCharsUString = PnCharsBaseString + u8"_";
static constexpr auto PnCharsUString = PnCharsBaseString + fs("_");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read the definition of the fs lambda in the code above, but did not understand why it's needed (as opposed to just calling fixed_string). Can you explain?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

  1. fs("content") is equivalent to ctll::fixed_string("content") . It is just shorter, which helps in this local case because it is needed a lot, and then the "actual" contents of the string become more visible.

  2. There is an overload for operator+ for fixed_string and compile time known char[] constants. Thus we do not have to write fs(... or fixed_string for every little string when concatenating, but only for one of them. In the places where I changed u8"content" to 'fs("content")I could also just have gone"content"`. Do you want me to make this consistent?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation.

@1: OK, but then maybe the comment, where the fs lambda is defined could be clarified. Right now it says "Is needed for some magical reasons".

@2: What about expressions like PnCharsUString + fs("\\-0-9") or cls(PnCharsBaseString) + fs("(\\.") + ..., is it needed there? Or is it only needed for the first string in a sum?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@1 I will.
@2 Since the + is left-to-right associative, it suffices, if the firs or second element of a long sum has the type fixed_string. I will adapt this for readability

@@ -141,7 +141,7 @@ bool TurtleParser<T>::predicateSpecialA() {
_tok.skipWhitespaceAndComments();
if (auto [success, word] = _tok.template getNextToken<TokId::A>(); success) {
(void)word;
_activePredicate = u8"<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>";
_activePredicate = "<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>"s;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the s needed here and not in other places?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not needed. The s suffix makes a std::string constant, whereas the "" is just a const char* which can also be implicitly converted to a std::string. If it confuses you, I can remove it again.

constexpr ctre::fixed_string r1(R"(\xc3([\x80-\x96]|[\x98-\xb6]|[\xB8-\xBF]))");
constexpr ctre::fixed_string r2(R"([\xc4-\xcb][\x80-\xBF])");
constexpr ctre::fixed_string r3(
constexpr ctll::fixed_string r1(R"(\xc3([\x80-\x96]|[\x98-\xb6]|[\xB8-\xBF]))");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why ctll:: instead of ctre:: now?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't ask me. They stopped compiling with the ctre:: namespace, but it's the same type.

Copy link
Member

@hannahbast hannahbast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again!

Looks to me like there are now so few applications of fs that the lambda is no longer needed. But I leave that to you, it's fine either way

@joka921 joka921 merged commit 9a5735c into ad-freiburg:master May 2, 2021
@joka921 joka921 deleted the f.C++20 branch May 2, 2021 18:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants