Skip to content

Conversation

@jsteemann
Copy link
Contributor

Scope & Purpose

Clean up the AQL register planning code a bit, and simplify APIs a bit.

  • Strictly new functionality (i.e. a new feature / new option, no need for porting)
  • The behavior change can be verified via automatic tests

Testing & Verification

This change is already covered by existing tests, such as scripts/unittest shell_server_aql, scripts/unittest catch.

http://172.16.10.101:8080/view/PR/job/arangodb-matrix-pr/9164/

@jsteemann jsteemann added this to the devel milestone Mar 25, 2020
@jsteemann
Copy link
Contributor Author

@jsteemann jsteemann marked this pull request as ready for review March 26, 2020 10:37
@jsteemann jsteemann requested a review from goedderz March 26, 2020 10:37
@jsteemann
Copy link
Contributor Author

jsteemann commented Mar 26, 2020

@mpoeter
Copy link
Contributor

mpoeter commented Mar 26, 2020

Just out of curiosity - why do you sort the RegisterId vectors? I did not see anything that relies on the order (though I could have easily missed that).

@jsteemann
Copy link
Contributor Author

You are right, nothing reies on the order of the registers in the vector.
However, I am sorting them so when we walk over them we will be accessing registers in ascending order. Often, when a register is accessed its underlying data is accessed as well. If we don't access registers in random order but in ascending order, we may benefit from cache prefetching, at least in theory. Though I doubt this will have a great effect in reality.

Copy link
Member

@goedderz goedderz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the cleanup in the RegisterPlan!

I have two notes, though:

  • I'd like to keep the set of input registers in the ExecutorInfos. While it's currently not in use, it is useful and I want to use it when working on the register planning in the next weeks.
  • The change from std::unordered_set<RegisterId> to a sorted std::vector<RegisterId> is too global for my taste. A local copy of the set in ExecutorInfos or so would have sufficed for the performance gain, while keeping the changes more local and not encumbering all future work with sorting a vector everywhere, which surely someone will not know or not remember when making a change. Alternatively some kind of minimal wrapper would help, so not everyone has to remember sorting it everytime (e.g. a class RegisterSet with only a constructor, emplace, begin and end).

// effects when peeking at the registers' values in the AqlItemBlock later).
#ifdef ARANGODB_ENABLE_MAINTAINER_MODE
TRI_ASSERT(std::is_sorted(_registersToKeep.begin(), _registersToKeep.end()));
TRI_ASSERT(std::is_sorted(_registersToClear.begin(), _registersToClear.end()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you intentionally omit a similar check for _outRegs?

Suggested change
TRI_ASSERT(std::is_sorted(_registersToClear.begin(), _registersToClear.end()));
TRI_ASSERT(std::is_sorted(_registersToClear.begin(), _registersToClear.end()));
TRI_ASSERT(std::is_sorted(_outRegs.begin(), _outRegs.end()));

// sic: It's possible that a current output register is immediately cleared!
TRI_ASSERT(regToClear < nrOutputRegisters);
TRI_ASSERT(_registersToKeep->find(regToClear) == _registersToKeep->end());
TRI_ASSERT(std::find(_registersToKeep.begin(), _registersToKeep.end(), regToClear) == _registersToKeep.end());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
TRI_ASSERT(std::find(_registersToKeep.begin(), _registersToKeep.end(), regToClear) == _registersToKeep.end());
TRI_ASSERT(!std::binary_search(_registersToKeep.begin(), _registersToKeep.end(), regToClear));

for (auto const& regToKeep : _registersToKeep) {
TRI_ASSERT(regToKeep < nrInputRegisters);
TRI_ASSERT(_registersToClear->find(regToKeep) == _registersToClear->end());
TRI_ASSERT(std::find(_registersToClear.begin(), _registersToClear.end(), regToKeep) == _registersToClear.end());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
TRI_ASSERT(std::find(_registersToClear.begin(), _registersToClear.end(), regToKeep) == _registersToClear.end());
TRI_ASSERT(!std::binary_search(_registersToClear.begin(), _registersToClear.end(), regToKeep));

*
* @return The indices of the input registers.
*/
std::shared_ptr<std::unordered_set<RegisterId> const> getInputRegisters() const;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to keep this information (the set of input registers). It will make it easy to add useful assertions to AQL that will immediately help when rewriting the register planning.

Comment on lines +59 to +60
// TODO: improve on this
if (std::find(registers.begin(), registers.end(), col) == registers.end()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// TODO: improve on this
if (std::find(registers.begin(), registers.end(), col) == registers.end()) {
if (!std::binary_search(registers.begin(), registers.end(), col)) {

totalNrRegs++;
}

void RegisterPlan::registerVariable(Variable const* v) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍


/// @brief get registers to clear
std::unordered_set<RegisterId> const& getRegsToClear() const;
std::vector<RegisterId> const& getRegsToClear() const;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unnecessary and counterproductive. While I agree that there might be a performance gain when this is used in the OutputAqlItemRow, everywhere else it's just harder to read (that something is a vector gives me much less information than something being some kind of set) and error-prone. Anyone doing changes to any register set now has to know and remember that it has to be sorted.

@jsteemann
Copy link
Contributor Author

Obsoleted by #11385

@jsteemann jsteemann closed this Apr 3, 2020
@jsteemann jsteemann deleted the bug-fix/cleanup-register-planning branch March 9, 2022 13:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants