Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reverse lookup for DataFlowAnalyzer #14112

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
1 change: 1 addition & 0 deletions Changelog.md
Expand Up @@ -4,6 +4,7 @@ Language Features:


Compiler Features:
* Optimizer: Introduced an optimized variable assignment map to improve the performance of Yul optimizer steps that rely on data flow analysis.


Bugfixes:
Expand Down
2 changes: 2 additions & 0 deletions libyul/CMakeLists.txt
Expand Up @@ -183,6 +183,8 @@ add_library(yul
optimiser/UnusedPruner.h
optimiser/VarDeclInitializer.cpp
optimiser/VarDeclInitializer.h
optimiser/VariableAssignmentMap.cpp
optimiser/VariableAssignmentMap.h
optimiser/VarNameCleaner.cpp
optimiser/VarNameCleaner.h
)
Expand Down
7 changes: 3 additions & 4 deletions libyul/optimiser/DataFlowAnalyzer.cpp
Expand Up @@ -270,7 +270,7 @@ void DataFlowAnalyzer::handleAssignment(std::set<YulString> const& _variables, E
auto const& referencedVariables = movableChecker.referencedVariables();
for (auto const& name: _variables)
{
m_state.references[name] = referencedVariables;
m_state.references.set(name, referencedVariables);
if (!_isDeclaration)
{
// assignment to slot denoted by "name"
Expand Down Expand Up @@ -353,9 +353,8 @@ void DataFlowAnalyzer::clearValues(std::set<YulString> _variables)
// Also clear variables that reference variables to be cleared.
std::set<YulString> referencingVariables;
for (auto const& variableToClear: _variables)
for (auto const& [ref, names]: m_state.references)
if (names.count(variableToClear))
referencingVariables.emplace(ref);
if (auto&& references = m_state.references.getReversedOrNullptr(variableToClear))
referencingVariables += *references;

// Clear the value and update the reference relation.
for (auto const& name: _variables + referencingVariables)
Expand Down
9 changes: 5 additions & 4 deletions libyul/optimiser/DataFlowAnalyzer.h
Expand Up @@ -25,6 +25,7 @@

#include <libyul/optimiser/ASTWalker.h>
#include <libyul/optimiser/KnowledgeBase.h>
#include <libyul/optimiser/VariableAssignmentMap.h>
#include <libyul/YulString.h>
#include <libyul/AST.h> // Needed for m_zero below.
#include <libyul/SideEffects.h>
Expand Down Expand Up @@ -104,7 +105,7 @@ class DataFlowAnalyzer: public ASTModifier

/// @returns the current value of the given variable, if known - always movable.
AssignedValue const* variableValue(YulString _variable) const { return util::valueOrNullptr(m_state.value, _variable); }
std::set<YulString> const* references(YulString _variable) const { return util::valueOrNullptr(m_state.references, _variable); }
std::set<YulString> const* references(YulString _variable) const { return m_state.references.getOrderedOrNullptr(_variable); }
std::map<YulString, AssignedValue> const& allValues() const { return m_state.value; }
std::optional<YulString> storageValue(YulString _key) const;
std::optional<YulString> memoryValue(YulString _key) const;
Expand Down Expand Up @@ -179,9 +180,9 @@ class DataFlowAnalyzer: public ASTModifier
{
/// Current values of variables, always movable.
std::map<YulString, AssignedValue> value;
/// m_references[a].contains(b) <=> the current expression assigned to a references b
nikola-matic marked this conversation as resolved.
Show resolved Hide resolved
std::unordered_map<YulString, std::set<YulString>> references;

/// references.m_ordered[a].contains(b) <=> the current expression assigned to a references b
/// references.m_reversed[b].contains(a) <=> b from current expression assigned to a is references by a
VariableAssignmentMap references;
Environment environment;
};

Expand Down
39 changes: 39 additions & 0 deletions libyul/optimiser/VariableAssignmentMap.cpp
@@ -0,0 +1,39 @@
#include <libyul/optimiser/VariableAssignmentMap.h>
Copy link
Member

@cameel cameel May 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something felt off to me about this file and I finally realized why. It looks too clean. We can't have such nice things here :P You must add the ugly license boilerplate.


using std::set;
using namespace solidity::yul;

void VariableAssignmentMap::set(YulString const& _variable, std::set<YulString> const& _references)
{
erase(_variable);
m_ordered[_variable] = _references;
for (auto&& reference: _references)
m_reversed[reference].emplace(_variable);
}

void VariableAssignmentMap::erase(YulString const& _variable)
{
for (auto&& reference: m_ordered[_variable])
if (m_reversed.find(reference) != m_reversed.end())
{
if (m_reversed[reference].size() > 1)
m_reversed[reference].erase(_variable);
else
// Only fully remove an entry if no variables other than _variable
// are contained in the set pointed to by reference.
m_reversed.erase(reference);
}
m_ordered.erase(_variable);
Comment on lines +16 to +26
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an easy micro-optimization here - you're looking up reference 3 times, while you could instead just find an iterator to the element and then use that. Similar with _variable - 2 separate lookups.

It's a fixed number of times so it won't change the complexity of the whole algorithm but it's very easy to do. I'm curious if it will make any kind of difference in benchmark results given that we perform this operation a lot. The lookup is something like O(log n) so nothing compared to the linear search we had before but still worth a try.

}

set<YulString> const* VariableAssignmentMap::getOrderedOrNullptr(YulString const& _variable) const
{
auto&& it = m_ordered.find(_variable);
return (it != m_ordered.end()) ? &it->second : nullptr;
}

set<YulString> const* VariableAssignmentMap::getReversedOrNullptr(YulString const& _variable) const
{
auto&& it = m_reversed.find(_variable);
return (it != m_reversed.end()) ? &it->second : nullptr;
}
85 changes: 85 additions & 0 deletions libyul/optimiser/VariableAssignmentMap.h
@@ -0,0 +1,85 @@
/*
This file is part of solidity.

solidity is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

solidity is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with solidity. If not, see <http://www.gnu.org/licenses/>.
*/

#pragma once

#include <libyul/YulString.h>

#include <set>
#include <unordered_map>

namespace solidity::yul
{

/**
* Class that implements a reverse lookup for an ``unordered_map<YulString, set<YulString>>`` by wrapping the
* two such maps - one ordered, and one reversed, e.g.
*
* m_ordered m_reversed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This naming is a bit confusing. What specifically is ordered here? Its type is unordered_map so it surely can't be referring to the order of elements, can it?

Maybe something like m_assignments and m_uses would be better names? Or maybe m_lValues and m_rValues?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's ordered in the sense that it's the opposite of reversed. m_assigments and m_uses isn't really correct either, e.g. if we have

a = x + y;

then (a,x) and (a,y) could be assignments - but what are the uses? (x, a), (y, a)? That doesn't really make sense to me either. lvaluesandrvalues` do make sense, but are then somewhat confusing in terms of C++ semantics. In any case, I've spend quite a while trying to come up with names for these, and I'm still convinced these are the best, especially since the whole purpose of this PR is to implement a reverse lookup.

* f -> (g,) g -> (f,)
* c -> (b,d,e,) d -> (c,)
* a -> (b,c,) c -> (a,)
* e -> (c,)
* b -> (a,c,)
*
* The above example will from here onwards be referenced as ```Ref 1```.
*
* This allows us to have simultaneously managed insertion and deletion via a single interface, instead of manually
* managing this at the point of usage (see ``DataFlowAnalyzer``).
*/
class VariableAssignmentMap
{
public:
VariableAssignmentMap() = default;
/**
* Insert a set of values for the provided key ``_variable`` into ``m_ordered`` and ``m_reversed``.
nikola-matic marked this conversation as resolved.
Show resolved Hide resolved
* This method will erase all references of ``_variable`` from both sets before performing the insertion,
* akin to container assignment with subscript operator, i.e. container[index] = value.
* For example, if ``_variable`` is ``x`` and ``_references`` is ``{"y", "z"}``, the following would be added to ``Ref 1``:
*
* m_ordered m_reversed
* x -> (y, z,) y -> (x,)
* y -> (z,)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* y -> (z,)
* z -> (x,)

*
* @param _variable current expression variable
* @param _references all referenced variables in the expression assigned to ``_variable``
*/
void set(YulString const& _variable, std::set<YulString> const& _references);

/**
* Erase entries in both maps based on provided ``_variable``. The behaviour is the same as ``set("x", {})``.
* For example, after deleting ``c`` for ``Ref 1``, ``Ref 1`` would contain the following:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we say that erase(x) is exactly the same as set(x, {})?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eh, yes and no - since we do m_ordered[_variable] = _references;, which will still make an insertion for key x. I don't think it would ultimately affect the behaviour. I could insert an empty check for _references however, in which case your statement would be fully correct.

edit: Actually, I'm assuming you already knew that this will insert a key with an empty value, so yes, it's exactly the same as set(x, {}).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get your answer here. You're saying that it's not the same but then that it's the same after all. It doesn't look the same to me.

I could insert an empty check for _references however, in which case your statement would be fully correct.

I'd add this check because I don't think we care about distinguishing x being assigned an empty set of variables from not being assigned anything. The former can't even be expressed in the language. With this check the behavior of the container will be more consistent - currently with getOrderedOrNullptr() you have to check for an empty set explicitly, while with getReversedOrNullptr() you can assume you'll always get nullptr instead.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, now that I think about it, it can be expressed after all. x can just be assigned a constant expression that does not depend on other variables. So yeah, depends on whether we want the ability to express that. Does not seem to me like we're using that distinction for anything currently.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The edit's the final answer - i.e. yes, it's exactly as Chris suggested; adding the empty _references check would alter the behaviour of the analysis (I would assume we'd see failing tests, but I'd have to check). I.e. we fetch by key, and then use the value set to either perform arithmetic (i.e. add two sets together), or lookup, neither of which need an empty set check.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, you're right about this altering the analysis.

But I'm still confused as to why you think erase(x) and set(x, {}) would be equivalent. This just does not seem true to me. Are you referring to the fact that m_ordered[_variable] will modify m_ordered and insert the key if it's not there? You're still doing m_ordered.erase() at the end of the function so yeah, it will technically insert the key but the key won't be there when the function finishes. So I don't think it's true that The behaviour is the same as ``set("x", {})``.. This bit should be removed from the docstring unless I'm missing something here.

*
* m_ordered m_reversed
* f -> (g,) g -> (f,)
* a -> (b,c,) c -> (a,)
* b -> (a,)
*
* @param _variable variable to erase
*/
void erase(YulString const& _variable);
std::set<YulString> const* getOrderedOrNullptr(YulString const& _variable) const;
std::set<YulString> const* getReversedOrNullptr(YulString const& _variable) const;

private:
/// m_ordered[a].contains[b] <=> the current expression assigned to ``a`` references ``b``
std::unordered_map<YulString, std::set<YulString>> m_ordered;
/// m_reversed[b].contains[a] <=> the current expression assigned to ``a`` references ``b``
std::unordered_map<YulString, std::set<YulString>> m_reversed;
};

} // solidity::yul