Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

yul literal value as struct of u256 + optional formatting hint #15112

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

clonker
Copy link
Member

@clonker clonker commented May 16, 2024

Changed the definition of the solidity::yul::Literal to carry its value not as YulString but as LiteralValue struct, consisting of a u256 and an optional string representation hint. Upon converting from its data back to a string representation, it is first checked if the hint is not empty and in that case, whether value == parseValue(hint).

nikola-matic
nikola-matic previously approved these changes May 17, 2024
libyul/AsmJsonImporter.cpp Outdated Show resolved Hide resolved
libyul/AsmJsonImporter.cpp Outdated Show resolved Hide resolved
libyul/Utilities.cpp Outdated Show resolved Hide resolved
@nikola-matic nikola-matic dismissed their stale review May 17, 2024 13:00

Wanted to comment and not approve.

);
unlimited = member(_node, "unlimited").get<bool>();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'll need a changelog entry for this, since it's a visible change (i.e. user facing).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fair point, i postponed doing anything changelog related before it's clear that everyone agrees this is the right approach

Copy link
Member

@cameel cameel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a complete review yet. Halfway through reading it I started realizing that I thought I knew how the hint and the unlimited flag worked but I actually didn't. I'm more and more convinced that this flag is not exactly the right solution here and we need to change it a bit. It's not really a property of the literal itself. But see my comments for details.

And some style nitpicking :)

libyul/AST.h Outdated Show resolved Hide resolved
libyul/AST.h Outdated Show resolved Hide resolved
libyul/Utilities.cpp Outdated Show resolved Hide resolved
test/tools/yulInterpreter/Interpreter.cpp Outdated Show resolved Hide resolved
libyul/AsmJsonConverter.cpp Outdated Show resolved Hide resolved
libyul/AsmParser.h Outdated Show resolved Hide resolved
libyul/AsmJsonImporter.cpp Outdated Show resolved Hide resolved
libyul/AST.h Outdated
Comment on lines 54 to 55
LiteralValue(Data const& _data, std::optional<std::string> const& _hint = std::nullopt, bool _unlimited = false):
m_data(_data), m_hint(_hint ? std::make_shared<std::string>(*_hint) : nullptr), m_unlimited(_unlimited) {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we assert here that the hint matches the data if present?

formatLiteral() ignores m_data when m_hint is present so we must ensure that these two are always consistent.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I'd recommend this style:

Suggested change
LiteralValue(Data const& _data, std::optional<std::string> const& _hint = std::nullopt, bool _unlimited = false):
m_data(_data), m_hint(_hint ? std::make_shared<std::string>(*_hint) : nullptr), m_unlimited(_unlimited) {}
LiteralValue(Data const& _data, std::optional<std::string> const& _hint = std::nullopt, bool _unlimited = false):
m_data(_data),
m_hint(_hint ? std::make_shared<std::string>(*_hint) : nullptr),
m_unlimited(_unlimited)
{}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm asserting it in this place is going to be difficult without knowing the LiteralKind. That is what determines the validity in the end. I have added a check to formatLiteral in the utilities, though, so that the hint isn't blindly taken but it is compared to the stored u256 value.

Alternatively, we could think about storing the LiteralKind in the LiteralValue as well. Then an assert in the constructor is possible and failure would occur earlier.

libyul/AsmAnalysis.cpp Outdated Show resolved Hide resolved
libyul/AST.h Outdated Show resolved Hide resolved
Comment on lines 130 to 136
if (_literal.kind == LiteralKind::Boolean)
{
yulAssert(_literal.value() == 0 || _literal.value() == 1, "Could not format boolean literal");
result = _literal.value() == 1 ? "true" : "false";
}
else
result = _literal.value().str();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like you're formatting strings and integers the same way. How can that work? I.e. how does it preserve the distinction between 1234 and "1234"?

Actually, does the hint include the quotes? Because if it does not, then I can't see how we can distinguish values like '1234' and "1234" or '1234' and hex'1234'.

Also, if there's no distinction, the value cannot be unambiguously recovered from the hint (which matters if we want to enforce that they match).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, it would be good to have those examples as test cases if we don't have them somewhere already.

I'd also add something like this:

object "A" {
    code {
        pop(datasize(hex'616263'))
    }
    data 'abc' "1234"
}

I actually only now realized that we're allowing anything other than plain strings for literal arguments, not sure we even have it covered (only for arguments though, not in data).

Copy link
Member Author

@clonker clonker May 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The quoting and escaping of string literals is done in AsmPrinter, so the LiteralKind is also always required to unambiguously print and/or recover a literal. In your example you'd get

object "A" {
    code {
        pop(datasize(hex'616263'))
    }
    data 'abc' "1234"
}
// ----
// step: disambiguator
//
// object "A" {
//     code { pop(datasize("abc")) }
//     data "abc" hex"31323334"
// }

as for almost ambiguous literals

{
    let a := 1234
    let a_hex := 0x4d2
    let b := "1234"
    let b_hex := "0x4d2"
    let c := '1234'
    let c_hex := '0x4d2'
    let d := hex"1234"
    let d_hex := hex"04d2"
}
// ----
// step: disambiguator
//
// {
//     let a := 1234
//     let a_hex := 0x4d2
//     let b := "1234"
//     let b_hex := "0x4d2"
//     let c := "1234"
//     let c_hex := "0x4d2"
//     let d := "\x124"
//     let d_hex := "\x04\xd2"
// }

…limited'

- unlimited meaning an argument of a builtin function with corresponding entry in
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants