Skip to content

Conversation

nickrolfe
Copy link
Contributor

No description provided.

@nickrolfe nickrolfe marked this pull request as ready for review November 14, 2022 15:34
@nickrolfe nickrolfe requested a review from a team as a code owner November 14, 2022 15:34
Comment on lines +292 to +293
input = "Argument[self].Element[any]" and
output = ["Argument[block].Parameter[0]", "ReturnValue.Element[?]"] and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a difference between Element[any] and Element[?]?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any means any known or unknown index, while ? means an unknown index.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that Element[any] in an output spec means the same as Element[?].

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was unsure exactly what this meant, but after some testing it seems like Element[?] in an output spec behaves like Element[any] (i.e. both mean "an element at a known or unknown index"). I will update #10899 to document this.

As an aside, it would be nice if ? and any had consistent semantics for input and output contexts. Is there any reason we can't make that change?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One would never be interested to have an output spec that writes to any content, as that would result in a huge fan-out. The reason why Element[any] in an output is still allowed, is because we need to support writes to Element[any] in "reverse stores" as well.

Assume we have a flow summary

input = Argument[0].Element[any]
output = ReturnValue

for a method get. Then when we have a call like

get(x).foo = taint

the get(x) needs to be translated into a reverse store, such that data is stored inside field foo, inside an element of [post update] x. And here we interpret the store into Element[any] as a store into the unknown element content (Element[?]); again to avoid the fan-out.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So when the shared library derives reverse store steps for flow summaries, it must convert Element[any] into Element[?] (and it seems we do the same for things like Element[1..]?). That makes sense to me. But it seems we could still forbid the use of Element[any] in the output of explicitly written flow summaries. Your example doesn't use Element[any] in the output, and I can find only one example in the codeql-ruby codebase (the summary for Hash#with_indifferent_access).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would absolutely also use Element[?] instead of Element[any] in output specs of summaries (we should update the summary that you mention); I was just mentioning, that Element[any] would in fact mean the same thing.

@nickrolfe nickrolfe merged commit 8d854e0 into main Nov 15, 2022
@nickrolfe nickrolfe deleted the nickrolfe/active_support_enumerable branch November 15, 2022 10:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants