Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inference annotations that can be read by an external tool #35845

Merged
merged 2 commits into from
Jul 1, 2020

Conversation

Keno
Copy link
Member

@Keno Keno commented May 12, 2020

This builds on top of #35831, letting inference emit a custom message
whenever it gives up on infering something. These messages are intended
to be displayed by external tools, either to debug what inference is doing
(e.g. for Cthulhu) or, if an external compiler needs to disallow certain
kinds of missing information (e.g. no dynamic dispatch on GPUs), can be
used to improve diagnostics. This is mostly a proof of concept, I think
these messages/hooks need to be a bit richer for a full solution, though
I think this can be already useful if we hook up something like Cthulhu.
As a proof of concept, I hacked up a 10 line function that reads these messagse.
It works something like the following:

function bar()
    sin = eval(:sin)
    sin(1)
end
foo() = bar()
julia> Compiler3.analyze_static(Tuple{typeof(foo)})
In function: bar()
ERROR: The called function was unknown
1| function bar()
2|     sin = eval(:sin)
=>     sin(1)
4| end

[1] In foo()

The reason this needs to sit on top of #35831 is that it needs to run
with a completely fresh cache, which #35831 provides the primitives for.
Otherwise, while you could still get the annotations out, that would
only work the first time something is inferred anywhere in the system.
With a fresh cache, everything is analyzed again, and any messages like
these that are opted in to can be collected.

cc @vchuravy for Cthulhu, @maleadt @jpsamaroo for GPU-related concerns

Base automatically changed from kf/cache_refactor to master June 9, 2020 03:10
@Keno Keno changed the title WIP: Inference annotations that can be read by an external tool Inference annotations that can be read by an external tool Jun 9, 2020
@Keno
Copy link
Member Author

Keno commented Jun 29, 2020

Can I go ahead with this? Not having it on master is causing me lots of rebase issues when I want to make patches against master from my branch.

@Keno Keno requested review from JeffBezanson, vtjnash and vchuravy and removed request for JeffBezanson June 29, 2020 00:36
code_cache(interp)[result.linfo] = CodeInstance(result, min_valid, max_valid)
code_cache(interp)[result.linfo] = CodeInstance(result, min_valid, max_valid,
may_compress(interp), may_discard_trees(interp))
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like these got accidentally added from a different commit, but otherwise seems okay

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, these are useful to have the remark emitting AbstractInterpreter force results to be kept such that it can correlate the remarks with the original source, but you're right, it's conceptually separate.

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, maybe just needs a separate commit then, or at least mention in the commit message, so we don't drop this later?

I'd also perhaps have preferred we just added these to the existing configuration set for this (InferenceParams), rather than creating a duplicate strategy with dispatch.

by the native interpreter, but can be used by external tooling to annotate
inference results.
"""
mark_dynamic!(ni::NativeInterpreter, sv, s) = nothing
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This naming seems rather poor, since there's not really a sense in which the places this got called is related to "when inference widens a result". The only place that happens is tmerge. Perhaps this should be called simply add_remark?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. The idea here was that this answers the question "why did a dynamic dispatch get introduced here". I'm fine with just generically calling it add_remark though. Eventually we might want to have multiple classes of remarks, not just about inference precision, so that seems like a fine direction to go.

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, but the places we're putting it (in inference) aren't introducing dynamic dispatch (that's always there), they're merely decision points where we prune our analysis of the call graph.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depends on what you mean by dynamic dispatch ;). I meant calls to jl_apply_generic which is often what happens when we prune the analysis here. The reason to care about that is that that'd be what's disallowed in e.g. a GPU or XLA context. Anyway, add_remark! is fine.

@Keno
Copy link
Member Author

Keno commented Jun 29, 2020

Renamed to add_remark

@@ -327,6 +343,7 @@ function abstract_call_method(interp::AbstractInterpreter, method::Method, @nosp
# avoid widening when detecting self-recursion
# TODO: merge call cycle and return right away
if call_result_unused(sv)
add_remark!(interp, sv, "Widening unused call result early to avoid unbounded recursion")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
add_remark!(interp, sv, "Widening unused call result early to avoid unbounded recursion")
add_remark!(interp, sv, "Widening unused call result early to avoid unbounded self-recursion")

Would be good if the annotations are unique so that one can grep for them. Not sure if my recommendation is the best here.

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As commented below, no widening is happening here (that's why we're okay with returning this early)

Suggested change
add_remark!(interp, sv, "Widening unused call result early to avoid unbounded recursion")
add_remark!(interp, sv, "Delaying call result analysis until after optimization due to bounded recursion")

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we really delaying analysis, aren't we just skipping it entirely? When will we come back to do this analysis?

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're skipping this here because we know it's already being analyzed and we don't care about the result.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, but we never go back to actually annotate that result, right? In that case, I'd like to inform the user that the return type they'd be seeing in Cthulhu may not match what inference knows about this callsite. Maybe something like: "Unbounded recursion detected with unused result. Annotated return type may be wider than true result".

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is bounded recursion (in the type domain), but yeah, we're putting 'Any' here specifically because we know it can't be used by the optimizer anyways, and thus want to avoid the re-work costs of putting the real type here later.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So "Unbounded recursion detected with unused result. Annotated return type may be wider than true result."? Or just leave off the remark entirely and have Cthulhu check when it seems a situation like this?

base/compiler/types.jl Outdated Show resolved Hide resolved
Copy link
Sponsor Member

@vtjnash vtjnash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some alternative wording suggestions

isdefined(ftname, :mt) || return Any # not callable. should be Bottom, but can't track this backedge right now
if !isdefined(ftname, :mt)
# should be Bottom, but can't track this backedge right now
add_remark!(interp, sv, "The function is not callable")
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
add_remark!(interp, sv, "The function is not callable")
add_remark!(interp, sv, "Dispatch on this function type value is currently disallowed")

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, since it's probably not useful to attempt to maintain the distinction between these:

Suggested change
add_remark!(interp, sv, "The function is not callable")
add_remark!(interp, sv, "Could not identify method table for call")

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's worth having distinct messages for every call site, so people can grep the inference source and see what happened if necessary.

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, but the only reason these are separate is that we're doing a bad job at reimplementing jl_method_table_for here. These reasons shouldn't actually be distinguishable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, fair enough. What's the reason we're not calling jl_method_table_for here? Or is there just not good reason?

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just got forgotten, I'm guessing

base/compiler/abstractinterpretation.jl Outdated Show resolved Hide resolved
base/compiler/abstractinterpretation.jl Outdated Show resolved Hide resolved
base/compiler/abstractinterpretation.jl Outdated Show resolved Hide resolved
@@ -407,6 +424,7 @@ function abstract_call_method(interp::AbstractInterpreter, method::Method, @nosp
# continue inference, but note that we've limited parameter complexity
# on this call (to ensure convergence), so that we don't cache this result
if call_result_unused(sv)
add_remark!(interp, sv, "Widening unused call result early to avoid unbounded recursion")
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wording is misleading: there's never widening that happens here (except if the code-path would be discovered to stack overflow instead), since the result is known to be unused, though we also don't know what will happen next (though it's unlikely inlining would succeed regardless due to the graph cycle).

Suggested change
add_remark!(interp, sv, "Widening unused call result early to avoid unbounded recursion")
add_remark!(interp, sv, "Delaying call result analysis until after optimization due to unbounded recursion")

@@ -327,6 +343,7 @@ function abstract_call_method(interp::AbstractInterpreter, method::Method, @nosp
# avoid widening when detecting self-recursion
# TODO: merge call cycle and return right away
if call_result_unused(sv)
add_remark!(interp, sv, "Widening unused call result early to avoid unbounded recursion")
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As commented below, no widening is happening here (that's why we're okay with returning this early)

Suggested change
add_remark!(interp, sv, "Widening unused call result early to avoid unbounded recursion")
add_remark!(interp, sv, "Delaying call result analysis until after optimization due to bounded recursion")

base/compiler/abstractinterpretation.jl Outdated Show resolved Hide resolved
@@ -49,6 +57,7 @@ function abstract_call_gf_by_type(interp::AbstractInterpreter, @nospecialize(f),
if !isa(tname, DataType)
# can't track the backedge to the ctor right now
# for things like Union
add_remark!(interp, sv, "The called constructor is too complicated")
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
add_remark!(interp, sv, "The called constructor is too complicated")
add_remark!(interp, sv, "Deciding to skip computing the result due to arbitrary reasons")

I have no clue what this ftname === _TYPE_NAME condition means (from 0b8f0b7), but looks really arbitrarily random these days to return from here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you saying we should just remove this whole code path? The suggested message seems like it could cause frustration for people seeing it.

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I couldn't tell you why these specific conditions are so bad. "Too complicated" seemed like it was trying to be too PC and cause the same frustrations.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll put up a PR to remove it and see what breaks :)

@Keno
Copy link
Member Author

Keno commented Jun 30, 2020

I've applied most of the suggestions and asked two questions in others (will probably still apply those, but I'd like to understand better).

Keno added a commit that referenced this pull request Jun 30, 2020
`@vtjnash` said in #35845 that he didn't know what this code was for
and didn't think it was necessary. Since he's the one that added it
and the comment says it's about backedges, I'm inclined to believe
him :). In any case, removing it passes base tests, so let's try it.
Keno added a commit that referenced this pull request Jun 30, 2020
In #35845, `@vtjnash` was saying that this function tried to
re-implement jl_method_table_for, but badly. Try to just use
it directly and see if anything breaks (Base tests seem to
pass).
Keno added a commit that referenced this pull request Jul 1, 2020
`@vtjnash` said in #35845 that he didn't know what this code was for
and didn't think it was necessary. Since he's the one that added it
and the comment says it's about backedges, I'm inclined to believe
him :). In any case, removing it passes base tests, so let's try it.
Keno added a commit that referenced this pull request Jul 1, 2020
In #35845, `@vtjnash` was saying that this function tried to
re-implement jl_method_table_for, but badly. Try to just use
it directly and see if anything breaks (Base tests seem to
pass).
@Keno Keno changed the base branch from master to kf/remwat July 1, 2020 00:16
Keno added 2 commits June 30, 2020 20:17
As well as turning off optimizations. This is useful for Cthulhu to
be able to request all unoptimized, noncompressed CodeInfos and
correlate them with inference remarks (see next commit).
This builds on top of #35831, letting inference emit a custom message
whenever it gives up on infering something. These messages are intended
to be displayed by external tools, either to debug what inference is doing
(e.g. for Cthulhu) or, if an external compiler needs to disallow certain
kinds of missing information (e.g. no dynamic dispatch on GPUs), can be
used to improve diagnostics. This is mostly a proof of concept, I think
these messages/hooks need to be a bit richer for a full solution, though
I think this can be already useful if we hook up something like Cthulhu.
As a proof of concept, I hacked up a 10 line function that reads these messagse.
It works something like the following:

```
function bar()
    sin = eval(:sin)
    sin(1)
end
foo() = bar()
```

```
julia> Compiler3.analyze_static(Tuple{typeof(foo)})
In function: bar()
ERROR: The called function was unknown
1| function bar()
2|     sin = eval(:sin)
=>     sin(1)
4| end

[1] In foo()
```

The reason this needs to sit on top of #35831 is that it needs to run
with a completely fresh cache, which #35831 provides the primitives for.
Otherwise, while you could still get the annotations out, that would
only work the first time something is inferred anywhere in the system.
With a fresh cache, everything is analyzed again, and any messages like
these that are opted in to can be collected.
@Keno
Copy link
Member Author

Keno commented Jul 1, 2020

Alright, rebased on top of #36477 and split out the may_optimize stuff into a separate commit. Also adjusted the recursion message as discussed. Will plan to merge once #36477 is in.

Base automatically changed from kf/remwat to master July 1, 2020 18:57
@Keno Keno merged commit d5cf73f into master Jul 1, 2020
@Keno Keno deleted the kf/static_errors branch July 1, 2020 20:22
simeonschaub pushed a commit to simeonschaub/julia that referenced this pull request Aug 11, 2020
`@vtjnash` said in JuliaLang#35845 that he didn't know what this code was for
and didn't think it was necessary. Since he's the one that added it
and the comment says it's about backedges, I'm inclined to believe
him :). In any case, removing it passes base tests, so let's try it.
simeonschaub pushed a commit to simeonschaub/julia that referenced this pull request Aug 11, 2020
In JuliaLang#35845, `@vtjnash` was saying that this function tried to
re-implement jl_method_table_for, but badly. Try to just use
it directly and see if anything breaks (Base tests seem to
pass).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants