It is too hard to reason about when Julia specializes arguments #51423

jakobnissen · 2023-09-22T07:45:54Z

Background

Julia users quickly learn that Julia specializes code for the types of every argument, and that every function is its own type.
However, invariably, users will try to write a function like this:

function foo(f, v)
    v2 = map(f, v)
    y = 0
    for i in v2
        y += i
    end
    y
end

And find that it allocates like crazy:

julia x = collect(1:1_000_000);

julia> @time foo(isodd, x);
  0.084467 seconds (2.00 M allocations: 31.448 MiB)

This is extraordinarily unintuitive, but of course this is a well known issue.

Strangely, there are various ways you can regain specialization, for example here - where I use f in the function:

function foo(f, v)
    v2 = map(f, v)
    y = f(first(v)) - f(first(v)) # zero, but uses f
    for i in v2
        y += i
    end
    y
end

You also can enable specialization by annotating f::F.
And you can disable specialization using @nospecialize - but look at what happens if you do these toggether:

function foo(@nospecialize(f::F), v) where F
    v2 = map(f, v)
    y = 0
    for i in v2
        y += i
    end
    y
end

julia> @time foo(isodd, x);
  0.001473 seconds (3 allocations: 976.625 KiB)

@nospecialize is not respected. Even when you apply @nospecialize to the whole function:

julia> @nospecialize function foo(f::F, v) where F
           v2 = map(f, v)
           y = 0
           for i in v2
               y += i
           end
           y
       end

julia> @time foo(isodd, x);
ERROR: UndefVarError: `foo` not defined

Oops - @nospecialize silently deleted the function there.

It seems very hard to understand when Julia specializes and when it does not.
Luckily, we have code introspection tooling like @code_warntype - but of course, this doesn't work in this case and actively misleads the investigator as to what is wrong (a known issue: #32834).

Furthermore, it's quite unclear what kind of specialization exactly is blocked by @nospecialize - as a user, the mental model of @nospecialize is presumably: "Julia just treats this @nospecialize(x::Integer) as an Integer, nothing else". But this isn't the case, and the actual semantics of @nospecialize is kind of complex, see #41931.

In short:

Julia actually specializing code is critical to have fast code, which is the major use case for Julia.
It's too complicated and unlearnable for users that Julia specializes for all types except functions and types, except when the function/type is used directly in the method, except when users annotate it with a where clause. This is an exception to an exception to an exception to a rule. I don't think it's reasonable to expect users to learn this.
It is too hard to debug, learn and understand that this happens due to Can we support a way to make @code_typed stop lying to us all? :) #32834
@nospecialize is too hard to use correctly (@nospecialize could use argument validation #49512, and perhaps the example in this issue) and its semantics are fairly complicated (introduce @nospecializeinfer macro to tell the compiler to avoid excess inference #41931).

This issue

While each of the individual issues linked to are already known, I think the overall issue really is this: What can we expect a Julia user to intuit or learn about specialization?

If specialization is critical for fast code, then users must be able to have their code specialize, which means they must be able have a working understanding of when it specializes. I argue that it is currently so hard to understand what is happening that you need to be a Julia compiler developer, or have years of Julian experience, to really understand what is happening.

If specialization is considered a compiler implementation detail, then it is unacceptable that Julia chooses to despecialize code seemingly arbitrarily

Possible solutions

Perhaps the simplest solution would be: Julia always specializes, and always respects @nospecialize.

Perhaps effects could be used to achieve this: In the original function map(f, v) in this post, this could be made to specialize because Julia knew that the result of the map was used in the rest of the function. This way, lack of specialization would never affect type stability and therefore could be more of a compiler implementation detail, with the semantics being that Julia always specializes. This would also make #32834 less critical to fix, and would enable us to remove the section from the Julia performance tips.

The text was updated successfully, but these errors were encountered:

mbauman · 2023-09-22T12:01:54Z

Related (maybe even a duplicate?): #11339

andrewjradcliffe · 2023-09-27T02:32:57Z

Not mentioned above, but there is another place that specialization (or lack thereof) matters: type inference which depends on recursion depth permitted by the Julia compiler. That is, recursive types, which, with some coaxing (without using internals), the Julia compiler will infer. The pain begins when the compiler heuristics change, invaliding whatever scheme permitted inference to work when the code was written.

The need to avoid combinatorial explosion is quite real with type inference, thus, your humble author understands why such heuristics must be in place. Unfortunately, this is the reason I avoid writing Julia code, as I find that I must re-write it a few months later, despite the fact that I am not doing anything which involves language internals. It very well could be a case of user error, i.e. such code should simply not be written.

jakobnissen · 2023-09-27T04:58:16Z

I would argue that this is not a matter of specialization: In the situation you describe, Julia does not choose to despecialize in that case even though it could. It can't actually specialize recursive functions generally, since then inference could never stop.
In contrast, in the above examples, Julia could absolutely specialize.

andrewjradcliffe · 2023-09-27T05:57:22Z

Under specific conditions, this repository exhibited said problem, or, at least it did at the time of writing (tests were performed using master at August 2022, which was 1.9); here are some concrete examples.

To explain what is going on there: a sequence of function calls, each of which is wrapped in a type -- one can (and should) view it as a singly-linked list of static size, expressed in the type domain. Usage of said list occurs as follows: unwrap a node, apply the function, then proceed to the next node, stopping (via dispatch on Nothing type) if the next node is nothing. The list is just a parametric type, which, when a concrete instantiation is used, it is consumed via recursive calls; each call leads to a function invocation on a new type (if the initial list had N elements, the next call has N-1 elements).
What is it used for? Re-constructing trees from various tabular representations.

brenhinkeller added the performance Must go faster label Sep 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

It is too hard to reason about when Julia specializes arguments #51423

It is too hard to reason about when Julia specializes arguments #51423

jakobnissen commented Sep 22, 2023 •

edited

mbauman commented Sep 22, 2023

andrewjradcliffe commented Sep 27, 2023 •

edited

jakobnissen commented Sep 27, 2023

andrewjradcliffe commented Sep 27, 2023

It is too hard to reason about when Julia specializes arguments #51423

It is too hard to reason about when Julia specializes arguments #51423

Comments

jakobnissen commented Sep 22, 2023 • edited

Background

This issue

Possible solutions

mbauman commented Sep 22, 2023

andrewjradcliffe commented Sep 27, 2023 • edited

jakobnissen commented Sep 27, 2023

andrewjradcliffe commented Sep 27, 2023

jakobnissen commented Sep 22, 2023 •

edited

andrewjradcliffe commented Sep 27, 2023 •

edited