Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error constructing a multivector #7

Closed
MasonProtter opened this issue Feb 26, 2019 · 27 comments
Closed

Error constructing a multivector #7

MasonProtter opened this issue Feb 26, 2019 · 27 comments

Comments

@MasonProtter
Copy link

MasonProtter commented Feb 26, 2019

julia> using Grassmann

julia> basis"+++"
(⟨+++⟩, v, v₁, v₂, v₃, v₁₂, v₁₃, v₂₃, v₁₂₃)

julia> v + v1 + v2
Internal error: encountered unexpected error in runtime:
MethodError(f=typeof(Base.string)(), args=(Expr(:call, Symbol("!"), Expr(:., :typ, :(:mutable))),), world=0x0000000000000eb9)
rec_backtrace at /Users/mason/julia/src/stackwalk.c:94
record_backtrace at /Users/mason/julia/src/task.c:217
jl_throw at /Users/mason/julia/src/task.c:417
jl_method_error_bare at /Users/mason/julia/src/gf.c:1649
jl_method_error at /Users/mason/julia/src/gf.c:1667
jl_lookup_generic_ at /Users/mason/julia/src/gf.c:2195
jl_apply_generic at /Users/mason/julia/src/gf.c:2215
lift_leaves at ./compiler/ssair/passes.jl:300
getfield_elim_pass! at ./compiler/ssair/passes.jl:658
run_passes at ./compiler/ssair/driver.jl:123
optimize at ./compiler/optimize.jl:164
typeinf at ./compiler/typeinfer.jl:35
typeinf_ext at ./compiler/typeinfer.jl:576
typeinf_ext at ./compiler/typeinfer.jl:613
jfptr_typeinf_ext_1 at /Users/mason/julia/usr/lib/julia/sys.dylib (unknown line)
jl_apply_generic at /Users/mason/julia/src/gf.c:2219 [inlined]
jl_apply at /Users/mason/julia/src/./julia.h:1571 [inlined]
jl_type_infer at /Users/mason/julia/src/gf.c:277
jl_compile_method_internal at /Users/mason/julia/src/gf.c:1819
jl_fptr_trampoline at /Users/mason/julia/src/gf.c:1863
do_call at /Users/mason/julia/src/interpreter.c:323
eval_stmt_value at /Users/mason/julia/src/interpreter.c:362 [inlined]
eval_body at /Users/mason/julia/src/interpreter.c:759
jl_interpret_toplevel_thunk_callback at /Users/mason/julia/src/interpreter.c:885
Interpreter frame (ip: 0)
Core.CodeInfo(code=Array{Any, (2,)}[
  Expr(:call, :+, :v, :v1, :v2),
  Expr(:return, SSAValue(1))], codelocs=Array{Int32, (2,)}[1, 1], method_for_inference_limit_heuristics=nothing, ssavaluetypes=2, linetable=Array{Any, (1,)}[Core.LineInfoNode(mod=Main, method=Symbol("top-level scope"), file=:none, line=0, inlined_at=0)], ssaflags=Array{UInt8, (0,)}[], slotflags=Array{UInt8, (0,)}[], slotnames=Array{Any, (0,)}[], inferred=false, inlineable=false, propagate_inbounds=false, pure=false)jl_interpret_toplevel_thunk at /Users/mason/julia/src/interpreter.c:894
jl_toplevel_eval_flex at /Users/mason/julia/src/toplevel.c:764
jl_toplevel_eval at /Users/mason/julia/src/toplevel.c:773 [inlined]
jl_toplevel_eval_in at /Users/mason/julia/src/toplevel.c:793
eval at ./boot.jl:328
eval_user_input at /Users/mason/julia/usr/share/julia/stdlib/v1.1/REPL/src/REPL.jl:85
macro expansion at /Users/mason/julia/usr/share/julia/stdlib/v1.1/REPL/src/REPL.jl:117 [inlined]
#26 at ./task.jl:259
jl_apply at /Users/mason/julia/src/./julia.h:1571 [inlined]
start_task at /Users/mason/julia/src/task.c:572
1 + 1v₁ + 1v₂
@chakravala
Copy link
Owner

That is actually not a consequential error, because you still get the result at the end, and if you evaluate the input another time you do not get any error. this is very strange indeed, and I have noticed this behavior before also in some rare cases, it would help to know what exactly causes it..

Thanks for sharing your version of the error message, on my computer it is like this

julia> using Grassmann

julia> basis"+++"
(⟨+++⟩, v, v₁, v₂, v₃, v₁₂, v₁₃, v₂₃, v₁₂₃)

julia> v+v1+v2
Internal error: encountered unexpected error in runtime:
MethodError(f=typeof(Base.string)(), args=(Expr(:call, Symbol("!"), Expr(:., :typ, :(:mutable))),), world=0x0000000000000eb9)
unknown function (ip: 0x7f0edd30ed58)
jl_throw at /usr/bin/../lib/libjulia.so.1 (unknown line)
unknown function (ip: 0x7f0edd2c69fa)
unknown function (ip: 0x7f0edd2c6b19)
jl_apply_generic at /usr/bin/../lib/libjulia.so.1 (unknown line)
unknown function (ip: 0x7f0ed1fd6525)
unknown function (ip: 0x7f0ed1fde2ee)
unknown function (ip: 0x7f0ed2011f97)
unknown function (ip: 0x7f0ed2013025)
unknown function (ip: 0x7f0ed20143ae)
unknown function (ip: 0x7f0ed2015300)
unknown function (ip: 0x7f0ed2015733)
unknown function (ip: 0x7f0ed2015b00)
jl_apply_generic at /usr/bin/../lib/libjulia.so.1 (unknown line)
unknown function (ip: 0x7f0edd2c8721)
unknown function (ip: 0x7f0edd2c907e)
jl_fptr_trampoline at /usr/bin/../lib/libjulia.so.1 (unknown line)
jl_apply_generic at /usr/bin/../lib/libjulia.so.1 (unknown line)
unknown function (ip: 0x7f0edd421057)
unknown function (ip: 0x7f0edd420d92)
unknown function (ip: 0x7f0edd421a4f)
unknown function (ip: 0x7f0edd422144)
Interpreter frame (ip: 0)
Core.CodeInfo(code=Array{Any, (2,)}[
  Expr(:call, :+, :v, :v1, :v2),
  Expr(:return, SSAValue(1))], codelocs=Array{Int32, (2,)}[1, 1], method_for_inference_limit_heuristics=nothing, ssavaluetypes=2, linetable=Array{Any, (1,)}[Core.LineInfoNode(mod=Main, method=Symbol("top-level scope"), file=:none, line=0, inlined_at=0)], ssaflags=Array{UInt8, (0,)}[], slotflags=Array{UInt8, (0,)}[], slotnames=Array{Any, (0,)}[], inferred=false, inlineable=false, propagate_inbounds=false, pure=false)unknown function (ip: 0x7f0edd42260c)
unknown function (ip: 0x7f0edd2f979c)
jl_toplevel_eval_in at /usr/bin/../lib/libjulia.so.1 (unknown line)
unknown function (ip: 0x7f0ed2050f51)
jl_apply_generic at /usr/bin/../lib/libjulia.so.1 (unknown line)
unknown function (ip: 0x7f0ed21e83f5)
unknown function (ip: 0x7f0ed21e865b)
jl_apply_generic at /usr/bin/../lib/libjulia.so.1 (unknown line)
unknown function (ip: 0x7f0edd2e0617)
unknown function (ip: 0xffffffffffffffff)
1 + 1v₁ + 1v₂

julia> v+v1+v2
1 + 1v₁ + 1v₂

as you can see, the error does not pose any real problem yet, since the result still works

@chakravala
Copy link
Owner

I'm labeling this won't fix, because I'm not entirely sure what is going on with this at the moment yet

it does not seem to actually cause any issues with obtaining any results though

@MasonProtter
Copy link
Author

Removing all uses of @pure solves this issue (https://github.com/MasonProtter/Grassmann.jl/commit/54f7a7ff0b08458f5e2de91b56fd865bbabb2c2c)

@Seelengrab
Copy link

Ref and Ref.

Just to reiterate: @pure in julia is not equivalent to the commonly used definition of a pure function in other languages and shouldn't be used unless you know EXACTLY what it means.

At the very least (looking through the quoted parts) there are A LOT of places where @pure was used for mutable stuff, which (as explained by the docs) is already not allowed:

A pure function can only depend on immutable information. This also means a @pure function cannot use any global mutable state, including generic functions. Calls to generic functions depend on method tables which are mutable global state. Use with caution, incorrect @pure annotation of a function may introduce hard to identify bugs. Double check for calls to generic functions.

@chakravala
Copy link
Owner

chakravala commented Feb 27, 2019

There are many situations where I used @pure in a valid way, and in some places where I hastily made changes to code, where the situations may have changed. So this is definitely not final, but I'm not going to be changing anything until I have the exact source lines of code for the problem narrowed down.

In some cases, it may not look as a valid usage of @pure to non-authors of this package because they are not familiar with the data flow of the variables and what the precise intention is behind everything.

This is going to require a more precise analysis, which is going to be very difficult for non-authors to be able to do independently since they do not know of the implicit intentions and principles used in this code.

Also, I am working in making foundational changes to the design of the code constantly, so this package is not in a stable state yet... this is v0.0.x

@StefanKarpinski
Copy link

There are many situations where I used @pure in a valid way

Almost no non-trivial Julia method definitions are actually @pure: if a definition calls a generic function it is impure since dispatch depends on method tables and method tables are mutable global state. Do your @pure-annotated functions actually call no generic functions that might ever have methods added to them?

Again, why do you feel it's necessary to use the non-exported Base.@pure annotation so much?

@chakravala
Copy link
Owner

In this particular package, the entire design constraint is centered around passing around parametric type data ahead of compile time in order to make the generated code more efficient, yet also remain fully flexible for both numerical and symbolic computation with many different dimensional spin algebras.

That is why if you naively remove all usage of @pure you will have drastic performance costs.. in the JuliaLang discourse I have already explained why I used it, but without @pure you are not going to be able to get efficient code on many instances into the function ahead of compile time.

The usage of it probably needs to be investigated more, but it is part of the fundamental design here.

@StefanKarpinski
Copy link

That sounds like it might be a better use case for generated functions.

@eschnett
Copy link
Contributor

In https://github.com/eschnett/FastArrays.jl, I am using generated functions in many places. (Recently, Julia's compiler became better, so after benchmarking I am able to convert some generated functions to generic functions.) I do this for the same reason: I need these functions to be fast, and any kind of genericity that still remains at run time has an unacceptable performance cost. If the compiler completely specializes functions based on rank, dimension, types, etc., then LLVM can optimize and inline away many operations.

For example, if you have a (3 x N) array, and the compiler does not know that the first dimension is 3, then there needs to be an integer multiplication at run time. If the compiler does know, however, then the generated code is becomes much more efficient.

I assume similar issues are true for this package, and I'd also assume that generated functions are the way to go.

@chakravala
Copy link
Owner

chakravala commented Feb 28, 2019

Let's make this discussion a bit more concrete by recalling the example cited from my discussion post

Since then, I have significantly changed the implementation of this particular example in this repository already, but this outdated version from the post makes for a good continued discussion example:

struct data{N,G} end
const binomsum_cache = [[1]];

Base.@pure function pure_binomsum(n::Int, i::Int)
    for k=length(binomsum_cache)+1:n
        push!(binomsum_cache, cumsum([binomial(k,q) for q=0:k]))
    end
    i!=0 ? binomsum_cache[n][i] : 0
end

Which is then used to define a test function, for which we require precompiled binomsum data

pure(::data{N,G}) where {N,G} = float(pure_binomsum(N,G));

What I want is for the code to look like this

julia> @code_warntype pure(data{5,4}())
Body::Float64
1 1return 26.0   

However, as demonstrated in my discussion post, using @pure yielded the best code.. while not using it or instead resorting to @inline makes things a bit worse either way.

To move the discussion forward, it would be best if you could demonstrate your proposals with this test.

@eschnett
Copy link
Contributor

This becomes:

struct data{N,G} end
const binomsum_cache = [[1]]

@generated function binomsum(::Val{n}, ::Val{i}) where {n, i}
    for k=length(binomsum_cache)+1:n
        push!(binomsum_cache, cumsum([binomial(k,q) for q=0:k]))
    end
    res = i!=0 ? binomsum_cache[n][i] : 0
    quote
        $res
    end
end

pure(::data{N,G}) where {N,G} = float(binomsum(Val(N), Val(G)))

with

julia> @code_warntype pure(data{5,4}())
Body::Float64
1return 26.0

The main trick with generated functions is that one essentially writes a macro, but the types of the function arguments are known. One can use Val to convert values to types. This allows a clear distinction between calculations that happen at compile time and those that happen at run time. If there are many non-trivial calculations left at run time, then the resulting quoting can become messy, and splitting things into two functions makes sense (one for compile-time and another for run-time calculations). Here, of course, all calculations happen at compile time, so the quoting is trivial.

@eschnett
Copy link
Contributor

... since thread-safety was mentioned somewhere: I don't think the code above is thread-safe. One needs a mutex around the for loop that pushes to the global variable. Since this for loop is only executed at compile time, this does not affect run-time performance.

@chakravala
Copy link
Owner

chakravala commented Feb 28, 2019

since thread-safety was mentioned somewhere: I don't think the code above is thread-safe.

Actually, the thread safety has also already been completely addressed since then, in another separate follow up discussion, where it turned out that the thread-safety issue was caused by a logging message that has been completely removed since, and now is now resolved.

Thank you for your example, I will take that into consideration for the next release, which will have a variety of planned updates and modifications.

@chakravala
Copy link
Owner

chakravala commented Mar 2, 2019

@StefanKarpinski @eschnett your solution is unfortunately too much of a degradation in performance

pur_summage(x) = sum([pure_binomsum(x,y) for y  0:x])
gen_summage(x) = sum([binomsum(Val(x),Val(y)) for y  0:x])

this _summage test shows that @pure option is better

julia> @btime pur_summage(7)
  69.492 ns (1 allocation: 144 bytes)
448

julia> @btime gen_summage(7)
  48.298 μs (3 allocations: 208 bytes)
448

Perhaps also @JeffBezanson or @vtjnash could comment on this. If there is something the Julia language could improve to solve this, it would be to have some kind of softer @pure macro, one which doesn't have as strict requirements.

Perhaps, this neeeds an entirely new macro name, because it is not @inline and not @pure and also the proposed @generated is not exactly it either.

@MasonProtter
Copy link
Author

MasonProtter commented Mar 2, 2019

The difference in performance between the pure and generated functions is all due to dynamic dispatch of the generated function in the loop.

For this function, you don't actually need @generated or @pure.

struct data{N,G} end
const binomsum_cache = [[1]];

Base.@pure function pure_binomsum(n::Int, i::Int)
    for k=length(binomsum_cache)+1:n
        push!(binomsum_cache, cumsum([binomial(k,q) for q=0:k]))
    end
    i!=0 ? binomsum_cache[n][i] : 0
end

function reg_binomsum(n::Int, i::Int)
    for k=length(binomsum_cache)+1:n
        push!(binomsum_cache, cumsum([binomial(k,q) for q=0:k]))
    end
    i!=0 ? binomsum_cache[n][i] : 0
end

pur_summage(x) = sum([pure_81.525 ns(x,y) for y  0:x])
reg_summage(x) = sum([reg_binomsum(x,y) for y  0:x])

julia> @btime pur_summage(7);
   81.026 ns (1 allocation: 144 bytes)

julia> @btime reg_summage(7)
   81.525 ns

The reason for this is that you've already done an (unsafe due to constant redefinition) version of memoization in the actual definition of binomsum.

@chakravala
Copy link
Owner

chakravala commented Mar 2, 2019

Thank you for the benchmarks, but the purpose of @pure or @generated is to help get this

julia> @code_warntype pure(data{5,4}())
Body::Float64
1return 26.0

That's why the function needs a macro like that. My point: however, maybe a different macro is needed, one which has much less strict requirements than @pure but also is faster than @generated.

Perhaps this macro does not exist yet, and needs to be added to the language?

It needs both performance, and needs to be able to be precomputed ahead of time from type data.

@MasonProtter
Copy link
Author

and needs to be able to be precomputed ahead of time from type data.

The point that I believe you are missing is that the function already is precomputed ahead of time without the @pure annotation because it's memoized.

The first time one computes reg_binomsum(n, i) it's slow because it needs to build up the cache of values and all the subsequent times it's fast because everything is already computed and stored in a cache.

@chakravala
Copy link
Owner

chakravala commented Mar 2, 2019

The point that I believe you are missing is that the function already is precomputed ahead of time without the @pure annotation because it's memoized.

You are right about that, but it is irrelevant to what I'm talking about. Whether the value is memoized or not is completely irrelevant to obtaining code like this:

julia> @code_warntype pure(data{5,4}())
Body::Float64
1return 26.0

You only get that kind of code when you use @pure or @generated currently. It has nothing to do with whether the function is memoized or cached or not. The caching is irrelevant to this aspect.

@MasonProtter
Copy link
Author

Yes, instead the code it produces is essentially answer in cache ? answer : compute answer which in practice is just as performant without needing to violate the assumptions of the compiler.

Caching a value in a lookup table or using @pure both have the effect of only computing the function once for each set of inputs and re-using them on all subsequent calls with those inputs. Yes, the output of @code_warntype is different, but in practice that isn't relevant to what you're trying to do here.

@chakravala
Copy link
Owner

chakravala commented Mar 2, 2019

You don't understand the point of the test function. The point of it is not to cache a value, but the point is to test the ability of the compiler to simplify the result ahead of time based on data{N,G}. Since I'm the one who invented the test and its goals for my purposes here, I would know what's relevant. The goal is not to cache a value here, but to have the compiler simplify the code ahead of time.

The fact that I have a cache built-in is simply an extra fact added onto it, but is not relevant to whether I chose to use @pure or not here.

@eschnett
Copy link
Contributor

eschnett commented Mar 2, 2019

In the code discussed here, certain calculations happen at compile time, and others happen at run time. The memoization lookup table is only there to speed up compiling the code. It is likely not very important whether compiling the code is a few times faster or slower. What matters is that, at run time, only the absolutely necessary calculations are performed: here actually none at all, as the result 26.0 can be obtained at compile time.

In my example above I use Val, which is slow. I didn't mention that I assumed that all calculations involving Val happen at compile time. In this way, the function gen_summage above would need to be @generated as well, ensuring that the sum loop is unrolled so that it is type-stable, and the argument x would need to be a Val argument as well.

Are functions such as gen_summage exposed to the user, or are they only used internally? If the former, then generated function don't look like a good approach.

@eschnett
Copy link
Contributor

eschnett commented Mar 2, 2019

@chakravala In hindsight, an internal function such as for binomials might not have been a good choice to demonstrate how to use generated functions. Do you have an example of a function that is exported and which uses binomsum or a similar function? I assume that when it is rewritten to be a generated function, then it would call a non-generated (non-pure) version of binomsum, but would only do so at compile time, which should be acceptable.

@chakravala
Copy link
Owner

chakravala commented Mar 2, 2019

The gen_summage example is completely fictional and used for testing versatility. The binumsum values are needed in many different contexts and should be fast overall, especially because for high dimensional algebras you will have a lot of code generation happening, which will require calls to these methods to place the static values into compiled functions for specific binomial values. So the speed of the progressive code generation at high dimensions and also the speed of some full multivector binary operations are dependent on their performance. It is not desirable for me to cut performance on this front.

Due to the exponential nature of these algebras, these small differences can make a big difference in how fast your algebra will be at very high dimensions.

How deep must the layering of @generated functions typically go?

@eschnett
Copy link
Contributor

eschnett commented Mar 2, 2019

I didn't expect that code generation would itself be a bottleneck; I have no experience with this. Generally, all the function that need to be efficient at run time need to be generated. Functions that are only called at compile time don't need to be generated. Standard software engineering (lookup tables, conditionals, etc.) should suffice there. I am not familiar with how @pure is implemented in Julia, i.e. how fast its code generation is (although I assume that at least all of LLVM needs to run) or how fast the respective lookup that is necessary for a function call is (i.e. whether it is faster than a regular method table lookup).

@chakravala
Copy link
Owner

In response to the other post, the binomsum method is used in pretty much everything. You will probably only see it mentioned very few times in my code by text, because I used variables to shorten my code size.

https://github.com/chakravala/Grassmann.jl/blob/master/src/utilities.jl#L170

The insert_expr method is used to splice binomsum into pretty much a huge number of methods.

@chakravala
Copy link
Owner

chakravala commented Mar 2, 2019

The algebras I am looking at sometimes require 2^48=281,474,976,710,656 dimensions, so at that point the code generation has a huge factor in performance with so many basis elements.

@chakravala
Copy link
Owner

chakravala commented Mar 4, 2019

This issue is now being closed, because the original reported behavior now appears to be fixed.

julia> using Grassmann

julia> basis"+++"
(⟨+++⟩, v, v₁, v₂, v₃, v₁₂, v₁₃, v₂₃, v₁₂₃)

julia> v + v1 + v2
1 + 1v₁ + 1v₂

this is on the current master branch which resolves the original issue.. 00330ca 1edc6b5

don't know what it is that exactly caused the error, but I've changed around some of the fundamental operations on multivectors in the last few commits and that seems to have fixed it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants