Support defining methods of a function in multiple cells #177

karlwessel · 2020-06-10T16:28:55Z

The other solution is much harder to implement consistently - i.e. being able to detect multiple methods with the same argument types (instead of just detecting multiple functions with the same name) and deleting only the methods from one cell, while leaving the others.

All of this is possible within the scope of Pluto, but not something for the next couple of months :)

Originally posted by @fonsp in #109 (comment)

karlwessel · 2020-06-10T16:45:50Z

I would like to track this feature in an actual issue, since at least I am really really missing this.

Multiple dispatch and therefore multiple methods of the same function is a key feature of Julia. Often those methods are implemented in completely different contexts (for example Base.show). Having to put them all in the same cell makes exploratory implementation of custom types really hard.

fonsp · 2020-06-11T10:11:52Z

(split into two issues: #178)

fonsp · 2020-06-16T20:15:59Z

It would be helpful if someone with an eye for edge cases (@karlwessel) would write the test cases for this before thinking up a solution.

(Writing test cases is fairly easy, and they don't need to be perfect yet.)

fonsp · 2020-06-16T22:16:54Z

Strictly speaking, two cells f(x::Int) = 1 & f(x::Int) = 2 should not be allowed, and we can find those cases and throw a MultipleDefinitionsError. However, there are some very difficult situations like:

f(x::Int) = 1 & f(x::Int64) = 2 (built in type synonyms)

A = Number & f(x::A) = 1 & f(x::Number) = 2 (dynamic types)

A = rand(Bool) ? Number : String & f(x::A) = 1 & f(x::Number) = 2 (aaah)

f = 1 & f(x) = 1 (variables vs methods)

f(x::Int) = 1; f(x::String) = 2 && f(x::String) = 3; f(x::Vector) = 4 (multiple methods per cell)

f(x::Int) = 1; f(x::Int) = 2 && f(x::Vector) = 3 (number of methods in code might not equal number of defined methods)

f() = 1 && f(; s) = 2 (kwargs)

f(x::T) where T = 1 && f(x) = 2

f(x::(rand(Bool) ? Int64 : Float64)) = 1 &
f(x::(rand(Bool) ? Int64 : Float64)) = 2

If all of these situations are allowed without a MultipleDefinitionsError (which is the easy solution), then a notebook could be ambiguous (ignoring the SECRET fact that the order in which cells are placed is used as a tie-breaker when two cells are mutually independent.) They need to be handled correctly.

karlwessel · 2020-06-17T15:15:42Z

Is it possible to detect those cases during execution of a notebook by comparing the methodswith count for each function for which there are cells defining methods for? And throw a runtime exception if one cell that should increase the count doesn't because it replaced another method?

fonsp · 2020-06-17T21:08:53Z

That's going in the right direction! (Although my 5th example would not get detected with that method)

karlwessel · 2020-06-17T21:17:11Z

Well, maybe we don't have to catch all corner cases...

fonsp · 2020-08-05T16:59:25Z

Building on your counting methods idea:

It's very nice if we can keep all of the syntax analysis in one place, before running the code, and then run cells in a precomputed order. Checking the number of defined methods after running cells (in the middle of a reactive run), and then taking that information back to do something special makes that tricky. But there are three reasons to do this kind of "dynamic" code analysis instead of static anaylsis:

The common hurdles in these three are:

the dependency graph can only be determined after runtime
the dependency graph depends on the global state
we'd like to know things about external packages without
- analysing their source code, nor
- importing them into the server process

Solution?

We need to do some dynamic analysis, but we can try to keep it simple and isolated. The runner process will have three new APIs:

do_signatures_overlap(signatures::Array{NTuple{Type}})::Bool

symbols_exported_by_module(name::Symbol)::Set{Symbol}

macroexpand(expr::Expr)::Expr

to solve these three problems, respectively.

`do_signatures_overlap`?

Create a new test module, import all globals into the test module, and then define a dummy functions with the given signatures. If the number of methods is less than the number of signatures, then they overlap. (Thanks @karlwessel!)

`symbols_exported_by_module`

Create a new test module, call the using statement there, and then return Core.eval(testmodule, :(names($(name_of_usinged_module))) ).

`macroexpand`

This is just Base.macroexpand :)

Separate using+import from all other code

using and import already get some of special treatment - they always execute first in a "tie" scenario of a reactive run. This is a heuristic: we know that our syntax analysis fails for using so this is a hacky way to fix most scenarios. All using X sub-expressions are also extracted from cell code and stored separately.

To support the solutions above, we need to go from:

syntax analysis (in server process)
reactive run (in worker process)

to:

mini syntax analysis to find using (in server process)
call the using statements (in worker process)
syntax analysis (in server process), with symbols_exported_by_module to fill in the gaps
reactive run (in worker process), but use do_signatures_overlap during the run to somehow figure out if cells should err.

fonsp · 2020-08-05T17:42:35Z

We can simplify the do_signatures_overlap quite a bit by dropping support for:
A = B
f(x::A) = 1
f(x::B) = 2

Although it is common to have user-defined types, I hope that it's uncommon for user defined aliases to cause this kind of trouble. So:

`do_signatures_overlap`?

Create a new test module, ~~import all globals into the test module~~, create dummy types for all unknown types, and then define dummy functions with the given signatures. If the number of methods is less than the number of signatures, then they overlap.

So:

function f(x::Int, y::MyType, z::random_type())
    asdf
end

will become

struct var"MyType" end
struct var"random_type()" end

function f(x::Int, y::var"MyType", z::var"random_type()")
end

(var"..." is how you define variables with possibly illegal names in Julia).

mini syntax analysis to find using (in server process)
call the using statements (in worker process)
syntax analysis (in server process), with symbols_exported_by_module and do_signatures_overlap to fill in the gaps
reactive run (in worker process)

karlwessel · 2020-09-10T11:14:20Z

f(x::Int) = 1 & f(x::Int64) = 2 (built in type synonyms)

A = Number & f(x::A) = 1 & f(x::Number) = 2 (dynamic types)

A = rand(Bool) ? Number : String & f(x::A) = 1 & f(x::Number) = 2 (aaah)

f = 1 & f(x) = 1 (variables vs methods)

f(x::Int) = 1; f(x::String) = 2 && f(x::String) = 3; f(x::Vector) = 4 (multiple methods per cell)

f(x::Int) = 1; f(x::Int) = 2 && f(x::Vector) = 3 (number of methods in code might not equal number of defined methods)

f() = 1 && f(; s) = 2 (kwargs)

f(x::T) where T = 1 && f(x) = 2

f(x::(rand(Bool) ? Int64 : Float64)) = 1 &
f(x::(rand(Bool) ? Int64 : Float64)) = 2

I guess the next step is me implementing those tests :-)

fonsp · 2020-09-10T12:42:34Z

Yes please! Although I think we will be going with a solution where half of those tests are broken :)

karlwessel · 2020-09-10T15:40:24Z

Yes please! Although I think we will be going with a solution where half of those tests are broken :)

Why is that? Is the method-counting version to complicated?

fonsp · 2020-09-10T15:50:04Z

Why is that? Is the method-counting version to complicated?

Counting methods is not too complicated, but then there needs to be a feedback loop: too few methods were detected, so we error those cells manually, delete the methods, and then continue.

Instead of:
Syntax analysis > topological ordering > execution

where each of those steps does not depend on the results of later steps

karlwessel · 2020-09-10T16:02:59Z

It thought that was the process you described in your comment above. What is the simpler alternative?

Pangoraw · 2020-10-04T13:55:23Z

Hi!
I made a little proof of concept to solve this issue using a shallow module where all the types referenced are redefined using struct var"$name" end and where the functions have an empty body (following @fonsp's idea). Iterating on @karlwessel's idea of counting the number of methods defined, we visit the methods table to extract all defined methods for a function. We can then reduce by using the signature as a key to identify which signature is defined more than once. This prevent one problem I have identified with counting: We couldn't have overridden method from the Base package for example since the number of methods would have stayed the same.

The complete process would be:

Collect the function signatures in ExpressionExplorer so that the symbols states contains the type signature and not just the references.
Before a reactive run if a function name is assigned in multiple cells we can construct this shallow module with a empty definition for every type.
Define a get_errors function which can identify which methods are duplicated
For every function name which is assigned more than once run get_errors to identify whether or not there are duplicate methods
Match the duplicate methods with their respective cells using a LineNumberNode containing the cell id for each shallow function
Dispatch errors for those cells
Remove previously dispatched MultipleDefinitions errors, topology[cell].assignments will no longer contains symbols for function names

What don't work yet

Handle curly types Union{A, B}, Option{A} and so on...
Handle optional parameters a::Bool=false
Handle ::Any Parameters (parameters without types) -> f(a) === f(a::Any)
Handle type aliases A = Int64
Currently I am "polluting" the Main module with these shallow modules, using another process with Distributed will be cleaner
Currently, the detection is run every time there is a change on every cells, there must be a way to filter every function that has not changed
Plenty of tests -> #471

A quick demo of the progress

I have pushed my changes on this branch if you want to take a look at the implementation and compare the changes. If the methods implemented seems good to you I can open a draft PR to discuss more technical details of the implementation 🎇

fonsp · 2020-10-06T12:13:24Z

(ramble)

Hi @Pangoraw!

Thanks so much for your contribution! I've sent you an email in case you would like to call!

Your work looks promising, but motivated by @karlwessel (#177 (comment)) I think we should take a different direction:
These tricky corner cases are very difficult to solve, and will probably make Pluto's internals much more complicated (if we go with the crazy solution I proposed, although I see some good middle ground solutions).

So instead, I think we should only attempt to catch super simple cases like f(x) = 1 vs f(wow::Any) = 2, and leave anything more complicated for later.

I'm also not sure whether I like having the function signature info in its SymbolState. I think it would be better to store it in its name. I have started working on this in a new branch, so far it's just refactoring old code.

During my refactoring, I found that a bunch of tricky code can be eliminated! #536

Co-authored-by: Karl Wessel <karl.wessel@stud.uni-goettingen.de> Co-authored-by: fonsp <fonsvdplas@gmail.com> Co-authored-by: Jelmar Gerritsen <jelmargerritsen@gmail.com> Co-authored-by: Michiel Dral <m.c.dral@gmail.com>

karlwessel mentioned this issue Jun 11, 2020

Cyclic references between methods in cells #178

Closed

Repository owner deleted a comment from karlwessel Jun 11, 2020

fonsp added the reactivity The Pluto programming paradigm label Jun 11, 2020

fonsp added the user request Requested using the feedback form inside Pluto notebooks label Jul 22, 2020

fonsp mentioned this issue Sep 10, 2020

Macros and DSLs #196

Closed

This comment has been minimized.

Sign in to view

fonsp mentioned this issue Sep 10, 2020

Pluto complains about multiple definitions #410

Closed

karlwessel mentioned this issue Sep 11, 2020

Pluto complains about multiple definitions #409

Closed

karlwessel pushed a commit to karlwessel/Pluto.jl that referenced this issue Sep 22, 2020

implemented tests towards solving issue fonsp#177

768980b

karlwessel mentioned this issue Sep 22, 2020

implemented tests towards solving issue #177 #471

Closed

fonsp mentioned this issue Oct 2, 2020

Handle parametric type assignments, closes #517 #521

Merged

fonsp linked a pull request Oct 7, 2020 that will close this issue

👒 Include function signature in reactive edge - fix #177 #538

Merged

fonsp mentioned this issue Oct 7, 2020

👒 Include function signature in reactive edge - fix #177 #538

Merged

fonsp linked a pull request Oct 7, 2020 that will close this issue

add duplicate methods detection using signatures instead of names #543

Closed

fonsp closed this as completed in #538 Oct 7, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support defining methods of a function in multiple cells #177

Support defining methods of a function in multiple cells #177

karlwessel commented Jun 10, 2020

karlwessel commented Jun 10, 2020 •

edited

fonsp commented Jun 11, 2020

fonsp commented Jun 16, 2020

fonsp commented Jun 16, 2020 •

edited

karlwessel commented Jun 17, 2020

fonsp commented Jun 17, 2020

karlwessel commented Jun 17, 2020

fonsp commented Aug 5, 2020 •

edited

fonsp commented Aug 5, 2020 •

edited

karlwessel commented Sep 10, 2020 •

edited

fonsp commented Sep 10, 2020

This comment has been minimized.

karlwessel commented Sep 10, 2020

This comment has been minimized.

This comment has been minimized.

fonsp commented Sep 10, 2020 •

edited

karlwessel commented Sep 10, 2020

Pangoraw commented Oct 4, 2020

fonsp commented Oct 6, 2020 •

edited

Support defining methods of a function in multiple cells #177

Support defining methods of a function in multiple cells #177

Comments

karlwessel commented Jun 10, 2020

karlwessel commented Jun 10, 2020 • edited

fonsp commented Jun 11, 2020

fonsp commented Jun 16, 2020

fonsp commented Jun 16, 2020 • edited

karlwessel commented Jun 17, 2020

fonsp commented Jun 17, 2020

karlwessel commented Jun 17, 2020

fonsp commented Aug 5, 2020 • edited

Solution?

do_signatures_overlap?

symbols_exported_by_module

macroexpand

Separate using+import from all other code

fonsp commented Aug 5, 2020 • edited

do_signatures_overlap?

karlwessel commented Sep 10, 2020 • edited

fonsp commented Sep 10, 2020

This comment has been minimized.

karlwessel commented Sep 10, 2020

This comment has been minimized.

This comment has been minimized.

fonsp commented Sep 10, 2020 • edited

karlwessel commented Sep 10, 2020

Pangoraw commented Oct 4, 2020

What don't work yet

A quick demo of the progress

fonsp commented Oct 6, 2020 • edited

karlwessel commented Jun 10, 2020 •

edited

fonsp commented Jun 16, 2020 •

edited

fonsp commented Aug 5, 2020 •

edited

`do_signatures_overlap`?

`symbols_exported_by_module`

`macroexpand`

fonsp commented Aug 5, 2020 •

edited

`do_signatures_overlap`?

karlwessel commented Sep 10, 2020 •

edited

fonsp commented Sep 10, 2020 •

edited

fonsp commented Oct 6, 2020 •

edited