Jeff Bezanson PhD #8839

Closed
stevengj opened this Issue Oct 28, 2014 · 174 comments

Projects

None yet
@stevengj
Member

One of the core Julia authors, @JeffBezanson, has become a problematic developer. He needs to graduate from MIT, ideally by January 2015. Dependencies:

  • Thesis proposal.
  • Thesis outline.
  • Meeting with thesis committee.
  • Rough draft sent to advisor(s).
  • Schedule defense.
  • Final draft sent to committee.
  • Defense.
  • Alcohol.

This is a priority issue, to ensure that arms are not broken and to guarantee long-term viability of the Julia project.

cc: @alanedelman, @jiahao, @StefanKarpinski, @ViralBShah, @samanamarasinghe, @gjs

Edit (VS): This issue is closed with the following thesis. I am putting it up here, since many people will be interested in finding it.
https://github.com/JeffBezanson/phdthesis/blob/master/main.pdf

@stevengj stevengj added this to the 0.4 milestone Oct 28, 2014
@johnmyleswhite
Member

cc @fperez who also is interested in this outstanding issue

@jiahao
Member
jiahao commented Oct 28, 2014

Supporting information attached.
p1170816
p1170832

@JeffBezanson
Member

As a procedural issue, closing this might require work in a separate repository. The thesis should perhaps be included in Base to make it easier for others to contribute.

Also the ordering of the last task is misleading; it will actually recur frequently throughout the process.

@stevengj
Member

+1 for including it in Base, or at least in julia/doc/thesis. Or maybe theses to allow for future needs.

(Please go ahead and open up a thesis branch, Jeff.)

@jiahao
Member
jiahao commented Oct 28, 2014

Also the ordering of the last task is misleading; it will actually recur frequently throughout the process.

*has also already recurred

@stevengj
Member

I'm looking forward to being present at the Close issue ceremony.

@timholy
Member
timholy commented Oct 28, 2014

I can't reproduce this issue locally; is it MIT-specific?

@jakebolewski
Member

One of the core Julia authors, @JeffBezanson, has become a problematic developer academic

@tknopp
Contributor
tknopp commented Oct 28, 2014

Is this the github version of a PHD thesis? Jeff has to open a PR with his proposal and the committee will decide whether to merge or not...

@ViralBShah
Member

+Inf for speedy resolution of this one!

@Carreau
Carreau commented Nov 1, 2014

I had the same problem on IPython repo a few month ago, hopefully it was fixed 32 days ago.
I'm pretty sure it involved coffee, annoying paperwork and last minute change of plan because of jackhammers.

Good luck !

@stevengj
Member

Updated: Jeff met with his thesis committee and gave us a rough outline.

@timholy
Member
timholy commented Nov 19, 2014

Glad to hear progress is being made!

But git log --date=short --all --since=22.days.ago --author="Jeff Bezanson" still makes one wonder how he has time for writing a thesis. Either that, or he's a superhero. Actually, scratch that: we all know he is a superhero, so never mind.

@jiahao
Member
jiahao commented Nov 19, 2014

The commits involving juliatypes.jl record our attempts to describe Julia's type system, which is directly relevant thesis work.

@johnmyleswhite
Member

The type system work seems to already be hitting some nerves: https://twitter.com/plt_hulk/status/535045242920378369

@StefanKarpinski
Member

I kind of doubt that's directly in response to Jeff's work, although I could be wrong. Hilarious tweet either way though.

@timholy
Member
timholy commented Nov 19, 2014

@jiahao, my comment was mostly tongue-in-cheek---I kinda wondered about that very thing. I, for one, tend to have a lot of commits when I'm polishing something up for presentation.

@jiahao
Member
jiahao commented Nov 19, 2014

@timholy humor noted. :)

It would be remiss to not mention the lovely Magritte homage our local theory collaborator @jeanqasaur made and posted on twitter:

magritte_type_with_types

"The Treachery of Types" has a nice ring to it, no?

@tonyhffong
Member

That's quite funny.

@timholy
Member
timholy commented Nov 19, 2014

Love it!

@stevengj
Member

Call for help: Jeff is looking for nice examples that show off multiple dispatch (and maybe staged functions), things which would be much harder/slower in languages without those features.

@tonyhffong
Member

er, show?

@timholy
Member
timholy commented Dec 19, 2014

(and maybe staged functions)

subarray.jl and subarray2.jl should serve rather nicely. Design document is at http://docs.julialang.org/en/latest/devdocs/subarrays/

@johnmyleswhite
Member

I think the Distributions package really makes multiple dispatch seem useful. Having things like rand(Gamma(1, 1), 5, 5) vs rand(Normal(0, 1), 3) is a huge gain in expressivity at no performance cost because of multiple dispatch.

@jakebolewski
Member

I don't see how that is the best example as it really is showing off
single-dispatch. How is it different than Gamma(1,1).rand(5,5) which you
would do in a more traditional OO language like Python or Java?

On Fri, Dec 19, 2014 at 1:39 PM, John Myles White notifications@github.com
wrote:

I think the Distributions package really makes multiple dispatch seem
useful. Having things like rand(Gamma(1, 1), 5, 5) vs rand(Normal(0, 1),
3) is a huge gain in expressivity at no performance cost because of
multiple dispatch.


Reply to this email directly or view it on GitHub
#8839 (comment).

@johnmyleswhite
Member

Ok. Replace that with examples of computing KL-divergences using analytic results: kl(Normal(0, 1), Normal(0, 1)) vs kl(Normal(0, 1), Gamma(1, 1)).

@timholy
Member
timholy commented Dec 19, 2014

I should have also added, there were some potentially-useful statistics on what life would have been like without staged functions in my initial post in #8235. The take-home message: generating all methods through dimension 8 resulted in > 5000 separate methods, and required over 4 minutes of parsing&lowering time (i.e., a 4-minute delay while compiling julia). By comparison, the stagedfunction implementation loads in a snap, and of course can go even beyond 8 dimensions.

@vtjnash
Member
vtjnash commented Dec 19, 2014

Vs single-dispatch, it still demonstrates the unification of what other OO languages describe as: functions vs methods. You could constrast to python's sorted(a) vs a.sort(). In comparison to "traditional" OO languages, it dramatically changes what it means for a function to be "associated with" a class.

You could point out how it replaces the need for static vs instance variables, since you can dispatch on the instance or the type. I might have some more ideas from some recent IRC conversations, when I can get to a computer.

@eschnett
Contributor

I have an implementation of fmap. This traverses several arrays and
applies a function to each set of elements. This implementation is actually
very slow since the number of arrays can be arbitrary. To make this useful,
I have manually created specialization of this for various numbers of
arguments yss. I've always wanted to write a staged function for this,
but haven't done so yet.

The staged function would need to evaluate in particular the map call
that produces the arguments to f.

-erik

function fmap{T,D}(f::Function, xs::Array{T,D}, yss::Array...;
                   R::Type=eltype(f))
    [@assert size(ys) == size(xs) for ys in yss]
    rs = similar(xs, R)
    @simd for i in 1:length(xs)
        @inbounds rs[i] = f(xs[i], map(ys->ys[i], yss)...)
    end
    rs::Array{R,D}
end

On Fri, Dec 19, 2014 at 12:46 PM, Steven G. Johnson <
notifications@github.com> wrote:

Call for help: Jeff is looking for nice examples that show off multiple
dispatch (and maybe staged functions), things which would be much
harder/slower in languages without those features.

Reply to this email directly or view it on GitHub
#8839 (comment).

Erik Schnetter schnetter@cct.lsu.edu
http://www.perimeterinstitute.ca/personal/eschnetter/

@eschnett
Contributor

Sorry, ignore the call to eltype(f) in the function signature in my code,
that's non-standard.

-erik

On Fri, Dec 19, 2014 at 3:08 PM, Erik Schnetter schnetter@cct.lsu.edu
wrote:

I have an implementation of fmap. This traverses several arrays and
applies a function to each set of elements. This implementation is actually
very slow since the number of arrays can be arbitrary. To make this useful,
I have manually created specialization of this for various numbers of
arguments yss. I've always wanted to write a staged function for this,
but haven't done so yet.

The staged function would need to evaluate in particular the map call
that produces the arguments to f.

-erik

function fmap{T,D}(f::Function, xs::Array{T,D}, yss::Array...;
                   R::Type=eltype(f))
    [@assert size(ys) == size(xs) for ys in yss]
    rs = similar(xs, R)
    @simd for i in 1:length(xs)
        @inbounds rs[i] = f(xs[i], map(ys->ys[i], yss)...)
    end
    rs::Array{R,D}
end

On Fri, Dec 19, 2014 at 12:46 PM, Steven G. Johnson <
notifications@github.com> wrote:

Call for help: Jeff is looking for nice examples that show off multiple
dispatch (and maybe staged functions), things which would be much
harder/slower in languages without those features.

Reply to this email directly or view it on GitHub
#8839 (comment).

Erik Schnetter schnetter@cct.lsu.edu
http://www.perimeterinstitute.ca/personal/eschnetter/

Erik Schnetter schnetter@cct.lsu.edu
http://www.perimeterinstitute.ca/personal/eschnetter/

@stevengj
Member

@timholy, since Matlab and NumPy have subarrays/slices too, why can we argue that multiple dispatch is essential here?

@ivarne
Contributor
ivarne commented Dec 19, 2014

Ease of implementation? As far as I can tell you can simulate multiple dispatch in any language, so it isn't essential for anything.

Maybe not good to sugest something we haven't yet decided we want. In #9297, there is a proposal that allows us to have both efficient UTF-8 buffer positions and convenient indexing without holes so that you can do convenient arithmetic when you want that. Regex and search will return the wrapped internal index, but s[2] can give the second character regardless of how many bytes that were used to encode the first.

@timholy
Member
timholy commented Dec 19, 2014

Can they make efficient subarrays/slices of AbstractArrays, or does their implementation only work for contiguous blocks of memory? This isn't a terribly hard problem to solve if you can assume your parent array has contiguous memory; it gets more interesting when you don't make that assumption.

@stevengj
Member

Yes, that's the key feature we are looking for: not just that something can be done nicely with multiple dispatch and/or staged functions, but that the lack of these features in other languages made implementation of the feature much harder (ideally, so much harder that no one has even attempted it).

@timholy, a NumPy array is characterized by a fixed stride for each dimension, not necessarily contiguity (essentially equivalent to our DenseArray). This property is preserved under slicing, so slices themselves can be sliced etcetera.

@vtjnash
Member
vtjnash commented Dec 20, 2014

aberrant had some good questions on IRC along these lines. I've tried to extract just the relevant bits of comments (from among the unrelated conversations and notifications) below:

2014-12-10 (EST)
11:41 aberrant: “Organizing methods into function objects rather than having named bags of methods “inside” each object ends up being a highly beneficial aspect of the language design.”
11:41 aberrant: why?
12:20 Travisty: aberrant: I can’t speak for them, but I imagine that the argument is that it’s a nice separation of concerns. I have data (which I will represent with types) and routines for operating on that data (which I will represent as functions), and rather than having some routines belong to specific types, they are kept separate
12:21 aberrant: Travisty: I sort of understand the argument, but I’m not sure I agree with it :)
12:22 Travisty: aberrant: Yeah, sure. This is the sort of thing that may be hard to argue about from first principles, and it may be useful to look at examples. I think one place where this design simplified things was in impementing the standard mathematical functions on all of the numeric types, and dealing with conversion
12:22 Travisty: I’m not sure, but I think the solution in julia is quite elegant because of this design and it would be a bit trickier to do it in the traditional OOP setting
12:23 aberrant: Travisty: perhaps. I need to think about it some more. I really like pure OO, and this is a bit of a change that I need to wrap my head around.
...
12:54 vtjnash: julia has a convention that a method name will end in a ! to signify that the method will mutate one of it's arguments
12:56 aberrant: that’s one thing I sorely miss in python. foo.sort() vs foo.sorted() always confused me.
12:57 vtjnash: except that in python, isn't it sort(foo) vs sorted(foo) ?
12:57 aberrant: it might be :)
12:58 aberrant: no
12:58 aberrant: it’s foo.sort vs sorted(foo)
12:58 vtjnash: ah
12:58 aberrant: foo.sort modifies foo in place.
12:58 aberrant: see?
12:58 aberrant: that’s what I mean.
12:58 vtjnash: well, anyways, that's an unintentional example of why . oriented programming is a pain
12:58 aberrant: sort(foo) vs sort!(foo) makes much more sense.
12:59 vtjnash: python made a reasonable choice there
12:59 vtjnash: and tries to help you remember
12:59 vtjnash: but it still was forced to make some decision
2014-12-14 (EST)
15:25 aberrant: there’s no way to do type constants, I guess?
15:25 aberrant: http://dpaste.com/18AEHBG
15:25 aberrant: like that
15:27 vtjnash: no. that declares a local variable inside the type (can be seen by the constructors and other methods in there)
15:27 vtjnash: instead, define `y(::Foo) = 6`
15:28 aberrant: is that mutable?
15:29 aberrant: hm, yeah, that’s not what I want though.
15:29 aberrant: but I guess I can use it.
15:30 vtjnash: not what you want, or not what other languages do?
15:31 vtjnash: multiple dispatch in julia allows you to collapse 4 or 5 or more different constructs needed by other OO languages into one abstraction
15:33 aberrant: oh, I see how it works.
15:33 aberrant: well, it’s a function and therefore more overhead
15:33 aberrant: basically, I want to set the “bitwidth” of an IPv4 address to be 32, and the “bitwidth” of an IPv6 address to be 128, and then be able to write a function that takes ::IPAddr and uses the appropriate bitwidth.
15:34 aberrant: I can do this with the function, but it seems like overhead to have a function return a constant.
15:35 aberrant: e.g., http://dpaste.com/3RXRCAG
15:36 vtjnash: don't assume that a function has more overhead
15:36 vtjnash: in this case, it would actually have less overhead
15:54 aberrant: wow, ok
15:54 aberrant: I don’t see how, but I’ll take your word for it :)
15:59 vtjnash: inlining
...
18:04 aberrant: there’s no way to associate a constant inside a type?
18:04 aberrant: it would make my life a whole lot easier.
18:04 mlubin: aberrant: t.constant or constant(t) is just a syntax difference
18:04 mlubin: julia uses the latter
18:05 aberrant: mlubin: the issue is that you have to instantiate the type first.
18:05 aberrant: mlubin: I need the equivalent of a class property.
18:05 mlubin: constant(::Type{Int32}) = 10
18:05 mlubin: constant(Int32)
18:05 aberrant: oh. wow.
18:06 Travisty: The only member of Type{T} is T, which is why this works
18:06 mlubin: mind=blown? ;)
18:06 aberrant: yeah
18:06 aberrant: that’s jacked up
18:07 aberrant: there’s NO WAY I would have ever thought of that on my own :(
18:07 mlubin: once you see it for the first time it becomes a lot more intuitive
18:07 aberrant: ipaddrwidth(::Type{IPv4}) = uint8(32)
18:08 aberrant: w00t
18:10 aberrant: can I do a const in front of that?
18:11 Travisty: I don’t think so, but usually when julia generates code, it should be inlined so that just the constant uint8(32) appears, instead of a function call

if you're looking for more examples: #7291

Net code reduction is nice from 6K C++ to about 1K Julia. Initial performance benchmarks show it's just under 2x the native C++.

how many languages do you know that can claim implementation of printf, from the operators (+-*/, Int, Float) to the numerical output formatting, in the language itself, and benchmark within a small margin of the libc versions? Maybe C++? C can't even claim this (it doesn't have operator overloading).

Python/MATLAB/etc. might be able to claim bits of this, but have you ever tried to use the string operators in MATLAB?

@tknopp
Contributor
tknopp commented Dec 20, 2014

This is a very interesting discussion and I like to add some points:

  • What seems to be not really answered is how Julia compares to polymorphism in C++ speedwise. In C++ one can have static polymorphism through templates or dynamic polymorphism through OO. But the later requires a vtable and is always a little slower, especially when calling such things in a tight loop. For this reason one would never overload the array access operator since this would not be fast. Eigen for instance uses static polymorphism.
  • In Julia there is no destination between a virtual and a non-virtual function and we still don't have performance penalties. My understanding is that the hugh advantage is that Julia can generate code "on the fly" and thus still generate fast code where in C++ runtime means one cannot modify the generated code anymore. Additionally in Julia we are able to do inlining at runtime. So no function call overhead. And in Julia we don't have to think about all this because it is done automatically
  • One huge thing (again) is: generics without overhead and without all that pain C++ has (looking compile times) There are different methods how to implement generics and IMHO Julia uses the same model as C#, so this might not be entirely new (for Jeffs thesis). But still I think this could be something to discuss in the thesis.
@StefanKarpinski
Member

How hard it is to explain how effective multiple dispatch is, is itself quite interesting.

@timholy
Member
timholy commented Dec 20, 2014

Glad we're on the same page now. Yes, our new SubArrays don't rely on having strides, nor do they require linear indexing (although they can use it, and do by default if it happens to be efficient). So they work efficiently for any AbstractArray, and support "non-strided" (Vector{Int} indexes) views as well.

To clarify, I presented our new SubArrays primarily as an example of staged functions, not multiple dispatch. That said, the current scheme would fall apart at the construction step without multiple dispatch: we call completely different constructor methods for slice(::AbstractArray, ::UnitRange{Int}, ::Int, ::Vector{Int}) and slice(::AbstractArray, ::Int, ::UnitRange{Int}, ::UnitRange{Int}), etc. Those constructors are generated by staged functions, but we need multiple dispatch for the system to work.

Without separate functions, a varargs constructor has to face the fact that looping over the entries in the indexes tuple is not type-stable: to process individual indexes in a loop, you have to assign them to a named scalar variable, and that variable is guaranteed not to be well-typed even though the input tuple can be. Of course, the staged function does oh-so-much more: almost all the "hard" decisions---are we dropping this as a sliced dimension, or keeping it, and what are the consequences for the internal representation of the object?---can be made at compile time, reducing the runtime constructor to a remarkably trivial operation. It just so happens that the particular trivial operation is different for each combination of input types, so to exploit this triviality and make construction fast you need to be calling different methods customized for each combination of input types. This is why you need both staged functions and multiple dispatch.

Personally, I doubt that it would be possible to do what we've done without the combination of staged functions and multiple dispatch, and that it's likely that julia now has the most flexible and efficient array views of any language. But I don't pretend to have undertaken any kind of actual study of this problem.

@timholy
Member
timholy commented Dec 20, 2014

This makes me wonder if I should write up our SubArrays for publication---it does seem we have something new here.

@tknopp
Contributor
tknopp commented Dec 20, 2014

@StefanKarpinski: Lets be self-critical: Maybe we just have not written it up yet?

@tknopp
Contributor
tknopp commented Dec 20, 2014

@timholy I actually see two things here, the staged functions (which I have to admit not fully understood) and how Julia fits into C++ virtual functions vs templates universe (C# and Java are similar here).

For the C++ comparison it would be very interesting to do timings and experimentally proof that the we can reach compile time polymorphism (which I am sure we do)

Another thing I wanted to formulate for some time is how Julia compares to C++ with regards to C++ Concepts (lite). The subtyping we have is a large part of C++ Concepts.

@timholy
Member
timholy commented Dec 20, 2014

Does one even need to write a bunch of C++ code and run timings to check this?

A = rand(3,5)
@code_llvm slice(A, 2, :)
@code_llvm slice(A, :, 2)

There's essentially nothing but load and store operations (i.e., memory access) in there.

@ViralBShah
Member

@timholy I do think it is worth writing this up. I believe (but have not verified) that many systems that implement Nd arrays have specialized implementations for the first few dimensions, and then generally fall back to something slow otherwise, and that shouldn't be the case for the new implementation.

Another related case is writing an efficient sparse N-d store (arrays being the common special case).

@tknopp
Contributor
tknopp commented Dec 20, 2014

Tim,

yes indeed looking at the resulting assembly code is a great tool. But still I think this has to be compared in a larger context, where (in the C++ implementation) virtual functions would have to be taken into account.

My guess is really that it is not multiple dispatch that is so fast but that we can inline at runtime and thus transform virtual function (which our generic functions are) into efficient instruction without function call indirection.

If this is true (please correct me if I am wrong @JeffBezanson @vtjnash @stevengj @timholy ) the multiple in multiple dispatch is not the reason of Julia beeing so fast but a neat side affect that certain code can be formulated nicer (where single dispatch is limiting)

@timholy
Member
timholy commented Dec 20, 2014

I'm probably just not understanding, but I'm not sure of the distinction you're making. In Julia, "inline" and "runtime" don't really seem to go together; inlining is performed during type inference, i.e., at compile-time. Multiple dispatch is used to select the appropriate method for the inferred types.

@tknopp
Contributor
tknopp commented Dec 20, 2014

The confusion here is that "compile time" and "runtime" of C++ cannot be compared with that of Julia. Codegen can happen during "runtime" so yes I think when I do include("myscript.jl") inlining is performed. And even if "runtime" is not the right word for that from a C++ perspective it is "runtime".

And the dispatching on different types is like a vtable but more general, no?

@johnmyleswhite
Member

This makes me wonder if I should write up our SubArrays for publication---it does seem we have something new here.

It's a little far from standard topics, but you might consider submitting to JSS. We need more Julia articles.

@ViralBShah
Member

I would love it if Tim writes a blog post to describe this work to start with, as it will get a lot of people up to speed. JSS is a great idea, and perhaps some of the foundational work that had been done in data frames and distributions is worth writing up too? I certainly would enjoy reading it.

@timholy
Member
timholy commented Dec 21, 2014

Well, what made me think of it is that much of it has been written up: http://docs.julialang.org/en/latest/devdocs/subarrays/. For a publication you'd want to go into a lot more detail, but this hits a fair number of the main big-picture points.

@ViralBShah
Member

To the question @stevengj raised about multiple dispatch, I would say that without multiple dispatch, it is pretty difficult to write our base library in julia. I am saying what is already known by everyone here, and wonder if this is not a compelling example for the reasons brought up here.

Details such as operations on numbers and the conversion/promotion are intricately tied to multiple dispatch. Since multiple dispatch is what essentially exposes type inference in the compiler to the way types are used in code, we are able to write a generic and fast numerical base library. To quote a statement you made - it helps separate the policy making out of the compiler and into libraries. For example, @JeffBezanson showed me once how the Scheme spec spends 1/3 of its space on numerical details.

Many interpreted systems often end up having general types and inspecting the types of their objects at runtime to make decisions on what code to execute. They often then have a separate implementation in C/C++/Fortran in the base library for each type, leading to a large and difficult to debug codebase. Often these are generated through an external macro system, but increasingly the use of C++ templates have avoided this specific issue. The issue of two languages and type inference still remains in these cases.

At some level, vectorization is how many scientific languages amortize the cost of determining types at runtime and doing the appropriate code selection. In Julia, with the combination of type inference, multiple dispatch and generic programming, our costs for both are significantly lower allowing us to write generic devectorized code - C without types.

@simonbyrne
Contributor

One example is inequality comparison with MathConst (representing irrational numbers) in PR #9198:

https://github.com/JuliaLang/julia/pull/9198/files#diff-e247e18c426659d185379c7c96c1899dR29

  • FloatingPoint vs MathConst compares the float with the float above/below the constant
  • Rational{T} vs MathConst (for bounded integer types T) finds the closest rational representable by the type; then based on whether it is above or below the true irrational type.

What makes this feasible is easy access to the Rational and BigFloat functionality at compile time. Although it might be possible using something like a macro preprocessor, it would require essentially two implementations of all the functionality.

@JeffBezanson
Member

I have a small update on this issue. The document is now open to the public: https://github.com/JeffBezanson/phdthesis

Today I'm releasing a sort-of draft by self-imposed deadline. It's not much of a draft; many pieces are entirely missing. I also claim no responsibility for anything that is inside TeX comments, or present only in the past version history :)

All feedback welcome! In particular, if you have nice examples of julia code that show off the language well, especially anything that would be hard to do without it. Always looking for good non-trivial multiple dispatch examples.

@johnmyleswhite
Member

Thanks for making this public. I was surprised when I tried to follow a link SGJ used and hit a 404.

@andreasnoack
Member

Maybe the triangular matrix arithmetic could be an example of the usefulness of Julia's multiple dispatch. Here UpperTriangular+UpperTriangular=UpperTriangular, but UpperTriangular+LowerTriangular=Matrix.

We also talked about promotion of element types. I like that you can avoid promotion when it is not necessary, e.g.

julia> Base.LinAlg.UnitLowerTriangular(ones(Int, 3, 3))\[1,2,3]
3-element Array{Int64,1}:
 1
 1
 1

julia> Base.LinAlg.UnitLowerTriangular(ones(Int, 3, 3))\[1,2,3.0]
3-element Array{Float64,1}:
 1.0
 1.0
 1.0

I can't say how specific this is to Julia, but as we saw, it appeared that at least Eigen will not manage promotion, but require that the element types are stable under the linear algebra operation.

@ViralBShah ViralBShah removed the feature label Feb 14, 2015
@stevengj
Member
stevengj commented May 6, 2015

Jeff is going into the final stretch now, we hope, so feedback on and corrections to his thesis (see the abovementioned github repository) would be especially welcome now.

Feel free to submit PRs for typos, missing references, etc.

@yuyichao
Member
yuyichao commented May 6, 2015

@stevengj Should we expect an announcement of the defense schedule here?

@stevengj
Member
stevengj commented May 6, 2015

Yes.

@ScottPJones
Contributor

I hope he has a job lined up at MIT already... (or anywhere where he'll continue the great Julia work)... reading his thesis now... great stuff, IMO!

@mschauer
Contributor
mschauer commented May 6, 2015

@andreasnoack A bit esoteric, but anyway funny: Define a arrow matrix ring type and apply the Cholesky factorization from base to see what the structure of a cholesky factorization of an arrow matrix of arrow matrices looks like. https://gist.github.com/mschauer/c325ff89cd378fe7ecd6 ("A" Arrow matrix, "F" full matrix, "L" Lower half arrow").

@JeffBezanson
Member

@ScottPJones thanks!!

Defense scheduled: Wed May 13, 1pm Stata D-463.

@jiahao
Member
jiahao commented May 8, 2015

@JeffBezanson are you planning to have it recorded through replay.csail.mit.edu?

@JeffBezanson
Member

...maybe I'd rather forget about it... :)

@ihnorton
Member
ihnorton commented May 8, 2015

Isn't that what the last checkbox is for?

@ScottPJones
Contributor

Are you really sure you should be giving the address out? You might get mobbed by fan boys, who want signed copies of your thesis! 😀

@timholy
Member
timholy commented May 8, 2015

I've already got my tickets. I hear the price is already up to $800 on the black market.

@ScottPJones
Contributor

Stata D-463 wasn't there in my day... (I've only been there to visit friends and for the NE Database Day)... Is it going to be big enough? Surely he's going to need 10-250!

@JeffBezanson
Member

Officially D-463 fits 48 chairs. If we think that won't be enough, we can look into getting a bigger room.

@ScottPJones
Contributor

I seriously don't think you really appreciate what you've done! If all of your fans showed up, maybe 10-250 wouldn't be big enough... book Kresge now!

@stevengj
Member

Abstraction in Technical Computing

  • PhD Candidate: Jeff Bezanson
  • Thesis Supervisor: Prof. Alan Edelman
  • Thesis Committee: Prof. Steven Johnson, Prof. Saman Amarasinghe, Prof. Gerry Sussman
  • Date: Wednesday May 13, 2015
  • Time: 1pm
  • Location: MIT campus, 32-D463

Array-based programming environments are popular for scientific and technical computing. These systems consist of built-in function libraries paired with high-level languages for interaction. Although the libraries perform well, it is widely believed that scripting in these languages is necessarily slow, and that only heroic feats of engineering can at best partially ameliorate this problem.

This thesis argues that what is really needed is a more coherent structure for this functionality. To find one, we must ask what technical computing is really about. This thesis suggests that this kind of programming is characterized by an emphasis on operator complexity and code specialization, and that a language can be designed to better fit these requirements.

The key idea is to integrate code _selection_ with code _specialization_, using generic functions and data-flow type inference. Systems like these can suffer from inefficient compilation, or from uncertainty about what to specialize on. We show that sufficiently powerful type-based dispatch addresses these problems. The resulting language, Julia, achieves a Quine-style "explication by elimination" of many of the productive features technical computing users expect.

@ScottPJones
Contributor

Will there be printed copies of his thesis available for spectators?

@yuyichao
Member
@ScottPJones
Contributor

That's what I'm hoping for... I figure I can get some big bucks on Ebay selling an autographed copy in 10-15 years (when I'll really need it to pay off my kids' college education... esp. if they go where they have said they want to go... 😀 )

@aviks
Member
aviks commented May 12, 2015

Good luck, Jeff!

@Mikewl
Contributor
Mikewl commented May 13, 2015

Good luck! If I was in the area I would be one of the fanboys @ScottPJones mentions

@ViralBShah ViralBShah modified the milestone: 0.4.0, 0.4.x May 13, 2015
@mbauman
Member
mbauman commented May 13, 2015

Break a leg, Jeff!

@johnmyleswhite
Member

Good luck, Jeff!

@xianyi
xianyi commented May 13, 2015

Good luck! @JeffBezanson

@yuyichao
Member

@JeffBezanson Good Luck!!

Hopefully I can get a seat.

@wlbksy
Member
wlbksy commented May 13, 2015

good luck

@MithrandirMiles

I told him he should have booked 10-250!

Sent from my iPhone

On May 13, 2015, at 11:53 AM, Yichao Yu notifications@github.com wrote:

@JeffBezanson Good Luck!!

Hopefully I can get a seat.


Reply to this email directly or view it on GitHub.

@kshyatt
Contributor
kshyatt commented May 13, 2015

Good luck!

@garrison
Member

Ditto!

@ScottPJones
Contributor

Luckily I got in early enough to get a seat... and I was right (as usual 😀) he really should have booked a bigger room!

@ScottPJones
Contributor

img_0994

@yuyichao
Member

Congratulations

@JeffBezanson
Member

Fixed.

@timholy
Member
timholy commented May 13, 2015

There's still one checkbox left unchecked. Better get working.

@Carreau
Carreau commented May 13, 2015

:shipit: 🍹 🍻 🍸 🎉 !

@garrison
Member

👍

@rsrock
Member
rsrock commented May 13, 2015

Congratulations, Dr. Bezanson!

@kmsquire
Member
@nolta
Member
nolta commented May 13, 2015

+1

@catawbasam
Contributor

Bravo

@vtjnash
Member
vtjnash commented May 13, 2015

the last checkbox has been filled now with a Kentucky Bourbon Whiskey.

@Ismael-VC
Contributor

Congratulations Jeff!

On Wed, May 13, 2015 at 1:44 PM, Mike Nolta notifications@github.com
wrote:

+1


Reply to this email directly or view it on GitHub
#8839 (comment).

@cdsousa
Contributor
cdsousa commented May 13, 2015

Congratulations!

@shashi
Contributor
shashi commented May 13, 2015

Congrats Jeff! 🍻 🍻 😄

@mbauman
Member
mbauman commented May 13, 2015

This is wonderful! Congratulations, Jeff. 🎆

@tshort
Member
tshort commented May 13, 2015

Great news. Congratulations, Jeff!

@toivoh
Member
toivoh commented May 13, 2015

Congratulations!

@abhijitiitr

Congrats

@sjkelly
Contributor
sjkelly commented May 13, 2015

Woop woop congrats Jeff!

@aviks
Member
aviks commented May 13, 2015

Congratulations, Dr Bezanson!

@stevengj
Member

Thanks for posting the photo, @ScottPJones. @jiahao, I think you have some photos as well?

@tknopp
Contributor
tknopp commented May 13, 2015

Congratulations Jeff. Your work that ended in this PHD has influenced many of us. It will have and already has a huge impact on scientific computing! Rock on!

@shabbychef
Contributor

congratulations.

@mschauer
Contributor

Great!

@rahuldave
Member

Congrats Jeff!!

@dpsanders
Contributor

Fantastic, congrats!!

@ViralBShah
Member

Congrats Dr. Bezanson!

@ViralBShah
Member

The thesis, for those who'd like to take a peek:

https://github.com/JeffBezanson/phdthesis/blob/master/main.pdf

@tonyhffong
Member

Congrats!

@mattjj
mattjj commented May 14, 2015

Congratulations!

@lindahua
Member

Congratulations!

@Gnimuc
Gnimuc commented May 14, 2015

Congrats 👍

@jiahao
Member
jiahao commented May 14, 2015

I don't know why GitHub decided to rotate all my pictures, but here they are.

2015-05-13 13 11 16

2015-05-13 14 00 17

2015-05-13 14 26 23

2015-05-13 14 45 00

2015-05-13 14 46 56

Issue closing ceremony video: http://youtu.be/hGpLOZX6CEY

@staticfloat
Member

Aleph-zero congratulations to you, Jeff! If you're ever in Seattle, let me buy you a congratulatory drink. :)

@amitmurthy
Member

Congrats Jeff. It feels very nice to celebrate your work as a community.

@tkelman
Member
tkelman commented May 14, 2015

Congratulations!

@jiahao, might be worth re-running the latest world of Julia as supplementary material for the acknowledgements page :)

@JeffBezanson
Member

Thank you everybody.

Unfortunately my thesis is not actually done, but hopefully will be soon...

@ScottPJones
Contributor

I've got a couple more pictures, I'll be sending them on to Jeff after I wake up (for his mother, a very nice lady!) Dr. Bezanson can post them here if he wishes...

@ScottPJones
Contributor

Not done??? Did Gerry ask you to remove all of that bloody "syntax" and just let him write with s-expressions?

@ViralBShah
Member

@JeffBezanson using a mac is not a picture I ever expected to see!

@carlobaldassi
Member

Yay!

@dancasimiro
Contributor

Congratulations

@sherrinm

Well done from all your fans this side of the pond.

@JeffBezanson
Member

Not done??? Did Gerry ask you to remove all of that bloody "syntax" and just let him write with s-expressions?

You nailed it! I'm not kidding. But he will settle for an optional s-expr mode.

@ScottPJones
Contributor

I’d been talking to him after the presentation... he liked your stuff, but really didn’t care for all the syntax... there’s just so much of it... he reminded me of how small the Scheme manual is 😀 When I had him for 6.001 [first semester it was taught], we had to implement a small Scheme interpreter in MacLisp... since Scheme was so small, it was pretty doable...

@ScottPJones
Contributor

and I’m sure you can knock that out pretty quickly, with that bottle of bourbon for company (if there’s any left! ;-) )

@JeffBezanson
Member

Speaking of the length of the scheme manual, it's funny: a large percentage of it is devoted to documenting the behavior of numbers, while in Julia that's defined in libraries. Julia could potentially be a smaller core language than scheme (unless of course you stapled the LLVM spec to it).

For it or against it, there's too much emphasis on syntax! Also check out section 7.1 of http://www.schemers.org/Documents/Standards/R5RS/HTML/. Scheme syntax is more complex than people think!

@ScottPJones
Contributor

Julia could potentially be a smaller core language than scheme (unless of course you stapled the LLVM spec to it).

Yes, Julia has good bones!

@toivoh
Member
toivoh commented May 14, 2015

Time to dig out the old @sexpr macro?

@trezitorul

Congratulations on graduating and on what you pulled out with Julia!

@yurivish
Contributor

🍰 Congratulations!

@stepchowfun
Contributor

Congrats @JeffBezanson!

@dustinvtran

congratulations :)

@interhive

👍

@StefanKarpinski
Member

@boyers! Long time no see!

@stepchowfun
Contributor

miss you guys :)

@nicola-gigante

Congratulations!

Just a question: the PDF does not have links in the ToC nor an browsable index in the pdf metadata.
Why is it so difficult for a MIT-graduated compiler-writer to add \usepackage{hyperref} in his preamble?

@ViralBShah
Member

That is because you are supposed to read the whole thing and not just skip around. ;-)

But yes, hyperref would make this a lot more accessible.

@nicola-gigante

I promise you to read the whole thing if you add the links :P

@rtoip
rtoip commented May 16, 2015

Congratulations. Freedom awaits.

@nlazarevic

Congrats, Dr. Bezanson!

@timholy
Member
timholy commented May 16, 2015

@nicola-gigante, you can make a pull request 😄.

@jpfairbanks

can we lobby for a best dissertation award?

http://awards.acm.org/doctoral_dissertation/nominations.cfm

@mseri
mseri commented May 16, 2015

Congratulation!!!

@ScottPJones
Contributor

@jpfairbanks - good idea! It has to be submitted by his thesis advisor though... pester Alan Edelman, I think...

@jpfairbanks
@ScottPJones
Contributor

Is it limited to 5 letters of support? Also, the real number of people who think he deserves it is at least a few orders of magnitude larger!

@ninjin
Contributor
ninjin commented May 16, 2015

@JeffBezanson: Congratulations!

@alanedelman: I think @jpfairbanks has a good point, Jeff should get nominated for the ACM Doctoral Dissertation Award.

@hayd
Member
hayd commented May 16, 2015

@JeffBezanson Well done and well deserved!

@ScottPJones "In addition, at least 3, and not more than 5, supporting letters should be included from experts in the field who can provide additional insights or evidence of the dissertation’s impact."

@ScottPJones
Contributor

Maybe Gerry will write him one (after he's added the s-expr mode!) 😀

@fabianlischka
Contributor

Hacker News front page :-) Congratulations, Jeff.

@JeffBezanson
Member

@nicola-gigante You're right, I will add hyperref.

Thank you to everybody again. All of your appreciation is the ultimate prize.

@owenversteeg

@jiahao It's because they have EXIF rotation data which browsers don't care about. Most of the time. Chrome only cares about EXIF data if the picture is its own window. Right click and "Open in new tab" to see them the correct orientation.

You can use an EXIF metadata stripper to take that off and rotate them the "proper" way.

@ViralBShah
Member

+1 for the ACM dissertation award nomination.

@ViralBShah
Member

Nomination deadline is Oct 31, 2015.

http://awards.acm.org/doctoral_dissertation/nominations.cfm

I don't like it that ACM requires a copyright transfer to them and exclusive publication rights. It is still valuable and prestigious in any case.

@Funfun
Funfun commented May 17, 2015

Congrats!

@ScottPJones
Contributor

Exclusive publication rights!?! I've got dibs on an autographed copy (or two... need one to sell on e-bay in 10-15 years, another to keep for myself 😀)... got to get that before it's locked up by the ACM!
(Of course, that probably means that it would be available on the ACM's digital library for members)

@alanedelman
Contributor

once dissertation is handed to me to sign I have every intention
of nominating the dissertation for various awards :-)

On Sun, May 17, 2015 at 8:49 AM, Scott P. Jones notifications@github.com
wrote:

Exclusive publication rights!?! I've got dibs on an autographed copy (or
two... need one to sell on e-bay in 10-15 years, another to keep for myself [image:
😀])... got to get that before it's locked up by the ACM!
(Of course, that probably means that it would be available on the ACM's
digital library for members)


Reply to this email directly or view it on GitHub
#8839 (comment).

@ScottPJones
Contributor

No comments from @JeffBezanson... I hope he's celebrating (more checking off his last box), and hasn't been locked in a room somewhere by GJS until he gets an s-expr mode working!

@dikshie
dikshie commented May 17, 2015

Congratulations!

@mbaz
Contributor
mbaz commented May 19, 2015

Congratulations, @JeffBezanson!

@milktrader
Member

(belated) Congrats!

@JeffBezanson
Member

Update: just submitted the document. The submitted version is https://github.com/JeffBezanson/phdthesis/tree/876be73a5aab9b034fac3eb9ea9d8f96713f786f .

I'm sure it's deficient in many ways. Oh well. Hopefully there is nothing really bad in there.

@johnmyleswhite
Member

Nice. Congrats on truly finishing your PhD.

@Keno
Member
Keno commented May 20, 2015

Congratulations, Jeff! Great accomplishment. I'm sure you're glad it's over.

@StefanKarpinski
Member

party

@timholy
Member
timholy commented May 20, 2015

I've already found your thesis fills some holes in Julia's documentation, so it's clearly going to be quite useful. All documents come with flaws; the rest of us are impressed by the strengths! Congrats!

@milktrader
Member

And thank you for sharing it!

@ssfrr
Contributor
ssfrr commented May 20, 2015

Woot! Congratulations on finishing the defense and thesis, and all the other work that they signify and represent. It's nice to have this milestone to recognize all the things you've accomplished. I hope you're proud!

@ViralBShah
Member

@JeffBezanson I hope you are planning to take a few days off to chill and celebrate, or perhaps you are planning to celebrate with type system overhaul or something. :-)

Nice that the new pdf has hyperref - for those wanting to browse. We should also get it up on the julia publications list.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment