New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pure function notation #414

Closed
si14 opened this Issue Feb 20, 2012 · 45 comments

Comments

Projects
None yet
@si14

si14 commented Feb 20, 2012

There are a lot of optimization opportunities coming from the ability to distinguish pure functions from unpure. The most important one (IMO) is deforestation: when you have something like sum(transpose(transpose(generate_random_matrix(M, N)) * SCALAR)), you can optimize away an entire matrix construction if you know that sum consumes matrix, generate_random_matrix builds one and all functions that was applied to generated matrix are pure. So it would be handy to have an ability to say that the particular function is pure.

@ViralBShah

This comment has been minimized.

Show comment
Hide comment
@ViralBShah

ViralBShah Feb 20, 2012

Member

We have had a fair amount of discussion in the past about this. It's good to have this issue to discuss so that it remains on the priority list.

Member

ViralBShah commented Feb 20, 2012

We have had a fair amount of discussion in the past about this. It's good to have this issue to discuss so that it remains on the priority list.

@StefanKarpinski

This comment has been minimized.

Show comment
Hide comment
@StefanKarpinski

StefanKarpinski Feb 20, 2012

Member

@si14: do you mean the ability to annotate a function as pure or for the compiler to determine that the function is pure?

Member

StefanKarpinski commented Feb 20, 2012

@si14: do you mean the ability to annotate a function as pure or for the compiler to determine that the function is pure?

@StefanKarpinski

This comment has been minimized.

Show comment
Hide comment
@StefanKarpinski

StefanKarpinski Feb 20, 2012

Member

Oh, nevermind. I just read the issue subject and it's clear. I once proposed procedure for impure and function for pure, but it was deemed that it would drive users coming from many programming languages and backgrounds completely crazy.

Member

StefanKarpinski commented Feb 20, 2012

Oh, nevermind. I just read the issue subject and it's clear. I once proposed procedure for impure and function for pure, but it was deemed that it would drive users coming from many programming languages and backgrounds completely crazy.

@JeffBezanson

This comment has been minimized.

Show comment
Hide comment
@JeffBezanson

JeffBezanson Feb 20, 2012

Member

Yeah, it's good to be able to declare functions pure (optional, unchecked declaration). Even C has that now.

Member

JeffBezanson commented Feb 20, 2012

Yeah, it's good to be able to declare functions pure (optional, unchecked declaration). Even C has that now.

@ViralBShah

This comment has been minimized.

Show comment
Hide comment
@ViralBShah

ViralBShah Feb 20, 2012

Member

I guess the interface could be @pure function.... What happens if functions marked pure are actually not pure? Would the program run slower, or would it give wrong answers, or undefined behaviour?

-viral

On 20-Feb-2012, at 11:53 PM, JeffBezanson wrote:

Yeah, it's good to be able to declare functions pure (optional, unchecked declaration). Even C has that now.


Reply to this email directly or view it on GitHub:
#414 (comment)

Member

ViralBShah commented Feb 20, 2012

I guess the interface could be @pure function.... What happens if functions marked pure are actually not pure? Would the program run slower, or would it give wrong answers, or undefined behaviour?

-viral

On 20-Feb-2012, at 11:53 PM, JeffBezanson wrote:

Yeah, it's good to be able to declare functions pure (optional, unchecked declaration). Even C has that now.


Reply to this email directly or view it on GitHub:
#414 (comment)

@ivanmantova

This comment has been minimized.

Show comment
Hide comment
@ivanmantova

ivanmantova Feb 20, 2012

Contributor

If we can't be sure the procedure is indeed pure, we wouldn't be able to optimize it without risking (potentially really bad) undefined behavior.

As for possible optimizations, something like C++ AMP comes to mind.

Contributor

ivanmantova commented Feb 20, 2012

If we can't be sure the procedure is indeed pure, we wouldn't be able to optimize it without risking (potentially really bad) undefined behavior.

As for possible optimizations, something like C++ AMP comes to mind.

@StefanKarpinski

This comment has been minimized.

Show comment
Hide comment
@StefanKarpinski

StefanKarpinski Feb 20, 2012

Member

Pure functions would only be allowed to call other pure functions — calling and impure function would be an error (probably run-time, but maybe compile-time). That way you can be sure a function is pure. Each intrinsic would need to be marked as pure or impure; likewise, one would want to be able to indicate whether each ccall is pure or impure. Of course one could lie or be mistaken there, in which case all hell could break loose, but that's what you get for lying.

Member

StefanKarpinski commented Feb 20, 2012

Pure functions would only be allowed to call other pure functions — calling and impure function would be an error (probably run-time, but maybe compile-time). That way you can be sure a function is pure. Each intrinsic would need to be marked as pure or impure; likewise, one would want to be able to indicate whether each ccall is pure or impure. Of course one could lie or be mistaken there, in which case all hell could break loose, but that's what you get for lying.

@si14

This comment has been minimized.

Show comment
Hide comment
@si14

si14 Feb 21, 2012

@StefanKarpinski
The question here is balance between let's say Haskell (where all of your functions have controlled side effects and they are reflected in signatures) and JS (where you don't have any control on types). Your architecture looks sane enough (quite easy to implement and a lot of errors would be handled at compile time). Moreover, it can be expanded to an arbitrary "effect system" when you can annotate functions with "effects" like "memory allocation", "mutates arguments" or "can throw exception", with effects propagating through the call stack (the only problem here is that such feature would probably cause a lot of noise in the source file without proper effect inference — maybe one can generate "intermediate" file with all inferred types and effects shown so programmer can check if everything is right). There are a lot of possibilities in the effect system world, for example you can warn your user if he puts some "allocating" function in the tight loop (and this is clearly bad for performance).

si14 commented Feb 21, 2012

@StefanKarpinski
The question here is balance between let's say Haskell (where all of your functions have controlled side effects and they are reflected in signatures) and JS (where you don't have any control on types). Your architecture looks sane enough (quite easy to implement and a lot of errors would be handled at compile time). Moreover, it can be expanded to an arbitrary "effect system" when you can annotate functions with "effects" like "memory allocation", "mutates arguments" or "can throw exception", with effects propagating through the call stack (the only problem here is that such feature would probably cause a lot of noise in the source file without proper effect inference — maybe one can generate "intermediate" file with all inferred types and effects shown so programmer can check if everything is right). There are a lot of possibilities in the effect system world, for example you can warn your user if he puts some "allocating" function in the tight loop (and this is clearly bad for performance).

@si14

This comment has been minimized.

Show comment
Hide comment
@si14

si14 Feb 21, 2012

Here is an example of what I mean: #249 — this "!" is in fact an "effect".

si14 commented Feb 21, 2012

Here is an example of what I mean: #249 — this "!" is in fact an "effect".

@si14

This comment has been minimized.

Show comment
Hide comment
@si14

si14 Feb 21, 2012

And here is an example of "deforestation" optimization: http://arma.sourceforge.net/ . This library uses C++ templates to implement such optimization, and as you can see here http://arma.sourceforge.net/speed.html it's fast.

si14 commented Feb 21, 2012

And here is an example of "deforestation" optimization: http://arma.sourceforge.net/ . This library uses C++ templates to implement such optimization, and as you can see here http://arma.sourceforge.net/speed.html it's fast.

@StefanKarpinski

This comment has been minimized.

Show comment
Hide comment
@StefanKarpinski

StefanKarpinski Feb 21, 2012

Member

Armadillo is really, really interesting. We've talked a lot about doing things like this to avoid creation of temporaries when executing linear algebra code. The main limiting factor actually has been a collective aversion to doing things that are not dead obvious and dead simple — I'd much rather give the programmer the ability to write this easily in an explicit way than to have the compiler do it automatically in a clever way that might not be obvious to the programmer.

At some point (a while ago), I was arguing for the ! at the end of function names to be enforced — effectively what we're talking about with pure versus impure. The idea was that all functions without a ! at the end would be required to be pure while functions with ! at the end would be allowed to be impure. This idea didn't go over too well, however, I think largely because it was deemed to draconian to force people to put ! everywhere that a side-effect might happen. I wasn't even talking about I/O either, just things that affect global state.

Member

StefanKarpinski commented Feb 21, 2012

Armadillo is really, really interesting. We've talked a lot about doing things like this to avoid creation of temporaries when executing linear algebra code. The main limiting factor actually has been a collective aversion to doing things that are not dead obvious and dead simple — I'd much rather give the programmer the ability to write this easily in an explicit way than to have the compiler do it automatically in a clever way that might not be obvious to the programmer.

At some point (a while ago), I was arguing for the ! at the end of function names to be enforced — effectively what we're talking about with pure versus impure. The idea was that all functions without a ! at the end would be required to be pure while functions with ! at the end would be allowed to be impure. This idea didn't go over too well, however, I think largely because it was deemed to draconian to force people to put ! everywhere that a side-effect might happen. I wasn't even talking about I/O either, just things that affect global state.

@si14

This comment has been minimized.

Show comment
Hide comment
@si14

si14 Feb 22, 2012

@StefanKarpinski , the thing that you are talking about is somewhat different from doing IO, for example. There are different aspects of "purity" and it would be better to have a uniform way to denote different aspects of function behaviour. Let me show you an example in a form of pseudocode.

def first(&a)[mutates]:
    a += 1

def second()[io]:
    print "hello world"

def third()[???]:
    a = 1
    first(a)
    second()

Let's think about this ???. Should we propogate "mutates" effect to the third function? Clearly not. What about "io" effect? Looks like so, because if we now that function will make some IO we can't optimize it away because of unused return, for example. Both first and second are impure, but their impurities are different and so does optimization that can be done on them.
Another examples of impurity are memory allocation or exceptions that can be thrown. And, again, that should be handled by compiler in a different ways.

si14 commented Feb 22, 2012

@StefanKarpinski , the thing that you are talking about is somewhat different from doing IO, for example. There are different aspects of "purity" and it would be better to have a uniform way to denote different aspects of function behaviour. Let me show you an example in a form of pseudocode.

def first(&a)[mutates]:
    a += 1

def second()[io]:
    print "hello world"

def third()[???]:
    a = 1
    first(a)
    second()

Let's think about this ???. Should we propogate "mutates" effect to the third function? Clearly not. What about "io" effect? Looks like so, because if we now that function will make some IO we can't optimize it away because of unused return, for example. Both first and second are impure, but their impurities are different and so does optimization that can be done on them.
Another examples of impurity are memory allocation or exceptions that can be thrown. And, again, that should be handled by compiler in a different ways.

@StefanKarpinski

This comment has been minimized.

Show comment
Hide comment
@StefanKarpinski

StefanKarpinski Feb 22, 2012

Member

Annotating things in that much detail is a non-starter. It's way too much book-keeping to foist on the programmer. The language has to be usable by people who don't know and don't care what different kinds of functional impurity are. On the other hand, if the compiler can compute and track all of these things for the programmer, then it's cool.

Member

StefanKarpinski commented Feb 22, 2012

Annotating things in that much detail is a non-starter. It's way too much book-keeping to foist on the programmer. The language has to be usable by people who don't know and don't care what different kinds of functional impurity are. On the other hand, if the compiler can compute and track all of these things for the programmer, then it's cool.

@si14

This comment has been minimized.

Show comment
Hide comment
@si14

si14 Feb 22, 2012

@StefanKarpinski Of course compiler can infer effects, it's the whole point :) Moreover, compiler should do it even if programmer explicitly annotate function in some way to control if programmer was mistaken. Look, rules for inferring effects are kinda easy (though different for different effects): some will simply propagate through functions (like "allocation" one — you can't build a function that uses allocating function and don't allocate itself), some will be little harder to infer (like "mutates it's arguments" — "outer" function will be "mutative" only if it passes it's arguments to "mutative" function), but it's definitely useful.
BTW, we can meet in IRC to discuss in more realtime fashion :)

si14 commented Feb 22, 2012

@StefanKarpinski Of course compiler can infer effects, it's the whole point :) Moreover, compiler should do it even if programmer explicitly annotate function in some way to control if programmer was mistaken. Look, rules for inferring effects are kinda easy (though different for different effects): some will simply propagate through functions (like "allocation" one — you can't build a function that uses allocating function and don't allocate itself), some will be little harder to infer (like "mutates it's arguments" — "outer" function will be "mutative" only if it passes it's arguments to "mutative" function), but it's definitely useful.
BTW, we can meet in IRC to discuss in more realtime fashion :)

@StefanKarpinski

This comment has been minimized.

Show comment
Hide comment
@StefanKarpinski

StefanKarpinski Feb 22, 2012

Member

I'll hop on IRC at some point, but can't today. I'm on vacation in Buenos Aires...

Member

StefanKarpinski commented Feb 22, 2012

I'll hop on IRC at some point, but can't today. I'm on vacation in Buenos Aires...

@o-jasper

This comment has been minimized.

Show comment
Hide comment
@o-jasper

o-jasper Aug 3, 2012

Member

What about 'semipure' functions. For instance, memoized functions;

memoize_hash = dict((Number,),(Number,))
function memoized_function(x::Number)
   val = get(memoize_hash,x, nothing) #hmm, in the real world, maybe you want to round x instead
   if isnothing(val)
     val = very_expensive_function(x)
     assign(memoize_hash,x, val)
   end
   return val
end

It isn't thread-safe, but f(memoized_function(x),memoized_function(x)) would be better off converted to tmp=memoized_function(x); f(tmp,tmp), it isn't really destructive either.

Member

o-jasper commented Aug 3, 2012

What about 'semipure' functions. For instance, memoized functions;

memoize_hash = dict((Number,),(Number,))
function memoized_function(x::Number)
   val = get(memoize_hash,x, nothing) #hmm, in the real world, maybe you want to round x instead
   if isnothing(val)
     val = very_expensive_function(x)
     assign(memoize_hash,x, val)
   end
   return val
end

It isn't thread-safe, but f(memoized_function(x),memoized_function(x)) would be better off converted to tmp=memoized_function(x); f(tmp,tmp), it isn't really destructive either.

@diegozea

This comment has been minimized.

Show comment
Hide comment
@diegozea

diegozea Feb 6, 2013

Member

Maybe a silly comment (sorry if it is), but... Maybe @pure can check for not I/O operations, for only use arguments and variables defined on the function or const if are global and for the absent of assignments or applications of mutating functions into the arguments and trow and error if something of this is break. Is there more cases of side effects?

@StefanKarpinski I hope did you enjoy your vacations here? :)

Member

diegozea commented Feb 6, 2013

Maybe a silly comment (sorry if it is), but... Maybe @pure can check for not I/O operations, for only use arguments and variables defined on the function or const if are global and for the absent of assignments or applications of mutating functions into the arguments and trow and error if something of this is break. Is there more cases of side effects?

@StefanKarpinski I hope did you enjoy your vacations here? :)

@stevengj

This comment has been minimized.

Show comment
Hide comment
@stevengj

stevengj Jun 13, 2013

Member

This discussion is getting out of control. It's way more important to have an unchecked pure annotation first; you can worry about compiler checking later. If the user annotates an impure function as "pure" then the results are simply undefined.

The ability to annotate pure functions is incredibly important in numerical applications in order to implement constant-folding optimizations, because constant expressions like float64(1//3) or sin(3) are very common in numerical code.

Hence pure should mean, "with constant arguments, this function may be safely evaluated at compile-time." (This is essentially the usual meaning of pure function, of course, not counting memory allocation as a side-effect.)

(For the same reason, I suspect that this needs to be a keyword recognized by the parser, not a macro, so that the pure annotation can be passed through to the compiler.)

Member

stevengj commented Jun 13, 2013

This discussion is getting out of control. It's way more important to have an unchecked pure annotation first; you can worry about compiler checking later. If the user annotates an impure function as "pure" then the results are simply undefined.

The ability to annotate pure functions is incredibly important in numerical applications in order to implement constant-folding optimizations, because constant expressions like float64(1//3) or sin(3) are very common in numerical code.

Hence pure should mean, "with constant arguments, this function may be safely evaluated at compile-time." (This is essentially the usual meaning of pure function, of course, not counting memory allocation as a side-effect.)

(For the same reason, I suspect that this needs to be a keyword recognized by the parser, not a macro, so that the pure annotation can be passed through to the compiler.)

@stevengj

This comment has been minimized.

Show comment
Hide comment
@stevengj

stevengj Jun 13, 2013

Member

(By the way, it seems to me that @si14's original example is not pure, because pseudo-random-number generation both depends on and modifies global state, assuming generate_random_matrix is pseudorandom. But I suppose "random_matrix" was only meant in the informal sense of "some arbitrary constant matrix".)

Member

stevengj commented Jun 13, 2013

(By the way, it seems to me that @si14's original example is not pure, because pseudo-random-number generation both depends on and modifies global state, assuming generate_random_matrix is pseudorandom. But I suppose "random_matrix" was only meant in the informal sense of "some arbitrary constant matrix".)

@si14

This comment has been minimized.

Show comment
Hide comment
@si14

si14 Jun 16, 2013

@stevengj you are definitely right about purity of generate_random_matrix :)

si14 commented Jun 16, 2013

@stevengj you are definitely right about purity of generate_random_matrix :)

@StefanKarpinski

This comment has been minimized.

Show comment
Hide comment
@StefanKarpinski

StefanKarpinski Dec 20, 2013

Member

I think we should learn from Rust's experience and just close this issue:

http://thread.gmane.org/gmane.comp.lang.rust.devel/3674/focus=3855

If this couldn't be made to work in Rust, which is statically compiled and relatively much more willing to make its users just through hoops (their target: professional programmers who write large, complex systems software; our target: scientists), then there's no way this will ever fly in Julia.

Member

StefanKarpinski commented Dec 20, 2013

I think we should learn from Rust's experience and just close this issue:

http://thread.gmane.org/gmane.comp.lang.rust.devel/3674/focus=3855

If this couldn't be made to work in Rust, which is statically compiled and relatively much more willing to make its users just through hoops (their target: professional programmers who write large, complex systems software; our target: scientists), then there's no way this will ever fly in Julia.

@stevengj

This comment has been minimized.

Show comment
Hide comment
@stevengj

stevengj Dec 20, 2013

Member

Note that Rust apparently made everything pure by default and required a keyword for non-pure functions, according to that post, which proved unworkable for them. That would definitely not fly for our audience, I agree. Nor, as I argued above, should we worry about compiler-enforced purity (any more than Julia enforces @inbounds).

But there have been plenty of working examples of optional pure annotations for functions in other languages. For example, __attribute__((pure)) in gcc. Why is gcc such a terrible model to follow?

Member

stevengj commented Dec 20, 2013

Note that Rust apparently made everything pure by default and required a keyword for non-pure functions, according to that post, which proved unworkable for them. That would definitely not fly for our audience, I agree. Nor, as I argued above, should we worry about compiler-enforced purity (any more than Julia enforces @inbounds).

But there have been plenty of working examples of optional pure annotations for functions in other languages. For example, __attribute__((pure)) in gcc. Why is gcc such a terrible model to follow?

@stevengj

This comment has been minimized.

Show comment
Hide comment
@stevengj

stevengj Dec 20, 2013

Member

It would be a shame for x::FloatingPoint + (2//3) to always be slow in Julia because the floating-point conversion of 2//3 cannot be performed at compile-time.

Member

stevengj commented Dec 20, 2013

It would be a shame for x::FloatingPoint + (2//3) to always be slow in Julia because the floating-point conversion of 2//3 cannot be performed at compile-time.

@JeffBezanson

This comment has been minimized.

Show comment
Hide comment
@JeffBezanson

JeffBezanson Dec 20, 2013

Member

Impure-by-default with a pure keyword would certainly be the way to go. But the other points in the Rust discussion are relevant: for example [1,2] is pretty darn pure, but can't be evaluated fully at compile time since it creates a new mutable data structure on each call.

Member

JeffBezanson commented Dec 20, 2013

Impure-by-default with a pure keyword would certainly be the way to go. But the other points in the Rust discussion are relevant: for example [1,2] is pretty darn pure, but can't be evaluated fully at compile time since it creates a new mutable data structure on each call.

@stevengj

This comment has been minimized.

Show comment
Hide comment
@stevengj

stevengj Dec 20, 2013

Member

@JeffBezanson, I still don't see the problem. It's the programmer's responsibility to use pure safely. Julia should be free to assume that anything that is declared pure and evaluated with (for example) pointer-free immutable constant arguments can be evaluated at compile time (along with implications for loop hoisting and CSE, but IMO compile-time constant expressions are the most important), and if this causes a problem it is programmer error.

e.g. declaring a constructor for a mutable type to be pure would be be a programmer error, no different from using @inbounds on an expression that has not been previously bounds-checked in some way.

As a result, the applications of pure will be fairly restricted, but the cases where they apply will be extremely useful (as in the rational-constant examples above).

Member

stevengj commented Dec 20, 2013

@JeffBezanson, I still don't see the problem. It's the programmer's responsibility to use pure safely. Julia should be free to assume that anything that is declared pure and evaluated with (for example) pointer-free immutable constant arguments can be evaluated at compile time (along with implications for loop hoisting and CSE, but IMO compile-time constant expressions are the most important), and if this causes a problem it is programmer error.

e.g. declaring a constructor for a mutable type to be pure would be be a programmer error, no different from using @inbounds on an expression that has not been previously bounds-checked in some way.

As a result, the applications of pure will be fairly restricted, but the cases where they apply will be extremely useful (as in the rational-constant examples above).

@JeffBezanson

This comment has been minimized.

Show comment
Hide comment
@JeffBezanson

JeffBezanson Dec 20, 2013

Member

I do think we could get away with the definition that x===y => f(x)===f(y) and f doesn't do I/O or write to global state.

Member

JeffBezanson commented Dec 20, 2013

I do think we could get away with the definition that x===y => f(x)===f(y) and f doesn't do I/O or write to global state.

@stevengj stevengj reopened this Dec 20, 2013

@stevengj

This comment has been minimized.

Show comment
Hide comment
@stevengj

stevengj Dec 20, 2013

Member

That definition seems very clean.

Member

stevengj commented Dec 20, 2013

That definition seems very clean.

@StefanKarpinski

This comment has been minimized.

Show comment
Hide comment
@StefanKarpinski

StefanKarpinski Dec 20, 2013

Member

In that case, we'd need to distinguish pure and impure ccalls.

Member

StefanKarpinski commented Dec 20, 2013

In that case, we'd need to distinguish pure and impure ccalls.

@JeffBezanson

This comment has been minimized.

Show comment
Hide comment
@JeffBezanson

JeffBezanson Dec 20, 2013

Member

I think we could leave it as a julia function attribute, and the compiler would just trust that it's correct. That would handle wrapped functions like sin. Wanting to do a direct ccall and say that it's pure is a pretty specialized case.

Member

JeffBezanson commented Dec 20, 2013

I think we could leave it as a julia function attribute, and the compiler would just trust that it's correct. That would handle wrapped functions like sin. Wanting to do a direct ccall and say that it's pure is a pretty specialized case.

@StefanKarpinski

This comment has been minimized.

Show comment
Hide comment
@StefanKarpinski

StefanKarpinski Dec 20, 2013

Member

Yeah, ok, fair enough.

Member

StefanKarpinski commented Dec 20, 2013

Yeah, ok, fair enough.

@StefanKarpinski

This comment has been minimized.

Show comment
Hide comment
@StefanKarpinski

StefanKarpinski Dec 21, 2013

Member

I'm thinking that this would have to be at the level of generic functions, since having it be per-method is a bit of a mess. Together with being able to annotate overall generic function type behavior, it seems like we're ripe for some function-level declaration/annotation ability.

Member

StefanKarpinski commented Dec 21, 2013

I'm thinking that this would have to be at the level of generic functions, since having it be per-method is a bit of a mess. Together with being able to annotate overall generic function type behavior, it seems like we're ripe for some function-level declaration/annotation ability.

@lindahua

This comment has been minimized.

Show comment
Hide comment
@lindahua

lindahua Dec 21, 2013

Member

From a pragmatic standpoint, if we allow people to declare pure functions and just trust them (and leverage this information to do code optimization), we'd better have a mechanism to turn off the pure-function-based optimization globally or for individual functions.

In practice, you can expect people to make a lot of mistakes (e.g. declaring things to be pure when they are not), which would, as you may imagine, result in disasters, especially for inexperienced programmers (e.g. the compiler will generate error messages that are hard to figure out how they are related to the actual code, or it simply do not raise errors, but yield very weird results). Being able to turn off the such mechanism may help debugging.

BTW, this is also needed for things like @inbounds.

Member

lindahua commented Dec 21, 2013

From a pragmatic standpoint, if we allow people to declare pure functions and just trust them (and leverage this information to do code optimization), we'd better have a mechanism to turn off the pure-function-based optimization globally or for individual functions.

In practice, you can expect people to make a lot of mistakes (e.g. declaring things to be pure when they are not), which would, as you may imagine, result in disasters, especially for inexperienced programmers (e.g. the compiler will generate error messages that are hard to figure out how they are related to the actual code, or it simply do not raise errors, but yield very weird results). Being able to turn off the such mechanism may help debugging.

BTW, this is also needed for things like @inbounds.

@lindahua

This comment has been minimized.

Show comment
Hide comment
@lindahua

lindahua Dec 21, 2013

Member

Maybe we should have a debug mode or something, where all code optimization based on user declarations (@inbounds, pure functions, etc) are disabled.

Member

lindahua commented Dec 21, 2013

Maybe we should have a debug mode or something, where all code optimization based on user declarations (@inbounds, pure functions, etc) are disabled.

@stevengj

This comment has been minimized.

Show comment
Hide comment
@stevengj

stevengj Dec 22, 2013

Member

@StefanKarpinski, following Jeff's suggested definition of pure, then it has to be per-method and not per-generic function. sin(::Float64) is pure, but sin(::Array{Float64}) is not (because it allocates a new array).

Member

stevengj commented Dec 22, 2013

@StefanKarpinski, following Jeff's suggested definition of pure, then it has to be per-method and not per-generic function. sin(::Float64) is pure, but sin(::Array{Float64}) is not (because it allocates a new array).

@stevengj

This comment has been minimized.

Show comment
Hide comment
@stevengj

stevengj Dec 22, 2013

Member

@lindahua, +1 on a "safe" mode. (I would prefer "safe" to "debug".)

Member

stevengj commented Dec 22, 2013

@lindahua, +1 on a "safe" mode. (I would prefer "safe" to "debug".)

@StefanKarpinski

This comment has been minimized.

Show comment
Hide comment
@StefanKarpinski

StefanKarpinski Dec 22, 2013

Member

@StefanKarpinski, following Jeff's suggested definition of pure, then it has to be per-method and not per-generic function. sin(::Float64) is pure, but sin(::Array{Float64}) is not (because it allocates a new array).

Yes, that's kind of my point. When writing any code that's generic, you can very easily find yourself in a situation where you have to split the exact same definition into pure and impure versions based on very subtle differences. Nobody is realistically going to do that. At that point, purity is a property that's better off automatically inferred rather than manually annotated. The only thing standing in the way of pretty decent bottom-up inference of purity, at least in situations where we've figured out exactly what method gets called, is annotation of pure vs. impure ccalls.

Member

StefanKarpinski commented Dec 22, 2013

@StefanKarpinski, following Jeff's suggested definition of pure, then it has to be per-method and not per-generic function. sin(::Float64) is pure, but sin(::Array{Float64}) is not (because it allocates a new array).

Yes, that's kind of my point. When writing any code that's generic, you can very easily find yourself in a situation where you have to split the exact same definition into pure and impure versions based on very subtle differences. Nobody is realistically going to do that. At that point, purity is a property that's better off automatically inferred rather than manually annotated. The only thing standing in the way of pretty decent bottom-up inference of purity, at least in situations where we've figured out exactly what method gets called, is annotation of pure vs. impure ccalls.

@JeffBezanson

This comment has been minimized.

Show comment
Hide comment
@JeffBezanson

JeffBezanson Dec 22, 2013

Member

That is a good point; f(x) = sin(x) could not be declared pure.

Member

JeffBezanson commented Dec 22, 2013

That is a good point; f(x) = sin(x) could not be declared pure.

@GlenHertz

This comment has been minimized.

Show comment
Hide comment
@GlenHertz

GlenHertz Dec 23, 2013

Contributor

Doesn't sin(::Float64) allocate memory too (64 bytes)? If that is the criteria then purity wouldn't be that common.

Contributor

GlenHertz commented Dec 23, 2013

Doesn't sin(::Float64) allocate memory too (64 bytes)? If that is the criteria then purity wouldn't be that common.

@JeffBezanson

This comment has been minimized.

Show comment
Hide comment
@JeffBezanson

JeffBezanson Dec 23, 2013

Member

That's the beauty of ===. It takes care of most of these issues if you just
go by my definition above. Allocating memory is not considered a side
effect by itself; the caller has to be able to observe the difference.
Also in most uses sin will not allocate anything; it is probably boxing
some values due to the repl.
On Dec 22, 2013 10:18 PM, "Glen Hertz" notifications@github.com wrote:

Doesn't sin(::Float64) allocate memory too (64 bytes)? If that is the
criteria then purity wouldn't be that common.


Reply to this email directly or view it on GitHubhttps://github.com/JuliaLang/julia/issues/414#issuecomment-31102763
.

Member

JeffBezanson commented Dec 23, 2013

That's the beauty of ===. It takes care of most of these issues if you just
go by my definition above. Allocating memory is not considered a side
effect by itself; the caller has to be able to observe the difference.
Also in most uses sin will not allocate anything; it is probably boxing
some values due to the repl.
On Dec 22, 2013 10:18 PM, "Glen Hertz" notifications@github.com wrote:

Doesn't sin(::Float64) allocate memory too (64 bytes)? If that is the
criteria then purity wouldn't be that common.


Reply to this email directly or view it on GitHubhttps://github.com/JuliaLang/julia/issues/414#issuecomment-31102763
.

@RauliRuohonen

This comment has been minimized.

Show comment
Hide comment
@RauliRuohonen

RauliRuohonen Dec 23, 2013

I think Jeff's rule is sort of right and sort of wrong - it's basically what the compiler should pretend the "pure" declaration means, with the understanding that some functions declared pure will not actually obey the rule. The point is to enable optimizations, so when the programmer declares a function to be pure, it should mean that the compiler is free to evaluate it as many times as it wishes whenever it wishes for a particular argument, and at any time return any one of those evaluations in whatever arbitrary way, and the programmer asserts that the program won't break because of it. It should also be method-specific, not generic-function-specific (this is an optimization, so the narrow claim is safer and sometimes the wider claim won't even be possible).

It is entirely reasonable to declare sin(::Array{Float64})) pure. The result won't always be a new copy depending on CSE optimizations, but then again, that's not anything new with array expressions. In e.g. Python if you do A[1:3], you won't get a copy either, and people accept that for performance reasons, using A[1:3].copy() when they really want a copy. That's not in principle any different from sin(A) not returning a new unshared array, and you can still use sin(A).copy() if desired. I'm not saying that the base sin() function should do this, just that it is not unreasonable to declare the implementation of such a function pure. If the base version does not do that, it is easy enough to implement a module with pure-declared wrappers of functions when such semantics are desired. If the compiler simply accepts the pure declaration at face value without any checking, it leaves the users the ability to implement functions with that kind of API. Even Haskell has the unsafePerformIO escape hatch for when you want something to be declared pure but not really technically be so.

I think Jeff's rule is sort of right and sort of wrong - it's basically what the compiler should pretend the "pure" declaration means, with the understanding that some functions declared pure will not actually obey the rule. The point is to enable optimizations, so when the programmer declares a function to be pure, it should mean that the compiler is free to evaluate it as many times as it wishes whenever it wishes for a particular argument, and at any time return any one of those evaluations in whatever arbitrary way, and the programmer asserts that the program won't break because of it. It should also be method-specific, not generic-function-specific (this is an optimization, so the narrow claim is safer and sometimes the wider claim won't even be possible).

It is entirely reasonable to declare sin(::Array{Float64})) pure. The result won't always be a new copy depending on CSE optimizations, but then again, that's not anything new with array expressions. In e.g. Python if you do A[1:3], you won't get a copy either, and people accept that for performance reasons, using A[1:3].copy() when they really want a copy. That's not in principle any different from sin(A) not returning a new unshared array, and you can still use sin(A).copy() if desired. I'm not saying that the base sin() function should do this, just that it is not unreasonable to declare the implementation of such a function pure. If the base version does not do that, it is easy enough to implement a module with pure-declared wrappers of functions when such semantics are desired. If the compiler simply accepts the pure declaration at face value without any checking, it leaves the users the ability to implement functions with that kind of API. Even Haskell has the unsafePerformIO escape hatch for when you want something to be declared pure but not really technically be so.

@JeffBezanson

This comment has been minimized.

Show comment
Hide comment
@JeffBezanson

JeffBezanson Dec 23, 2013

Member

There's a difference between a function like A[1:3] reliably referencing the same underlying data, vs. a function that may or may not return a new array based on what the compiler did.

Member

JeffBezanson commented Dec 23, 2013

There's a difference between a function like A[1:3] reliably referencing the same underlying data, vs. a function that may or may not return a new array based on what the compiler did.

@RauliRuohonen

This comment has been minimized.

Show comment
Hide comment
@RauliRuohonen

RauliRuohonen Dec 23, 2013

True, but still there is a common use case where you'd want a pure-declared sin(::Array{Float64})) to enable CSE and compile-time evaluation, and it is entirely safe. This happens when you write entirely pure mathematical functions, where by "pure" I mean pure-as-in-Haskell. You don't modify anything, so you don't care whether something is a reference or a copy. You could have a module with pure-declared versions of functions like sin() designed for that use case. If the final result of such a computation is an array (instead of e.g. a scalar) and you want to ensure that it is a new one, then you can do a copy as a final step to hide any optimization effects.

Basically what I'm trying to say that the simplest implementation of "pure" is probably the most useful one: the programmer asserts that the compiler can do its optimizations with no checking, and that's it, much like it is with @inbounds.

True, but still there is a common use case where you'd want a pure-declared sin(::Array{Float64})) to enable CSE and compile-time evaluation, and it is entirely safe. This happens when you write entirely pure mathematical functions, where by "pure" I mean pure-as-in-Haskell. You don't modify anything, so you don't care whether something is a reference or a copy. You could have a module with pure-declared versions of functions like sin() designed for that use case. If the final result of such a computation is an array (instead of e.g. a scalar) and you want to ensure that it is a new one, then you can do a copy as a final step to hide any optimization effects.

Basically what I'm trying to say that the simplest implementation of "pure" is probably the most useful one: the programmer asserts that the compiler can do its optimizations with no checking, and that's it, much like it is with @inbounds.

@StefanKarpinski

This comment has been minimized.

Show comment
Hide comment
@StefanKarpinski

StefanKarpinski Dec 23, 2013

Member

Haskell is a completely off-base model here since you can't semantically modify anything. In Julia – or any language with semantically mutable arrays – deciding arbitrarily whether to return a copy or the same array is totally nuts.

Member

StefanKarpinski commented Dec 23, 2013

Haskell is a completely off-base model here since you can't semantically modify anything. In Julia – or any language with semantically mutable arrays – deciding arbitrarily whether to return a copy or the same array is totally nuts.

@stevengj

This comment has been minimized.

Show comment
Hide comment
@stevengj

stevengj Oct 13, 2015

Member

With #13555, there is now an (unchecked, unexported) @pure annotation that allows functions to be evaluated at compile-time, but it needs several compiler optimizations to be hooked into it in order to be generally useful (e.g. CSE, loop-invariance, constant propagation, and to interact properly with inlining).

It would also be nice to have a julia --safe flag to turn off this and other unsafe annotations. Or maybe just --pure=no.

Member

stevengj commented Oct 13, 2015

With #13555, there is now an (unchecked, unexported) @pure annotation that allows functions to be evaluated at compile-time, but it needs several compiler optimizations to be hooked into it in order to be generally useful (e.g. CSE, loop-invariance, constant propagation, and to interact properly with inlining).

It would also be nice to have a julia --safe flag to turn off this and other unsafe annotations. Or maybe just --pure=no.

@vtjnash vtjnash closed this Nov 18, 2015

@stevengj

This comment has been minimized.

Show comment
Hide comment
@stevengj

stevengj Nov 18, 2015

Member

@vtjnash, maybe we need a followup issue to track @pure optimizations?

Update: created #14324.

Member

stevengj commented Nov 18, 2015

@vtjnash, maybe we need a followup issue to track @pure optimizations?

Update: created #14324.

KristofferC added a commit that referenced this issue Jul 3, 2018

Add support for quoted args in REPL mode (#414)
* Add `start` command to REPL mode

* Add support for quoted args in REPL mode

* Add support for quoted args

* Add more tests for quoted args in REPL

* Remove function accidentally commited to master

* Replace accidentaly removed UUID from Project.toml

* Remove double quote from tests (not supported by Windows)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment