Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve pure functions spec #2627

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open

Improve pure functions spec #2627

wants to merge 6 commits into from

Conversation

andralex
Copy link
Member

I've redone the pure functions section to improve definitions. I'm aiming for a narrow and precise definition of strongly pure functions; following that, aggressive optimizations can be applied. cc @WalterBright @tgehr @dnadlinger @JohanEngelen @redstar

@dlang-bot
Copy link
Contributor

Thanks for your pull request, @andralex!

Bugzilla references

Your PR doesn't reference any Bugzilla issue.

If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog.

@andralex
Copy link
Member Author

N.B.: this is best viewed as a split diff.

@thewilsonator
Copy link
Contributor

Searching for trailing whitespace
./spec/function.dd:411:$(P A pure function that passes all tests for strongly pure functions except (4) is called a 

spec/function.dd Outdated
pure int fun(S* object);
---

$(LI The function returns `void`. Example:)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently such functions are already accepted (this diff attempts to clarify the spec without altering it). They don't make sense as strongly pure functions because then they'd be simply not called at all.

An use sample is a deallocation function.

Copy link
Contributor

@tgehr tgehr Apr 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deallocation functions are not a good match for weak purity, because "this will at most modify what is reachable through the parameters" and "this needs to be called to reclaim memory" are not compatible. One is a guarantee, the other is a restriction.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that if we require exceptions and non-termination to be preserved, then strongly pure functions returning void may not be elided unless they follow an identical function call.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tgehr I just want to make sure odd cases like that are put in the weak functions category. That means no moving calls around, no elision, and no nasty surprises. Recall that "strongly pure" is the elites, "weakly pure" is the rest. We need to have a good argument to put something in the strongly pure category. Weakly pure is the default.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That does not make a lot of sense. Why is a void return an "odd case" and returning an empty struct instance is not an "odd case"? The only way this frivolous special casing is acceptable is by noting that we can remove it in the future, but I fear that you might try to double down on it by coupling it with some unrelated feature to save on syntax.

More generally, I am very much against creating categories of pure functions based on the function signature such that the semantics in one category are more relaxed than in another category even though they can call each other. If function inlining can make the set of allowed reorderings smaller, then your semantics is broken.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tgehr I'm just keeping things simple. Going with void is simple and obvious. Going with "all data types without state" would work but is more complicated with no increase in expressive power. Simplicity is something to be appreciated.

More generally, I am very much against creating categories of pure functions based on the function signature such that the semantics in one category are more relaxed than in another category even though they can call each other. If function inlining can make the set of allowed reorderings smaller, then your semantics is broken.

My perception is there's a lot of good experience with distinguishing between weak and strong purity. Initially we only had strong purity, which was very difficult to use. When we added weak purity things got suddenly a ton better. We could of course eliminate them and just go with one definition and let compilers optimize as they find fit. But you'd need a good amount of convincing that these are useless.

So what's your take?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tgehr I'm just keeping things simple. Going with void is simple and obvious. Going with "all data types without state" would work but is more complicated with no increase in expressive power. Simplicity is something to be appreciated.
...

Adding an unnecessary special case is not keeping it simple. It is the opposite of that.

We do not even need a way to say: this function is only weakly pure even though it cannot mutate any state. And making it implicit with a void return is not even a good way to specify that, even if you do want that feature for some reason. Especially now, when you have allowed out parameters on strongly pure functions (as well as pure factory functions).

More generally, I am very much against creating categories of pure functions based on the function signature such that the semantics in one category are more relaxed than in another category even though they can call each other. If function inlining can make the set of allowed reorderings smaller, then your semantics is broken.

My perception is there's a lot of good experience with distinguishing between weak and strong purity. Initially we only had strong purity, which was very difficult to use. When we added weak purity things got suddenly a ton better. We could of course eliminate them and just go with one definition and let compilers optimize as they find fit. But you'd need a good amount of convincing that these are useless.

So what's your take?

My take is that your fluffy paragraph fails to address my point while setting up a straw man. (It also wasted a good bit of my time, spent verifying that this is indeed the case.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for wasting your time. I don't know what I'm doing.

What I want here is to give people a chance to do systemy things like this:

void deallocate(immutable[] array);

At some point there's going to be a need for such in a systems programming language. And it should be well defined and work well - no elision, no reordering, just straight call.

Without the special rule, the function above is strongly pure. Right? Then an implementation looks at the rules and figures there's no need to ever call it.

I don't know how to do this another way. Please advise.

spec/function.dd Outdated
pure S* fun(int);
---

$(LI The function takes zero arguments. Example:)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

Copy link
Member Author

@andralex andralex Apr 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Such functions are already accepted and considering them strongly pure would not be useful - they'd be constants. So I put them in the weakly pure category.

Weakly pure means "somewhat restricted, can be called by any pure function, but not reordered at will".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Constants or factory functions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tgehr cool! So those should be weakly pure, correct?

Copy link
Contributor

@tgehr tgehr Apr 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, they should be (potentially) strongly pure. Note that your pull request says they can't be factory functions.
(Here, I'm ignoring issues like exceptions and termination etc. If we don't do that, then they would not necessarily be constants.)

@tgehr
Copy link
Contributor

tgehr commented Apr 18, 2019

I think there may be the idea that strong and weak purity mean "can optimize" and "cannot optimize", but this is not true. (For example, a weakly pure function that may only modify data that is not accessed later can be elided. There can also be optimizations based on pure functions accessing only const arguments.) I believe however that this is what probably motivated the special case for void return.

I have no idea why there would be a special case for functions with zero arguments. This does not exist now and it contradicts existing compiler behavior:

int[] foo()pure{
	return [1,2,3];
}
void main(){
	immutable a=foo(); // ok
}

@tgehr
Copy link
Contributor

tgehr commented Apr 18, 2019

Also, I think it is a bit weird that the spec defines "weakly pure" and then "strongly pure" as "pure but not weakly pure".

@andralex
Copy link
Member Author

I think there may be the idea that strong and weak purity mean "can optimize" and "cannot optimize", but this is not true. (For example, a weakly pure function that may only modify data that is not accessed later can be elided. There can also be optimizations based on pure functions accessing only const arguments.)

I'm trying to define strong purity as "definitely have freedom to optimize in functional tradition, even when definition is not available". Weak purity is "may or may not depending on information at hand".

Also: the weakly pure definition is more conservative than it could be. Consider:

pure void fun(int input, out int output);

This should be categorized as a pure function, because it's just an awkward way of writing:

pure int fun(int input);

By this PR, however, it's categorized as weakly pure. Making it strongly pure would complicate the definition while not allowing additional useful cases.

@andralex
Copy link
Member Author

andralex commented Apr 18, 2019

I have no idea why there would be a special case for functions with zero arguments. This does not exist now and it contradicts existing compiler behavior:

int[] foo()pure{
	return [1,2,3];
}
void main(){
	immutable a=foo(); // ok
}

Can you please explain that further? By this PR the function would need to return fresh memory every time, which is the right thing to do.

@andralex
Copy link
Member Author

Also, I think it is a bit weird that the spec defines "weakly pure" and then "strongly pure" as "pure but not weakly pure".

What would be a better way?

The way I went for maximum clarity was to define weakly pure first. Then to make clear that a function can be either weakly pure or strongly pure (but not neither or both), I mention that a strongly pure function is a pure function that is not weakly pure. THEN just to make things crystal clear the doc goes ahead and enumerates the conditions for strong purity.

Are you thinking it's better to define strongly pure first and then just say "if it's pure but not strongly pure, it's weakly pure"?

@andralex
Copy link
Member Author

andralex commented Apr 18, 2019

@tgehr Consider:

int[] foo()pure{
	return [1,2,3];
}
void main(){
	immutable a=foo(); // ok
    auto b=foo();
    assert(b == a);
    assert(b !is a);
}

This currently passes, which is good. If foo() were strongly pure it would return the same array.

@andralex
Copy link
Member Author

@tgehr
Copy link
Contributor

tgehr commented Apr 18, 2019

I think there may be the idea that strong and weak purity mean "can optimize" and "cannot optimize", but this is not true. (For example, a weakly pure function that may only modify data that is not accessed later can be elided. There can also be optimizations based on pure functions accessing only const arguments.)

I'm trying to define strong purity as "definitely have freedom to optimize in functional tradition, even when definition is not available".

Besides not being necessary for that, it is also not sufficient (with the current language definition). E.g., Haskell has non-deterministic exception semantics, to allow more rewrites. Furthermore, Haskell does not expose reference identities for immutable values.

Weak purity is "may or may not depending on information at hand".
...

Strong vs weak purity isn't really about optimizations. I think it's mostly a marketing thing to avoid the claim that D does not have "true" functional purity (which it really does not because e.g., you can't have an immutable list data type with value semantics due to the is operator).

The important distinction is between what you call weakly pure functions and pure factory functions, because it affects implicit conversions. Currently, the pull request says pure factory functions must have at least one argument.

Also: the weakly pure definition is more conservative than it could be. Consider:

pure void fun(int input, out int output);

This should be categorized as a pure function, because it's just an awkward way of writing:

pure int fun(int input);

By this PR, however, it's categorized as weakly pure. Making it strongly pure would complicate the definition while not allowing additional useful cases.

It's weakly pure because it modifies a parameter by reference. That does not mean the compiler should not optimize fun(input,output1); fun(input,output2); to fun(input,output1); output2=output1;

@tgehr
Copy link
Contributor

tgehr commented Apr 18, 2019

I have no idea why there would be a special case for functions with zero arguments. This does not exist now and it contradicts existing compiler behavior:

int[] foo()pure{
	return [1,2,3];
}
void main(){
	immutable a=foo(); // ok
}

Can you please explain that further? By this PR the function would need to return fresh memory every time, which is the right thing to do.

By this PR the implicit conversion to immutable should not compile because foo has no arguments hence is not a pure factory function. Also, I think that's not "the right thing to do". Guaranteeing to preserve reference identities for immutable references is costly.

@tgehr
Copy link
Contributor

tgehr commented Apr 18, 2019

Also, I think it is a bit weird that the spec defines "weakly pure" and then "strongly pure" as "pure but not weakly pure".

What would be a better way?
...

Are you thinking it's better to define strongly pure first and then just say "if it's pure but not strongly pure, it's weakly pure"?

Yes, I think it is, but of course that's fully up to you, as it does not influence correctness.

@tgehr
Copy link
Contributor

tgehr commented Apr 18, 2019

@tgehr Consider:

int[] foo()pure{
	return [1,2,3];
}
void main(){
    immutable a=foo(); // ok
    auto b=foo();
    assert(b == a);
    assert(b !is a);
}

This currently passes, which is good. If foo() were strongly pure it would return the same array.

"Might", not "would". Also, you are justifying why you think strongly pure functions should not have indirections in their return values, not why they must have arguments. It really does not make sense to require them to have arguments. (The single argument could be of an empty struct type.)

spec/function.dd Outdated
struct S; // defined elsewhere
// These functions are weakly pure:
pure int fun(S object);
pure int fun(immutable S object);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this not strongly pure? Clearly it does not have mutable indirections because of transitivity of immutability.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tgehr this is to leave us room for __metadata.

Copy link
Contributor

@tgehr tgehr Apr 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It contradicts existing behavior:

struct S;
int[] foo(immutable S* s)pure;

immutable(int[]) bar(immutable S* s){
	return foo(s); // ok
}

__metadata shouldn't influence strong purity anyway. Otherwise we can't actually implement lazy evaluation correctly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tgehr I don't understand the example. Can you please clarify?

__metadata shouldn't influence strong purity anyway. Otherwise we can't actually implement lazy evaluation correctly.

I don't understand this either, please clarify. Thanks!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tgehr I don't understand the example. Can you please clarify?
...

The function foo computes an int[] from a pointer to an immutable instance of an opaque datatype. The result converts to immutable implicitly. The PR says this is not the case.

__metadata shouldn't influence strong purity anyway. Otherwise we can't actually implement lazy evaluation correctly.

I don't understand this either, please clarify. Thanks!

Simplest example is you have an immutable data type and you change some field from being eagerly initialized to being lazily initialized. Factory functions that take an immutable instance of your type will break, because their results suddenly cannot be implicitly converted to immutable anymore.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tgehr thanks, I see. The short of it is I don't know how to define __metadata and pure so as to work together properly without blowing complexity up. If you have any ideas, they'd be welcome.

The longer of it is I'm willing to impose additional restrictions on unknown data types if that makes matters simple, even if we need to break code that works in obscure cases. Optimizing strongly pure functions is low-impact anyway (we have zero such optimizations right now and there's no blood in the streets), and it's very rare that types are incompletely defined in D to start with.

@andralex
Copy link
Member Author

Strong and weak purity isn't really about optimizations. I think it's mostly a marketing thing to avoid the claim that D does not have "true" functional purity (which it really does not because e.g., you can't have an immutable list data type with value semantics due to the is operator).

I'm not that concerned about marketing as much as about making decisions that keep things simple and meaningful. In that sense allowing optimizations is a litmus test - if we can't do it we might have taken the wrong turn someplace. Also I don't want to define pure so as to lock us away from __metadata, or __metadata to lock us away from good optimizations of pure.

I think I have a response to the paren - revised version upcoming which restricts what "equivalent parameters" means.

@andralex
Copy link
Member Author

@tgehr eliminated the no-args requirement in the upcoming revision

@andralex
Copy link
Member Author

@tgehr followed your advice by placing definition of strongly pure first. Then made weakly pure definition as the complement. Things seem a lil cleaner. Thx.

@tgehr
Copy link
Contributor

tgehr commented Apr 18, 2019

Strong and weak purity isn't really about optimizations. I think it's mostly a marketing thing to avoid the claim that D does not have "true" functional purity (which it really does not because e.g., you can't have an immutable list data type with value semantics due to the is operator).

I'm not that concerned about marketing as much as about making decisions that keep things simple and meaningful.

I was concerned about conflating strongly pure, which originated as a marketing concept used to justify weak purity, with language semantics. The PR makes more sense with your latest edits. (Or I missed the part restricting reordering before. I have no way to tell as the previous version disappeared after a force push.)

In that sense allowing optimizations is a litmus test - if we can't do it we might have taken the wrong turn someplace.

That's fine, as long as it is with an understanding that the set of specified optimizations is a small subset of what is possible, and not a comprehensive list.

Also I don't want to define pure so as to lock us away from __metadata, or __metadata to lock us away from good optimizations of pure.
...

Neither do I, but that might make it necessary to specify the complete set of allowed rewrites. (Explicitly or implicitly.)

Some possible bad outcomes that I really want to avoid:

  • void returns become optimization blockers.
  • weakly pure functions become optimization blockers.
  • __metadata in arguments become optimization blockers, turning off purity-based optimizations in the presence of lazy evaluation or reference counting.
  • __metadata in arguments implies at most weak purity.

spec/function.dd Outdated Show resolved Hide resolved
pure double gun(double);
void hun(double x)
{
auto y = fun(x) + gun(x); // evaluate fun(x) and gun(x) in any order
Copy link
Contributor

@tgehr tgehr Apr 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about the case where fun is not nothrow, gun is nothrow, and fun throws an exception exactly when gun does not terminate?

spec/function.dd Outdated
not carry to the thrown object.)

$(P Note: a strongly pure function may still have behavior inconsistent with memoization by e.g.
using `cast`s or by changing behavior depending on the address of its parameters. An implementation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you defined away memoization for equal immutable arguments with different addresses? (Not that I like it, but this seems to eliminate this concern.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that was the intent. Here's what I had in mind:

pure string identity(string s) { return s; }
auto s1 = "abc",  s2 = s1.idup;
auto s3 = identity(s1), s4 = identity(s2);

If we allow s1 to be considered equivalent to s2, identity will always return s1.

Also: when in doubt, make it weakly pure.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, you're saying I should eliminate that text. Will do.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tgehr will still keep ie because:

pure int* hmmm(int x) { return &x; }

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tgehr simplified the text to, well, simplify

Copy link
Contributor

@tgehr tgehr Apr 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, my comment was purely about the parameter addresses. (By which I assumed you meant the actual address of references/pointers within the arguments, otherwise it should have said address of local variables.)

@andralex
Copy link
Member Author

I was concerned about conflating strongly pure, which originated as a marketing concept used to justify weak purity, with language semantics. The PR makes more sense with your latest edits. (Or I missed the part restricting reordering before. I have no way to tell as the previous version disappeared after a force push.)

Yeah, I switched to no-amend commits going forward.

In that sense allowing optimizations is a litmus test - if we can't do it we might have taken the wrong turn someplace.

That's fine, as long as it is with an understanding that the set of specified optimizations is a small subset of what is possible, and not a comprehensive list.

Cool!

Also I don't want to define pure so as to lock us away from __metadata, or __metadata to lock us away from good optimizations of pure.
...

Neither do I, but that might make it necessary to specify the complete set of allowed rewrites. (Explicitly or implicitly.)

That would be acceptable. And don't forget - fewer is not a disaster.

Some possible bad outcomes that I really want to avoid:

Allow me to put my pointy-haired manager hat (or wig) by interspersing the top-level view with these:

  • void returns become optimization blockers.

Very low impact. Pure void functions will be very rare if any, and don't forget we currently do zero optimizations based on purity. Yet nobody holds a gun to our head.

  • weakly pure functions become optimization blockers.

Low impact.

  • __metadata in arguments become optimization blockers, turning off purity-based optimizations in the presence of lazy evaluation or reference counting.

Low impact.

  • __metadata in arguments implies at most weak purity.

Low impact.

What is high impact:

  • Reference counted data structures usable with immutable and in pure functions
  • Lazy initialization of members

These are what matters. I would very happy give away any and all items on your list to get these two. And it would be great if you did, too.

spec/function.dd Outdated Show resolved Hide resolved
@tgehr
Copy link
Contributor

tgehr commented Apr 19, 2019

...
Also I don't want to define pure so as to lock us away from __metadata, or __metadata to lock us away from good optimizations of pure.
...

Neither do I, but that might make it necessary to specify the complete set of allowed rewrites. (Explicitly or implicitly.)

That would be acceptable. And don't forget - fewer is not a disaster.
...

The point of the rewrites is to define the meaning of pure and immutable.

Some possible bad outcomes that I really want to avoid:

Allow me to put my pointy-haired manager hat (or wig) by interspersing the top-level view with these:
...

It was you who said: "Also I don't want to define pure so as to lock us away from __metadata, or __metadata to lock us away from good optimizations of pure."

Pure void functions will be very rare if any, and don't forget we currently do zero optimizations based on purity. Yet nobody holds a gun to our head.
...

That's not the point. Currently, pure and immutable have well-defined meanings from which such optimizations can be derived (note that pure function elision can't even be derived if we require pure functions to preserve non-termination). With __mutable this goes out of the window, so you need an alternative way to specify what they mean. I was aiming for a state where we don't lose what we already have with pure and immutable.

What is high impact:

  • Reference counted data structures usable with immutable and in pure functions

I guess you mean nominally immutable. I don't really see the point of that. It's rare to want to use the same complex data structure as both mutable and immutable, and immutable structs and class references can't be reassigned, limiting their usefulness. Furthermore, it can make sense to store mutable data within a persistent data structure.

  • Lazy initialization of members
    ...

I can do lazy initialization of members today, even within a weakly pure function.

These are what matters.

I thought you wanted to support general manual memory management within pure functions, not just reference counting?

I would very happy give away any and all items on your list to get these two. And it would be great if you did, too.

I'm just not very excited to have merely nominal immutability in the type system, and I am certainly not the only one. It means whenever there is an immutable in the code, I have to worry about whether or not it is a lie. Libraries will require const arguments in the misguided name of "const correctness", arguing that I should just use __mutable fields if I really need to do some mutation in a callback. If I want to avoid the synchronization overhead, then I will need to make sure nobody can create an immutable instance of my type. At least right now, function signatures are honest. I don't really see the point of watering down attributes such that they can be legally slapped even on code that does not behave the way the attribute would suggest, not even logically.

@andralex
Copy link
Member Author

It was you who said: "Also I don't want to define pure so as to lock us away from __metadata, or __metadata to lock us away from good optimizations of pure."

Yes, but that's not a hard requirement. In fact if we drop strongly pure entirely and stay with weakly pure that's acceptable if there's no other way to get work done on reference counting.

Pure void functions will be very rare if any, and don't forget we currently do zero optimizations based on purity. Yet nobody holds a gun to our head.
...

That's not the point. Currently, pure and immutable have well-defined meanings from which such optimizations can be derived (note that pure function elision can't even be derived if we require pure functions to preserve non-termination). With __mutable this goes out of the window, so you need an alternative way to specify what they mean. I was aiming for a state where we don't lose what we already have with pure and immutable.

Maybe we have too much with pure and immutable, and of low-impact. Like, the ability to do optimizations that have not been done in a decade. What we need is high impact: reference counting without compiler magic.

  • Reference counted data structures usable with immutable and in pure functions

I guess you mean nominally immutable. I don't really see the point of that. It's rare to want to use the same complex data structure as both mutable and immutable, and immutable structs and class references can't be reassigned, limiting their usefulness. Furthermore, it can make sense to store mutable data within a persistent data structure.

This might be the case with complex data structures, but it is not with strings and slices. Currently string and generally T[] work nicely in pure functions and in conjunction with immutable. A simple rephrasing of the challenge is: design a type rcstring and a type Slice!T that do what string and T[] do, just that they manage their own memory instead of relying on the tracing collector.

Can you lead such an effort?

  • Lazy initialization of members
    ...

I can do lazy initialization of members today, even within a weakly pure function.

Not in an immutable or const object. This is the problem, and a very real one.

These are what matters.

I thought you wanted to support general manual memory management within pure functions, not just reference counting?

I'd be happy with just reference counting if that's all that's possible.

I would very happy give away any and all items on your list to get these two. And it would be great if you did, too.

I'm just not very excited to have merely nominal immutability in the type system, and I am certainly not the only one. It means whenever there is an immutable in the code, I have to worry about whether or not it is a lie.

NO it's not a lie! The point is to allow controlled use of metadata that is transparent to the user of the immutable data structure.

Is tracing GC a lie? Because it does make mutable memory into immutable memory (construction) and immutable memory back into modifiable memory (reclamation). It is also easy to circumvent in any number of ways. Yet nobody is spending nights losing sleep over it.

Libraries will require const arguments in the misguided name of "const correctness", arguing that I should just use __mutable fields if I really need to do some mutation in a callback.

That is working just fine for C++. A good success story to learn from.

If I want to avoid the synchronization overhead, then I will need to make sure nobody can create an immutable instance of my type.

I don't understand this.

At least right now, function signatures are honest. I don't really see the point of watering down attributes such that they can be legally slapped even on code that does not behave the way the attribute would suggest, not even logically.

I understand, and I don't want to be there either. But I also reckon that without workable reference counting we're dead in the water.

Sadly we have re-entered an unhelpful pattern.

  1. I try to break the stalemate

  2. I don't know what I am doing, and both you and I know I don't know what I'm doing.

  3. You rightly poke holes in my proposal, pointing out that I'm making something worse and overall that I don't know what I'm doing.

  4. I give up and we're back to where we were. How can I not? You know what you're doing, and your counterarguments are correct.

  5. There is no progress - no reference counting for the next ten years.

How can we get from this pattern to a pattern whereby you actually do steer things to the positive? What steps can we get to get reference counted data structures while at the same time preserve most of the good qualities of pure and immutable? That's the real challenge, not to say it can't be done. My mailman can tell me it can't be done. We need your good skill put to work in the right direction.

@anon17
Copy link

anon17 commented Apr 19, 2019

You can't elide strongly pure calls to zero, because they can throw exceptions, it must be called at least once. So void function is fine as strongly pure, a use case for it is data validation. A void function that does nothing - that's low impact.

Maybe you can reorder two pure nothrow functions between each other, but not anything else, again because exceptions.

Lazy initialization doesn't seem to conflict with purity. Reference counting can be done as weakly pure:

void* addRef() pure immutable
{
  _counter++;
  return null;
}

Also postblit should be weakly pure too.

new and malloc are special cases and should be in the list of special cases.

@andralex
Copy link
Member Author

You can't elide strongly pure calls to zero, because they can throw exceptions, it must be called at least once.

Good insight, thanks. Can throw errors or abort the program even if nothrow. I can't believe I'm saying this, but that's a relief :o).

@andralex
Copy link
Member Author

Lazy initialization doesn't seem to conflict with purity. Reference counting can be done as weakly pure:

void* addRef() pure immutable
{
  _counter++;
  return null;
}

You can't update the counter of an immutable structure. (It's actually a pointer to a counter.)

Also postblit should be weakly pure too.

Good news - we don't need to worry about postblit :o).

new and malloc are special cases and should be in the list of special cases.

Yes to new, no to malloc.

spec/function.dd Outdated Show resolved Hide resolved
@andralex
Copy link
Member Author

Eliminated the void return test.

@SSoulaimane
Copy link
Member

@andralex I'd say just do the ref counting. escape the type system. improve later.

@andralex
Copy link
Member Author

@SSoulaimane we have that already... dlang/druntime#2348

spec/function.dd Outdated Show resolved Hide resolved
@anon17
Copy link

anon17 commented Apr 22, 2019

Mutable data can be stored like this:

struct __mutable(T)
{
    private size_t _ref;
    this(T val){ _ref=cast(size_t)val; }
    T unwrap() const
    {
        return cast(T)_ref;
    }
}

struct A
{
    __mutable!(int*) rc;
    this(int) immutable
    {
        rc=new int;
    }
    int inc() immutable
    {
        return ++*rc.unwrap;
    }
}

int main()
{
    immutable A b=0;
    assert(b.inc==1);
    assert(b.inc==2);
    return 0;
}

@andralex
Copy link
Member Author

@anon17 thanks for the suggestion. It's in line with some of what @edi33416 and myself have tried over time. It has a number of small issues (not safe, the GC may free the memory prematurely) that can be fixed as follows:

@safe:

struct __mutable(T)
{
    private union { T _unused; size_t _ref; }
    this(T val) pure immutable { _ref=cast(size_t)val; }
    T unwrap() const pure @system
    {
        return cast(T)_ref;
    }
}

struct A
{
    __mutable!(int*) rc;
    this(int) immutable pure
    {
        rc=new int;
    }
    int inc() immutable pure @trusted
    {
        return ++*rc.unwrap;
    }
}

int main() pure
{
    immutable A b=0;
    assert(b.inc==1);
    assert(b.inc==2);
    return 0;
}

This code compile and runs. The large problem is that A.inc is a strongly pure function (pure operating on immutable data). However, that's a trick - unbeknownst to the type system, the code ends up changing immutable data. Per the spec (with or without this PR), the call to b.inc can be memoized, which would render code incorrect.

@anon17
Copy link

anon17 commented Apr 23, 2019

Weakly pure variant:

struct A
{
    __mutable!(int*) rc;
    this(int) immutable pure
    {
        rc=...; //PrefixAllocator is fine
    }
    private void* addRef() pure const @trusted
    {
        ++*rc.unwrap;
        return null;
    }
    private void* release() pure const @trusted
    {
        int counter=--*rc.unwrap;
        if(counter==0)deallocate();
        return null;
    }
    private void deallocate() pure const
    {
        ...
    }
    ~this() pure const
    {
        release();
    }
}

Also this way the change is not visible outside, so the type system can't really tell if it changes or not, it only needs to call weakly pure functions.

@andralex
Copy link
Member Author

@anon17 thanks! @edi33416 will take a look at this

}
pure S* f4(int);
---
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe have an example for ref return?

{
if (i == n) break;
result += __t;
}
Copy link
Contributor

@ntrel ntrel Mar 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that fun defined above always does return, I think the above code should actually be equivalent a valid lowering. So the example is confusing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.