Mark some GC functions pure and/or nothrow. #198

Merged
merged 1 commit into from Jun 1, 2012

Projects

None yet

4 participants

@alexrp
Member
alexrp commented Apr 27, 2012

A few things worth noting:

  • These functions actually can be nothrow, because the only exceptions they can throw are Errors.
  • Yes, marking the functions with these modifiers means future implementations have to follow them. I don't think this is a problem, however, and the current situation (no modifiers at all) is very annoying.
  • I haven't marked the actual functions inside the GC with these modifiers. I didn't feel that cascading them all the way to the GC implementation was necessary; the GC is a low-level component of the runtime and thus deserves as much freedom as it can get. That said, I did verify that none of the functions marked nothrow can throw.
  • I'm not sure if the application of pure is entirely correct. I think it makes sense in terms of weak purity, but please correct me if I'm wrong.
@alexrp
Member
alexrp commented May 11, 2012

Ping?

@alexrp
Member
alexrp commented May 23, 2012

I've been thinking about the application of pure to some of the functions here. While the getAttr, clrAttr, and setAttr functions (and the likes) do rely on the GC's global state, I think it could be argued that this state is an implicit parameter to all functions in the core.memory module. Anyone who calls these functions will know that they rely on fetching information from the garbage collector instance. As such, marking those functions pure is fine IMHO, since their purity only depends on two inputs: The GC itself and the memory address.

Besides, there's no good alternative to doing it this way, and this module is currently not very usable in pure/nothrow code. We need to fix that silly situation.

Thoughts?

@schveiguy
Member

I think most of these can be marked pure as well.

As long as it's weak purity, it can be marked pure. What we don't want is someone for instance doing:

void *p1 = gc_malloc(10);
void *p2 = gc_malloc(10);

and having the compiler rewrite as:

void *p2 = p1;

Given that most of these return mutable data, they should be weakly pure, right?

@alexrp
Member
alexrp commented May 23, 2012

Makes sense. But this brings up the question: Should free be pure? I find that one tough, since it may seem weakly pure, but it can have all sorts of arbitrary side-effects. For example, a GC implementation might choose to run the finalizer when the data is freed, or whatever.

@schveiguy
Member

I think free can be pure.  It does not call the finalizer (and would be illegal if it did).  It's one of the main reasons we are deprecating delete, deallocation should not be conflated with finalization.


From: Alex Rønne Petersen reply@reply.github.com
To: Steven Schveighoffer schveiguy@yahoo.com
Sent: Tuesday, May 22, 2012 9:52 PM
Subject: Re: [druntime] Mark some GC functions pure and/or nothrow. (#198)

Makes sense. But this brings up the question: Should free be pure? I find that one tough, since in principle it's weakly pure, but it can have all sorts of arbitrary side-effects. For example, a GC implementation might choose to run the finalizer when the data is freed, or whatever.


Reply to this email directly or view it on GitHub:
#198 (comment)

@alexrp
Member
alexrp commented May 23, 2012

Speaking of which, we probably should fix the docs: http://dlang.org/phobos/core_memory.html#free

But I'm honestly not sure what's the preferred construct these days.

Anyway, as long as free is guaranteed to not call arbitrary functions with side-effects, then it's fine I guess.

What about the enable, disable, minimize, and collect functions? I think making the latter two pure would be very, very wrong, since they on the other hand can result in finalizers being enqueued.

@schveiguy
Member

Minimize and collect should be callable IMO.  If we are enabling gc_malloc as pure, gc_malloc may call collect as a side effect.  Really, you have to think of it like this:

a) running a collect cycle or not running one shouldn't alter the logical execution of the program.  So if a call to collect is optimized out, no big deal.
b) the collection cycle runs in its own context, it's almost like it's on its own thread, but it's "borrowing" the current running thread's stack for execution.  It really doesn't affect the current-running thread's data whatsoever (except for things the pure function isn't legally allowed to see, such as global data).

As far as enable/disable, I think they should not be pure, because you don't want to optimize out a call to one of those!  However, my gut tells me we should be able to build a way to call these in pairs inside a pure scope.

So for example, it should be valid to call gc_disable, if you then call gc_enable (or restore to previous enabled state).


From: Alex Rønne Petersen reply@reply.github.com
To: Steven Schveighoffer schveiguy@yahoo.com
Sent: Tuesday, May 22, 2012 10:08 PM
Subject: Re: [druntime] Mark some GC functions pure and/or nothrow. (#198)

Speaking of which, we probably should fix the docs: http://dlang.org/phobos/core_memory.html#free

But I'm honestly not sure what's the preferred construct these days.

Anyway, as long as free is guaranteed to not call arbitrary functions with side-effects, then it's fine I guess.

What about the enable, disable, minimize, and collect functions? I think making the latter two pure would be very very wrong, since they on the other hand can result in finalizers being enqueued.


Reply to this email directly or view it on GitHub:
#198 (comment)

@alexrp
Member
alexrp commented May 23, 2012

a) running a collect cycle or not running one shouldn't alter the logical execution of the program. So if a call to collect is optimized out, no big deal.

I think this is a big deal: It is very common to force collections in order to reduce memory use and force finalizers to be run ASAP. We need to account for this case, and under no circumstances optimize it out. Also, keep in mind that finalizers can run arbitrary code, so if a collection causes finalizers to be fired off, this operation is by no means pure. Ideally, finalizers should be programmed to not access state outside of the object they belong to, but making this assumption is not pragmatic IMHO (just because finalization order is undefined doesn't mean accessing global state is useless - think a simple scenario like atomically decreasing a global counter).

b) the collection cycle runs in its own context, it's almost like it's on its own thread, but it's "borrowing" the current running thread's stack for execution. It really doesn't affect the current-running thread's data whatsoever (except for things the pure function isn't legally allowed to see, such as global data).

That depends. In order to implement weak references, I've used a very dirty trick where I store a GC pointer into a NO_SCAN area. This means that a collection actually can have side-effects in that it'll render weak references collected. You could argue that this is how weak references are supposed to work, but it's a side-effect no less.

As far as enable/disable, I think they should not be pure, because you don't want to optimize out a call to one of those! However, my gut tells me we should be able to build a way to call these in pairs inside a pure scope.

Yeah, it could probably be useful for certain optimized algorithms. Unfortunately, I can't think of a good way to do this.

@schveiguy
Member

Well, the call itself shouldn't be optimized out (maybe we have to include a void * parameter to ensure this, a weak attribute would be nice here...), I was thinking more in the case that a strong pure function that contains the call could be optimized out.

And you can't stop the GC from running a cycle at any time, it could run on a malloc call, it could run because of another thread (which will pause your pure function).  The runtime has to deal with the possibility that collect is run in the middle of a pure function call.


From: Alex Rønne Petersen reply@reply.github.com
To: Steven Schveighoffer schveiguy@yahoo.com
Sent: Tuesday, May 22, 2012 10:33 PM
Subject: Re: [druntime] Mark some GC functions pure and/or nothrow. (#198)

a) running a collect cycle or not running one shouldn't alter the logical execution of the program.  So if a call to collect is optimized out, no big deal.

I think this is a big deal: It is very common to force collections in order to reduce memory use and force finalizers to be run ASAP. We need to account for this case, and under no circumstances optimize it out. Also, keep in mind that finalizers can run arbitrary code, so if a collection causes finalizers to be fired off, this operation is by no means pure. Ideally, finalizers should be programmed to not access state outside of the object they belong to, but making this assumption is not pragmatic IMHO (just because finalization order is undefined doesn't mean accessing global state is useless - think a simple scenario like atomically decreasing a global counter).

b) the collection cycle runs in its own context, it's almost like it's on its own thread, but it's "borrowing" the current running thread's stack for execution.  It really doesn't affect the current-running thread's data whatsoever (except for things the pure function isn't legally allowed to see, such as global data).

That depends. In order to implement weak references, I've used a very dirty trick where I store a GC pointer into a NO_SCAN area. This means that a collection actually can have side-effects in that it'll render weak references collected. You could argue that this is how weak references are supposed to work, but it's a side-effect no less.

As far as enable/disable, I think they should not be pure, because you don't want to optimize out a call to one of those!  However, my gut tells me we should be able to build a way to call these in pairs inside a pure scope.

Yeah, it could probably be useful for certain optimized algorithms. Unfortunately, I can't think of a good way to do this.


Reply to this email directly or view it on GitHub:
#198 (comment)

@schveiguy
Member

Speaking of which, we probably should fix the docs: http://dlang.org/phobos/core_memory.html#free

From that page:

The block will not be finalized regardless of whether the FINALIZE attribute is set.

Seems correct to me...

@alexrp
Member
alexrp commented May 23, 2012

I was getting at it recommending using delete.

@schveiguy
Member
@alexrp
Member
alexrp commented May 23, 2012

Well, anyway, you make a good point: We already allow pure functions to cause GC allocations, so this should be no different. But I have to admit I don't like it... it seems... dirty.

Well, I'll make the changes and rebase.

@schveiguy
Member

Well, don't mark gc_enable and gc_disable pure, because they have no parameters/return value, they will be interpreted as strong-pure and might be optimized out. We need to find a way to make these weak-pure. I'll post a NG message querying on how to do it.

I guess same thing for gc_collect and gc_minimize.

@alexrp
Member
alexrp commented May 23, 2012

Rebased. I'm not sure what to do about the range management functions, though. Do we want those to be pure?

@schveiguy
Member

I'm not sure what to do about the range management functions, though. Do we want those to be pure?

I think these should be pure, but we can hold off for now. Adding ranges and roots is done when you allocate outside of the GC, and I don't think any of those functions (i.e. C's malloc and free) are marked pure yet.

@alexrp
Member
alexrp commented May 23, 2012

It will have to be eventually, otherwise writing a pure allocator interface is going to suck...

@schveiguy
Member

Reviewing your new version:

gc_setAttr, gc_getAttr, gc_clrAttr cannot be marked pure as-is, because they could be interpreted as strong-pure if called with immutable pointer. I'm pretty sure the current compiler does not optimize this, but we don't want to leave a bomb there for when it does optimize it.

It would make sense, actually, for those to be marked without 'in' (since a GC implementation might actually store the bits inside the block), but that would likely break some existing code.

gc_reserve does not have any mutable reference parameters or return values, so it will be interpreted as strong pure. It cannot be marked pure without a solution as we are discussing in the NG, but I think this is no big loss for now.

Continuing review...

@schveiguy
Member

OK, so other than those, I think it looks good.

@alexrp
Member
alexrp commented May 23, 2012

It would make sense, actually, for those to be marked without 'in' (since a GC implementation might actually store the bits inside the block), but that would likely break some existing code.

Then we should do that breakage sooner rather than later. I agree that these pointers really should not be marked with 'in' at all.

gc_reserve does not have any mutable reference parameters or return values, so it will be interpreted as strong pure. It cannot be marked pure without a solution as we are discussing in the NG, but I think this is no big loss for now.

I think for both gc_reserve and the attribute functions, we can just mark them pure for now and and add a TODO comment as a reminder that it should be fixed when we get proper weak purity (or whatever we want to call it).

@alexrp
Member
alexrp commented May 25, 2012

Is this good to go?

@schveiguy
Member

gc_reserve will be marked strong pure today. I think we can't make it pure, or it will not work properly.

The attribute functions I think should not be marked pure, but add a TODO: these should be pure when forced weak purity is possible at the top of the function list.

@alexrp
Member
alexrp commented May 25, 2012

Why can't the attribute ones be pure? We just make the arguments not 'in' (as they really should be) and they will be weakly pure by definition.

@schveiguy
Member

Because it would break code. The bar has to be set very high to be break existing code, and there just isn't enough benefit, seeing as how the adverse behavior won't happen without an improved optimizer in the compiler.

We have two choices:

  1. mark them as pure and leave in specified on the void*. This should work today, as I don't think the compiler will optimize out any calls based on this. Then "remember" to fix it if the compiler does start optimizing.
  2. Do not mark them as pure, and change it properly when it is possible to mark something as explicitly weak pure.

My vote is for option 2, because the chances we remember some time in the future to come back and fix it for option 1 is pretty slim.

What is the fallout if we use option 2? That is, if we can't mark those functions pure, what other functions now cannot be marked pure?

@alexrp
Member
alexrp commented May 25, 2012

There is the fallout that writing pure allocators against the GC interface would basically be impossible (in particular, I suspect that setting NO_SCAN will be very common).

@schveiguy
Member

There is the fallout that writing pure allocators against the GC interface would basically be impossible (in particular, I suspect that setting NO_SCAN will be very common).

Yes, that makes sense.  I don't know that it passes the bar for breaking code, though.  I think also, before this patch is incorporated, we need other opinions.  I think based on the NG thread I started about forced weak purity, not everyone agrees with these changes.

I thought of another possible solution:

Create setAttrPure which does exactly the same thing, but just doesn't mark the parameter as in (and of course, is marked pure).  When we can force weak purity, we can replace it with an alias.

@alexrp
Member
alexrp commented May 25, 2012

I really wish more people would weigh in here, but they seem busier arguing about the language's purity design than giving input here. ;)

Create setAttrPure which does exactly the same thing, but just doesn't mark the parameter as in (and of course, is marked pure). When we can force weak purity, we can replace it with an alias.

Can't we simply overload it? Keep in mind, it's a D linkage function, not C. So, we overload it and deprecate the old overload?

@schveiguy
Member

Can't we simply overload it? Keep in mind, it's a D linkage function,

not C. So, we overload it and deprecate the old overload?

I'm seeing this:

extern (C) uint gc_getAttr( in void* p );

This needs to be changed.

Yes, you can overload the GC.getAttr version.  But don't deprecate the old overload. 

Here is the timeline:

  1. We have gc_getAttrPure(void *p) pure and gc_getAttr(in void *p), neither is marked as deprecated.  Inside the wrapper GC struct, we have getAttr(void *) pure and getAttr(in void *), neither marked deprecated
  2. Once forced weak purity is possible, we now have gc_getAttr(in void *) weakpure and alias gc_getAttr gc_getAttrPure.  Inside the wrapper GC struct, we have getAttr(in void *) weakpure

And that's it.

@alexrp
Member
alexrp commented May 25, 2012

gc_getAttr is a completely internal function. We just change it to accept void*. Then we make an extra set of overloads inside the GC struct, and, in the ones that take an in void*, we just use a cast hack (this isn't problematic because we're dealing with const, not immutable).

@schveiguy
Member

gc_getAttr is a completely internal function. We just change it to accept

void*. Then we make an extra set of overloads inside the GC struct, and, in
the ones that take an in void*, we just use a cast hack (this isn't
problematic because we're dealing with const, not immutable).

if you are dealing with const you are dealing with possibly immutable.

But you are right, it's private.  So just add gc_getAttrPure and when we can properly mark things as weak, get rid of it, don't deprecate it.  If anyone is using undocumented internal functions from the GC, their code deserves to be broken :)

@klickverbot
Member

@schveiguy: Cue Walter complaining that his private DLL tests don't work any longer… ;)

@alexrp
Member
alexrp commented May 25, 2012

if you are dealing with const you are dealing with possibly immutable.

I posted a long rant on the NG on why this really doesn't matter in real world compilers: http://forum.dlang.org/thread/jooo4k$shb$2@digitalmars.com?page=4#post-joq5hk:24bit:241:40digitalmars.com

But we can make pure versions if you prefer.

@schveiguy
Member

Cue Walter complaining that his private DLL tests don't work any longer… ;)

ooooh good point! @alexrp any changes we make to the private functions have to be made to the gcstub directory as well.

@schveiguy
Member

if you are dealing with const you are dealing with possibly immutable.

I posted a long rant on the NG on why this is really doesn't matter in real world compilers

No, that's not what I'm talking about. But in any case, I think actually, we are easily able to cast away const for the prototype because the actual function is const. So here's now what I think we can do:

gc_getAttr(in void *) -> gc_getAttr(void *) pure
GC.getAttr(in void *) -> leave as is (but insert a cast to void * for passing to gc_getAttr)
add GC.getAttr(void *) pure

Then when weakpure is markable, fix things back, we won't need to deprecate anything.

@alexrp
Member
alexrp commented Jun 1, 2012

@schveiguy I've rebased with the changes we discussed, and have added FIXME notes that we should attend to once weak purity can be properly marked. Can you please review?

@schveiguy
Member

addrOf can be pure outright, it returns a mutable void * (though this might not be the best idea, maybe it should be fixed to return const(void)* )

Don't forget to update gcstub!

@alexrp
Member
alexrp commented Jun 1, 2012

addrOf can be pure outright, it returns a mutable void * (though this might not be the best idea, maybe it should be fixed to return const(void)* )

Too restrictive IMO; addrOf is often used to get from an interior pointer in an array or structure to the base, so forcing it to be const would be unreasonable.

Don't forget to update gcstub!

What is there to update? The functions I modified have C linkage, so all these attribute and storage class changes shouldn't affect linking (I think... doesn't on Linux, anyway).

Note also that I didn't change the GC implementation either.

@schveiguy
Member

addrOf can be pure outright, it returns a mutable void * (though this might not be the best idea, maybe it should be fixed to return const(void)* )

Too restrictive IMO; addrOf is often used to get from an interior pointer in an array or structure to the base, so forcing it to be const would be unreasonable.

Should be inout then.

Don't forget to update gcstub!

What is there to update? The functions I modified have C linkage, so all these attribute and storage class changes shouldn't affect linking (I think...).

No idea, all I know is that Walter goes on a rampage whenever It doesn't get updated ;)  Maybe that's only when functions are added/removed...

@alexrp
Member
alexrp commented Jun 1, 2012

Should be inout then.

Can you clarify how it should be declared? I'm not too familiar with inout yet, to be honest.

No idea, all I know is that Walter goes on a rampage whenever It doesn't get updated ;) Maybe that's only when functions are added/removed...

I think so. For C linkage functions, this stuff really shouldn't matter, since all the D 'annotations' are removed from the mangling.

@schveiguy
Member

Can you clarify how it should be declared? I'm not too familiar with inout yet, to be honest.

inout(void)* addrOf(inout(void)* p)

I think so. For C linkage functions, this stuff really shouldn't matter, since all the D 'annotations' are removed from the mangling.

Just checked, yes you need to update gcstub, because it duplicates the GC struct, which is D linkage.

@alexrp
Member
alexrp commented Jun 1, 2012

Just checked, yes you need to update gcstub, because it duplicates the GC struct, which is D linkage.

I don't see it... only occurrence of struct GC is in core.memory.

@schveiguy
Member

Yes, you are right, I am wrong. Sorry.

@alexrp
Member
alexrp commented Jun 1, 2012

Well, just pushed the inout change. All good?

@schveiguy
Member

Almost right. We still need the overloads, because inout is like const, so could potentially be treated as optimizable on a pure function if called with immutable arguments.

Sorry :( should have mentioned that when I said it could be pure outright!

@alexrp
Member
alexrp commented Jun 1, 2012

Done.

@schveiguy
Member

Still not right.

Should be like this:

static inout(void)* addrOf( inout(void)* p ) /*pure*/ nothrow
{
   return cast(inout(void)*)addrOf(cast()p);
}
@alexrp
Member
alexrp commented Jun 1, 2012

OK, how about this time? :)

@schveiguy
Member

OK, looks good! Let's wait and see how the pull tester does with it.

@schveiguy
Member

Indeed, it looks good to merge.

@schveiguy schveiguy merged commit ef8b2a6 into dlang:master Jun 1, 2012
@denis-sh
denis-sh commented Jun 2, 2012

WTF?! Sorry, but these breakes everything I know about pure functions in D. They are compiler checkable, not user checkable. E.g. strlen can't be pure because it's argument is a pointer, not an array (or we have to consider that if a pure function accepts a pointer the function result depends on all process memory).

All gc_* functions depends on global GC state and can't be pure unless this state is passed as an argument (such change will break binary compatibility but we can also leave old functions and add new ones).

Do you understand that you don't understand what is argument value and what is returned value of a pure function when we are dealing with pointers? I just filled Issue 8185 - Pure functions and pointers.

Revert it as soon as possible, please!

@klickverbot
Member

@denis-sh: You might want to get your facts straight first – pure functions in D can take/return pointers just fine. See the spec, and I recently wrote an article about it. As for the GC state, managing garbage collected memory is allowed in pure functions (making it possible to e.g. use new). This change gives some low-level functions the same treatment.

@denis-sh
denis-sh commented Jun 2, 2012

@klickverbot:

You might want to get your facts straight first – pure functions in D can take/return pointers just fine. See the spec, and I recently wrote an article about it.

OK, I'm a human and can make a mistake. But I still doesn't see this. If my mistake is obvious, just quote me some sentences from specs etc.

As for the GC state, managing garbage collected memory is allowed in pure functions (making it possible to e.g. use new).

According to specs new is clearly stated as an exception. An arbitrary C function isn't stated as an exception (even a C function from a specific module).

@klickverbot, have you seen my Issue 8185 - Pure functions and pointers?

@alexrp
Member
alexrp commented Jun 2, 2012

@denis-sh As discussed above in this pull request, we decided to do what we did here because the GC is a special case. Yes, in an ideal world, all purity would be compiler-checkable, but then pure would be next to useless. You would be unable to write a pure generic allocator using the GC as allocation mechanism for example. These changes are made because we've taken a pragmatic stance on the matter. Even Haskell, for example, has to call low-level functions in its runtime that aren't quite pure.

You are correct in saying that in reality, the GC functions operate on global state. But conceptually they operate on the GC class instance inside the gc.gcx module, and that can be considered a hidden parameter. Yes, it could be made an actual parameter, but there's no point. Further, there's no point in trying to actually enforce pure and nothrow inside the GC implementation; in fact, it would be impossible due to some of the functions it uses.

The GC (or more generally, druntime) is a very low-level aspect of the language. What we've done here is a faith-based approach where we assume people who hack on the GC know what they're doing and don't break the guarantees made in the core.memory module.

@denis-sh
denis-sh commented Jun 2, 2012

@alexrp I'm not against any hacking with hidden global state in pure functions. My point is that a compiler can do some unexpected things if a function is pure because it assumes that pure functions produces the same results for the same arguments and doesn't read/write global state. And I doesn't see anywhere a definition of same arguments/same results for pointers (see Comment 3 on Issue 8185).

But at least the following example has to be considered:

Example 1

int res = gc_reserve(100);
  • Expectations: Reserve some memory.
  • Reality: Will (may) not be actually called if gc_reserve have ever been called with this argument.
  • Note: This even violates gc_reserve documentation, because it can return any amount of memory for
    the same argument.

Example 2

if(!gc_reserve(400 * MiB))
    return false; // This is assumed to be a common case
  • Expectations: We will tell the user to stop allocating such big buffers.
  • Reality: Same as above.

Example 3

gc_reserve(100);
  • Expectations: Reserve some memory.
  • Reality: Will (may) produce a compilation error/warning and never ever be called.

Example 4

immutable(void)*[] pArr;
for(; ;) {
    immutable p = gc_malloc(100 * MiB);
    if(!p) break;
    pArr ~= p;
    size_t random = 0;
    foreach(b; cast(immutable(ubyte)[])p[0 .. 100 * MiB])
        random += b;
    writeln("random: ", random);
}
  • Expectations: Prints memory hashes until break.
  • Reality: Will (may) call gc_malloc once so the random will always be the same and it will be an infinite loop.

P.S.

I hope this my post is constructive enough.

@alexrp
Member
alexrp commented Jun 2, 2012

Example 1
Example 2
Example 3

You are right. gc_reserve should only be nothrow until we can mark weak purity. @schveiguy Do you agree?

Example 4

Wrong. gc_malloc returns a pointer with mutable indirection, so it is weakly pure. The compiler is not allowed to optimize it. It may only use the purity information for the type system.

@denis-sh
denis-sh commented Jun 2, 2012

I forgot to mention that Example 2 leads to OOM error because one can think a system just gave him lots of memory and than try to use it with gc_malloc.

@denis-sh
denis-sh commented Jun 2, 2012

Example 4

Wrong.

Have you noticed there is no cast(immutable)? You probably have to look at @klickverbot's Purity in D article "pure and immutable – again?" section

@schveiguy
Member

Alex is right and this is why we did not mark gc_reserve pure yet.

@alexrp
Member
alexrp commented Jun 2, 2012

I actually did (if you look at the diff) - either of us must have missed fixing/reviewing that at some point. Anyway, I'll make a pull request to revert it to just nothrow.

@alexrp
Member
alexrp commented Jun 2, 2012

See #237.

@klickverbot
Member

@denis-sh: Sorry if my earlier post sounded like a personal attack, it certainly wasn't intended that way. I thought you were implying that pure functions taking pointers are a problem in general, and Alex/Steven were stupid for not seeing that – even though the issues were discussed at length before the request was merged. ;)

@schveiguy
Member

Have you noticed there is no cast(immutable)? You probably have to look at @klickverbot's [Purity in D article "pure and immutable – again?" section]

That doesn't matter. The fact that it returns mutable makes it weak pure (the optimizer cannot remove any calls to gc_malloc)

(http://klickverbot.at/blog/2012/05/purity-in-d/#_and___again)

That confirms what we are saying.

@denis-sh
denis-sh commented Jun 2, 2012

That doesn't matter. The fact that it returns mutable makes it weak pure (the optimizer cannot remove any calls to gc_malloc)

Than I have no idea how does pure work. Why this

immutable int[] arr1 = pureFunction(1);
...
immutable int[] arr2 = pureFunction(1);

can't be optimized to this

immutable int[] arr1 = pureFunction(1);
...
immutable int[] arr2 = arr1;

? The only answer I see is that compiler knows that pureFunction can depend on some global state but I've never heard about such "feature" of pure functions and it looks inconsistent.

@schveiguy
Member

It's a new concept D has embraced. Since such a function that takes or returns mutable data cannot be pure optimized, it normally cannot be pure. But, if it doesn't access global variables, or shared data, then it still can't be optimized, but it can be called by an optimizable pure function!

So a pure function that created an immutable(int)[] with the numbers 1 to n could be pure and call malloc with no issues. The call to malloc could not be optimized, but the call to the number generating function could.

Of course, malloc DOES modify global shared state, but in an independent and thread safe way. So we must force malloc to be weakpure.

In fact , it might be reasonable to detect the implicit immutable cast, and optimize out calls to malloc, since you cannot change the data returned anyway. So the example might not work like you expect, but that's because your expectations are wrong.

@denis-sh
denis-sh commented Jun 2, 2012

It's a new concept D has embraced. ....

I completely agree and thank you for doing this. My point is that some functions you calls pure aren't logically pure.

In fact , it might be reasonable to detect the implicit immutable cast, and optimize out calls to malloc, since you cannot change the data returned anyway. So the example might not work like you expect, but that's because your expectations are wrong.

Looks like it contradicts to your previous post. Does it? Doesn't matter. Anyway, at least I'm still very confused with pure and request participation in Issue 8185 - Pure functions and pointers understanding and than converting the conclusion into docs and/or @klickverbot's article.

@schveiguy
Member

In fact , it might be reasonable to detect the implicit immutable cast, and optimize out calls to malloc, since you cannot change the data returned anyway. So the example might not work like you expect, but that's because your expectations are wrong.

Looks like it contradicts to your previous post. Does it?

Yes and no.  It is possible to optimize using that, but the current code does not do it.  I hadn't considered that optimization possibility, but I think it's still valid, even given the weak pure status of malloc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment