Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new, but never delete #2262

Merged
merged 1 commit into from Jun 28, 2013
Merged

new, but never delete #2262

merged 1 commit into from Jun 28, 2013

Conversation

WalterBright
Copy link
Member

This is good for a 20+% speedup in dmd. The only delete's are in init.c and interpret.c - Don, you might want to look at the latter.

L1:
heapleft -= m_size;
void *p = heapp;
heapp = (void *)((char *)heapp + m_size);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no alignment considered?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops! Good catch.

@andralex
Copy link
Member

nice!

@ghost
Copy link

ghost commented Jun 28, 2013

I did a tiny benchmark:

timeit dmd -unittest -main std\algorithm.d

Before: Done in 46.824 secs, used 1.100 MB
After: Done in 31.524 secs, used 954 MB

@CyberShadow
Copy link
Member

Thoughts:

  • Allocate memory from the OS directly, if possible - the C heap will have bookkeeping information to allow deleting memory, which we don't need. It might not be aligned to the page boundary, even though the allocation size is. We have existing D code for this in the D GC: https://github.com/D-Programming-Language/druntime/blob/master/src/gc/os.d
  • Consecutive allocations of 2049 bytes will waste almost 50% memory. Using different values for the direct-allocation threshold and bulk allocation amount would help.
  • I think there's no harm in allocating bulk memory in bigger pieces (say, 64K). Modern OS features (overcommit) allocate physical RAM when it's written to for the first time.

@ghost
Copy link

ghost commented Jun 28, 2013

What about using something like jemalloc?

@WalterBright
Copy link
Member Author

@CyberShadow The malloc overhead for these types of allocations is so small as to not be relevant. I agree that the block size could be bigger.

@AndrejMitrovic Since memory isn't being deleted anyway, why bother with jemalloc? Also, jemalloc requires including a license with the binary. I don't wish to further complicate the licensing of dmd.

@WalterBright
Copy link
Member Author

Amended to fix alignment, manifest constant, and increase chunk size.

@WalterBright
Copy link
Member Author

Dang, I keep pressing the wrong button!

@WalterBright
Copy link
Member Author

@donc Please look at this, as it essentially disables some "delete" operations in interpret.c.

@braddr
Copy link
Member

braddr commented Jun 28, 2013

Care to place any bets at how quickly we'll come to seriously regret this change?

@ibuclaw
Copy link
Member

ibuclaw commented Jun 28, 2013

Care to place any bets at how quickly we'll come to seriously regret this change?

@braddr it does seem to reinforce a joke I made earlier this week. ;)

http://forum.dlang.org/post/mailman.1430.1372151657.13711.digitalmars-d@puremagic.com

@AndrejMitrovic is that more or less allocating done? I can't tell. :)

@WalterBright
Copy link
Member Author

@braddr I've used similar techniques in many compilers, including DMC++. Your notion that we'll quickly regret it needs explanation.

@WalterBright
Copy link
Member Author

@ibuclaw your comment that a compiler is essentially a memory allocating program isn't that far off the mark. For every compiler I've profiled, malloc topped the list.

@braddr
Copy link
Member

braddr commented Jun 28, 2013

The memory usage issues of dmd are far more often complained about than the compilation speed.

@WalterBright
Copy link
Member Author

@braddr While I understand your point, this actually reduces memory consumption slightly. It's #if'd in, so can be turned on and off easily. rmem.malloc is still there, and is used by OutBuffer because OutBuffer uses rmem.free fairly regularly. I think this is a reasonable change for the moment, and doesn't back us into a corner.

@WalterBright
Copy link
Member Author

BTW, the cumulative speed up from 2.063 through this one in compiling:

dmd -main -unittest std\algorithm

is 22.5 seconds down to 11.5 seconds, nearly doubling the speed.

@don-clugston-sociomantic
Copy link
Contributor

Disabling the deletes in interpret.c is no problem, they're only called after an error happened. So actually they only reduce memory consumption in the case where compilation fails anyway! Pretty much useless.

@donc
Copy link
Collaborator

donc commented Jun 28, 2013

LGTM. I made a pull which deletes the CTFE deletes.
#2268

@ibuclaw
Copy link
Member

ibuclaw commented Jun 28, 2013

@WalterBright - thought I might also bring up should the frontend be making use of mem.malloc and friends rather than just calling malloc&co directly? I have noticed that code that uses those functions tend to mix and mash between the two in the same source file.

@donc
Copy link
Collaborator

donc commented Jun 28, 2013

There is one more delete, in template.c, in the rehashing of tempate instances. I presume it's recently added.
TemplateDeclaration::addInstance()

Should be next on the list of things to look at.

donc pushed a commit that referenced this pull request Jun 28, 2013
@donc donc merged commit 19a9003 into dlang:master Jun 28, 2013
@ghost
Copy link

ghost commented Jun 28, 2013

@WalterBright: I think D is getting seriously fast with these recent pulls. Also the recursive build pull (once it's properly implemented) will speed things up even more dramatically.

I wonder if we could even make the claim that D is the fastest language around to compile. I know without benchmarks it's hard to make such claims, but it feels seriously fast compared to C/C++ (and even Pascal for some projects I've tried out).

@mihails-strasuns
Copy link

Beating C/C++ is easy when it comes to compilation speed, benchmarking vs Pascal and Go can be interesting though.

@ibuclaw
Copy link
Member

ibuclaw commented Jun 28, 2013

@AndrejMitrovic - you mean DMD? GDC/LDC aren't quite up to speed with their time to compile ratios. But at least the frontend is blazing fast in comparison to the backend. (as in, it takes 8 seconds for the front-end to parse, run semantic analysis, and generate codegen to send to gcc for gdc -funittest std/algorithm.d - and just to be even more clear, that is 9455 functions built ;)

FYI: libphobos build time in GDC:
before: 1m 44.745s
after: 1m 37.588s

Hmm... looks like only 7% from a cursory test. :)

@WalterBright
Copy link
Member Author

@ibuclaw the reason for mem.malloc rather than malloc is so I can easily try different allocators. The reason I haven't switched mem.malloc to be allocate but never free is because OutBuffer successfully uses mem.free to reduce memory consumption.

@WalterBright
Copy link
Member Author

Timings for doing a full release build of phobos.lib (which includes running the C compiler to build zlib):

2.063: 9.84
now: 7.1

@WalterBright WalterBright deleted the never-delete branch June 28, 2013 18:58
donc pushed a commit that referenced this pull request Jun 28, 2013
@leandro-lucarella-sociomantic
Copy link
Contributor

On Fri, Jun 28, 2013 at 11:58:06AM -0700, Walter Bright wrote:

Timings for doing a full release build of phobos.lib (which includes running the C compiler to build zlib):

2.063: 9.84
now: 7.1

What about memory usage? You can use: /usr/bin/time -f '%M\n'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
9 participants