Skip to content

Conversation

def-
Copy link
Contributor

@def- def- commented May 14, 2018

No description provided.

@PMunch
Copy link
Contributor

PMunch commented May 14, 2018

Makes quite a big difference on speed, testing locally I went from 5.84 to 2.14 by using -d:release rather than --opt:speed.

@ghost
Copy link

ghost commented May 14, 2018

Yes, this is the official method of compiling release binaries for Nim.

@ghost
Copy link

ghost commented May 14, 2018

@frol can you merge this and re-run benchmark? Nim will be much faster with that :)

@frol frol merged commit 5fba67c into frol:master May 14, 2018
@frol
Copy link
Owner

frol commented May 14, 2018

I am so sorry for that! I will update the numbers ASAP!

@PMunch
Copy link
Contributor

PMunch commented May 14, 2018

You might want to run with --gc:markAndSweep as well as it is quite a bit faster.

@frol
Copy link
Owner

frol commented May 14, 2018

I will give it a try!

@ghost
Copy link

ghost commented May 14, 2018

@frol also you can use clang with Nim:
nim c -d:release --cc:clang file.nim
You can add both GCC/Clang results (they probably would differ)

frol added a commit that referenced this pull request May 14, 2018
@frol
Copy link
Owner

frol commented May 14, 2018

@Yardanico clang makes it slower:

Options Time, seconds Memory, MB
--opt:speed 4.0 0.5
-d:release 0.95 0.5
-d:release --gc:markAndSweep 0.55 5
-d:release --gc:markAndSweep --cc:clang 0.6 5

Notice the memory usage. Is that expected?

I have updated the results!

@narimiran
Copy link
Contributor

narimiran commented May 14, 2018

@frol can you keep both versions in the tables? One with -d:release (for low memory) and the other with -d:release --gc:markAndSweep (for maximum speed).

EDIT: Also, in the compiler column, it would be better to state: Nim 0.18 / GCC 8.1.0

@PMunch
Copy link
Contributor

PMunch commented May 14, 2018

Notice the memory usage. Is that expected?

The different GCs use different amounts of memory. I don't have the memory checking utility installed, would be interesting to run it with all the different GCs and see what the performance/memory balance between each was.

@frol
Copy link
Owner

frol commented May 14, 2018

@narimiran Done!

@PMunch
Copy link
Contributor

PMunch commented May 14, 2018

Not sure why you are seeing so high memory usage rates. I installed cgmemtime and these are my results from all the different GCs: https://i.imgur.com/1Xz3Elo.png

EDIT: Compiled with -d:release and averaged over three runs
EDIT2: My bad, forgot I had done some minor changes: http://ix.io/1aiF/Nim
EDIT3: Created a PR with my changes, along with some style fixes: #6

@frol
Copy link
Owner

frol commented May 14, 2018

It seems that object vs ref object is a big deal, it is 376 KB now!

-d:release --gc:markAndSweep now has the lowest time and memory footprint! Should I drop the line representing the results for -d:release without --gc:markAndSweep from the main README?

@PMunch
Copy link
Contributor

PMunch commented May 14, 2018

Well, markAndSweep isn't the default gc, so it makes sense to show both. But if you feel that it adds unnecessary clutter I wouldn't be opposed to just showing markAndSweep. You could also try the other GCs as well, as you can see from my graph the regions GC is potentially even better.

@PMunch
Copy link
Contributor

PMunch commented May 14, 2018

It makes sense that using objects vs. ref objects makes a bit of a difference. Objects are more or less the same as a struct, and will be allocated on the stack. This means that as soon as the calling proc returns the memory is freed, but it might end up costing you some memory copying if you're not careful. A ref object on the other hand is again like a struct, but this time allocated on the heap. These will live on until the garbage collector figures out that they are no longer in use and will free them. What you are seeing here is that the object I changed from ref to a simple object is only used to pass multiple return variables (we could also have used a tuple for this, or an array of static length 3). This means that all these object are no longer floating around on the heap waiting for garbage collection. If we turn off the garbage collector completely with --gc:none we can see that our memory consumption skyrockets to (on my machine) 41MiB as none of these are ever cleared. By simply changing them to use object instead of ref object, still compiling with --gc:none the memory consumption drops to 504 KiB. So it's not the garbage collector using more memory, it's just that we allocate a lot of memory on the heap by using ref object which then have to be GC'ed away. To learn more about how objects and references work in Nim you can read my post on it here.

EDIT: Tried to change it to a tuple and an array. Some speedup going to a tuple, not much going from tuple to an array.
EDIT2: Created a PR for using tuples instead of an object for the return type as it turns out to be even slightly faster

@frol
Copy link
Owner

frol commented May 14, 2018

We have just uncovered another pity bug in Nim benchmark implementation. Testing the resulting code again, the memory footprint jumped to 5MB using swapAndSweep GC. Could you, please, confirm that is the case?

Another question I couldn't find an answer to is how to enable LTO for Nim? We are going to enable it for all the languages.

@ghost
Copy link

ghost commented May 14, 2018

@frol you can pass arguments to the C compiler with --passC and --passL (you can look on how to use them in the docs).

@PMunch
Copy link
Contributor

PMunch commented May 14, 2018

@frol, running it on my machine I only see a jump to 768KiB.. I'll investigate further and see if I can get it back down

@ghost
Copy link

ghost commented May 14, 2018

@frol Can you also try with Nim devel? If you don't want to compile it by yourself you can use choosenim (it can handle different nim version and automatically compile devel)

@PMunch
Copy link
Contributor

PMunch commented May 14, 2018

Ah yes, it turns out that I do in fact get 5MiB when using Nim 0.18.0, using the devel version drops it to 768KiB

PMunch pushed a commit to PMunch/completely-unscientific-benchmarks that referenced this pull request May 15, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants