Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New View: Compiletime/size Test #137

Closed
crtrott opened this issue Nov 23, 2015 · 4 comments
Closed

New View: Compiletime/size Test #137

crtrott opened this issue Nov 23, 2015 · 4 comments

Comments

@crtrott
Copy link
Member

crtrott commented Nov 23, 2015

Trilinos: Tpetra with ETI
New Test for high dimensionality view does Irina have Tensors in Intrepid?

@crtrott crtrott added this to the New Views MUST WORK NOW milestone Nov 23, 2015
@crtrott
Copy link
Member Author

crtrott commented Dec 3, 2015

I am starting to record compile times here for Trilinos on my machine. The basic configure script is attached. For experimental view the -DKOKKOS_USING_EXPERIMENTAL_VIEW flag will be set. Otherwise only compilers will be changed. I will go back edit this post. I am using make -j 24 to build on an empty machine (Dual Haswell with total of 24 cores, 128GB RAM, PCIe SSDs, compilers and TPLs are mounted via the network from SEMS, harddisks are encrypted).

Updated the data for commit 1d0b6591cf9471d0b8711b8ca01335240526afc1of Trilinos
Compiler TimeOld TimeNew TpetraLibOld TpetraLibNew KernelsLibOld KernelsLibNew
gcc/4.7.2 11:54 11:42 84 97 16 21
gcc/4.9.2 11:39 11:38 64 77 8.5 15
gcc/5.1.0 12:22 12:10 65 66 8.7 9.5
intel/15.0.2/gcc/4.9.2 23:58 23:13 146 139 32 31
clang/3.6.1/gcc/4.9.2
cuda/7.5.18/gcc/4.9.2 36:40* 35:20* 182 46 202 57

Note cuda for this configuration had some errors for example in Seacas (which doesn't even use Kokkos).

Update for commit 872a11a5c30f31c41ea1da86ad035239b1788ce8
Compiler TpetraLibOld TpetraLibNew KernelsLibOld KernelsLibNew
gcc/4.7.2 84 97 16 21
gcc/4.9.2 78 73 8.5 14
gcc/5.1.0 65 66 8.7 9.5
intel/15.0.2/gcc/4.9.2 146 136 32 29

do-configure.txt

@crtrott
Copy link
Member Author

crtrott commented Dec 9, 2015

Here is LAMMPS data. This was run on a dual SandyBridge with 2x8 cores.
I did: make yes-manybody; make yes-kspace; make yes-user-reaxc; make yes-user-cg-cmm; make yes-molecule; make yes-kokkos; make -j 16

This will generate 464 object files.

Compiler TimeOld/TimeNew SizeOld/SIzeNew
GCC/4.8.4 2:39/2:32 98/95
GCC/4.9.2 3:02/3:10 96/92
GCC/5.1.0 fail fail
Intel/15.0.2/GCC/4.9.2 3:49/3:47 127/129
Clang/3.6.1 fail fail
Cuda/7.5.18/GCC/4.9.2 11:05/11:14 190/177

After identifying the NFS mounted compilers as a bottleneck here is new data with copying compiles to a local disk (I also switched to my Haswell machine which was about 10% faster than the sandy bridge one, when using the NFS mounted compilers. For example GCC/4.9.2 with -j48 took 2:40). Building with -j48, also note that a good 10-20 seconds are just spend on linking.

Compiler TimeOld TimeNew SizeOld SIzeNew
GCC/4.8.4 0:49/0:49 98/95
GCC/4.9.2 0:53/0:52 96/92
GCC/5.1.0 fail fail
Intel/15.0.2/GCC/4.9.2 1:30/1:36 127/129
Clang/3.6.1 fail fail
Cuda/7.5.18/GCC/4.9.2 xx/xx 190/177

@crtrott
Copy link
Member Author

crtrott commented Dec 10, 2015

So to summarize there is no terrible counter indication against the new view with the possible exception of the tpetra kernels library for GCC 4.9.2 and GCC 4.7.2 where the library size increased substantially. On the other hand the binary size of LAMMPS actually decreased for most compilers.

Compile times were not impacted.

@crtrott
Copy link
Member Author

crtrott commented Dec 15, 2015

Nalu data for commit 9d30a9f9a448919c9c1a4cad393bf5da64aac056 of crtrott/Nalu

gcc/4.7.2 74 82
gcc/4.9.2 70 67
gcc/5.1.0 60 62
intel/15.0.2/gcc/4.9.2 143 140

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants