Skip to content

Building and Running Octotiger

David Pfander edited this page Jun 30, 2017 · 10 revisions

Warning: If you're using jemalloc, there currently is a bug in master and version 5.0. Use 4.5 instead!

Build-scripts for Ubuntu for the Very Impatient

To build OctoTiger with all dependencies on a recent Ubuntu machine.

  • Create a folder where you want to build OctoTiger with all dependencies
  • Uncompress the scripts from this archive to the build folder
  • Change the paths in build_octotiger_with_dependencies.sh so that they match your build folder
  • Execute build_octotiger_with_dependencies.sh Now missing packages are installed, all git repositories are cloned and Boost, Vc, HPX and OctoTiger are compiled.

Building HPX on a Desktop/Notebook Machine

In order to use the Vc library with OctoTiger, HPX must be configured using it. OctoTiger will pick up the appropriate settings from the HPX build system:

Build Instructions for HPX (with APEX) and Octotiger on Cori@NERSC for the KNLs

HPX, Vc and octotiger should be built with the script provided by @khuck that can be found in /project/projectdirs/xpress/hpx-lsu-cori-II directory on NERSC resources. See the README there.

Warning: Due to a cori bug, you have to manually module unload darshan in every new shell, putting the command in your .bashrc.ext doesn't resolve this issue. If you forget to do this, your applications will link against darshan, which is not available on Cori Phase 2.

Instead of building octotiger with all dependencies, you can also use the built libraries that can be found in the knl-build subfolder. However, using them might require some minor changes to the build scripts.

Running Octotiger with REAL SCIENCE RESULTS

(from an email by @dmarce1):

For test scaling runs, I have created three startup files of the same initial problem at three different maximum levels of refinement. (The problem setup is a q=0.20 double white dwarf system in near equilibrium)

  1. 7 level run with 1513 subgrids
  2. 8 level run with 4297 subgrids
  3. 9 level run with 11241 subgrids.

(There is about a ~2.7 increase in workload with each step)

I have uploaded the restart files to google drive here (also saved in /project/projectdirs/xpress/hpx-lsu-cori-II/restart-files): https://drive.google.com/drive/folders/0B_Hf1bEwvJEkS21VMzBSdUg1S0k?usp=sharing

The command line to start from these files are are

./octotiger --hpx:threads 20 -Problem=dwd -Max_level=*L**E**V**E**L* − *X**s**c**a**l**e* = 4.0 − *E**o**s* = *w**d* − *A**n**g**c**o**n* = 1 − *R**e**s**t**a**r**t* = *r**e**s**t**a**r**t*LEVEL.chk -Stoptime=0.01

where $LEVEL is 7, 8, or 9. Make sure to use the latest octotiger, a couple of the command line options above require the latest version.

On the SuperMIC computer, with 32 20 node cores, the 8 level problem takes 70 steps and runs in about a minute and a half. Stoptime can be adjusted for longer or shorter runs.

Additional large files

  1. level 10? with ~50,000 subgrids - Google Drive
  2. level 11? with ~300,000 subgrids - Google Drive

Running Octotiger (one locality)

To run octotiger with APEX:

salloc -N 1 -p knl -C knl,quad,flat -t 30:00 -A xpress
# (wait for allocation...)
rm -rf *.chk
rm -rf step.dat
rm -rf OTF2_archive
export APEX_OTF2=1  # enables OTF2 output
export APEX_PROC_STAT=0  # disables background checking of system counters (there's a bug Kevin needs to fix)
export APEX_PROCESS_ASYNC_STATE=0  # reduces overhead of APEX from ~1.5% to even less than that
srun -n 1 -N 1 ./src/octotiger-build/octotiger -Disableoutput -Problem=moving_star -Max_level=4 -Xscale=32 -Odt=0.5 -Stopstep=0 --hpx:threads=68

If APEX insn't enabled:

salloc -N 1 -p knl -C knl,quad,flat -t 30:00 -A xpress
# (wait for allocation...)
srun -n 1 -N 1 ./src/octotiger-build/octotiger -Disableoutput -Problem=moving_star -Max_level=4 -Xscale=32 -Odt=0.5 -Stopstep=0 --hpx:threads=68

Notice that to control the MCDRAM/DRAM of the KNLs, you have to use numactl:

salloc -N 1 -p knl -C knl,quad,flat -t 30:00 -A xpress
# (wait for allocation...)
srun -n 1 -N 1 numactl -m 1 ./src/octotiger-build/octotiger -Disableoutput -Problem=moving_star -Max_level=4 -Xscale=32 -Odt=0.5 -Stopstep=0 --hpx:threads=68

-m 1 indicates that the MCDRAM should be used. For more, see the NERSC website.

If Octotiger Crashes

Insufficient pages

If the error message says something with insufficient resources, similar to this:

  what():  mmap() failed to allocate thread stack due to insufficient resources, increase /proc/sys/vm/max_map_count or add -Ihpx.stacks.use_guard_pages=0 to the command line: HPX(unhandled_exception)

Then add the flag -Ihpx.stacks.use_guard_pages=0.

On some machines this seems to lead to some performance degradation. See whether the function pageblock_pfn_to_page shows up in a profiler.

Stack overflow

If octotiger crashes with a segfault, it is possible that the stack space is too small. Increase the stack size by adding the parameter --hpx:ini=hpx.stacks.small_size=0xC0000 (from C000 bytes).

Enable integration with VTune

HPX can be integrated with the Intel VTune Amplifier and Intel Inspector tools through the open ITTNotify interface exposed by those tools. This exposes the following information to the Intel tools:

  • Setting kernel thread names to be displayed in VTune
  • HPX-threads, HPX performance counters, and HPX Parcel send/receive events
  • HPX specific synchronization primitives
  • HPX memory allocation tracking

This integration requires both, compile-time and run-time settings to be enabled. Add -DHPX_WITH_ITTNOTIFY=On and -DAMPLIFIER_ROOT=<amplifier base directory> to cmake at configuration time to compile in the functionality into the HPX core library. This can be done, even if no (runtime-) integration with the Intel tools is planned and will create no additional runtime overheads (unless you actually run using one of the Intel tools).

Note however that you can't have both, integrating HPX with the Intel tools and with APEX at the same time. Only one of the integrations is possible for a particular build of HPX. If you use both options (-DHPX_WITH_ITTNOTIFY=On and -DHPX_WITH_APEX=On) you will still get only the integration with APEX as it integrates with HPX in part through the ITTNotify interface itself.

In order to activate the integration with the Intel tools at runtime add the command line option --hpx:ini=hpx.use_itt_notify=1.