Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
198 commits
Select commit Hold shift + click to select a range
7c8a54e
Extending CMake configuration to use CTest
helq Dec 6, 2022
7031778
Packet source terminal is notified on packet delay delivery
helq Dec 14, 2022
eaffbb5
Improve strategy to store latency of packets from terminal to terminal
helq Dec 19, 2022
90f623f
Sending packages directly to terminal instead of network
helq Dec 21, 2022
46bf23d
"Network bypassing" implementation can be toggled on/off now
helq Dec 23, 2022
a85b983
Scaffolding for custom latency predictor done
helq Jan 15, 2023
43d152b
Basic Director implementation to switch to and from surrogate mode
helq Jan 17, 2023
d7d972a
Allowing code to compile when tie breaker is deactivated
helq Jan 20, 2023
3fe1b8e
Moving surrogate from ping-pong to global file
helq Jan 30, 2023
eff8784
Tracking new parameter: timestamp at which a packet was injected in m…
helq Feb 2, 2023
64230f2
Capturing in queue (input terminal buffer) delay per packet
helq Feb 6, 2023
6e23955
Initial implementation of network freezing
helq Feb 20, 2023
bb91444
Some assertions to check for sanity of gvt trigger
helq Feb 21, 2023
59d3399
The state of the network simulation is now being *truly* frozen
helq Feb 21, 2023
04103ab
Sequential implementation of switching mechanism completed
helq Feb 22, 2023
baa787a
Loading router buffer occupancies via the config file
helq Feb 22, 2023
e152bc6
Fixing implementation of switching mechanism (mostly in ROSS)
helq Feb 23, 2023
b151224
Selection of events to freeze is now done by the topology/network model
helq Feb 23, 2023
25af13a
Moving procedure to save packet delay info to memory out of event han…
helq Feb 23, 2023
0dfb747
Adding some `free`s that were previously missing on rollback!
helq Feb 24, 2023
8f755eb
Fiddling a bit more with the idle events. They are tricky to get righ…
helq Feb 24, 2023
80362d6
Using injection bandwith infraestructure instead of by hand strategy
helq Feb 24, 2023
cac772f
Anothe quick fix and a configuration option to turn on or off the fre…
helq Feb 24, 2023
062bef7
Adding surrogate stats output for models
helq Feb 25, 2023
f64161f
merge IIT updates to kronos branch, and add support of Union in drago…
Apr 11, 2023
4e889d7
add doc for union online workload
Apr 11, 2023
72db8aa
typo fix
Apr 11, 2023
453b9ab
Merge branch 'kronos' into kronos-union
Apr 18, 2023
45899ef
Enabling surrogate switch even when tie-breaker mechanism is disabled
helq Apr 24, 2023
27f9e9e
Extending functionality to compile and run without tie-breaker mechanism
helq May 3, 2023
92355ca
Simplifying zombie notification strategy
helq May 10, 2023
4fe5d52
Extending predictor to predict delay to process next packet in the queue
helq May 21, 2023
3080130
Including reproducibility pads23 scripts
helq Jul 9, 2023
d6c979e
Dragonfly's surrogate model can now handle msg_sz > packet_sz
helq Jul 9, 2023
d78644a
Extending unit tests to include several cases for ping-pong workload
helq Jul 25, 2023
e2537e7
Fixing additional bugs on surrogate mode
helq Aug 11, 2023
480f259
Small reformating of surrogate and two bugs found
helq Aug 11, 2023
a6922c9
Ignoring next-in-line delay for packel-latency average predictor
helq Aug 17, 2023
d977677
output iteration time to file
Aug 31, 2023
7edd333
add temporary test folder
Aug 31, 2023
4a3e46a
Merge branch 'kronos-union' into kronos-ml-speedup
helq Aug 31, 2023
3518083
Updating to allow Union to run alongside surrogate
helq Sep 1, 2023
2a236fc
Updated compilation instructions
helq Sep 1, 2023
1b14c9f
Fixing UNION and SWM compilation of Argobots
helq Sep 1, 2023
166d3dc
Harcoded application surrogate. Skipping several iterations of simula…
helq Sep 1, 2023
dc9568f
Merge remote-tracking branch 'origin/kronos-ml-application' into kron…
helq Sep 1, 2023
f96ac23
In some systems the an assert assumption was not valid
helq Sep 1, 2023
1f35268
Fixing non-determinism bug on dragonfly-dally
helq Sep 15, 2023
54ea148
Updating tests to run in AiMOS with no errors
helq Sep 15, 2023
04fe07a
Fixing compilation error for some non-compliant compilers (empty defa…
helq Sep 18, 2023
e1e136e
Refactoring surrogate code to separate it into multiple files
helq Sep 22, 2023
a570ca0
Connecting (Py)Torch model to packet latency surrogate
helq Sep 22, 2023
816ecba
Fixing bug on switch to and from surrogate failure
helq Sep 27, 2023
5bf3c73
Adding extra check to surrogate tests (making them regression unit te…
helq Sep 28, 2023
9e8c0e7
Tracking time spent in surrogate mode
helq Oct 2, 2023
9794a32
Disabling Torch-JIT if it is not found in the system
helq Oct 5, 2023
8848b55
Updating torch-jit predictor to feed model an integer
helq Oct 11, 2023
50961f3
Fixiing average predictor table size
helq Oct 15, 2023
628907d
Tracking packets not chunks to determine if message has been completed
helq Oct 17, 2023
3010118
Extending examples with uniform random traffic example
helq Oct 22, 2023
4fd9ea4
Fixed NaN bug when packet predictor hasn't been fed a packet with ano…
helq Oct 27, 2023
1707f1f
New strategy to feed predictor implemented
helq Nov 2, 2023
0b53bb8
Small fixes to allow more experiment configurations to run
helq Nov 2, 2023
2f23503
Fixing double deallocation (free())
helq Nov 2, 2023
46d72f8
Updating scripts to check output data (packet latency / port occupanc…
helq Nov 3, 2023
b0be25b
Revert "Extending examples with uniform random traffic example"
helq Nov 3, 2023
ff831a8
Updating scripts to find out packet latency
helq Nov 8, 2023
2a48197
Another update to the plotting scripts (more general)
helq Nov 11, 2023
0c2678c
Refactoring plot generation scripts
helq Nov 13, 2023
59ffe08
Misc changes to plot packet latency scripts
helq Nov 25, 2023
97830b4
Allowing to disable torch if library is present
helq Nov 27, 2023
f924aca
Quick (and good enough) fix to keep the buffer at the terminal small
helq Nov 27, 2023
d3b192b
Forcing variable to not be optimized out
helq Nov 28, 2023
3f63c9c
Fixed little bug on next in queue time
helq Nov 30, 2023
4d35048
Merge branch 'kronos-torch-jit' of github.com:codes-org/codes into kr…
helq Nov 30, 2023
8e0f450
Partial fix for progressive adaptive's algo
helq Dec 11, 2023
3d1f29c
Tweaking the plotting script for packet-latency
helq Dec 13, 2023
743f853
Merge remote-tracking branch 'origin/fix-tiebreaker-compilation' into…
helq Jan 12, 2024
98aba5e
Fixing reverse computation VC patch
helq Jan 12, 2024
eaf97df
Merge branch 'kronos-torch-jit-prog-adap-patch' into kronos-torch-jit
helq Jan 12, 2024
4fcda47
Updating port-occupancy script
helq Feb 8, 2024
94ae872
Renaming variable to avoid confusion
helq Feb 12, 2024
6a67816
Refactoring director function to generalize (first step to define API…
helq Feb 26, 2024
9b32a71
Hardcoded example skipping iterations for TWO applications (MILC and …
helq Apr 11, 2024
42f7cd5
Improving figure generation script
helq Apr 18, 2024
44d5f69
Fix: replacing O(n) table lookup for O(1)
helq Apr 29, 2024
a155f6d
zmqml src
kazutomo May 7, 2024
b7cba7c
additional zmqml src
kazutomo May 7, 2024
8184b44
C++ API fixes and training demo files
kazutomo May 7, 2024
b51ffbf
notes
kazutomo May 7, 2024
32f0482
data location change
kazutomo May 7, 2024
c589d49
Injecting iteration time as an argument
helq May 8, 2024
9f605f0
Fixing compilation warning `incompatible-pointer-types`
helq Jul 7, 2024
54936f2
Merge remote-tracking branch 'origin/kronos-develop' into kronos-develop
helq Jul 7, 2024
1df7bb7
Updating code after ROSS change on gvt hook
helq Jul 9, 2024
6af7eb1
Fixing compilation warning `incompatible-pointer-types`
helq Jul 7, 2024
472cc5a
Removing hardcoded test and we can pass a config file now
helq Jan 23, 2025
57fc7e3
Fixing a memory bug when reading from file
helq Jan 24, 2025
2711b6b
Allowing to run without skipping configuration file
helq Jan 24, 2025
1412a4e
Saving apps iteration logs into single files per PE
helq Jan 24, 2025
bb5b369
Guaranteeing that "workload period" config works in parallel
helq Jan 25, 2025
a4e052a
Changing time in period file to double (from long)
helq Jan 25, 2025
795628d
Stdout for surrogate only from PE 0
helq Feb 18, 2025
a7121ec
Implementing custom LP status printing for model-net-lps
helq Feb 18, 2025
ca30320
Fixing small bug found when rollbacking model-net-event
helq Feb 18, 2025
c2afcd1
Cleaning up some structs and fixing a reverse handler case
helq Feb 24, 2025
c4c1491
Refactoring struct in model-net-mpi-replay
helq Feb 24, 2025
9a5bf98
Print function for struct codes_workload_op and enum codes_workload_o…
helq Feb 24, 2025
9da3d36
Implementing deep copy/check/print for LP state: nw_state
helq Feb 24, 2025
6e97889
Fixing minor reversibility bugs in LP type nw_state
helq Feb 24, 2025
a3e638e
Adding checkpointer functionality to model-net sub-models
helq Mar 2, 2025
e430fea
Moving implementation of linked list equality to quicklist.h
helq Mar 2, 2025
8b95a70
Fixing some potential memory errors (from Valgrind)
helq Mar 2, 2025
c9729d8
Extending implementation of model-net checkpointer
helq Mar 2, 2025
7bc29c2
Implementing FCFS checkpointer
helq Mar 2, 2025
d48898a
Removing never used struct param `entry_time`
helq Mar 4, 2025
fab09e8
Printing lp states and events with a prefix (prettier printing)
helq Mar 5, 2025
ca89cf1
Small implementation fixes (typo and exporting function name)
helq Mar 7, 2025
2dd6db5
Implementing base deep-copy/clean/comparison/print for dragonfly lps
helq Mar 7, 2025
7aa4c11
Printing sub_message contents of model-net message
helq Mar 7, 2025
f3818d0
Implementing (an almost complete) deep-copy of terminal_state
helq Mar 10, 2025
0898c37
Fixing reversibility bug in terminal_state (dragonfly-dally)
helq Mar 10, 2025
d3d7621
Commenting what has is left to be implemented to fully deep-copy `str…
helq Mar 11, 2025
41680da
Implementing deep-copy of member terminal_msgs in terminal_state
helq Mar 12, 2025
f8c5163
Fixing copy of C++ non-initialized members
helq Mar 12, 2025
4a1819b
Some members of terminal_state are not be deep-copied in surrogate mode
helq Mar 12, 2025
d2cf6ae
Fixing surrogate switch
helq Mar 12, 2025
4b6bc9a
Fixing state that wasn't properly reversed
helq Mar 12, 2025
8c8ccbc
Fixing rollback of member `remaining_sz_packets` in terminal_state
helq Mar 12, 2025
ba77a08
Fixing faulty logic when rollbacking event for background traffic
helq Mar 12, 2025
f195835
Merge remote-tracking branch 'origin/hotfix-8k-scalability-issue' int…
helq Mar 12, 2025
ddf1981
Fixing condition for surrogate switch
helq Mar 13, 2025
896b70b
Merge remote-tracking branch 'origin/kronos-gvt-director-refactor' in…
helq Mar 13, 2025
03e5fd4
Fixing the switch from high-fidelity to surrogate
helq Mar 18, 2025
a4cac4d
Adding ability to delete events at director call
helq Mar 18, 2025
dde0551
Fixing some debug output in surrogate switch
helq Mar 18, 2025
18d300e
Adding deep-copy/check/print functions for router_state
helq Mar 21, 2025
244f98a
Fixing reversibility bug in router_state
helq Mar 21, 2025
e7e7535
Updating tie-breaker related code from ROSS update
helq Mar 23, 2025
e0cc46e
Finishing missing components to check in deep-copy/check/print functi…
helq May 29, 2025
0e76693
Moving general PDES code into ROSS
helq May 29, 2025
01e6bf6
Renaming surrogate as network-surrogate
helq May 30, 2025
ab3b951
Renaming network average predictor to allow for more predictors
helq May 30, 2025
77964ab
Network predictors do not need to allocate memory when initialized
helq May 30, 2025
8c65ec2
Each computer node tracks its own workload id
helq May 30, 2025
d3f75b8
Renaming another variable from surrogate to network-surrogate
helq May 31, 2025
9bfa926
Adding some documentation for nw_state
helq Jun 2, 2025
81099b7
Initial implementation of director for application iteration
helq Jun 5, 2025
53f51c4
Fixing bug on predictor when app is not fully distributed across all PEs
helq Jun 9, 2025
ffea77b
Refactoring/renaming some fields to aid legibility
helq Jun 9, 2025
763a71f
De-harcoding parameters passed down by model-net-mpi-replay at init
helq Jun 9, 2025
86c25cd
Configuring application surrogate through config file
helq Jun 9, 2025
fa56d85
Passing data from non-synthetic workloads to CODES through interface
helq Jun 10, 2025
2433b8b
Allowing surrogate to run in sequential mode
helq Jun 10, 2025
26fa2ac
Minor cosmetic change
helq Jun 10, 2025
1b0bdab
Light refactoring of a large function in the application predictor
helq Jun 10, 2025
553f492
Removing old (hardcoded) application surrogate
helq Jun 10, 2025
650fd9e
Refactoring strategy to freeze network in network director
helq Jun 11, 2025
7db95be
Refactor network director to use separate queue for frozen events ins…
helq Jun 11, 2025
c16965f
Network surrogate should be enabled through a custom parameter
helq Jun 11, 2025
9835040
Bug fix - tw_now has been moved out of commit time
helq Jun 12, 2025
bfdfba9
Hooking network surrogate to application surrogate
helq Jun 12, 2025
8e9521b
Wrap dummy event logic with compile-time flag for simulation reproduc…
helq Jun 12, 2025
10edcec
Modifying tests. They all pass now!
helq Jun 12, 2025
9126863
Adding missing garbage collection and print statement
helq Jun 13, 2025
cd766b1
Refactoring some of the common values between surrogates
helq Jun 13, 2025
4969737
Fixing position of bracket
helq Jun 13, 2025
b4b6362
Updating tests
helq Jun 13, 2025
36bc317
Adding tests for UNION
helq Jun 13, 2025
03d7da6
Fixing bug where MILC would not work with network surrogate when free…
helq Jun 16, 2025
fcdf824
Fixed a bug on reading setting from file
helq Jun 16, 2025
1a41fda
Resetting predictor when turning back into full fidelity
helq Jun 16, 2025
e70f540
Adding more tests for Union and the application surrogacy
helq Jun 16, 2025
9b9a1ed
Fixing a bug and adding a test to check for different sizes of chunks…
helq Jun 17, 2025
66511d9
potential/partial fix to unmatched receives bug
helq Jun 17, 2025
ecae4b8
Fixing bugs that show up with Jacobi and chunk size != packet size
helq Jun 17, 2025
737c702
Adding more tests :)
helq Jun 17, 2025
5755e06
Changed terminal_dally_message_list to work with terminal_dally_messa…
helq Jun 17, 2025
1710290
Fixing small silent bug at terminal initialization
helq Jun 17, 2025
3d1b55c
Refactoring routers usage of custom double linked-list for qlist
helq Jun 18, 2025
274f020
Allowing director to be called after simulation ended, to repopulate …
helq Jun 18, 2025
ba7b826
Updating README and compile instructions
helq Jun 18, 2025
07a4002
Small print changes
helq Jun 20, 2025
8be98f9
Allowing conc-online to load json files from config path
helq Jun 20, 2025
25ab4c9
If we pass on a `workload_json_files` conf file, we allow a job to ta…
helq Jun 20, 2025
64c6cce
Extending iterator predictor to predict when to restart the simulation
helq Jun 23, 2025
b992e4a
Making post_init_share_ending_iteration intent clearer
helq Jun 24, 2025
82a69f8
Fixed cross-platform fscanf EOF handling
helq Jun 24, 2025
3295653
Fixing some errors found with valgrind
helq Jun 26, 2025
73cdbd5
Updating CODES-compile-instructions.sh
helq Jun 26, 2025
667dc28
Saving to file when an iteration has been skipped by the surrogate
helq Jun 27, 2025
ed57788
Merge branch 'director-app-automatic' into develop
helq Jun 27, 2025
789c469
Updating compilation instructions
helq Jun 29, 2025
45453ad
Max iteration per app should be computed across all MPI ranks
helq Jun 29, 2025
242707e
Updating compilation instructions
helq Jul 15, 2025
34275e3
Removing support for Autoconf
helq Jul 22, 2025
3d2b726
Adding some of Neil's and Elkin's contributions from the past 5 years
helq Jul 22, 2025
ed9edf5
Updating compilation script
helq Jul 23, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 7 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -37,4 +37,10 @@
# generated files from test runs
ross.csv

install-mastiff/include/codes/model-net-method.h
install-mastiff/include/codes/model-net-method.h

# commonly used building stuff
/build*/
/build*
.cache
compile_commands.json
86 changes: 66 additions & 20 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
cmake_minimum_required(VERSION 3.10)
cmake_minimum_required(VERSION 3.17)

# set the project name and version
project(codes LANGUAGES C CXX VERSION 2.0)
Expand All @@ -22,12 +22,13 @@ SET(CMAKE_INSTALL_RPATH_USE_LINK_PATH TRUE)

set(ROSS_PKG_CONFIG_PATH "" CACHE PATH "Where is ROSS PKG_CONFIG is installed?")
set(SWM_PKG_CONFIG_PATH "" CACHE PATH "Where is the SWM PKG_CONFIG installed?")
set(UNION_PKG_CONFIG_PATH "" CACHE PATH "Where is the Union PKG_CONFIG installed?")
set(ARGOBOTS_PKG_CONFIG_PATH "" CACHE PATH "Where is argobots PKG_COPNFIG installed? Necessary for SWM")
set(DAMARIS_PKG_CONFIG_PATH "" CACHE PATH "Where is the damaris PKG_CONFIG installed?")


find_package(PkgConfig REQUIRED)
set(ENV{PKG_CONFIG_PATH} "${ROSS_PKG_CONFIG_PATH}:${SWM_PKG_CONFIG_PATH}:${ARGOBOTS_PKG_CONFIG_PATH}")
set(ENV{PKG_CONFIG_PATH} "${ROSS_PKG_CONFIG_PATH}:${SWM_PKG_CONFIG_PATH}:${UNION_PKG_CONFIG_PATH}:${ARGOBOTS_PKG_CONFIG_PATH}")
pkg_check_modules(ROSS REQUIRED IMPORTED_TARGET ross)

# MPI
Expand Down Expand Up @@ -57,28 +58,50 @@ else(DUMPI_LIB)
set(USE_DUMPI true)
endif()

## SWM
# SWM and UNION (both require ARGOBOTS to function)
pkg_check_modules(SWM IMPORTED_TARGET swm)
if(NOT SWM_FOUND)
message(STATUS "SWM Library Not Found, Online workloads disabled")
message(STATUS "SWM Library Not Found, Online workloads disabled")

else(SWM_FOUND)
message(STATUS "SWM Library Found: ${SWM_LIBRARIES}")
pkg_check_modules(ARGOBOTS REQUIRED IMPORTED_TARGET argobots)
if(NOT ARGOBOTS_FOUND)
message(STATUS "Argobots Library Not Found, Online workloads disabled")
else(ARGOBOTS_FOUND)
message(STATUS "Argobots Library Found: ${ARGOBOTS_LIBRARIES}")
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${ARGOBOTS_CFLAGS} -I${ARGOBOTS_INCLUDE}")

pkg_get_variable(SWM_DATAROOTDIR swm datarootdir)
cmake_print_variables(SWM_DATAROOTDIR)

set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${SWM_CFLAGS} -I${SWM_INCLUDE}")
add_definitions(-DUSE_ONLINE=1)
set(USE_ONLINE true)
message(STATUS "SWM Library Found: ${SWM_LIBRARIES}")
pkg_check_modules(ARGOBOTS REQUIRED IMPORTED_TARGET argobots)
if(NOT ARGOBOTS_FOUND)
message(STATUS "Argobots Library Not Found, Online workloads disabled")

else(ARGOBOTS_FOUND)
message(STATUS "Argobots Library Found: ${ARGOBOTS_LIBRARIES}")
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${ARGOBOTS_CFLAGS} -I${ARGOBOTS_INCLUDE}")

pkg_get_variable(SWM_DATAROOTDIR swm datarootdir)
cmake_print_variables(SWM_DATAROOTDIR)

set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${SWM_CFLAGS} -I${SWM_INCLUDE}")
add_definitions(-DUSE_ONLINE=1)
set(USE_ONLINE true)

pkg_check_modules(UNION IMPORTED_TARGET union)
if(NOT UNION_FOUND)
message(STATUS "UNION Library Not Found, SWM-only online workloads enabled")
add_definitions(-DUSE_SWM=1)
set(USE_SWM true)
else(UNION_FOUND)
message(STATUS "UNION Library Found: ${UNION_LIBRARIES}")
pkg_get_variable(UNION_DATAROOTDIR union datarootdir)
cmake_print_variables(UNION_DATAROOTDIR)

set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${UNION_INCLUDE}")
foreach(INCLUDE_OPT ${UNION_CFLAGS})
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${INCLUDE_OPT}")
endforeach()

add_definitions(-DUSE_UNION=1)
set(USE_UNION true)
endif()
endif()
endif()


## RECORDER
option(USE_RECORDER "use recorder io workload" ON)
if(USE_RECORDER)
Expand All @@ -96,11 +119,34 @@ endif()
# set(USE_DAMARIS true)
# endif()

## TORCH loading ML models
if((NOT DEFINED USE_TORCH) OR USE_TORCH)
find_package(Torch)
if(Torch_FOUND)
set(CMAKE_CXX_STANDARD 17)
add_definitions(-DUSE_TORCH)
set(USE_TORCH true)
message(STATUS "Loading TORCH models enabled.")
else()
set(USE_TORCH false)
message(STATUS "Torch library not found. Loading TORCH models disabled.")
endif()
else()
message(STATUS "Loading TORCH models NOT enabled.")
endif()

cmake_print_variables(CMAKE_C_FLAGS)
add_subdirectory(src)


configure_file(codes_config.h.in codes_config.h)

configure_file(codes_config.h.cmake.in codes_config.h)

add_subdirectory(doc/example)

string(COMPARE NOTEQUAL "RELEASE" "${CMAKE_BUILD_TYPE}" not_release)
if(BUILD_TESTING AND not_release)
include(CTest)
set(CODES_SOURCE_DIR "${CMAKE_CURRENT_SOURCE_DIR}")
set(CODES_BINARY_DIR "${CMAKE_CURRENT_BINARY_DIR}")
add_subdirectory(tests)
endif()
136 changes: 136 additions & 0 deletions CODES-compile-instructions.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
#!/usr/bin bash -x

# Switches
swm_enable=1
union_enable=1
torch_enable=0

# Uncomment below for MPICH
#export PATH=/usr/local/mpich-4.1.2/bin/:"$PATH"
# Note: remember to compile MPICH with nemesis not with UCX support

################## Actual scripts starts from here ##################

# SWM has to be enabled for UNION to work
if [ $union_enable = 1 ]; then
swm_enable=1
fi

# What to compile
CUR_DIR="$PWD"

##### Downloading everything #####

git clone https://github.com/codes-org/codes --depth=100 --branch=v1.5.0
git clone https://github.com/ross-org/ross --depth=100 --branch=v8.1.0

if [ $swm_enable = 1 ]; then
git clone https://github.com/pmodels/argobots --depth=1
git clone https://github.com/codes-org/swm-workloads --branch=v1.2
fi

if [ $union_enable = 1 ]; then
# Downloading conceptual
curl -L https://sourceforge.net/projects/conceptual/files/conceptual/1.5.1b/conceptual-1.5.1b.tar.gz -o conceptual-1.5.1b.tar.gz
tar xvf conceptual-1.5.1b.tar.gz
# Downloading union
git clone https://github.com/SPEAR-UIC/Union
pushd Union && git checkout 99b3df3 && popd
fi

##### COMPILING #####

mkdir ross/build
pushd ross/build
cmake .. -DROSS_BUILD_MODELS=ON -DCMAKE_INSTALL_PREFIX="$(realpath ./bin)" \
-DCMAKE_C_COMPILER=mpicc -DCMAKE_BUILD_TYPE=Debug -DCMAKE_C_FLAGS="-g -Wall"
#make VERBOSE=1
make install -j4
err=$?
[[ $err -ne 0 ]] && exit $err
popd

if [ $swm_enable = 1 ]; then
pushd swm-workloads/swm
./prepare.sh
mkdir build
pushd build
../configure --disable-shared --prefix="$(realpath ./bin)" CC=mpicc CXX=mpicxx CFLAGS=-g CXXFLAGS=-g
#make V=1 && make install
make -j4 && make install
err=$?
[[ $err -ne 0 ]] && exit $err
popd && popd

pushd argobots
./autogen.sh
mkdir build
pushd build
#../configure --enable-debug=all --disable-fast --disable-shared --prefix="$(realpath ./bin)" CC=mpicc CXX=mpicxx CFLAGS=-g CXXFLAGS=-g
../configure --disable-shared --prefix="$(realpath ./bin)" CC=mpicc CXX=mpicxx CFLAGS=-g CXXFLAGS=-g
#make V=1 && make install
make -j4 && make install
err=$?
[[ $err -ne 0 ]] && exit $err
popd && popd
fi

if [ $union_enable = 1 ]; then
pushd conceptual-1.5.1b
PYTHON=python2 ./configure --prefix="$(realpath ./install)" LIBS=-lm
make -j4 && make install
err=$?
[[ $err -ne 0 ]] && exit $err
popd

pushd Union
# Python 2 override. Union expects Python 2 ONLY
mkdir -p python-override
ln -s /usr/bin/python2 python-override/python
# compiling
./prepare.sh
PYTHON=python2 ./configure --disable-shared --with-conceptual="$(realpath ../conceptual-1.5.1b/install)" --with-conceptual-src="$(realpath ../conceptual-1.5.1b)" --prefix="$(realpath ./install)" CC=mpicc CXX=mpicxx
PATH="$PWD/python-override:$PATH" make -j4 && make install
err=$?
[[ $err -ne 0 ]] && exit $err
popd
fi


mkdir codes/build
pushd codes/build

make_args_codes=(
-DCMAKE_PREFIX_PATH="$(realpath "$CUR_DIR/ross/build/bin")"
-DCMAKE_CXX_COMPILER=mpicxx -DCMAKE_C_COMPILER=mpicc
-DCMAKE_C_FLAGS="-g -Wall"
-DCMAKE_CXX_FLAGS="-g -Wall"
-DCMAKE_BUILD_TYPE=Debug -DBUILD_TESTING=ON
-DCMAKE_INSTALL_PREFIX="$(realpath bin)"
)
if [ $swm_enable = 1 ]; then
make_args_codes=(
"${make_args_codes[@]}"
-DSWM_PKG_CONFIG_PATH="$(realpath "$CUR_DIR/swm-workloads/swm/build/maint")"
-DARGOBOTS_PKG_CONFIG_PATH="$(realpath "$CUR_DIR/argobots/build/maint")"
)
fi
if [ $union_enable = 1 ]; then
make_args_codes=(
"${make_args_codes[@]}"
-DUNION_PKG_CONFIG_PATH="$(realpath "$CUR_DIR/Union/install/lib/pkgconfig")"
)
fi
if [ $torch_enable = 1 ]; then
make_args_codes=("${make_args_codes[@]}" -DUSE_TORCH=true)
else
make_args_codes=("${make_args_codes[@]}" -DUSE_TORCH=false)
fi

cmake .. "${make_args_codes[@]}"
#make VERBOSE=1
make -j4
err=$?
[[ $err -ne 0 ]] && exit $err

popd
18 changes: 18 additions & 0 deletions CONTRIBUTORS.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,8 @@ Contributors to date (with affiliations at time of contribution)
- Lee Savoie, Univ. of Arizona
- Ning Liu, Rensselaer Polytechnic Institute
- Jason Cope, Argonne National Laboratory
- Kevin A. Brown, Argonne National Laboratory
- Elkin Cruz, Rensselaer Polytechnic Institute

Contributions:

Expand All @@ -40,6 +42,8 @@ Neil McGlohon (RPI)
- Merged 1-D dragonfly and 2-D dragonfly network models.
- Updated adaptive routing in megafly and 1-D dragonfly network models.
- Extended slim fly network model's dual-rail mode to arbitrary number of rails (pending).
- Implemented Quality of Service (QoS) in 1-D dragonfly network.
- Implemented changes needed to allow ROSS's tiebreaker mechanism.

Nikhil Jain, Abhinav Bhatele (LLNL)
- Improvements in credit-based flow control of CODES dragonfly and torus network models.
Expand Down Expand Up @@ -78,3 +82,17 @@ Caitlin Ross (RPI):
- Added instrumentation so that network models can report sampled
statistics over virtual time (pending).
- Bug reporter for CODES models.

Elkin Cruz (RPI)
- Added network surrogate for 1-D Dragonfly model (dragonfly-dally).
- Added application surrogate for MPI replay (model-net-mpi-replay).
- Implemented API to allow network and application surrogates to switch as
simulation runs (aka, hybrid simulation).
- Added network and application level directors, which coordinate data
transference between model and predictor.
- Added simple averaged-based network and application predictors (they are
given simulation data and are in charge of predicting future states of the
simulation, skipping computation).
- Implemented necessary scaffolding to check for bugs in reversible
computation (to be used with SEQUENTIAL_ROLLBACK_CHECK option in ROSS).
- Fixed reversible computation bugs on 1-D Dragonfly network.
22 changes: 0 additions & 22 deletions LICENSE.md

This file was deleted.

40 changes: 0 additions & 40 deletions Make.rules

This file was deleted.

Loading