-
Notifications
You must be signed in to change notification settings - Fork 32
High Performance Computing
DynaSim offers several methods for utilizing professional HPC assets. Note that you can use all of the below options simultaneously for extremely powerful distributed simulation using only a single command!
- DynaSim can compile your simulation code by setting the
compile_flag
to 1. - DynaSim can run parallel simulations using multiple cores on a single computer by setting
parfor_flag
to 1. - DynaSim can run parallel simulations using multiple nodes on a computer cluster by setting
cluster_flag
to 1.
See the best practices page for recommendations on how to optimize your use of HPC assets.
Here is an example model with parameter variation to get started using the HPC options of DynaSim:
eqns='HH.pop'; % predefined Hodgkin-Huxley neuron
vary={'HH','Iapp',[0 10 20]}; % parameter space to explore
- How to: set
compile_flag
to 1 - Dependency: MATLAB Coder Toolbox
Simulating large models can be sped up significantly by compiling the
simulation before running it; this is done by automatically compiling the simulation into C code using the MATLAB Coder. DynaSim makes this easy to do using the
compile_flag
option in dsSimulate
. Note: compiling the model can
take several seconds to minutes; however, it only compiles the first time
it is run and is significantly faster on subsequent runs.
To see the effects of running a local simulation with compilation:
tic
data=dsSimulate(eqns, 'compile_flag',1);
toc
% Now run again:
tic
data=dsSimulate(eqns, 'compile_flag',1);
toc
To combine compilation and multicore parallelization to maximize computational speed locally:
data=dsSimulate(eqns, 'compile_flag',1, 'parfor_flag',1, 'vary', vary);
dsPlot(data);
To run a set of simulations on a cluster with compilation:
dsSimulate(eqns, 'save_data_flag',1, 'study_dir','demo_cluster_4','compile_flag',1,...
'vary',vary, 'cluster_flag',1, 'overwrite_flag',1, 'verbose_flag',1);
- How to: set
parfor_flag
to 1 - Dependency: MATLAB Parallel Processing Toolbox
DynaSim can automatically use parfor
from the MATLAB Parallel Processing Toolbox in order to run multiple simulations in parallel on the same machine. An example:
data = dsSimulate(eqns, 'tspan',[0 250], 'vary',vary, 'parfor_flag',1);
- How to: set
cluster_flag
to 1 - Dependency: you must be running MATLAB on a cluster node with Sun Grid Engine software (i.e., recognizing the
qsub
command).
Users are advised to check whether their institution has a Linux cluster/supercomputer. If not, users may get a free allocation on the Neuroscience Gateway (NSG). The NSG provides a web portal for using computational neuroscience tools in the browser. See the NSG tutorials for more details.
DynaSim creates m-files called 'jobs' that run dsSimulate
for one or
more simulations. Jobs are saved in automatically-created folders in '~/batchdirs/<study_dir>' and are
submitted to the cluster queue using the Sun Grid Engine command qsub
.
Standard out and error logs for each job are saved in ~/batchdirs/<study_dir>/pbsout.
The following cluster examples can be executed from any type of node/computer on the cluster, including 'login' nodes:
Run three simulations in parallel jobs and save the simulated data:
dsSimulate(eqns, 'save_data_flag',1, 'study_dir','demo_cluster_1',...
'vary',vary, 'cluster_flag',1, 'overwrite_flag',1, 'verbose_flag',1);
Tips for checking job status from within Matlab:
!qstat -u <YOUR_USERNAME>
!cat ~/batchdirs/demo_cluster_1/pbsout/sim_job1.out
Once simulations are finished, you can load and plot the data as before:
data = dsImport('demo_cluster_1');
dsPlot(data);
To run a cluster simulation but also save plotted data:
dsSimulate(eqns, 'save_data_flag',1, 'study_dir','demo_cluster_2',...
'vary',vary, 'cluster_flag',1, 'overwrite_flag',1, 'verbose_flag',1,...
'plot_functions',@dsPlot);
!cat ~/batchdirs/demo_cluster_2/pbsout/sim_job1.out
To run a cluster simulation, save multiple plots, AND pass custom options to each plotting function:
dsSimulate(eqns, 'save_data_flag',1, 'study_dir','demo_cluster_3',...
'vary',vary, 'cluster_flag',1, 'overwrite_flag',1, 'verbose_flag',1,...
'plot_functions',{@dsPlot,@dsPlot},...
'plot_options',{{},{'plot_type','power'}});
!cat ~/batchdirs/demo_cluster_3/pbsout/sim_job1.out
Post-simulation analyses can be performed similarly by passing
analysis function handles and options using analysis_functions
and
analysis_options
in calls to dsSimulate
.
Note: options will be passed to plot and analysis functions in the order given. You can pass handles and options for any built-in, pre-packaged, or custom functions.
Multiple simulations can be grouped into a single cluster job using the sims_per_job
option. Grouping simulations into a single job saves time by reducing the number of times Matlab must be started. This is beneficial any time that the simulation run time exceeds the Matlab startup time. See dsSimulate
help in the Function Reference for more information.
Several examples of options to supply with qsub for controlling memory usage are available here: https://www.bu.edu/tech/support/research/system-usage/running-jobs/parallel-batch/
BU SCC Examples:
- #$-l mem_per_core=8G
- #$-pe omp 16
This should be enough for 86GB jobs, for those that are smaller:
- #$-l mem_per_core=8G
- #$-pe omp 8
If it happens that some jobs require more than 86GB, then you can try either:
- #$-l mem_per_core=16G
- #$-pe omp 16
or
- #$-pe omp 28
BU SCC Specific Notes:
- All the 28-core nodes have at least 256GB of memory.
- Do not request broadwell nodes.
Update: A summary of these commands can be found here: http://www.bu.edu/tech/support/research/system-usage/running-jobs/allocating-memory-for-your-job/
The following are several example commands for monitoring job usage on the cluster, in order to help debugging issues surrounding memory usage or other factors that cause unexpected termination of jobs.
View history of all jobs over past 5 days:
qacct -o username -d 5 -j
-
maxvmem
tells maximum memory usage of the job
First, run: qstat -u username
to get your currently active jobs.
Let's pretend you have a job with ID 1234567 and it is running on a node scc-wm3:
1234567 0.10000 myjob username r 12/02/2017 14:17:15 w@scc-wm3.scc.bu.edu 1
Method1:
Run: qstat -j 1234567
You will get an output that looks something like:
job_number: 1234567
exec_file: job_scripts/2918951
submission_time: Thu Dec 7 14:26:12 2017
owner: stanleyd
....
usage 1: cpu=02:54:19, mem=23493.17654 GBs, io=0.77390, vmem=2.246G, maxvmem=2.246G
scheduling info: (Collecting of scheduler job information is turned off)
Pay attention to "usage 1" line. At the end the value maxvmem=2.246G So this job was using 2.246G memory up to now (it may use more in the future though!!!)
Method2:
Login to the compute node where your jobs is running:
ssh scc-wm3
Then run "top" command:
top -u username
and then see what is under "VIRT" and "RES" columns. This is your virtual and resident memory at the moment.
(DO not forget to logout of compute node).