-
Notifications
You must be signed in to change notification settings - Fork 147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Running SPECFEM2D with Slurm #924
Comments
Hi,
This is very likely unrelated to SPECFEM (we use Slurm here on several
machines, without noticing any problem). You should probably contact
your system administrator to ask him/her to check the installation of
SLURM (or your submission scripts).
Best regards,
Dimitri.
…On 04/23/2018 07:10 AM, tianzeliu wrote:
I was trying to run SPECFEM2D on a computer cluster using Slurm as the
system management tool. In this case I need to submit jobs as oppose to
running the code directly. Although I could get SPECFEM2D running, it
was never able to generate any output. Instead it would keep running
until the time assigned by Slurm was up and then crash (the time should
be enough for the simulation to finish as I have tested it on my own
machine). Is there any idea on why this happened? Thanks a lot.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#924>, or mute the
thread
<https://github.com/notifications/unsubscribe-auth/AFjDKfJDDm41lHbLQrzC_RW85GHQbbYRks5trWJHgaJpZM4TfQIk>.
--
Dimitri Komatitsch, CNRS Research Director (DR CNRS)
Laboratory of Mechanics and Acoustics, Marseille, France
http://komatitsch.free.fr
|
Hi Dimitri, Best, |
Hi Tianze,
Here is one:
#!/bin/bash
#SBATCH -J job_name
#SBATCH --nodes=2
#SBATCH --ntasks=48
#SBATCH --ntasks-per-node=24
#SBATCH --threads-per-core=1
#SBATCH --mem=1GB
#SBATCH --time=00:30:00
#SBATCH --output job_name.output
module purge
module load intel
module load openmpi
srun --mpi=pmi2 -K1 --resv-ports -n $SLURM_NTASKS ./my_executable param1
param2 ...
Best regards,
Dimitri.
…On 04/23/2018 08:18 PM, tianzeliu wrote:
Hi Dimitri,
Thank you for the quick response. Is it possible for you to provide a
sample submission script so that I could modify on top of that? Thanks a
lot.
Best,
Tianze
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#924 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AFjDKQ7JioVEAsUgEbi8lSyeoIm8y3XSks5trhrkgaJpZM4TfQIk>.
--
Dimitri Komatitsch, CNRS Research Director (DR CNRS)
Laboratory of Mechanics and Acoustics, Marseille, France
http://komatitsch.free.fr
|
Hi Dimitri, Tianze |
Hi Tianze,
Did you manage to successfully run some of the examples that are in the
EXAMPLES directory? If not (i.e. if they also fail), then the problem is
very likely specific to the cluster you use, i.e. that cluster has an
installation problem. If you managed to run several of the examples
successfully then please let me know and we will investigate the
particular case that crashes.
Thank you,
Best regards,
Dimitri.
…On 04/27/2018 01:59 AM, tianzeliu wrote:
Hi Dimitri,
The submission script you provided seemed to work. However, the code now
breaks at a fixed point of the simulation, which only happens on the
cluster not on my local machine. The only difference between the cluster
and the local machine is that the version on the cluster may be newer (I
installed it just now). Here is the error message it gives:
|Backtrace for this error: #0 0x7F3BEF41E6F7 #1 0x7F3BEF41ED3E #2
0x7F3BEE70726F #3 0x45078D in compute_forces_viscoelastic_ at
compute_forces_viscoelastic.F90:465 #4 0x45572E in
compute_forces_viscoelastic_main_ at
compute_forces_viscoelastic_calling_routine.F90:67 #5 0x49A0D6 in
iterate_time_ at iterate_time.F90:165|
It does not seem to be a problem with an unstable time scheme, because
the CFL number and the suggested minimum time step both look OK. Thanks
a lot!
Tianze
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#924 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AFjDKdy_FcVyTrOE4hpPyhaqMudmObJeks5tsl98gaJpZM4TfQIk>.
--
Dimitri Komatitsch, CNRS Research Director (DR CNRS)
Laboratory of Mechanics and Acoustics, Marseille, France
http://komatitsch.free.fr
|
Hi Dimitri, Best, |
Hi Dimitri, Best, |
I was trying to run SPECFEM2D on a computer cluster using Slurm as the system management tool. In this case I need to submit jobs as oppose to running the code directly. Although I could get SPECFEM2D running, it was never able to generate any output. Instead it would keep running until the time assigned by Slurm was up and then crash (the time should be enough for the simulation to finish as I have tested it on my own machine). Is there any idea on why this happened? Thanks a lot.
The text was updated successfully, but these errors were encountered: