Change Log

22-JAN 2024
- Modified anaconda3 to miniforge3 in job.psb, as NSCC no longer supports anaconda3.
18-OCT 2023
- Simplified the tutorial so that it only uses conda instead of singularity anymore.
- Corrected the project ID for our group.
3-APR 2023
- Added the instruction for using Anaconda3.
10-MAR 2023
- Revised the job.psb so that the examples for applying more than 1 gpus are provided.
3-MAR 2023
- Revised the job.psb so that the container can know the gpu index. This prevent the bug AssertionError: Invalid device id when loading a checkpoint. The bug is caused by the unrecognizable gpu index assigned by the PBS-PRO to the variable CUDA_VISIBLE_DEVICES, which further caused torch.cuda.is_available() = True, yet torch.cuda.device_count() = 0. The revised job.psb manually assigns the gpu index instead.
1-MAR 2023
- Uploaded the correct container file (for option 2).

Table of Content

Step 1: Preparation (In Local)
Step 2: Run (In NSCC)
Endnote: Useful commands in NSCC

Step 1: Preparation (In Local)

First, we need to prepare the Python environment for our code. Login to your NSCC, type

module avail

so that you will see the list of all the modules. Find the Miniforge3. (It is a open-sourced version of Anaconda, the two are equivalent in usage.) It should be something like miniforge3/23.10. Load it by typing

module load miniforge3/23.10

Now you can use the conda command. Then the rest is all the same. By same I mean you can condo create your environment and then condo install or pip install your packages. This option is available for the new nscc. By using this you don't need to bother with the messy singularity container.

Step 2: Run (In NSCC)

Return to Table of Content

First, edit your job definition. See jpb.psb in detail! The examples and comments there covered everything!

Then, upload your dataset, code, and job.psb to NSCC. I always put job.psb and main.py in the same directory for convenience. Moreover, following NSCC's instruction, large files like dataset should be stored in ~/scratch directory.

Finally, in the NSCC terminal, cd to the path storing main.py, and type

qsub job.pbs

to submit your job. If your main.py needs arguments, and you have already edited your job.psb accordingly (see job.psb for example), simply feed them with -v flag and comma separator as

qsub -v bs=32,e=100 job.pbs

and use below for feeding lists

qsub -v bs=32,e=100,modal="visual audio" job.psb

Endnote: Useful commands in NSCC

Return to Table of Content

qstat: see the job numbers and status of your submitted jobs, but you don't know what variables you fed to the job.
qstat -x -f: see the summary of your recently submitted jobs, you can see the variables fed to the job if any.
qdel <jobid>: kill a job.
qdel -W force <jobid>: force kill a job, use this when a normal kill cannot work.

:)

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
images		images
ReadMe.md		ReadMe.md
install_singularity.sh		install_singularity.sh
job.psb		job.psb
main.py		main.py
my.def		my.def

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

images

images

ReadMe.md

ReadMe.md

install_singularity.sh

install_singularity.sh

job.psb

job.psb

main.py

main.py

my.def

my.def

Repository files navigation

Change Log

Table of Content

Step 1: Preparation (In Local)

Step 2: Run (In NSCC)

Endnote: Useful commands in NSCC

About

Releases

Packages

Languages

sucv/NSCC_tutorial_for_ASPIRE2

Folders and files

Latest commit

History

Repository files navigation

Change Log

Table of Content

Step 1: Preparation (In Local)

Step 2: Run (In NSCC)

Endnote: Useful commands in NSCC

About

Resources

Stars

Watchers

Forks

Languages