# Welcome to Jupyter Notebooks on the Intel DevCloud for oneAPI Projects! 
This document covers the basics of the JupyterLab access to the Intel DevCloud for oneAPI Projects. It is not a tutorial on the JupyterLab itself. Rather, we will run through a few examples of how to use the computational resources available on the DevCloud *beyond* the notebook.

The diagram below illustrates the high-level organization of the DevCloud. This tutorial explains how to navigate this organization. 

<img src="https://devcloud.intel.com/oneapi/static/images/svg/cluster-jn-organization.svg" style="max-width:600px;" />


## Service Terms

By using the Intel DevCloud for oneAPI Projects, you are agreeing to the following terms linked in the footer of the Intel DevCloud website: <br />
<a href="https://devcloud.intel.com/oneapi/">https://devcloud.intel.com/oneapi/</a>

## Table of Contents
1. [Notebook Basics](#sec-basics)
2. [Compute Power and Limits](#sec-limits)
3. [Job Queue](#sec-queue)
4. [Final Words](#sec-final)

<a id="sec-basics"></a>
## 1. Notebook Basics

You can find detailed documentation on using the JupyterLab software at <a href="https://jupyter.org/">jupyter.org</a>. For our tutorial, you just need to know that 
- When you see cells like below (the line that begins with `!echo "Running..."`), this is code that you can run. 
- If you mouse-click on the cell, you will be able to edit the code. 
- While you are in the cell, press Ctrl+Enter, and the code in the cell will run.
- In the top-right corner of the page, the indicator <i class="kernel_idle_icon"></i> will change to <i class="kernel_busy_icon"></i>. This means that the kernel is busy. 
- If the code begins with "!", it will run in the Bash shell. Otherwise, it is treated as Python code.

Go ahead, click the cell below and then press Ctrl+Enter. You should see "Running..." and a few seconds later "...done".

In [3]:
!echo "Running..."; sleep 3; echo "...done!"

Running...
...done!


<a name="sec-limits"></a>
## 2. Compute Power and Limits
For the most part, the Notebook session on the DevCloud should be familiar to JupyterLab users. However, there are few limitations that you should be aware of.

* ### Session Time
 Your JupyterLab session has a time limit. 
 If your session runs out of time, you can start a new one by refreshing the page or going to <a href="https://jupyter.oneapi.devcloud.intel.com">jupyter.oneapi.devcloud.intel.com</a> again. However, keep in mind that
 * The contents of the Notebook will not be automatically saved when the session time runs out. Save your work!
 * All running processes (the notebook itself, the kernels running in it, terminals) are terminated. If you want to run calculations that survive outside the notebook, use the <a href="#sec-queue">job queue</a> as described below.

* ### Number of Cores
 Your Notebook is running on a powerful computing server, but other people may be running Notebooks on the same server. They cannot access your files, but you do share the pool of the CPU cores. For heavy workloads (e.g., neural network training), you can get access to more computing power by submitting scripts to the <a href="#sec-queue">job queue</a> as discussed below.
 
* ### Amount of Memory
 Your Notebook is also sharing the computing server's operating memory with other tenants. If you need more memory for calculations, use the <a href="#sec-queue">job queue</a>.
 
Run the code in the cell below to query the limits of your JupyterLab environment.

In [4]:
!echo "* How many seconds are left in my JupyterLab session?"
!qstat -f $PBS_JOBID | grep Walltime.Remaining

!echo "* How many logical CPUs do I have for the Notebook?"
!taskset -c -p $$

!echo "* How much RAM can I use in the Notebook?"
!/usr/local/bin/qstat -f $PBS_JOBID | grep vmem

* How many seconds are left in my JupyterLab session?
    Walltime.Remaining = 12434
* How many logical CPUs do I have for the Notebook?
pid 2747249's current affinity list: 6-11,18-23
* How much RAM can I use in the Notebook?
    resources_used.vmem = 6815176kb
    Resource_List.vmem = 94gb


<a name="sec-queue"></a>
## 3. Job Queue
The job queue is the only method for accessing the full capacity of the computing resources available on the DevCloud. This section explains how you can interact with the queue from the JupyterLab environment. You can also submit to the queue from a terminal session. A more detailed guide on queue usage is available on the <a href="https://devcloud.intel.com/oneapi/learn/job-submission/">Intel DevCloud website</a>.


### Creating a Job Script
To submit a job to the queue, create a Bash script containing the commands that you want to run. 
You can do this from the Notebook using the `%%writefile` magic. The following example creates a job script called `hello-world-example`. The line `cd $PBS_O_WORKDIR` changes the working directory to the directory where the script is located. Everything else runs in the Bash shell on the designated compute server.

In [5]:
%%writefile hello-world-example
cd $PBS_O_WORKDIR
echo "* Hello world from compute server `hostname`!"
echo "* The current directory is ${PWD}."
echo "* Compute server's CPU model and number of logical CPUs:"
lscpu | grep 'Model name\\|^CPU(s)'
echo "* Python available to us:"
which python
python --version
echo "* The job can create files, and they will be visible back in the Notebook." > newfile.txt
sleep 10
echo "*Bye"
# Remember to have an empty line at the end of the file; otherwise the last command will not run


Writing hello-world-example


You should now see the file `hello-world-example` when you go to the tree menu, or if you run the `%ls` magic.

In [6]:
%ls

[0m[01;34massets[0m/                            c1_c_python_benchmarking.ipynb
c1_a_intro_intel_devcloud.ipynb    c1_README.md
c1_b_python_multiprocessing.ipynb  hello-world-example


Note that only Bash job scripts are supported. If you need to run a Python application, add the corresponding Python launch line to the job script. For example:

    %%writefile my_job_script
    echo "Running myapplication.py"
    python myapplication.py


### Submitting a Job to the Queue

Now you can submit this script as a job using the `qsub` command. Go ahead and execute the cell below:

In [7]:
!qsub hello-world-example

1901400.v-qsvr-1.aidevcloud


You have submitted a job to the queue. You should see an output line that looks like "[numbers].cXXX". 
The number you see in the front is the Job ID. We will be using this number to retrieve the output of the job.

### Checking the Queue Status
Once the job has been placed in the queue, you can find the current status of the job by running the followng command in a cell. 

In [8]:
!qstat

Job ID                    Name             User            Time Use S Queue
------------------------- ---------------- --------------- -------- - -----
1901390.v-qsvr-1           ...ub-singleuser u134923         00:00:46 R jupyterhub     
1901400.v-qsvr-1           ...world-example u134923                0 R batch          


If you ran `qstat` soon enough, you will see a job with the name `...world-example`. In the column `S` you will see a letter indicating its status: "Q" is for "queued", "R" is for "running", and "E" is either an error, or a transition to a normal job completion. If you still see an entry for `...world-example`, keep re-running the above cell a few times until the "hello world" job completes and disappears from the list. If you don't see this entry, proceed to the next section to view the results of our job.

### Getting the result
Once the job is completed, the resulting output and error streams (stdout and stderr) are placed in two seperate text files. These output files have the following naming convention: 

* stdout: [Job Name].o[Job ID].    Example: `hello-world-example.o12345`
* stderr: [Job Name].e[Job ID].    Example: `hello-world-example.e12345`

[Job Name] is either the script name, or a custom name — for example, the name specified by the `-N` parameter of `qsub`. 

[Job ID] is the number you got from the output of the `qsub` command. 

Let's find the output file produced by the `hello-world-example` job by running the `%ls` magic again.

In [9]:
%ls hello-world-example*

hello-world-example  hello-world-example.e1901400  hello-world-example.o1901400


To view this file, you can go to File -> Open... click on the `hello-world-example.o*` file. Alternatively, you can view the contents of the file inside the JupyterLab using the `%cat` magic command. Run the cell below to view the result of the "hello world" job.

In [10]:
%cat hello-world-example.o*


########################################################################
#      Date:           Sun 08 May 2022 05:48:41 PM PDT
#    Job ID:           1901400.v-qsvr-1.aidevcloud
#      User:           u134923
# Resources:           neednodes=1:batch:ppn=2,nodes=1:batch:ppn=2,walltime=06:00:00
########################################################################

* Hello world from compute server s001-n009!
* The current directory is /home/u134923/csula-ee3445-final-project/c1_intro_parallel_computing.
* Compute server's CPU model and number of logical CPUs:
CPU(s):                          24
Model name:                      Intel(R) Xeon(R) Gold 6128 CPU @ 3.40GHz
* Python available to us:
/glob/development-tools/versions/oneapi/2022.1.2/oneapi/intelpython/latest/bin/python
Python 3.9.7 :: Intel Corporation
 *Bye

########################################################################
# End of output for job 1901400.v-qsvr-1.aidevcloud
# Date: Sun 08 May 2022 05:48:56 PM PDT
###

The error stream goes to the .e* files. For our job, it contains the Python version (not because we had an error, but because Python chooses to write the version to the error stream):

In [11]:
%cat hello-world-example.e*

Finally, any files created by the job will be visible after its completion. Useful for data processing tasks!

In [12]:
%cat newfile.txt

* The job can create files, and they will be visible back in the Notebook.


### Colfax Magic for Job Submission

We have created a custom cell magic command `%%qsub` to simplify job submission from Notebooks. 
The magic is defined as a part of the `cfxmagic` module. You can use it after you have imported the module. Run the cells below to see how it works.

In [13]:
%%writefile PythonDemo.py
# Creating an example Python application. 
# You can do it with the %%writefile magic like we are doing here,
# or you can go to the File menu, choose Open, and from there
# either upload your code or create a .py file and compose it in the Notebook
print ("Hello world from Python!")

Writing PythonDemo.py


In [14]:
import cfxmagic

In [15]:
%%qsub
cd $PBS_O_WORKDIR
python PythonDemo.py

1901401.v-qsvr-1.aidevcloud



This will submit the contents of the cell as a job named STDIN, without writing the script into a separate file. 

Wait a few moments and then view the output of the job by running the cell below:

In [16]:
%ls STDIN.*
%cat STDIN.o*

ls: cannot access 'STDIN.*': No such file or directory
cat: 'STDIN.o*': No such file or directory


### Job Parameters
Both the `!qsub <file>` command and the `%%qsub` magic can take a variety of parameters that you can set. For example, the following command requests a wall clock time limit of 24 hours and passes a command line argument equal to "13.2" to the job.

In [17]:
!qsub hello-world-example -l walltime=24:00:00 -F "13.2"

1901402.v-qsvr-1.aidevcloud


In [18]:
!qmgr -c 'p q batch'

#
# Create queues and set their attributes.
#
#
# Create and define queue batch
#
create queue batch
set queue batch queue_type = Execution
set queue batch max_queuable = 5000
set queue batch max_user_queuable = 20
set queue batch max_running = 90
set queue batch resources_max.nodes = 2
set queue batch resources_max.walltime = 24:00:00
set queue batch resources_default.nodes = 1:batch:ppn=2
set queue batch resources_default.walltime = 06:00:00
set queue batch resources_available.nodect = 60
set queue batch max_user_run = 5
set queue batch enabled = True
set queue batch started = True


### Submitting Multiple Jobs

You can submit a number of jobs at once. If enough compute servers are available, all jobs will run simultaneously. Otherwise, they will stay in the queue waiting for their turn to run. 

When you submit a lot of jobs, be aware that the queue has a fair share-based scheduling policy, so the more you run, the more often will your jobs yield to other users' calculations.

You can learn about the pool of compute servers available for your jobs by running the commands below.

In [19]:
!echo "* How many compute servers are available?"
!pbsnodes | grep "^s" | wc -l

!echo "* How many of them are free?"
!pbsnodes | grep "state = free" | wc -l

!echo "* What are the time limits for queued jobs?"
!qmgr -c 'p q batch' | grep walltime

!echo "* What is the configuration of the available compute servers?"
!pbsnodes | grep properties | sort | uniq

* How many compute servers are available?
282
* How many of them are free?
190
* What are the time limits for queued jobs?
set queue batch resources_max.walltime = 24:00:00
set queue batch resources_default.walltime = 06:00:00
* What is the configuration of the available compute servers?
     properties = core,cfl,i9-10920x,ram32gb,net1gbe,gpu,iris_xe_max,gpu
     properties = xeon,cfl,e-2176g,ram64gb,net1gbe,gpu,gen9
     properties = xeon,clx,ram192gb,net1gbe,batch,extended,fpga,stratix10,fpga_runtime
     properties = xeon,icx,gold6348,ramgb,netgbe,jupyter,batch
     properties = xeon,icx,plat8358,ram256gb,net1gbe,batch
     properties = xeon,icx,plat8380,ram2tb,net1gbe,batch
     properties = xeon,skl,gold6128,ram192gb,net1gbe,fpga_runtime,fpga,arria10
     properties = xeon,skl,gold6128,ram192gb,net1gbe,jupyter,batch
     properties = xeon,skl,gold6128,ram192gb,net1gbe,jupyter,batch,fpga_compile
     properties = xeon,skl,ram384gb,net1gbe,renderkit



### Running the JupyterLab Code as a Job
JupyterLab sessions are designed for interactive computing, which is the opposite of what the job queue is designed for. However, if you structured your Jupyter Notebook code as non-interactive Python application, you can submit your code to the queue. 

For example, suppose that your Notebook 
1. Imports some Python modules, 
2. Loads a dataset, 
3. Sets up a neural network
4. Trains it, and 
5. Writes the resultant model weights into a file. 
This is the kind of workload that can benefit from access to a powerful compute server and does not require interactivity. Therefore, you can submit it to the job queue. 

You can dump the code of all cells in a Notebook into a Python script using the `jupyter` command shown below. Suppose that you have a Notebook saved in the file `mynotebook.ipynb`. The following cell converts the Notebook into a Python script `mynotebook.py`.

In [20]:
# Do not try to run this unless you already have a file called mynotebook.ipynb
# This is just an illustration.
!jupyter nbconvert --to script "mynotebook.ipynb"

This application is used to convert notebook files (*.ipynb)
        to various other formats.


Options
The options below are convenience aliases to configurable class-options,
as listed in the "Equivalent to" description-line of the aliases.
To see all configurable class-options for some <cmd>, use:
    <cmd> --help-all

--debug
    set log level to logging.DEBUG (maximize logging output)
    Equivalent to: [--Application.log_level=10]
--show-config
    Show the application's configuration (human-readable format)
    Equivalent to: [--Application.show_config=True]
--show-config-json
    Show the application's configuration (json format)
    Equivalent to: [--Application.show_config_json=True]
--generate-config
    generate default config file
    Equivalent to: [--JupyterApp.generate_config=True]
-y
    Answer yes to any questions instead of prompting.
    Equivalent to: [--JupyterApp.answer_yes=True]
--execute
    Execute the notebook prior to export.
    Equivalent to: [--ExecutePr

In [21]:
# This will view the resultant Python code mynotebook.py
!cat "mynotebook.py"

cat: mynotebook.py: No such file or directory


The above commands should have created a Python script named `mynotebook.py`. You can submit this script to the queue as follows:

In [22]:
# Do not submit; this is an illustration.
%%qsub
cd $PBS_O_WORKDIR
python mynotebook.py

SyntaxError: invalid syntax (3277707077.py, line 3)

<a name="sec-final"></a>
## 4. Final Notes
This document covered some of the basics of using the JupyterLab environment on the DevCloud. 

JupyterLab is not the only way to access the DevCloud. You can also log in with an SSH client or a file transfer application based on the SSH protocol (e.g., WinSCP or FileZilla). This may be a more convenient access mode for advanced users who already have the code base developed, and who want to execute their code on powerful compute resources.
