# What is Lambda Server?

The faculty HPC server cluster is composed of a gateway server, lambda, into which you log in with SSH, and five compute nodes lambda1-5 which run the actual computations. The gateway server is relatively weak and has no attached GPUs, so it should not be used for running computations.

## Connect to Technion Network - VPN

As usual

## Connect to Lambda Server

Connect using a simple SSH:
    
**ssh -X user_name@lambda.cs.technion.ac.il**
    
Notes:
1. Username is your @campus.technion.ac.il username
2. Password is the same as the one for the account.
3. If host name is not recognized can use IP address: 132.68.39.159.

Recommended GUI app is Bitvise SSH Client.

Link - https://cswp.cs.technion.ac.il/bitvise-ssh-client-installation-setup/

In [2]:
from IPython.display import HTML
HTML('<img src="bitvise_client.png" alt="Your Image" style="width:500px;height:300px;">')

## Install required packages

### install miniconda:
cd ~
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh 
sh Miniconda3-latest-Linux-x86_64.sh

### Create a link to the conda binary
ln miniconda3/bin/conda .local/bin/conda

### Install cuda:
conda create --name tf23-gpu python=3.8
conda activate tf23-gpu

If you get a "conda init must be [...] try restarting your terminal" or a similar message while running this line.

You need to restart your terminal. If you're working with bitvise, just close and reopen the terminal window. 

Otherwise, reconnect to lambda. After that, try running the line again.

## Running Script

For example, we want to run the following script:

In [None]:
# script begin

from numba import cuda
import numpy

@cuda.jit
def my_kernel(io_array):
    # Thread id in a 1D block
    tx = cuda.threadIdx.x
    # Block id in a 1D grid
    bx = cuda.blockIdx.x
    # Block width, i.e. number of threads per block
    bw = cuda.blockDim.x
    # Compute flattened index inside the array
    pos = tx + bx * bw
    if pos < io_array.size:  # Check array boundaries
        io_array[pos] *= 2 # do the computation


# Create the data array - usually initialized some other way
data = numpy.ones(256)

data_send = cuda.to_device(data)

# Set the number of threads in a block
threadsperblock = 32

# Calculate the number of thread blocks in the grid
blockspergrid = (data.size + (threadsperblock - 1)) // threadsperblock

# Now start the kernel
my_kernel[blockspergrid, threadsperblock](data_send)

data = data_send.copy_to_host()

# Print the result
print(data)

# script end

**Make sure the env is active:**

### Run the script:

srun --gres=gpu:1 -c 2 --pty python3 script.py

--gres=gpu:1 - number of GPU - 2
-c 2 - number of cores - 1

More info about different ways to run on Lambda server can be found in the following link:

https://vistalab-technion.github.io/cs236781/assignments/hpc-servers