# PyTorch Simple
This is my first attempt at running PyTorch on the UAHPC
Modified from pytorch_gpus.py (stored in this directory)
Updated 11/17/2020

## Outline
0. Workflow
1. Links and Resources
2. Setting up Workspace
3. Running a PyTorch Script

## 0. Workflow
This documents how to run a PyTorch script using Jupyter Notebooks on UAHPC

    1. Log on to a Jupyter Notebook Session from https://ood.hpc.arizona.edu

    2. Choose the Number of Hours and CPUs. (Note that to use GPU you will need to run multiple CPUS (>1) and 'Connect to Jupyter'
    
    3. Navigate to an existing Jupyter Notebook or create a new one

    4. Set up your Workspace: A note on modules, environments, and python packages
        * Jupyter Notebooks come with some pre-baked python packages, and will automaticlly load the python 'module'
        * However, many 'custom' Python Packages need to be installed manually
            * For more info, see 'Setting up Workspace' section below
        * Careful to set up virtual environments that respect
            1. The structure of the UAHPC file directory. (For me, I access packages here: "/home/u8/roberthull/mypyenv_gato/lib/python3.5/site-packages")
            2. The version of Python that you are running. (For me, python3.5) 
            3. The super computer you run on (For me, ElGato, but could also be Ocelote). It can helpful to name your virtual environments descritively so that you have one for each super comuputer

    5. Run your PyTorch Script

## 1. Links and Resources
1. An introduction into using Python on Ocelote and El Gato
https://public.confluence.arizona.edu/display/UAHPC/Using+and+Installing+Python
    * Scroll to the bottom to view Jupyter-specific Python Recommendations
2. An introduction to Jupyter Notebooks on UAHPC
https://public.confluence.arizona.edu/display/UAHPC/Jupyter+Notebook+-+Python
    * Generates a simple Jupyter Notebook script
3. Description of GUI / On Demand Services available through HPC
https://public.confluence.arizona.edu/display/UAHPC/Open+On+Demand

## 2. Setting up your workspace
### Set up your directory locations. 
You need to activate your environment (if you want to install custom packages)
And
You need to direct the system to look where the packages are installed

In [None]:
# initialize directory lcoation
env_dir = "~/mypyenv_gato/bin/activate"
sys_dir = "/home/u8/roberthull/mypyenv_gato/lib/python3.5/site-packages"

### Install custom packages
Only do this once. I.E. if the packages already exist, then don't install them! 

In [None]:
# %% install custom package (each call must be done on one line)
# # syntax
# # !source </path/to/virtualenv>/bin/activate && pip install <package> && pip show <package>

# # pycurl
# !source ~/mypyenv_gato/bin/activate && pip install pycurl && pip show pycurl

# # pytorch
# !source ~/mypyenv_gato/bin/activate && pip install torch && pip show torch

### Check to see if a packages has already been installed
See previous

In [None]:
# pip show <module name>
# ex
pip show torch

### Ensure that the system knows to look for packages in your environment
You need to append the system directory (established as the path to your virtual environment) so that it can be used in the module import section

In [None]:
# Check system Paths
import sys
# add environment directory
sys.path.append(sys_dir)
sys.path

### Import Modules
'nuff said. Make sure that they've already been installed if they are custom and don't come by default with Jupyter

In [None]:
# %% Modules
# Import torch and numpy
import numpy as np
import pycurl
import torch as th
import time

## 3. Run Python Script
The below script looks to see if gpu is available (from the pytorch consule). Note that it will only show as available if more than one CPU is being used in your session

In [None]:
# %% First attempt (using th.cuda.is_available and .to())

if th.cuda.is_available():
  # Create tensors
  x = th.ones(1000, 1000)
  y = 2 * x + 3
  # Do the calculation on cpu (default)
  start_time = time.time()
  # Matrix multiplication (for benchmark purpose)
  results = th.mm(x, y)
  time_cpu = time.time() - start_time
  
  # Do the same calculation but on the gpu
  # First move tensors to gpu
  x = x.to("cuda")
  y = y.to("cuda")
  start_time = time.time()
  # Matrix multiplication (for benchmark purpose)
  results = th.mm(x, y)
  time_gpu = time.time() - start_time
  
  print("Time on CPU: {:.5f}s \t Time on GPU: {:.5f}s".format(time_cpu, time_gpu))
  print("Speed up: Computation was {:.0f}X faster on GPU!".format(time_cpu / time_gpu))
  
else:
  print("You need to enable GPU accelaration")