# 1. Introduction
---

_Python_ has become arguable the most widely used language for developing deep learning algorithms and models. These models define a means to recognize patterns within data, similar to how we believe humans may do. These models allow computers to parse vast quantities of data, building representations in order to understand a scene, or even promote advanced capabilities in robotics, such as performing complex pick-and-place tasks with highly variable objects.

The ability for deep learning to work with large amounts of data, however, is also a significant burden. In order to build suitable representations and abilities for understanding, Gigabytes upon Gigabytes of data are often needed for some of the more complex scenarios such as [ImageNet](http://www.image-net.org/). These models are often home to hundreds of thousands, or millions of parameters. Processing inputs until a model converges can often take an extremely large amount of time. Thankfully, both hardware and software have advanced to make development a more manageable task.

## 1.1. Deep Learning Frameworks
---

From the software side, a number of different frameworks have been built on the Python language to support the development of deep learning models. Key to the popularity of these frameworks is in how they allow us to interface with hardware accelerators such as GPUs for many common math operations. Popular frameworks include include (but are not limited to):

* [TensorFlow](https://www.tensorflow.org/) -- Google
* [PyTorch](https://pytorch.org/) -- Facebook
* [Keras](https://keras.io/) -- Google-ish
* [Chainer](https://chainer.org/) -- Preferred Networks (Japanese Company)
* [MXNet](https://mxnet.apache.org/) -- Apache
* [Neon](https://ai.intel.com/neon/) -- Nervana (acquired by Intel)

## 1.2. Python Package Managers
---

Many deep learning frameworks are built to interface with common python libraries such as _numpy_ or _scikit-learn_. In addition to these two, there are a myriad of others that one may choose to use during development to make the process a little easier. 
_Package managers_ such as [Anaconda](https://conda.io/docs/) are used to simplify the process of maintaining an environment or workspace. These environments are often associated with a specific project and python distribution. Consider the following:

* You are working on a project that requires Python=2.7
* You are working on another project in parallel that requires Python=3.6

By using Anaconda to manage these environments, you can seemlessly switch between tasks without interfering with the global python instance. __It is a good practice to always create a new environment whenever you start a new project__. This ensures that any library dependencies you install don't get contaminated or changed when you want to try something new.

# 2. Install the Anaconda Package Manager
---

To use the Conda package manager on the Compute Canada resources, we'll use a distribution called [miniconda](https://conda.io/miniconda.html). Miniconda allows us to individually download each of the packages we'll be using. 


## 2.1. Task: Downloading Miniconda3
---

Log on to the _graham_ cluster with your Compute Canada username and password:

```shell
ssh graham.computecanada.ca
```

Download the miniconda application by using the [Wget](https://www.gnu.org/software/wget/manual/wget.html) protocol:

```shell
wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
```

## 2.2. Task: Install Miniconda3
---

On the command line, enter the following:

```shell
bash Miniconda3-latest-Linux-x86_64.sh
```

* __TIP:__ In linux, there is a form of "auto completion" that runs by pressing the &lt;Tab&gt; character. Try typing the prefix "Mini" and pressing &lt;Tab&gt; to see it in action. In the case that many files in the current directory share the same prefix, you can view your choices by pressing &lt;Tab&gt; several times. 

Once the miniconda application has started to run, press &lt;Enter&gt; until you get to the screen that requires you to accept the terms and conditions. Enter "yes" and press enter, press &lt;Enter&gt; again (when prompted) to accept the default installation directory, and then &lt;Enter&gt; again to prepend the source to your .bashrc file. The [.bashrc](https://unix.stackexchange.com/questions/129143/what-is-the-purpose-of-bashrc-and-how-does-it-work) file contains commands for setting up your bash environment, and is automatically run every time you log in to the Compute Canada nodes. By prepending the source to this file, miniconda added the following line to the bottom of the file:

```# added by Miniconda3 installer
export PATH="/home/mveres/miniconda3/bin:$PATH"```

which describes the home of the executable, and is akin to adding environment variables in the Windows ecosystem. Verify the installation was successful by typing ```which python``` in the terminal, which should display the following:

```mveres@gra-login2:~$ which python
~/miniconda3/bin/python```

If it doesn't display this, you may need to refresh your bash environment. On the command line, type:

```source ~/.bashrc```

and try typing "which python" again.

### 2.2.1. Using the Conda Module
---

Within the Compute Canada HPC environment, there are objects called _modules_. These modules allow one to quickly change their environment -- by swapping out different code compilers, package managers, etc. Whenever you want to use a module, type "module load". If you need to use the miniconda module we have just installed for example, you would type the following:

```shell
module load miniconda3
```

If you type ```module list``` on the command line, you should see "miniconda3/4.x.xx" in the returned list. This command also shows you the other modules current installed for your user. If you're using C++ and want to compile against a particular use-case, you can swap out the module for another one. 
Whenever you want to use a different package manager you can _un-load_ the module as follows:

```shell
module unload miniconda3
```

# 3. Task: Create a Python Virtual Environment
---

For this workshop, we're going to be using python 3.6 and the Pytorch 0.4 library. If you haven't already loaded the miniconda module, load it:

```shell
module load miniconda3
```

Then create a virtual environment called "pytorch4"

```shell
conda create -n pytorch4 python=3.6
```

Type ```y``` when prompted and press &lt;Enter&gt;. This may take several minutes to complete as it will install a number of base packages required for use. 


## 3.1. Starting the Virtual environment
---

Whenever you want to use this python environment, type the following on the command line:

```shell
source activate pytorch4
```

## 3.2. Closing the Virtual Environment
---

Whenever you are done using the environment (for example you wish to switch to a different one), type:

```shell
source deactivate pytorch4
```

# 4. Task: Install PyTorch -- A Deep Learning Framework
---

hus far, we have installed a Python _package manager_, and installed an environment within the package manager called _pytorch4_ that we will populate with packages and libraries needed to run the workshop. Within this environment, we will also install the PyTorch deep learning framework. 

```shell
source activate pytorch4
```

Install the latest pytorch framework:

```shell
conda install pytorch torchvision -c pytorch
```

# 5. Task: Install Helper Libraries
---

In the next notebook, we'll take a look at training a neural network within the HPC environment. To do so, we'll make use of two additional libraries: _tqdm_ for monitoring training status, and the matplotlib library for visualizing results. 

```shell
conda install tqdm
conda install matplotlib
```