## Python Practice notebook 
This notebook contains some of the practices I have implemented in the past when training networks in pytorch.

### Creating subdirectories for your output
Often it is necessary to create a folder and subdirectories for your output. This is done with the python package ```os``` You first need to import the package:
```import os```

And then you will create the following logical statements:

```folder_name = '<insert_folder_name_and_location>'
if not os.path.exists(folder_name):
    os.mkdir(folder_name)
if not os.path.exists(folder_name+'/saved_models'):
    os.mkdir(folder_name+'/saved_models')
if not os.path.exists(folder_name+'/recons'):
    os.mkdir(folder_name+'/recons')
if not os.path.exists(folder_name+'/plots'):
    os.mkdir(folder_name+'/plots')
 ```
 
 The if not ```os.path.exists``` function looks for the folder in the directory tree and returns a ```True``` boolean value if it doesn't exists. The ```os.mkdir``` flag will then make the directory.


### Creating a GPU flag
By creating a GPU flag you will not have to change your code in order to run it on a cpu or gpu. The flag detects which hardware you are attempting to use and informs pytorch. One major benefit is that it allows you to prototype on your personal computer and then run the code on a GPU cluster. 

You will need to import the torch package and create the device variable as follows:

```device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
if str(device) == "cuda":
    print('Using GPU')
else:
    print('Using CPU')```

The device variable will carry the string cuda or cpu depending on which resource is available. Once device is defined, in order to apply it to the necessary variables you would use

```net = FCNet().to(device)
content_criterion = nn.MSELoss().to(device)```

The ```.to(device)``` operation sends the necessary item to the GPU or device, in this case the loss function and the network. NOTE: You must also use this when defining variables for training 

```for i,data in enumerate(train_loader,0):
        inputs,target = data
        inputs = Variable(inputs).to(device)
        target = Variable(target).to(device)```

### Creating a Log file ###
It is often helpful to create a log file particularly when you cannot see the evaulation of the loss function during training. It is also useful to compare architectures after. A simple text log file can be created using the following:

```import datetime
now = datetime.datetime.now()
with open(folder_name+'/log+str(now)+'.txt, 'a') as log:
    log.write('Epochs {}, batch size {}, learning rate:{:.6f}\n'.format(num_epochs, batch_size, learning_rate))```
    
The now variable contains information about when the log is created and will be used to name the log. We provide the with open function the location where we want the log file as well as the name of the log file. The first line of the log file will contain information about the number of Epochs we will use, batch size and learning rate. Care should be taken to end each adendum to the log with \n so that a new line will be used when adding to the log. 

Now that the log has been created it can be edited in exactly the same way, by opening the file and adding to them.

One thing I find helpful is printing the GPU or CPU flag output to the log in order to make sure I am running it on the appropriate device. 

### Saving the Models 

Often it is important to save the models during training. The important items to store from a model either to continue training or to perform inference are the parameters the model, the optimizer and the loss function etc. 

Pytorch saves the parameters in a dictionary. Which can be viewed by performing

```print("The state dict keys: \n\n, model.state_dict().keys())```

The state dict keys: 

 ```odict_keys(['fc1.weight', 'fc1.bias', 'fc2.weight', 'fc2.bias', 'fc3.weight', 'fc3.bias', 'fc4.weight', 'fc4.bias', 'fc5.weight', 'fc5.bias'])```

We will need to reconstruct the model exactly as it was when we trained at loading time so we need to store information about the model architecture in the state_dict. First create the dictionary 'checkpoint'

```checkpoint = {'epoch': epoch,
                  'model_state_dict': model.state_dict(),
                  'optimizer_state_dict' : optimizer.state_dict()
                  'loss':loss,
                  ...}```

to save we use 
```torch.save(checkpoint, 'checkpoint.pth')```

It should be noted that you can save anything you want within the checkpoint dictionary but the most important for training later are the state_dict and the optimizer state_dict.

It is worth noting that it is convention to save the model with a ```.pth``` extenstion. I typically also incorporate the epoch information in the same of the model save. 

```epoch_save_autoencoder = folder_name+'/saved_models/autoencoder_Epoch_'+str(epoch)+'.pth'
torch.save(checkpoint,epoch_save_autoencoder)
```

### Loading the models
In order to load the model we need to create the model and the optimizer

```
model = TheModelClass(*args,**kwargs)

optimizer = TheOptimizerClass(*args, **kwargs)

checkpoint = torch.load(PATH)
model.load_state_dict(checkpoint['model_state_dict'])
optimizer.load_state_dict(checkpoint['optimizer_state_dict'])
epoch = checkpoint['epoch']
loss = checkpoint['loss'] 


model.eval()
# or
model.train()```


https://pytorch.org/tutorials/beginner/saving_loading_models.html

## Singularity Basics
There are newer versions of singularity available. The IT assistants will be best to help you with your installation, but there are a couple basics you should know. For the most part, your mentor or the IT department shoudl have working containers or recipes. 

### Recipe 
A singularity recipe is essentially what you would run to install all packages that you need. 


```
Bootstrap: shub
From: singularityhub/ubuntu

%runscript
    exec echo "The runscript is the containers default runtime command!"

%files
   /home/vanessa/Desktop/hello-kitty.txt        # copied to root of container
   /home/vanessa/Desktop/party_dinosaur.gif     /opt/the-party-dino.gif #

%environment
    VARIABLE=MEATBALLVALUE
    export VARIABLE

%labels
   AUTHOR vsochat@stanford.edu

%post
    apt-get update && apt-get -y install python3 git wget
    mkdir /data
    echo "The post section is where you can install, and configure your container."
```

To build the recipe use

```sudo singularity build ubuntu.simg Singularity```


### Interacting with Singularity containers

There are two ways you will be interacting with the containers, you will either execute the container or operate it in a shell

Within a shell you will access the inside of a container like a small virtual machine. In conjunction with an interactive shell you would first use ```srun``` to start the interactive session and then use 

 ``singularity shell hello-world.simg
Singularity: Invoking an interactive shell within container...```

The alternative is to execute a command. This is more appropriate when you send a job to the cluster. This would be a line in your bash submission script that reads

```singularity exec hello-world.simg python_script.py```