# <font color=blue>_Week 1 - Introduction to KLC_</font>

This notebook provides basic shell commands to use on the Kellogg Linux Cluster (KLC).

For your reference, we provide some additional Linux commands on the Research Support website here: 
https://www.kellogg.northwestern.edu/research-support/computing/kellogg-linux-cluster/linux-tips.aspx

For week 1, this notebook primarily serves as a reference for topics covered during the Zoom session.


## Cloning a Github Repo

In order to follow along with each subsequent workshop, you will need to setup a github account at https://github.com/. 

From a terminal window, type the following in KLC to clone the repo for this workshop:
```bash
$ git clone https://github.com/rs-kellogg/empirical_workshop_2021 
```
Thereafter, you will be prompted to enter your username and password, so please do this outside of this notebook.

In order to make changes to the contents, please fork the original github repo and type the following to use it on KLC:


```bash
$ git clone https://github.com/<username>/empirical_workshop_2021 
```

## Changing Directories

A jupyter notebook will enable you to run python commands on KLC. In order to run shell commands, please type "!" in front of those lines (as shown below).

By default, you are in your home directory on KLC.  Please navigate to the folder for Week 1 by typing:

In [None]:
# change directories 
!cd <directory name here>
!cd ~ # go to your home directory
!cd.. # go one folder up

In [None]:
# change directories to the week 1 folder
!cd /home/<net ID>/empirical_workshop_2021/1_reproducibility_KLC_Intro 

## See Working Directory

You can see the present working directory you are in with the following:

In [None]:
!pwd

## List Directory Contents

To list the contents of this folder, type:

In [None]:
!ls

# Using Python on KLC

No modules are prelaoded on KLC. You will have to load anything you'd like to use.

To see the different versions of python are available outside of this notebook, type:

```bash
$ module avail python 
```

To run python outside of this notebook, please load it.

```bash
$ module load python/anaconda3.6
```

You can view or make basic changes to any file with the nano editor.  For instance, view the time.py file by typing:
```bash
$ nano time.py
```

Thereafter, you can run a python file by typing the word "python" before the file's name:

```bash
$ python time.py
```

To stop a file, type CTRL + C,

Today, we will use a basic file that returns the time every 20 seconds:

In [None]:
# libraries used
import time

# print the time after every 20 seconds
for i in range(10000):
    print("The time is now: " + time.strftime("%X") + ". Time flies when you are on KLC.")
    time.sleep(20)
    
# In this notebook, you can stop this code by pressing the 
# black square above to interrupt the kernel.


# Continue a Process after Logging Off

To continue running processes after logging off KLC, you can launch a job through the FastX web browser. The web browser will open a session (Terminal, Stata, R, SAS, etc.) in a new tab. If you close the tab without terminating the job, said process will continue to run after you log off KLC.

Likewise, you can use the _nohup_ command in a Terminal session. The _nohup_ command executes a program specified as its argument in the background, while ignoring hangup signals (like signing off KLC.) To use _nohup_, type:

```bash
$ nohup <command> &
```

The output of this command will be saved to a _nohup.out_ file. You can see the jobs you have running on a given node by typing:

```bash
$ ps -U <netID>
```

This command returns a list of processes and their corresponding IDs.  To terminate a _nohup_ job, type:

```bash
$ kill <Process_ID>
```
More information on _nohup_ is provided here: https://linuxize.com/post/linux-nohup-command/

Lastly, you can also use the _screen_ command from a Terminal session by typing:

```bash
$ screen
```

This opens a new session on the node.  Typing CTRL A + D will exit the screen session. You can list your screen sessions with:

```bash
$ screen -ls
```

This provides a list of screen sessions with IDs. To terminate a screen session, type:

```bash
$ screen -X -S <ID> quit
```

More information on _screen_ is available here: https://linuxize.com/post/how-to-use-linux-screen/






# Version Control - commit changes to files back to github

In the nano editor, we can changes the time to every 10 seconds in the time.py file.

```bash
$ nano time.py
```

In [None]:
# libraries used
import time

# print the time after every 10 seconds
for i in range(10000):
    print("The time is now: " + time.strftime("%X") + ". Time flies when you are on KLC.")
    time.sleep(10)

You can save the changes you make to files in your github repo by typing:

```bash
$ git add time.py
$ git commit -m "Every 10 seconds."
$ git push
```

Finally, you can check the status of your github repo, with the following:
    
```bash
$ git status
```

# Using R on KLC

Much like python, you will need to load R to use it on KLC.

```bash
$ module load R/4.0.0 
```

We will look at a file that produces basic regression output and returns a time series figure.To view the file, type:

```bash
$ nano swiss.R
```

To run R script from KLC type:

```bash
$ Rscript swiss.R
```

You can also launch an R GUI in Rstudio, with the following:

```bash
$ module load R/4.0.0
$ rstudio
```

# Transfering files with Cyberduck

To install Cyberduck, go to: https://cyberduck.io/download/ 

To access KLC from Cyberduck:
 - Select Open Connection
 - Select SFTP
 - server: klc01.ci.northwestern.edu 
 - Type your Net ID
 - Type your Net ID password

# Checking the status of processes on KLC

You can check how many processes are running on a KLC node by typing:

```bash
$ top
```

To list your processes, type:

```bash
$ ps -U <your net ID>
```

This will provide a list of processes and their IDs.  To stop a process by ID, type:

```bash
$ kill -f <process ID>
```