# Git + Bash 101
In this badge, we will go through the absolute basics of the bash command line (Linux, MacOS, WSL) and git, with zero prior experience needed. By the end of the course, you should be able to navigate your way through a computer purely through the command line, connect to servers, and maintain progress updates using git and Github.

## Outline
1. Concepts
    - server == computer
    - everything is made up of files


2. Bash basics

    2.1 structure of commands (\<command\> \<flags\> \<args\>)
    
    2.2 navigating files (ls, cd, ln -s, dir) -> where things are (home, root, cp, mv, etc)
    
    2.3 manipulating files (cat, nano/vim, head/tail, touch, rm)
    
    2.4 Installing stuff
    
    2.5 Python Scripting


3.  Bash Advanced

    3.1 SSH/SFTP
    
    3.2 Permissions (mod and sudo)
    
    3.3 Background Processes


4. Git Concepts TBD
    
    
5. Git Basics TBD 


7. Git Specials TBD

## 1. Command Line Concepts
The 'terminal' is the command-line window into the inner workings of computers. When you're looking at your computer's terminal, you can do a lot of the things you normally do + much more: edit/manipulate files, open files, organize folders, run programs, etc. 

Just as your local computer has an operating system, files, folders, programs, so do servers. This means that those things you can do on your computer you can also do to remote computers, that you are connected to through your terminal!

Another thing to know is that everything is made up of files. All the complex programs out there are made up of many text files of various formats and interpreted by different programs, which are also made up of a bunch of files. When you're doing a computational biology projects, you will have to handle a _boatload_ of files. There are files to control configurations, python files that contain your code, binary files you won't be able to edit, sequence files you _can_ edit, pdf's, etc. 

#### Terminal vs Bash vs Shell?
I threw these terms around earlier, and you will see these terms a lot when looking through tutorials and articles online. So what are they exactly? 

A shell is software that is designed to interact with the user. The shell takes in commands, and displays the outputs. A terminal, on the other hand, is the program that runs a shell. Additionally, 'bash' is one of the most common shell type, among the other shell types (zsh, csh, ksh, etc). Therefore, a user would open a _terminal_ to access a _shell_, usually _bash_, and type in commands to do stuff.

## 2. Bash Basics
We start with the basic commands you will have to use multiple times a day, ideally without even thinking about it. These are for file navigation and manipulation, so they do the same thing as you waving your mouse around a bunch of files and folders, including copying, pasting, creating shortcuts, etc.

First of all, let's try to open a bash terminal. See the Jupyter 101 badge to see how to do it. But in essence:
1. If you're in jupyter**lab**, click on the big blue `+` button on the top left, then select `$_ Terminal`.
2. If you're in jupyer **notebooks**, go back to the tree view (where you can see all your files), click on `new`, then click on `terminal`.

You can type in all the commands in the terminal you just opened. But, for the sake of this course, we will use jupyter (this notebook) as the interface to the shell. All you need is the "!" 'magic command'.

In [1]:
# Example command
!pwd

/home/jovyan/compbio-badges-dev/badge-git_bash


You should see the current working directory printed out above.

### 2.1 Commands and Their Structure
A very typical line in bash involves a command, flags (options), and arguments. Here are some example combinations:

| Example Command | Structure | Function | 
| --- | --- | --- |
| `ls` | `<command>` | lists files in directory |
| `pwd` | `<command>` | present working directory |
| `ls -a` | `<command> <flag>` | lists all files in directory, including hidden ones |
| `ls -l -a` | `<command> <flag> <flag>` | lists all files in directory, including hidden ones, including details |
| `ls -la` | `<command> <2 flags>` | lists all files in directory, including hidden ones, including details (same as above) |
| `cat myfile.txt` | `<command> <arg>` | view the text inside myfile.txt |
| `cat --help` | `<command> <flag>` | prints the help manual for the command 'cat' |
| `mv myfile.txt yourfile.txt` | `<command>` | present working directory |
| `samtools view -o aln.bam aln.sam.gz` | `<command> <subcommand> <optional argument type> <optional argument> <required argument>` | view aln.sam.gz and save it as aln.bam |





### 2.2 Navigation

In [2]:
# First we want to get rid of having to use "!". You have to run this cell but don't have to know what it does.
%automagic 1


Automagic is ON, % prefix IS NOT needed for line magics.


So when you open your terminal (or jupyter), it is 'located' at a certain directory, called the 'present working directory'.

In [3]:
pwd

'/home/jovyan/compbio-badges-dev/badge-git_bash'

This 'location' is just a pointer, kind of like opening a specific folder visually. You can do stuff to the files in this folder too, such as view all files in this directory:

In [4]:
ls

git_bash-101.ipynb  [0m[01;34msupermarket[0m/


The highlighted items are directories, and the plaintext ones are files. You can enter directories and open files.

However, you have the ability to change your location. You can enter a specific folder by using `cd` (or "change directory"). This lets you either go to a folder within your pwd:

In [5]:
cd supermarket

/home/jovyan/compbio-badges-dev/badge-git_bash/supermarket


Or move 'up' a directory using `..`: For reference, `.` is the current directory.

In [6]:
cd ..

/home/jovyan/compbio-badges-dev/badge-git_bash


You can also jump to different locations, such as your 'home' using `~`:

In [7]:
cd ~

/home/jovyan


Or to a random folder:

In [8]:
cd ~/compbio-badges-dev/badge-git_bash/supermarket

/home/jovyan/compbio-badges-dev/badge-git_bash/supermarket


In [9]:
ls

[0m[01;34maisle_1[0m/  [01;34maisle_2[0m/


We can then do many things! For example, copy the apples.txt file to aisle 2:

In [10]:
cp aisle_1/apples.txt aisle_2/

cp: cannot stat 'aisle_1/apples.txt': No such file or directory


In [11]:
ls aisle_1 && ls aisle_2

[0m[01;36maisle_2[0m@  bananas.txt
apples.txt  potato_chips.txt  sodas.txt


Then we can delete files with `rm`:

In [12]:
rm aisle_2/apples.txt

In [13]:
ls aisle_1 && ls aisle_2

[0m[01;36maisle_2[0m@  bananas.txt
potato_chips.txt  sodas.txt


We can also move files altogether, or rename them (both with `mv`):

In [14]:
mv aisle_1/apples.txt aisle_2/

mv: cannot stat 'aisle_1/apples.txt': No such file or directory


In [15]:
ls aisle_1 && ls aisle_2

[0m[01;36maisle_2[0m@  bananas.txt
potato_chips.txt  sodas.txt


We can also create shortcuts to go from one folder to another, in this case, we want to go to aisle_2 from aisle_1:

In [16]:
!ln -s /home/jovyan/compbio-badges-dev/badge-git_bash/supermarket/aisle_2 ~/compbio-badges-dev/badge-git_bash/supermarket/aisle_1

ln: failed to create symbolic link '/home/jovyan/compbio-badges-dev/badge-git_bash/supermarket/aisle_1/aisle_2': File exists


In [17]:
ls && pwd

[0m[01;34maisle_1[0m/  [01;34maisle_2[0m/
/home/jovyan/compbio-badges-dev/badge-git_bash/supermarket


Going to aisle_2 through aisle_1!

In [21]:
cd aisle_1/aisle_2/ 

[Errno 2] No such file or directory: 'aisle_1/aisle_2/'
/home/jovyan/compbio-badges-dev/badge-git_bash


In [19]:
cd ..

/home/jovyan/compbio-badges-dev/badge-git_bash


### 2.3 Manipulating Files

We can use `cat` to simply view files. However, be careful since this prints out _everything_ in the file. If your file is 1000 lines long, it will print out all 1000 lines. 

In [28]:
cat aisle_2/sodas.txt

soda 1
soda 2
soda 3
soda 4
soda 5
soda 6
soda 7
soda 8
soda 9
soda 10
soda 11
soda 12
soda 13
soda 14
soda 15
soda 16
soda 17
soda 18
soda 19


Instead, we can use `head` to view the first 10 lines, or `tail` to view the last 10 lines. If you want to view a different number of lines, use a flag with the number of lines.

In [31]:
!head aisle_2/sodas.txt

soda 1
soda 2
soda 3
soda 4
soda 5
soda 6
soda 7
soda 8
soda 9
soda 10


In [32]:
!head -4 aisle_2/sodas.txt

soda 1
soda 2
soda 3
soda 4


In [33]:
!tail aisle_2/sodas.txt

soda 10
soda 11
soda 12
soda 13
soda 14
soda 15
soda 16
soda 17
soda 18
soda 19


In [35]:
!tail -2 aisle_2/sodas.txt

soda 18
soda 19


Once you can view the files, we can also create, edit, and delete files. 
- To create a file, use `touch <file>`.
- To edit a file, use a built-in text editor like `vim <file>` or `nano <file>`
- To delete a file, use `rm <file>`. Deleting has advanced uses, such as deleting empty dirs, deleting full dirs, selectively deleting files, etc.

In [37]:
ls

[0m[01;34maisle_1[0m/  [01;34maisle_2[0m/


### 2.4 Installing Stuff

There are two types of things you will most likely have to install: python packages and bash programs. 

**Python packages**

For python packages, it's fairly straightforward. You can either use the built-in `pip install <package_name>` , or `conda install <package_name>`. Most packages are available in both installers, but some aren't. Just check the instruction of your specific package.

*Note: The conda command is not built-in, so you'd have to install it first. However, conda is nice that it's also provides you with the ability to make 'environments', which can then have their own set of packages installed. This way, you can have different environments for different projects that require their own unique set of packages.

**Bash Programs**

Regarding bash programs, there are more ways of doing it. There are a few built-in installers: `apt`, `apt-get`, and `dpkg`. We are only going to use `apt install <programn_name>`. 

*Note on sudo: Sometimes, when we need administrative privileges, we prepend `sudo` to the command. It doesn't work well in datahub but you can use it on your local terminal on your own time.

Sometimes, you need to install things from files you download online. For example, Miniconda (a lightweight version of conda). Here is how you would install it:

First, you have to download the file with `wget <file_link>`:

`wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh`


Then, you run the script (.sh means shell script) using `bash <script.sh>`: 

`bash Miniconda3-latest-Linux-x86_64.sh`

### 2.5 Python Scripting

Speaking of scripts, one thing we will do a lot is writing python files, then running them. You may be familiar with python notebooks (.ipynb) but python also has its original script format (.py). These scripts are run with `python <script.py>`. The cool thing is that with this command, you can run python as any other script, and use the aforementioned flags/options and arguments like you would a bash command!

In [42]:
# Lets create a small python script:
# This is a quicker way of writing into a file. 'Echo' is the equivalent of a print statement but for bash.
# ">" Redirects the output from the terminal to the file instead.
!echo "print('Items: cereal')" > basket.py

In [44]:
# Running the script
!python basket.py

Items: cereal


We can also use arguments by using `sys`, a built-in python package! 

*note: "\\" (backslash) is used in both python and bash to split one line into multiple lines without being considered separate lines.

*note: ">>" is _append_, whereas ">" is (over)_write_ 

In [72]:
!echo 'import sys' > basket.py
!echo 'print(f"Owner of basket: {sys.argv[1]}")' >> basket.py

In [73]:
!python basket.py Ivan

Owner of basket: Ivan


## 3. Advanced Bash

### 3.1 SSH/SFTP
SSH is the protocol to connect your terminal to a remote server/computer's terminal. You typically do this with `ssh <username>@<ip_address> -p <portnumber, defaults to 22>`

The `ip_address` can also be a domain, depending on the server. For example, for Savio, we connect to `<user>@hpc.brc.berkeley.edu`. The port number can also be changed depending on your/their needs.

SFTP, on the other hand, is geared more towards file transfers (hence Secure File Transfer Protocol), and less about running scripts and commands on the remote server.

With both SSH and SFTP, you can use a third-party client to view/manipulate/download/upload to/from the remote server's directory (WinSCP, Filezilla, Cyberduck, etc). You can also connect VSCode, a code editor, directly to a remote server through SSH. This way, you can directly edit files on the server, as well as do everything you'd normally do on your local computer, remotely, such as running scripts, running commands, debugging, github stuff, etc.

### 3.2 Permissions
With permissions, the concepts are fairly straightforward but the implementation is very tricky. First of all, every single file and folder have a set of things that can be done to it (read, write, and execute), applied to different groups (specific users, specific groups, and/or everyone/public). You can use:

`chmod <+/-><permission_number> <file | directory -R>`

You can then view the permissions by using `ls -l <file/folder>` and looking at the long string on the left column.

An important component is the `permission_number` that varies from 0 (strictest, nobody can read, write or execute) to 777 (everyone can read, write and execute). 

**Ownership**
Similarly, every file/folder is owned by some user AND some group (usually some default group containing all the users). Changing the owners will change which permissions apply to people. For example, file A has permission RWX for its user owner, but RW- for everyone else. If you change the owner from person A to B, then person A will lose execution privileges, whereas person B gains execution privileges. A similar story applies to groups.

To change user owners, use `sudo chown <user> <file/dir>`, and to change group owner, use `sudo chgrp <file/dir>` or `sudo chown <user(opt)>:<group> <file/dir>`.

If you want to learn more, visit https://ss64.com/bash/syntax-permissions.html and https://linuxcommand.org/lc3_lts0090.php

### 3.3 Background jobs

When you type in `python mysript.py` or `bash myscript.sh`, it will run in the terminal you're on, and will be forcefully terminated whenever you close your terminal. On local devices, this is fine, but when working on remote servers, you are unlikely to stay on and connected for multiple days, since you have to close/sleep/shutdown your computer. With the script running in the foreground, closing your SSH connection will terminate it. So, how do you solve this? Ignore the kill signal!

1. `nohup` ("no hangup") is a command that lets you do just that. Prepending a command with `nohup` will cause that process to ignore any 'kill' signals sent by the closing of the terminal (or ssh connection). For example, `nohup python myscript.py` will cause it to keep running even when the terminal is closed.
2. `&`, when appended to a command, will put it in the background. But, it doesn't prevent the process from being killed when closed. E.g. `python myscript.py &` will send it to the background. You can use the terminal for other things, and the printouts will not show up.
3. Both! Typically using both will be the best option. You can just type `nohup python myscript.py &` and your script will run in the background, while you can go about your day as the process finishes by itself. 

There are some nice tips to use:
- Since you can't see the printouts using nohup + &, it will output to some nohupXXX file at the end. If you want the output to be saved in a file of your name of choosing, then use the ">" operator, like `nohup python myscript.py > my_printouts.txt &`.
- The method above works for most commands, but python specifically waits for the process to complete before dumping ALL the printouts to your file (this is called buffering). To solve this, we want python to print stuff out 'unbuffered', using the flag `-u`. E.g. `nohup python -u myscript.py > my_printouts.txt &`.
- Now python is writing your printouts line by line to your output file. You can certainly `cat` it, but it doesn't change live. What do you do to monitor it in real time then? Introducing: `tail -F <file>`! You can view the end of the file (your printouts) live as it updates in real time!
- I have access to the printouts, but I don't want to monitor it constantly. How do I know if my job's running or if it was killed? You can view all the jobs that are running/recently ran by typing in `jobs` without any args.
- Okay, so I want to cancel my job. How do I do that? First, make sure to note down the number that was printed out when you ran the nohup'ed process. It's typically a 4-6 digit number. This is its process ID ('pid'). To kill this process, type in `kill -9 <pid>`. That's it.
