Shell Recap
============
The shell is a program where users can type commands. With the shell, it’s possible to invoke complicated programs like climate modeling software or simple commands that create an empty directory with only one line of code. The most popular Unix shell is Bash. Bash is the default shell on most modern implementations of Unix and in most packages that provide Unix-like tools for Windows.

The grammar of a shell allows you to combine existing tools into powerful pipelines and handle large volumes of data automatically. Sequences of commands can be written into a script, improving the reproducibility of workflows.

The command line is often the easiest way to interact with remote machines and supercomputers. Familiarity with the shell is nearly essential to run a variety of specialized tools and resources including high-performance computing systems. As clusters and cloud computing systems become more popular for scientific data crunching, being able to interact with the shell is becoming a necessary skill. We can build on the command-line skills covered here to tackle a wide range of scientific questions and computational challenges.

Let's start whit this recap!

**DISCLAIMER:** since we are in the Jupyter environment, all the commands or the cells that exploit bash functionalities will be preceded by the "**!**" or by "**%%bash**" this is necessary to make the Jupyter able to correctly interpreted the source code as bash code. 
In order to run the same command in a real bash, you have to remove the "**!**" and "**%%bash**"

## The Basics

### The variables
The variables in the shell have a different behavior in comparison to other languages, their declaration is simple

In [None]:
%%bash
NAME="Shell Scripting is Fun!"

the command used to print a variable is:

In [None]:
%%bash
NAME="Shell Scripting is Fun!"
echo $NAME

Running the same command with only the variable name will produce the printing of the variable name, but not the content of the var.

In [None]:
%%bash
NAME="Shell Scripting is Fun!"
echo NAME

### The print command

The ```echo``` command is a command used to print at screen a variable or whichever data you need to show.
As we have seen in the previous cells, with echo you can also concatenate variable/commands and strings. Let's see.

In [None]:
%%bash
APPLES=3
echo "Pippo has $APPLES apples"


In [None]:
%%bash

echo "I'm in the working folder $(pwd)"

The option ">" or ">>" respectively represent the "insert" and "append" of data to a file, the echo command can be used to put whatever you want to print to a file.

### The first command:

The first command that you have to remember is the ```pwd``` that stands for *print working directory*.
This command can be used to see/understand in which folder we are working.
Let's see an example

In [None]:
! pwd

### How to see what's in the folder:

The 'ls' command is used to see what the folder contains.

In [None]:
%%bash 
ls

The ls command has several parameters that allow you to see what you're looking for. One of those parameters if ```-F```. 
This parameter can be used with other to understand if each entry on the list is a file or a folder. To run this cell on your pc, you have to create a temporary test folder in the working folder


In [None]:
%%bash
ls -F

In general every available parameter of a command can be seen by running the command with the ```--help``` parameter.

In [None]:
%%bash

ls --help

### Let's create a new folder

The ```mkdir``` command is used to create a new folder starting from where you are. If you want you can also pass to the command the whole path you want to put the new folder in.
Let's try!

In [None]:
%%bash
mkdir newTemporaryFolder
ls -F

To create a new folder inside the created one the commands becomes:

In [None]:
%%bash 

mkdir newTemporaryFolder/newTemporaryFolderInside
ls ./newTemporaryFolder -F

### How to create a simple empty file?

The ```touch``` command is a simple command that allows you to create an empty file with the name and the extension that you want. Here's a simple example

In [None]:
%%bash

touch tempFile.txt
ls -F

### How to go in/out from a folder?

The ```cd``` command has two functions. Getting in or getting out of a folder. To enter a folder the command becomes ```cd folderName```. If you want to enter in a sub-folder, you can do this just by concatenating the folders that contain it: ```cd folderName/folder2Name/..```. 

If you want to exit a folder the command becomes ```cd ..```. Similar to the entering, if you want to get out of several folders, you have to concatenate the dots as ```cd ../..```.

If you want to come back to your $HOME directory, from wherever you are, this command gives you the shortcut ```cd ..```.

Let's try to cd in a folder and then get out.

In [None]:
%%bash

cd newTemporaryFolder
echo "I'm in:" $(pwd)
cd ..
echo "Now I'm in:" $(pwd)

### How to delete a file or a folder

The ```rm``` command is a command needed when you want to delete a file. If you want to delete both the folder an its content, the command becomes ```rm -r```.
Let's see a couple of examples

In [None]:
%%bash

echo "Before deleting the file:"
ls -F
echo "******************************************"

echo "After deleting the file:"
rm tempFile.txt
ls -F
echo "******************************************"

echo "After deleting the folder:"
rm -r newTemporaryFolder
ls -F

### How to Copy or Move a file or a folder

To copy a file or a folder you need to use ```cp```, while the command ```mv``` lets you move a file or a folder.
Mv command can also be used to rename a file: "*Moving a file into another name*"



In [None]:
%%bash 
mkdir startingPath
mkdir destinationPath
touch startingPath/file.txt

mkdir pippo

cp startingPath/file.txt destinationPath/file.txt


In [None]:
! ls destinationPath

In [None]:
%%bash 

rm  startingPath/file.txt
mv destinationPath/file.txt startingPath/file.txt

In [None]:
! ls destinationPath

In [None]:
! ls startingPath

But ```mv``` can be used even to rename...

In [None]:
%%bash

mv pippo pluto

In [None]:
!ls 

### How to see the content of a file

For this purpose, there are several options. The most common integrated command in the Unix Shell are:
+ cat
+ tail
+ head

The ```cat``` command can be used to get all the content of a file or a set of files and print them to the screen regardless of the file format (binary file included). This command is very useful but it is not appreciable when you have to deal with thousands and million-row files, which is quite common in the Big Data era.

The ``` head``` command allows you to get the first n rows of a file where n is a parameter of the command. This file can be useful when you need a fast check on a file structure or when you need to look for something in the first rows.

The ```tail``` command allows you to get the last n rows of a file where n is a parameter of the command. This command is particularly helpful when you have to check a log of a program (i.e.: you want to check just the last iteration/execution of a program and not the whole log file). Moreover, ```tail``` allows you to "*listen*" to a file that is being written and print in real-time what is going to be put on that file. This is quite useful when you have a log of your cluster and you want to check what is happening to the computation.


In [None]:
%%bash 

touch tempFile.txt
echo "Hello to MAPD 19/20" > tempFile.txt
cat tempFile.txt


In [None]:
%%bash 

echo "The first lecture is about bash" >> tempFile.txt
cat tempFile.txt

In [None]:
%%bash
head -n 1 tempFile.txt

In [None]:
%%bash 
tail -n 1 tempFile.txt

### Search a file/directory

the ```find``` command is used to search a file or a folder. It has a huge set of different options to make really complex searches. Let's see a simple example:


In [None]:
%%bash 
find . -name tempFile.txt

The aforementioned command is used to search, starting from our working directory, the tempFile.txt. In the next cell there's a command that can be used to search all the files with the txt extension that are present in our working directory. To change the searched directory we need to change the . with the path fo the folder you want to search in.

In [None]:
%%bash 
find . -name '*.txt' ## the qoutes around the argument are not necessay out of Jupyter

In [None]:
%%bash 

mkdir newTemporaryFolder
touch ./newTemporaryFolder/anotherTestFile.txt
find ./newTemporaryFolder -name '*.txt' ## the qoutes around the argument are not necessay out of Jupyter

rm -r newTemporaryFolder

## Search in a file

The ```grep``` command can be used to search something in a file. 
Let's see an example:

In [None]:
%%bash
grep MAPD tempFile.txt

grep allows to extract all the line that contains the word that we want to search. This command can also work with regular expressions.

In [None]:
%%bash
grep -E "[0-9]+" tempFile.txt

## Counting words or lines of a file:

the ```wc``` command allows you to count the number of words in a file:

In [None]:
%%bash
wc -w tempFile.txt

or the number of line

In [None]:
%%bash
wc -l tempFile.txt

## Sorting
the command ```sort``` can be used to sort the elements in a file or in a list of elements:

In [None]:
%%bash

echo "5" > orderable.txt
echo "3" >> orderable.txt
echo "6" >> orderable.txt
echo "2" >> orderable.txt

sort orderable.txt

## Complex but useful commands

### How can we build a pipeline of commands?

the answer is with char ```|```. 
This is a special char that can be used to create a pipeline of commands that must be executed in order. 
For example when searching in a file:

In [None]:
%%bash 
cat tempFile.txt | grep MAPD

Or it can be used to order a set of files by the number of words inside (ascending):

In [None]:
%%bash 
echo "one two three" > file1.txt
echo "one two three four" > file2.txt
echo "one" > file3.txt


wc -w file* | sort -n 

## Getting the list of the processes in the system

```ps``` is a command that allows to list all the processes that are running in the system.


In [None]:
! ps aux


Where:
+ a = show processes for all users
+ u = display the process user/owner
+ x = also show processes not attached to a terminal

The PID of the process can be used to kill a stuck process (i.e. when you run a python script that is going to be stuck)


The ```ps``` command can be used with the ```grep``` command through the ```|``` character, in order to search for a particular process.

In [None]:
%%bash 
ps aux | grep python

## Figuring out the IP address
The ```ifconfig``` command can be used to retrieve the Ip address. 
This command is not a standard command of the Unix system, and in some cases it must be installed. To install it on ubuntu you need to run this command:
    ```sudo apt-get install net-tools```

In [None]:
%%bash

ifconfig

the output of the command reports all the "network card" and the ip addresses.

### Other useful command in the near future

Other commands that may be useful in the near future are the ```ssh``` command that allows you to remotely connect to a server, cluster or HPC, and ```scp``` that is a command necessary to copy local-to-remote or remote-to-local files and resources.

[https://www.digitalocean.com/community/tutorials/ssh-essentials-working-with-ssh-servers-clients-and-keys](https://www.digitalocean.com/community/tutorials/ssh-essentials-working-with-ssh-servers-clients-and-keys)

[https://haydenjames.io/linux-securely-copy-files-using-scp/](https://haydenjames.io/linux-securely-copy-files-using-scp/)

Those two command exploit the same ssh protocol to communicate with a remote device.