Shell Recap
============
The shell is a program where users can type commands. With the shell, it’s possible to invoke complicated programs like climate modeling software or simple commands that create an empty directory with only one line of code. The most popular Unix shell is Bash. Bash is the default shell on most modern implementations of Unix and in most packages that provide Unix-like tools for Windows.

The grammar of a shell allows you to combine existing tools into powerful pipelines and handle large volumes of data automatically. Sequences of commands can be written into a script, improving the reproducibility of workflows.

The command line is often the easiest way to interact with remote machines and supercomputers. Familiarity with the shell is nearly essential to run a variety of specialized tools and resources including high-performance computing systems. As clusters and cloud computing systems become more popular for scientific data crunching, being able to interact with the shell is becoming a necessary skill. We can build on the command-line skills covered here to tackle a wide range of scientific questions and computational challenges.

Let's start whit this recap!

**DISCLAIMER:** since we are in the Jupyter environment, all the commands or the cells that exploit bash functionalities will be preceded by the "**!**" or by "**%%bash**" this is necessary to make the Jupyter able to correctly interpreted the source code as bash code. 
In order to run the same command in a real bash, you have to remove the "**!**" and "**%%bash**"

## The Basics

### The variables
The variables in the shell have a different behavior in comparison to other languages, their declaration is simple

In [1]:
%%bash
NAME="Shell Scripting is Fun!"

the command used to print a variable is:

In [2]:
%%bash
NAME="Shell Scripting is Fun!"
echo $NAME

Shell Scripting is Fun!


Running the same command with only the variable name will produce the printing of the variable name, but not the content of the var.

In [3]:
%%bash
NAME="Shell Scripting is Fun!"
echo NAME

NAME


### The print command

The ```echo``` command is a command used to print at screen a variable or whichever data you need to show.
As we have seen in the previous cells, with echo you can also concatenate variable/commands and strings. Let's see.

In [46]:
%%bash
APPLES=3
echo "Pippo has $APPLES apples"


Pippo has 3 apples


In [47]:
%%bash

echo "I'm in the working folder $(pwd)"

I'm in the folder /home/scampese/Repository/ManagementAndAnalysisOfPhysicsDatasetsB/lecture_1


The option ">" or ">>" respectively represent the "insert" and "append" of data to a file, the echo command can be used to put whatever you want to print to a file.

### The first command:

The first command that you have to remember is the ```pwd``` that stands for *print working directory*.
This command can be used to see/understand in which folder we are working.
Let's see an example

In [5]:
! pwd

/home/scampese/Repository/ManagementAndAnalysisOfPhysicsDatasetsB/lecture_1


### How to see what's in the folder:

The 'ls' command is used to see what the folder contains.

In [1]:
%%bash 
ls

anotherTestFile.txt
bash_shell_recap.ipynb
exercise_file_1.txt
exercise_file_2.txt
exercise_file_3.txt
exercise_file_4.txt
exercises.ipynb
file1.txt
file2.txt
file3.txt
lol.txt
orderable.txt
tempFile.txt


The ls command has several parameters that allow you to see what you're looking for. One of those parameters if ```-F```. 
This parameters can be used in other to understand if each entry on the list if a file or a folder. To run this cell in your pc, you have to create a temporary test folder in the working folder


In [10]:
%%bash
ls -F

bash_shell_recap.ipynb


In general all the available parameters of a command can be seen by run the command with the ```--help``` paramters.

In [11]:
%%bash

ls --help

Uso: ls [OPZIONE]... [FILE]...
List information about the FILEs (the current directory by default).
Sort entries alphabetically if none of -cftuvSUX nor --sort is specified.

Mandatory arguments to long options are mandatory for short options too.
  -a, --all                  non nasconde le voci che iniziano con .
  -A, --almost-all           non elenca le voci . e ..
      --author               con -l stampa l'autore di ogni file
  -b, --escape               stampa escape in stile C per i caratteri non grafici
      --block-size=SIZE      scale sizes by SIZE before printing them; e.g.,
                               '--block-size=M' prints sizes in units of
                               1,048,576 bytes; see SIZE format below
  -B, --ignore-backups       do not list implied entries ending with ~
  -c                         with -lt: sort by, and show, ctime (time of last
                               modification of file status information);
                               with -l:

### Let's create a new folder

The ```mkdir``` command is used to create a new folder starting from where you are. If you want you can also pass to the command the whole path where you want to put the new folder.
Let's try!

In [32]:
%%bash
mkdir newTemporaryFolder
ls -F

bash_shell_recap.ipynb
newTemporaryFolder/


To create a new folder inside the created one the commands become:

In [33]:
%%bash 

mkdir newTemporaryFolder/newTemporaryFolderInside
ls ./newTemporaryFolder -F

newTemporaryFolderInside/


### How to create a simple empty file?

The ```touch``` command is a simple comman that allow you to create an empty file with the name and the extension that you want. Let's see a simple example

In [34]:
%%bash

touch tempFile.txt
ls -F

bash_shell_recap.ipynb
newTemporaryFolder/
tempFile.txt


### How to go in/out from a folder?

The ```cd``` command has two functions. Getting in or out from a folder. To enter in a folder the command becomes ```cd folderName```. If you want to enter even in sub-folder, you can do this just by concatenating the folders that contain it as ```cd folderName/folder2Name/..```. 

If you want to exit from a folder the command becomes ```cd ..```. Similar to the entering, if you want to go out from several folders, you have to concatenate the dots as ```cd ../..```.

If you want to come back to your $HOME directory, from wherever you are, this command gives you the shortcut ```cd ..```.

Let's try to cd in a folder and then go out.

In [29]:
%%bash

cd newTemporaryFolder
echo "I'm in:" $(pwd)
cd ..
echo "Now I'm in:" $(pwd)

I'm in: /home/scampese/Repository/ManagementAndAnalysisOfPhysicsDatasetsB/lecture_1/newTemporaryFolder
Now I'm in: /home/scampese/Repository/ManagementAndAnalysisOfPhysicsDatasetsB/lecture_1


### How to delete a file or a folder?

The ```rm``` command is a command necessary to delete a file. If you want also delete a folder an its content, the command becomes ```rm -r```.
Let see a couple of examples

In [35]:
%%bash

echo "Before deleting the file:"
ls -F
echo "******************************************"

echo "After deleting the file:"
rm tempFile.txt
ls -F
echo "******************************************"

echo "After deleting the folder:"
rm -r newTemporaryFolder
ls -F

Before deleting the file:
bash_shell_recap.ipynb
newTemporaryFolder/
tempFile.txt
******************************************
After deleting the file:
bash_shell_recap.ipynb
newTemporaryFolder/
******************************************
After deleting the folder:
bash_shell_recap.ipynb


### How to see the content of a file?

For this purpose, there are several options. The most common integrated command in the Unix Shell are:
+ cat
+ tail
+ head

The ```cat``` command can be used to get all the content of a file or a set of file and print it to the screen whatever is the file format (binary file included). This command is very useful but it is not appreciable when you have to deal with thousands and million-row files, which is quite common in the Big Data era.

The ``` head``` command allows you to get the first n rows of a file where n is a parameter of the command. This file can be useful in order to get a fast check to a file structure or to search for something in the first rows.

The ```tail``` command allows you to get the last n rows of a file where n is a parameter of the command. This command is particularly helpful when you have to check a log of a program (i.e.: you want to check just the last iteration/execution of a program and not the whole log file). Moreover, ```tail``` allows you to "*listen*" to a file that is being written and print in real-time what is going to be put on that file. This is quite useful when you have a log of your cluster and you want to check with us happening to the computation.


In [56]:
%%bash 

touch tempFile.txt
echo "Hello to MAPD 19/20" > tempFile.txt
cat tempFile.txt


Hello to MAPD 19/20


In [57]:
%%bash 

echo "The first lecture is about bash" >> tempFile.txt
cat tempFile.txt

Hello to MAPD 19/20
The first lecture is about bash


In [63]:
%%bash
head -n 1 tempFile.txt

Hello to MAPD 19/20


In [60]:
%%bash 
tail -n 1 tempFile.txt

The first lecture is about bash


### Search a file/directory

the ```find``` command is a command used to search a file or a folder. Have a huge set of different options to make really complex searches. Let's see a simple example:


In [73]:
%%bash 
find . -name lol.txt

./lol.txt


The aforementioned command is used to search, starting from our working directory, the tempFile.txt. In the next cell is present the command that can be used to search all the files with the txt extension that is present in our working directory. To change the search directory is necessary to change the . with the path where you want to search.

In [74]:
%%bash 
find . -name '*.txt' ## the qoutes around the argument are not necessay out of Jupyter

./lol.txt
./tempFile.txt


In [100]:
%%bash 

mkdir newTemporaryFolder
touch ./newTemporaryFolder/anotherTestFile.txt
find ./newTemporaryFolder -name '*.txt' ## the qoutes around the argument are not necessay out of Jupyter

rm -r newTemporaryFolder

./newTemporaryFolder/anotherTestFile.txt


## Search in a file

The ```grep``` command can be used to search something in a file. 
Let's see an example:

In [101]:
%%bash
grep MAPD tempFile.txt

Hello to MAPD 19/20


grep allow to extract all the line that contains the word that we want to search. This command can also work with regular expression.

In [103]:
%%bash
grep -E "[0-9]+" tempFile.txt

Hello to MAPD 19/20


## Counting word or line of a file:

the ```wc``` command allow you count the number of words in a file:

In [115]:
%%bash
wc -w tempFile.txt

4 tempFile.txt


or the number of line

In [116]:
%%bash
wc -l tempFile.txt

1 tempFile.txt


## Sorting
the command ```sort``` can be used to sort the elments in a file or list of elements:

In [117]:
%%bash

echo "5" > orderable.txt
echo "3" >> orderable.txt
echo "6" >> orderable.txt
echo "2" >> orderable.txt

sort orderable.txt

2
3
5
6


## Complex and usefull command

### How we can build a pipeline of command?

the answer is with char ```|```. 
This is a special char that can be used to create a pipeline of command that must be executed in order. 
For example to search in a file:

In [104]:
%%bash 
cat tempFile.txt | grep MAPD

Hello to MAPD 19/20


Or it can be used to order a set of file by the number of words inside (ascending):

In [108]:
%%bash 
echo "one two three" > file1.txt
echo "one two three four" > file2.txt
echo "one" > file3.txt


wc -w file* | sort -n 

 1 file3.txt
 3 file1.txt
 4 file2.txt
 8 totale


## Get the list of the process in the system

```ps``` is a command that allows to list all the processes that are running in the system.


In [122]:
! ps aux


USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.0 225444  9012 ?        Ss   08:25   0:03 /sbin/init spla
root         2  0.0  0.0      0     0 ?        S    08:25   0:00 [kthreadd]
root         3  0.0  0.0      0     0 ?        I<   08:25   0:00 [rcu_gp]
root         4  0.0  0.0      0     0 ?        I<   08:25   0:00 [rcu_par_gp]
root         6  0.0  0.0      0     0 ?        I<   08:25   0:00 [kworker/0:0H-k
root         9  0.0  0.0      0     0 ?        I<   08:25   0:00 [mm_percpu_wq]
root        10  0.0  0.0      0     0 ?        S    08:25   0:01 [ksoftirqd/0]
root        11  0.0  0.0      0     0 ?        I    08:25   0:17 [rcu_sched]
root        12  0.0  0.0      0     0 ?        S    08:25   0:00 [migration/0]
root        13  0.0  0.0      0     0 ?        S    08:25   0:00 [idle_inject/0]
root        14  0.0  0.0      0     0 ?        S    08:25   0:00 [cpuhp/0]
root        15  0.0  0.0      0     0 ?        S    08

Where:
+ a = show processes for all users
+ u = display the process's user/owner
+ x = also show processes not attached to a terminal

The PID of the process can be use to kill a stucked process (i.e. when you run a python script that is going to be stucked)


The ```ps``` command can be used with the ```grep``` command through the ```|``` character, in order to search a particular process.

In [123]:
%%bash 
ps aux | grep python

root       907  0.0  0.0 170544 17244 ?        Ssl  08:25   0:00 /usr/bin/python3 /usr/bin/networkd-dispatcher --run-startup-triggers
root      1186  0.0  0.0 187248 19900 ?        Ssl  08:25   0:00 /usr/bin/python3 /usr/share/unattended-upgrades/unattended-upgrade-shutdown --wait-for-signal
scampese  1910  0.0  0.2 2535596 97536 ?       Sl   08:25   0:00 /usr/bin/python3 /usr/bin/remarkable
root      3352  0.0  0.0 167912 17052 ?        Sl   08:28   0:00 /usr/bin/python3 /usr/share/apt-xapian-index/update-apt-xapian-index-dbus
scampese 17693  0.0  0.1 356248 64776 pts/3    Sl+  10:40   0:08 /usr/bin/python3 /usr/local/bin/jupyter-notebook
scampese 18156  0.0  0.1 627188 50684 ?        Ssl  10:42   0:04 /usr/bin/python3 -m ipykernel_launcher -f /home/scampese/.local/share/jupyter/runtime/kernel-74d6adf9-9d07-4614-a451-4a59f1019aca.json
scampese 18994  0.0  0.0  14432  1092 ?        S    16:16   0:00 grep python


## Figuring out the IP address
The ```ifconfig``` command can be used to retrieving the Ip address. 
This command is not a standard command of the Unix system, and in some case must be installed. To install in ubuntu it is necessary to run this command:
    ```sudo apt-get install net-tools```

In [124]:
%%bash

ifconfig

enp8s0f1: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
        ether 80:fa:5b:73:35:77  txqueuelen 1000  (Ethernet)
        RX packets 402954  bytes 377524034 (377.5 MB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 311904  bytes 145796146 (145.7 MB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Loopback locale)
        RX packets 46155  bytes 25987791 (25.9 MB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 46155  bytes 25987791 (25.9 MB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

wlp7s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.1.10  netmask 255.255.255.0  broadcast 192.168.1.255
        inet6 fe80::7b3c:817c:dd1b:75e3  prefixlen 64  scopeid 0x20<link>
        ether 4c:1d:96:6d:28:06  txq

the output of the command reports all the "network card" and the ip addresses.