# Episode 1 - Introduction and setup
This notebook is based on a snapshot of [Episode 1](https://kmichali.github.io/SC-shell-novice/01-intro/index.html) of the [Unix Shell lesson](https://kmichali.github.io/SC-shell-novice/) from the [Software Carpentry](https://software-carpentry.org). The original material has more detail.

### Questions:
- What is a command shell and why would I use one?
- How is this lesson going to work?

### Objectives:
- Explain how the shell relates to the keyboard, the screen, the operating system, and users’ programs.
- Explain when and why command-line interfaces should be used instead of graphical interfaces.

### Video
Learn with [video](https://imperial.cloud.panopto.eu/Panopto/Pages/Viewer.aspx?id=abd2d252-6c29-4171-81b9-abd500efe82a).

<hr style="border: solid 1px red; margin-top: 1.5% ">

In [1]:
echo $SHELL

$SHELL


## Background
<hr style="border: solid 1px gray; margin-top: 1.5% ">

Humans and computers commonly interact in many different ways, such as through a keyboard and mouse, touch screen interfaces, or using speech recognition systems. The most widely used way to interact with personal computers is called a graphical user interface (GUI). With a GUI, we give instructions by clicking a mouse and using menu-driven interactions.

While the visual aid of a GUI makes it intuitive to learn, this way of delivering instructions to a computer scales very poorly. Imagine the following task: run two programs on multiple data files and save a section from the result files in one final document.

Using a GUI, you would not only be clicking at your desk for several hours, but you could potentially also commit an error in the process of completing this repetitive task. This is where we take advantage of the Unix shell that is very good at doing repetitive tasks automatically. With the proper commands, the shell can repeat tasks with or without some modification as many times as we want.


## The Shell
<hr style="border: solid 1px gray; margin-top: 1.5% ">

The shell is a program where users can type commands. With the shell, it’s possible to invoke complicated programs like climate modeling software or simple commands that create an empty directory with only one line of code. The most popular Unix shell is Bash (the Bourne Again SHell — so-called because it’s derived from a shell written by Stephen Bourne). Bash is the default shell on most modern implementations of Unix and in most packages that provide Unix-like tools for Windows.

Using the shell will take some effort and some time to learn. However, with just a few commands, you will be able to accomplish complicated tasks.  You can also save a sequence of commands and repeat those later to reproduce your experiments.


In addition, the command line is often the easiest way to interact with remote machines and supercomputers. Familiarity with the shell is near essential to run a variety of specialized tools and resources including high-performance computing systems.

## Shell and commands
<hr style="border: solid 1px gray; margin-top: 1.5% ">

Below is a picture of a shell running bash version 4.2:

![Shell](../fig/shell.png)

The **`$`** symbol is called the prompt.  One types **commands and their arguments** beside the prompt and executes by pressing **`Enter`**. For this reason the shell is often called the command line.

- If a command exists and is in a correct format, shell will execute.  Depending on a command, an output may be printed before the shell displays a new instance of the prompt.

- If a command does not exist or if the format is not as expected, it will produce an error. Below is an example of an error message that resulted from the user typing a command that does not exist.

![Shell](../fig/shell_error.png)

In principle, there are two categories of commands:
- **commands that are part of the Linux system** - commands for general tasks that have to do with file, directory and system management, programming language interpreters and compilers, text editors etc.

- **user-defined commands** - commands that run software packages and scripts that are installed by a user; for example scientific software, commercial software packages etc.

## Nelle's pipeline: a typical problem
<hr style="border: solid 1px gray; margin-top: 1.5% ">

Nelle Nemo, a marine biologist, has just returned from a six-month survey of the [North Pacific Gyre](https://en.wikipedia.org/wiki/North_Pacific_Gyre), where she has been sampling gelatinous marine life in the [Great Pacific Garbage Patch](https://en.wikipedia.org/wiki/Great_Pacific_garbage_patch). She has 1520 samples that she’s run through an assay machine to measure the relative abundance of 300 proteins. She needs to run these 1520 files through an imaginary program called **`goostats`** she inherited. On top of this huge task, she has to write up results by the end of the month so her paper can appear in a special issue of Aquatic Goo Letters.

The bad news is that if she has to run **`goostats`** by hand using a GUI, she’ll have to select and open a file 1520 times. If **`goostats`** takes 30 seconds to run each file, the whole process will take more than 12 hours of Nelle’s attention. With the shell, Nelle can instead assign her computer this mundane task while she focuses her attention on writing her paper.

The next few lessons will explore the ways Nelle can achieve this. More specifically, they explain how she can use a command shell to run the **`goostats`** program, using loops to automate the repetitive steps of entering file names, so that her computer can work while she writes her paper.

As a bonus, once she has put a processing pipeline together, she will be able to use it again whenever she collects more data.

## This lesson is in Jupyter Notebooks
<hr style="border: solid 1px gray; margin-top: 1.5% ">

This lesson is based on the materials by [Software Carpentry](https://kmichali.github.io/SC-shell-novice/).

There are an excellent resource for self-study that do not contain interactive examples.  I recreated a streamlined version in Jupyter Notebooks in an attempt to provide an environment where students can follow in-class instruction, execute commands and practice in one place.

Jupyter Notebooks include non-interactive text and interactive cells (in light grey with "In[  ]" to the left). **These cells can be executed, place the mouse cursor in the cell and press Shift+Enter.**

Below is an executable cell containg Linux command **`ls`**.  Execute with Shift+Enter to test your notebook, if all is ok you should see a list of files and directories after you execute the cell.

In [1]:
ls

 ������ D �еľ��� Data
 �������к��� B87B-0116

 d:\OneDrive - Imperial College London\ykOD\icStorage\IC\IcInterns\231120 The Linux Command Line for Scientific Computing\RCDS-comm-line\notebooks ��Ŀ¼

2023/11/19  23:32    <DIR>          .
2023/11/19  23:32    <DIR>          ..
2023/11/19  21:36            10,824 01_introduction_and_setup.ipynb
2023/11/19  21:36            14,627 02_navigating_files_directories.ipynb
2023/11/19  21:36            19,052 03_working_files_directories.ipynb
2023/11/19  21:36            15,513 04_pipes_filters.ipynb
2023/11/19  21:36            13,966 05_for_loops.ipynb
2023/11/19  21:36            21,290 06_shell_scripts.ipynb
2023/11/19  21:36            20,365 07_finding_things.ipynb
2023/11/19  21:36            25,943 08_Nelles_script.ipynb
2023/11/19  23:32    <DIR>          data-shell
               8 ���ļ�        141,580 �ֽ�
               3 ��Ŀ¼ 194,841,853,952 �����ֽ�


## Notebook or the shell?
<hr style="border: solid 1px gray; margin-top: 1.5% ">

Jupyter Notebooks are great to start with because they combine interactive elements and intructions in one place.  However, most of your work will eventually be done in a "real" shell environment that contains only commands.  In this lesson, we start with Jupyter Notebooks and move to the shell for the later parts of the course.


### Magic command
Jupyter Notebooks can execute shell commands and various programming language instructions but a defaut Notebook cell expects the content to be in Python. One can instruct the notebook to use bash instead by typing **`%%bash`** on the first line of the cell.  This is so called "magic" command and is not used outside Jupyter Notebooks.  With this magic command, each notebook cells becomes a single instance of the shell.  I use the magic command **`%%bash`** in all cells except the cells that contain a sole **`cd`** (change working directory) command.

For those of you familiar with Jupyter and are interested in a more elegant solution, one can install [custom bash kernel](https://github.com/takluyver/bash_kernel/tree/master/bash_kernel). This means that each cell will run bash by default (and one can skip the magic command all together).


### Cell order matters 
Finally, **the notebook cells have have to be executed from top to bottom**.  Executing out of order may produce errors because some cells depend on the output of the cells above.

<hr style="border: solid 1px red; margin-top: 1.5% ">

## Key points
- A shell is a program whose primary purpose is to read commands and run other programs.
- The shell’s main advantages are its high action-to-keystroke ratio, its support for automating repetitive tasks, and its capacity to access networked machines.
- The shell’s main disadvantages are its primarily textual nature and how cryptic its commands and operation can be.
- For the first part of this lesson, we will emulate the shell using Jupyter Notebooks.
