# Unix/Linux, Shell, and Git

![tar](fig/tar.png)

## Introduction to Operating Systems

An operating system (OS) is the software layer that connects the
computer hardware to users and applications (and AI agents now).
Instead of writing instructions that directly manipulate processors,
memory chips, or disk drives, we interact with the OS, which manages
these resources for us.

### The Structure of an Operating System

![Kernel, Shell, and Applications](fig/Unix.png)

An OS typically consists of three main parts:

* Kernel:
  the core component.
  It directly manages hardware (CPU, memory, devices) and enforces
  rules for resource sharing.
* System Programs and Applications:
  provide services built on top of the kernel, such as file utilities,
  compilers, or networking tools.
* Shell and User Interface:
  the layer through which users interact with the OS.
  This can be:
  * Command-line shells (e.g., `bash`, `zsh`), where users type commands, or
  * Graphical interfaces (e.g., desktops, windows, icons).

In this lab, we will focus on the shell, because computational
astrophysicists often work on large remote systems (HPC clusters and
Cloud) where the command line is the most efficient and sometimes the
only available interface.

### Common Features of Operating Systems

Despite differences, most operating systems share these
responsibilities:
* Process Management:
  starting, stopping, and scheduling programs.
* Memory Management:
  allocating, tracking, and protecting system memory.
* File Systems:
  organizing data into files and directories.
* Device Management:
  controlling access to hardware like disks and network cards.
* Security and Access Control:
  permissions, authentication, and isolation.
* User Interfaces:
  shells or graphical environments for interaction.

### Unix

![Ken Thompson and Dennis Ritchie](fig/ken+dmr.png)

Unix, developed at Bell Labs in the 1960s-70s by
[Ken Thompson](https://en.wikipedia.org/wiki/Ken_Thompson) and
[Dennis Ritchie](https://en.wikipedia.org/wiki/Dennis_Ritchie),
set the standard for many OS design principles:
* A multi-user, multi-tasking architecture.
* A hierarchical file system.
* "Everything is a file" (even devices).
* Small, composable programs connected via pipes.

### Linux

![Linus Torvalds](fig/Torvalds.png)

Linux is a Unix-like operating system (technically only the
[kernel](https://github.com/torvalds/linux)) created by
[Linus Torvalds](https://en.wikipedia.org/wiki/Linus_Torvalds)
in 1991.
Unlike traditional Unix systems, it was built independently.
Its open-source license
([GPLv2](https://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html))
lets anyone to study, modify, and redistribute the code.

Here is the original humble email that changed the world:
```
Hello everybody out there using minix -

I'm doing a (free) operating system (just a hobby, won't be big and
professional like gnu) for 386(486) AT clones.  This has been brewing
since april, and is starting to get ready.  I'd like any feedback on
things people like/dislike in minix, as my OS resembles it somewhat
(same physical layout of the file-system (due to practical reasons)
among other things).

I've currently ported bash(1.08) and gcc(1.40), and things seem to work.
This implies that I'll get something practical within a few months, and
I'd like to know what features most people would want.  Any suggestions
are welcome, but I won't promise I'll implement them :-)

              Linus (torvalds@kruuna.helsinki.fi)

PS.  Yes - it's free of any minix code, and it has a multi-threaded fs.
It is NOT protable (uses 386 task switching etc), and it probably never
will support anything other than AT-harddisks, as that's all I have :-(.
```

### Unix Philosophy

![The Art of Unix Programming](fig/taoup.png)

The power of Unix and Linux comes not just from its technical
features, but from a
[design philosophy](http://www.catb.org/~esr/writings/taoup/html/).
Some of the guiding principles are:
* Do one thing well.
  Each program should have a single, focused purpose.
  To solve a new problem, build a new tool rather than
  overcomplicating an old one.
* Build programs to work together.
  The output of one program should serve as the input to another.
  This encourages simple text-based interfaces and avoids unnecessary
  formatting.
* Prototype early and refine.
  Software should be tested quickly, with the freedom to discard
  clumsy parts and rebuild better versions.
* Rely on tools, not manual effort.
  Create reusable tools to simplify tasks, even if they are only
  needed temporarily.

Another core idea is that "everything is a file".
As a result, devices, processes, and data can all be accessed through
a unified file interface.

Because of these simple yet powerful design choices, Unix and Linux
are extremely flexible and extensible.

Unix evolved into a broad family of operating systems, including the
[BSDs](https://en.wikipedia.org/wiki/Berkeley_Software_Distribution)
(FreeBSD, OpenBSD, NetBSD), Solaris, and eventually
[NeXTSTEP](https://en.wikipedia.org/wiki/NeXTSTEP),
which became macOS (Mac OS X).
Linux, meanwhile, has grown into an ecosystem with countless
[distributions](https://en.wikipedia.org/wiki/List_of_Linux_distributions).

Today, Linux has surpassed both traditional Unix and Windows in many
domains and become the #1 OS for the internet and scientific
computing:
* Runs directly on bare-metal servers in data centers and on virtual
  machines in the cloud.
* Powers the [fastest supercomputers](https://top500.org/) in the
  world.
* Serves as the backbone of scientific computing, HPC, machine
  learning, and AI.
* Provides the kernel for Android smartphones, used by billions of
  people worldwide.

![Unix Family Tree](fig/Unix_history-simple.svg)

### Shells and Terminals

The terms "shell" and "terminal" are often used interchangeably today,
but they actually refer to different parts of the system:
* Terminal (or terminal emulator): A text-based interface that lets
  you interact with the operating system.
  On modern computers this is usually a software application (e.g.,
  Terminal on macOS, GNOME Terminal on Linux).
* Shell: A program that runs inside the terminal.
  It interprets the commands you type, sends them to the operating
  system, and prints the results back.
  Examples include `sh`, `bash`, and `zsh`.

In [None]:
# HANDSON: find out what OS you are running.
#
# Method 1: on Mac or Linux, open a terminal, type `uname -a`.
#
# Method 2: on Windows, make sure that Windows Subsystem for Linux
#           (WSL) is enabled, run the "Linux GUI apps", then type
#           `uname -a`.
#
# Method 3: "shell out" a single line in Jupyter notebook by adding a
#           "!" before your command in a Jupyter cell, i.e.,

! uname -a

In [None]:
# Method 4: "shell out" a whole cell in Jupyter notebook by adding
#           `%%bash` at the beginning of a Jupyter cell, i.e.,

In [None]:
%%bash

uname -a

Here are some commands every Unix/Linux user should know:

* Shells:
  * `sh`, `bash`, or `zsh`: start a new shell session.

* Navigation:
  * `pwd`: Print the current working directory.
  * `ls`: List directory contents.
    * `ls -l` long format (permissions, owner, size, date).
    * `ls -a` show hidden files (those starting with `.`).
  * `cd ~`: change into the specified directory.

* Basic File Management:
  * `touch newfile.txt`: create an empty file or update its timestamp.
  * `mv newfile.txt newname.txt`: move or rename files.
  * `cp newname.txt newname2.txt`: copy files.
  * `rm newname.txt newname2.txt`: remove (delete) files.

* Printing output
  * `echo hello world`: echo message on the screen.
  * `printf "hello world x %d\n" 3`: formatted print using `C` format strings.

* Viewing Files:
  * `cat file.txt`: print the file contents to the screen.
  * `head file.txt`: show the first 10 lines of a file.
  * `tail file.txt`: show the last 10 lines of a file.

Many of these commands deal with the file system, which is
intentional: in Unix/Linux, "everything is a file".
Hence, regular files, directories, devices, and even some processes
are all accessed using the same interface.

In [None]:
%%bash

# HANDSON: try out some of the above commands
#
# Specifically, try out both `touch` and `ls -l` to verify that
# `touch` does update timestamp of a file.


In [None]:
%%bash

# HANDSON: on Linux, what "files" are available inside `/proc`?
# What do you get if you `cat` these files?


If you want to learn more about a command, you can usually:
* Run `man COMMAND` to open its manual page, which provides detailed
  documentation.
* Or use `COMMAND -h` (or sometimes `COMMAND --help`) to see a short
  help message with common options.

The shell can automatically expand patterns into lists of files or
strings, saving you from typing them out manually.

* Wildcards, Globbing, and Brace Expansion:
  * Wildcards (`*`, `?`, `[ ]`): match files by name patterns.
  * Brace expansion (`{ }`): expand a sequence or set of strings.

In [None]:
%%bash

# HANDSON: try out at least the following
#
# ls *.txt          # List all files ending in .txt
# ls file?.dat      # Matches file1.dat, file2.dat ... but not file10.dat
# ls file[1-3].txt  # Matches file1.txt, file2.txt, file3.txt


Unix programs are designed to work together.
The shell provides simple mechanisms to connect these small tools into
powerful workflows.

* Redirection and piping operators:
  * Redirect output `>`: send the output of a program into a file
                         (overwrites the file if it exists).
  * Append output `>>`: just like `>`, but adds to the end of the file
                        instead of overwriting.
  * Pipes `|`: connect programs directly, using the output of one as the
               input to another.
               This idea was invented by
               [Doug McIlroy](https://en.wikipedia.org/wiki/Douglas_McIlroy)
               at Bell Labs in 1972, and it became one of the defining
	       features of Unix.
  * Command substitution (``` `cmd` ``` or `$(cmd)`): use the output
                                                      of one command
                                                      as part of
                                                      another command.
* Filters:
  * `grep` search text: finds lines matching a pattern.
  * `sed` stream editor: performs simple transformations on text
    streams.
  * `awk` pattern scanning and processing: a small programming
    language for structured text.

In [None]:
%%bash

# HANDSON: try out at least the following
#
# ls / > ~/list
# cat /proc/cpuinfo | grep ^processor
# echo "Today is $(date)"


* Variables and String Manipulation
  * Assigning variables
    ```
    NAME="Alice"
    echo $NAME
    ```
  * Command substitution inside variables
    ```
    DATE=$(date)
    echo "Today is $DATE"
    ```
  * Environment variables: special variables used by the system and programs
    ```
    echo $HOME
    echo $PATH
    ```
  * String operations: bash supports simple string manipulations
    ```
    FILE="astr501.txt"
    echo ${FILE%.txt}   # prints astr501  (remove suffix)
    echo ${FILE#astr}   # prints 501.txt  (remove prefix)
    ```

* Control structures in the Shell:
  The shell is not only an interface for running commands, but also a
  scripting language.
  The most common control structures are for conditions and loops.
  * Basic `if ... then ... elif ... then ... else`
    ```
    x=15

    if [ $x -lt 10 ]; then
        echo "x is less than 10"
    elif [ $x -lt 20 ]; then
        echo "x is between 10 and 19"
    else
        echo "x is 20 or more"
    fi
    ```
  * The `for` Loops:
    run a command repeatedly over a list of items.
    ```
    for i in {1..5}; do
        echo "Run $i"
    done
    ```

In [None]:
%%bash

# HANDSON: Using the commands we just learn, do the following:
#
# 1. Create files 1.txt, 2.txt, ..., 100.txt.
#
# 2. Rename them to 001.txt, 002.txt, ..., 100.txt.
#
# 3. Rename them to sim001.txt, sim002.txt, ..., sim100.txt.


### Text Editors

![Editors](fig/real_programmers.png)

To work effectively on Unix/Linux systems, you need a text editor to
create and modify files such as code, configuration files, or
scripts within a terminal.
Three most common editors you will encounter are `nano`, `vim`, and
`emacs`.
* `nano`: Simple and Beginner-Friendly
  * Command: `nano file.txt`
  * Easy to learn: commands are listed at the bottom of the screen.
  * Use `Ctrl+O` to save, `Ctrl+X` to exit.
  * Great for quick edits or when you are just starting out.
* `vim`: powerful but Minimal
  * Command: `vim file.txt`
  * Modal editor:
    * Normal mode: default, used for navigation, editing commands.
    * Insert mode: typing text, entered by pressing `i`.
    * Visual Mode: allows for selecting blocks of text, lines, or rectangular blocks, enter by `v`, `V`, or `Ctrl-v`.
    * Command mode: colon commands, e.g., `:w` to save, `:q` to quit.
  * Famous learning curve ![Exit `vim`](fig/exit_vim.png)
  * Almost always comes with Linux
* `emacs`: Extensible and Feature-Rich
  * Command: `emacs -nw file.txt`
  * Full-featured editor that is also an environment.
  * Key commands: `Ctrl+X Ctrl+S` to save, `Ctrl+X Ctrl+C` to quit.
  * Highly customizable with its own programming language (Emacs Lisp).