# Linux

This section is focused in options of using/configuring linux.

## Bash

Bash is a shell commonly used in Linux-based operating systems. Mastering Bash allows you to combine results from Linux utilities and create your own, which is a straightforward way to optimize tasks. 
Find out more on the [specific page](linux/bash.ipynb) or in the [official manual](https://www.gnu.org/software/bash/manual/bash.html).

---

The following example shows how to extract raw JSON using the `curl` command, format it nicely with the `jp` tool, and transform it to YAML format using the `yp` tool. To perform all this, we use just two Bash command operators: `|` and `$`.

In [19]:
docker run -itd --name temp_http --rm -p 80:80 kennethreitz/httpbin

json_output=$(curl -s localhost:80/json)
echo $json_output | jq
echo "============================================================"
echo "$json_output" | jq '.' | yq -P

docker stop temp_http

5d54c061747cf404cb1b6623515f3836d562367470208a9e008c4f7c7f0c2b75
[1;39m{
  [0m[34;1m"slideshow"[0m[1;39m: [0m[1;39m{
    [0m[34;1m"author"[0m[1;39m: [0m[0;32m"Yours Truly"[0m[1;39m,
    [0m[34;1m"date"[0m[1;39m: [0m[0;32m"date of publication"[0m[1;39m,
    [0m[34;1m"slides"[0m[1;39m: [0m[1;39m[
      [1;39m{
        [0m[34;1m"title"[0m[1;39m: [0m[0;32m"Wake up to WonderWidgets!"[0m[1;39m,
        [0m[34;1m"type"[0m[1;39m: [0m[0;32m"all"[0m[1;39m
      [1;39m}[0m[1;39m,
      [1;39m{
        [0m[34;1m"items"[0m[1;39m: [0m[1;39m[
          [0;32m"Why <em>WonderWidgets</em> are great"[0m[1;39m,
          [0;32m"Who <em>buys</em> WonderWidgets"[0m[1;39m
        [1;39m][0m[1;39m,
        [0m[34;1m"title"[0m[1;39m: [0m[0;32m"Overview"[0m[1;39m,
        [0m[34;1m"type"[0m[1;39m: [0m[0;32m"all"[0m[1;39m
      [1;39m}[0m[1;39m
    [1;39m][0m[1;39m,
    [0m[34;1m"title"[0m[1;39m: [0m[0;32m"Sample Slide S

## Tiny commands

There are some commands that are really primitive to use. They literally have 1-3 options. It doesn't make sense to have a separate section for all such commands, so we'll cover them all in this section.

Print the number of processing units available to the current process, which may be less than the number of online processors.

In [1]:
nproc

32


Print how long system have been running.

In [2]:
uptime

 16:29:57 up  3:13,  1 user,  load average: 1,13, 1,39, 1,19


Shows or sets the system's hostname.

In [3]:
hostname

MBD843AE246AC7


Print working directory.

In [4]:
pwd

/home/f.kobak@maxbit.local/Documents/knowledge/other


Pause the execution flow for a certain amount of time. The following example shows that the second execution of `uptime` is delayed by `sleep` when compared to the first execution of `uptime`.

In [5]:
%%bash
uptime
sleep 3
uptime

bash: fg: %%bash: no such job
 16:29:59 up  3:13,  1 user,  load average: 1,04, 1,37, 1,18
 16:30:02 up  3:13,  1 user,  load average: 1,04, 1,37, 1,18


## Variables

You can define variables in the current shell session using the syntax `<variable_name>=<value>`. To access a variable, use the syntax `$variable_name`. These variables will be substituted in the current command.

There is a special type of variable widely used in practice — environment variables. **Environment variables** are variables inherited by nested shells. The environment passed to any executed command includes the shell’s initial environment. Use the syntax `export <variable_name>=<value>` to set an environment variable.

Check:

- [Particular page](linux/variables.ipynb) on this website.
- [Environmennt](https://www.gnu.org/software/bash/manual/html_node/Environment.html) desciription on bash documentation.

---

The following code demonstrates defining a variable and substituting its value into the `echo` command.

In [1]:
MY_VAR=10
echo $MY_VAR

10


The following cell defines an environment variable and a regular variable, then attempts to access both from a nested `bash` shell:

In [6]:
export env_var="environment"
just_var="local"

bash -c "echo \"\$just_var \$env_var\""

 environment


As a result, only the value for the environment variable exists in the nested shell.

## List contents (ls)

`ls` command that allows to show files/directories in the folder.

The following python code creates some random files and folders - so we can see what they look like in the output of the `ls` command.

In [6]:
import os
import random
import string

def random_name(length=8):
    'Function to create a random file/directory name'
    letters = string.ascii_lowercase
    return ''.join(random.choice(letters) for i in range(length))

experimental_path = "linux_files/ls"
os.mkdir(experimental_path)

for i in range(10):
    new_path = experimental_path + "/" + random_name()
    if random.choice([True, False]):
        os.mkdir(new_path)
    else:
        with open(new_path, "w") as f:
            f.write("some content")

Just by using the `ls` command, we'll get some files/folders listed in random order. But we can't tell which of them are directories and which are files, their creation dates, the user who created them, and so on.

In [7]:
!ls linux_files/ls

clctvira  kmloyjbo  notogjkx  nyxxrzni	rxmzznlq
itklpuxw  mojhtfqq  nvqttwne  rgnmwfrt	zoiqrlne


When using the `ls -l` command, you receive additional information in a detailed, table-like format. The columns provide the following details:

- Line that indicates whether the item is a directory, along with its permissions.
- Number of links to the item.
- The third and fourth columns are the user who owns the file and the Unix group of users to which the file belongs.
- Size of item in bytes.
- Time at which item was changed.
- And last column is the name of the item.

In [8]:
!ls linux_files/ls -l

total 40
drwxr-xr-x 2 f.kobak@maxbit.local domain users@maxbit.local 4096 ліп 12 17:16 clctvira
drwxr-xr-x 2 f.kobak@maxbit.local domain users@maxbit.local 4096 ліп 12 17:16 itklpuxw
-rw-r--r-- 1 f.kobak@maxbit.local domain users@maxbit.local   12 ліп 12 17:16 kmloyjbo
drwxr-xr-x 2 f.kobak@maxbit.local domain users@maxbit.local 4096 ліп 12 17:16 mojhtfqq
drwxr-xr-x 2 f.kobak@maxbit.local domain users@maxbit.local 4096 ліп 12 17:16 notogjkx
drwxr-xr-x 2 f.kobak@maxbit.local domain users@maxbit.local 4096 ліп 12 17:16 nvqttwne
-rw-r--r-- 1 f.kobak@maxbit.local domain users@maxbit.local   12 ліп 12 17:16 nyxxrzni
-rw-r--r-- 1 f.kobak@maxbit.local domain users@maxbit.local   12 ліп 12 17:16 rgnmwfrt
-rw-r--r-- 1 f.kobak@maxbit.local domain users@maxbit.local   12 ліп 12 17:16 rxmzznlq
drwxr-xr-x 2 f.kobak@maxbit.local domain users@maxbit.local 4096 ліп 12 17:16 zoiqrlne


**Note** to keep your system clean, don't forget to get rid of temporary directories and files.

In [9]:
!rm -r linux_files/ls

## Find

Linux `find` command allows you to search for files in the system. It have following syntax `find <directory-to-search> <criteria> <action>` where:

- `<directory-to-search>`: Specifies the directory where you want to begin the search.
- `<criteria>`: Defines the properties of the files you are searching for. This can include the file name, size, modification date, permissions, and more.
- `<action>`: Specifies what to do with the found files. By default, it prints the path to the files, but it can also execute other commands on them.

The following Python code creates a random tree of foldres and puts `text.txt` in the random place.

In [10]:
import os
import random
import string

def random_directory_name(length=8):
    'Function to create a random directory name'
    letters = string.ascii_lowercase
    return ''.join(random.choice(letters) for i in range(length))

os.mkdir("linux_files/find")
folders = ["linux_files/find"]

for i in range(10):
    fold = random.choice(folders)
    new_dir = fold + "/" + random_directory_name()
    os.mkdir(new_dir)
    folders.append(new_dir)

with open(random.choice(folders) + "/" + "text.txt", "w") as f:
    f.write("Message to aliens")

As a result, we have the following file tree.

In [11]:
!tree linux_files/find

[01;34mlinux_files/find[0m
└── [01;34mgptiiiab[0m
    ├── [01;34miipubngm[0m
    ├── [01;34mngvixpsi[0m
    ├── [01;34mpluqbiln[0m
    └── [01;34myfjphojg[0m
        ├── [01;34mbosqqrcn[0m
        ├── [01;34mfopmjtfu[0m
        └── [01;34mrctkvqsm[0m
            └── [01;34mkegxfokz[0m
                ├── [00mtext.txt[0m
                └── [01;34mxsvcplwo[0m

10 directories, 1 file


And we can get the full path for `text.txt` by using construction `--name text.txt` as criteria.

In [12]:
%%bash
find linux_files/find -name text.txt
rm -r linux_files/find

linux_files/find/gptiiiab/yfjphojg/rctkvqsm/kegxfokz/text.txt


## Disk usage (du)

The `du` command is used to check disk usage by different paths in the filesystem. It provides information about how much space is being used by files and directories.

---

The following cell creates several folders and files. Notably, `linux/du_example/megabytes_file` is created with a size of exactly 2.5 megabytes, whereas `linux/du_example/folder/small_file` contains only a single short line, making it an extremely small file.

In [22]:
mkdir linux/du_example
mkdir linux/du_example/folder

dd if=/dev/zero of=linux/du_example/megabutes_file bs=1M count=2 &> /dev/null
dd if=/dev/zero of=linux/du_example/megabutes_file bs=512K count=1 oflag=append conv=notrunc &>/dev/null

echo "this is short message" >> linux/du_example/folder/small_file

Now let's try the `du` command. The following options are also added:

- `a`: prints both files and folders.
- `h`: displays file sizes in a human-readable format.

These options are really useful in my opinion.

In [26]:
du -ah linux/du_example/

2,5M	linux/du_example/megabutes_file
4,0K	linux/du_example/folder/small_file
8,0K	linux/du_example/folder
2,6M	linux/du_example/


After all don't forget foder that was used for experiments.

In [27]:
rm -r linux/du_example

## Process status (ps)

The `ps` (process status) command in Linux is used to display information about active system processes. By default, it provides an output table where each row represents a process and the columns include:

- `PID` the unique process identifier.
- `TTY` the terminal associated with the process.
- `TIME` the cumulative CPU time used by the process.
- `CMD` the command that initiated the process.

The following example shows the output of the ps command.

In [13]:
!ps

    PID TTY          TIME CMD
 213863 pts/2    00:00:00 ps


To print all processes in the system, use the `-e` option, which prints all processes in the system.

The following cell shows the result of `ps -e`. Only the first 10 rows have been printed, because it could take a long time for the computer to process them.

In [14]:
!ps -e | head -n 10

    PID TTY          TIME CMD
      1 ?        00:00:05 systemd
      2 ?        00:00:00 kthreadd
      3 ?        00:00:00 rcu_gp
      4 ?        00:00:00 rcu_par_gp
      5 ?        00:00:00 slub_flushwq
      6 ?        00:00:00 netns
     11 ?        00:00:00 mm_percpu_wq
     12 ?        00:00:00 rcu_tasks_kthread
     13 ?        00:00:00 rcu_tasks_rude_kthread


## System limits (ulimit)

The `ulimit` Linux utility allows you to view and set user-level resource limits. It provides options for each type of limit, enabling you to control various aspects of system resource usage. To display all available limits, use the `-a` option.

In [15]:
!ulimit -a

real-time non-blocking time  (microseconds, -R) unlimited
core file size              (blocks, -c) 0
data seg size               (kbytes, -d) unlimited
scheduling priority                 (-e) 0
file size                   (blocks, -f) unlimited
pending signals                     (-i) 126734
max locked memory           (kbytes, -l) 4065160
max memory size             (kbytes, -m) unlimited
open files                          (-n) 1048576
pipe size                (512 bytes, -p) 8
POSIX message queues         (bytes, -q) 819200
real-time priority                  (-r) 0
stack size                  (kbytes, -s) 8192
cpu time                   (seconds, -t) unlimited
max user processes                  (-u) 126734
virtual memory              (kbytes, -v) unlimited
file locks                          (-x) unlimited


## Curl

`curl` is a command-line tool for transferring data using various network protocols. In Linux, it’s commonly used to interact with web servers.

Find out more in the [specific page](linux/curl.ipynb).

---

Here is an example of a request to the remote server using curl.

In [2]:
curl https://httpbin.org/anything

{
  "args": {}, 
  "data": "", 
  "files": {}, 
  "form": {}, 
  "headers": {
    "Accept": "*/*", 
    "Host": "httpbin.org", 
    "User-Agent": "curl/7.81.0", 
    "X-Amzn-Trace-Id": "Root=1-66a891dd-4956ac101aa26bcc365e22d5"
  }, 
  "json": null, 
  "method": "GET", 
  "origin": "212.98.168.102", 
  "url": "https://httpbin.org/anything"
}


## Heredoc (<< delimiting_identifier)

Using the `<<` symbol followed by a *delimiter identifier*, you can define a multiline string (here document) that will be passed as input to the chosen command. The block of text should be terminated by the same *delimiter identifier*.

In the following cell, the `cat` command is passed a multiline expression, which is then printed in the output — this is exactly what the `cat` command does.

In [9]:
cat << EOF
hello
my name 
is fedor
EOF

hello
my name 
is fedor


To understand better what exactly it does. Anoter example with other delimiting identifier and other command applied to the result. Here we are using `grep` to find line that contains `FIND ME`. Begining and ending of the document is defined by `SSS` combination of the symbols.

In [8]:
grep "FIND ME" << SSS
this is some line
it's great FIND ME that
something strange in this line 
SSS

it's great FIND ME that


## Cron

Cron is a utility that allows you to schedule commands in a Linux system. You can manage cron jobs using `crontab`, a special utility that handles the file defining jobs for cron.

To set up a cron job, type `crontab -e` to open the editor where you can define jobs using the following syntax:

```bash
minute hour day month weekday command
```

Each line represents a new cron job. 

Find out more at the [particular page](linux/crontab.ipynb).

---

The following cell shows the default cron schedule for Alpine Linux, allowing you to learn the typical format of a cron schedule.

In [2]:
docker run --rm -it alpine crontab -l

# do daily/weekly/monthly maintenance
# min	hour	day	month	weekday	command
*/15	*	*	*	*	run-parts /etc/periodic/15min
0	*	*	*	*	run-parts /etc/periodic/hourly
0	2	*	*	*	run-parts /etc/periodic/daily
0	3	*	*	6	run-parts /etc/periodic/weekly
0	5	1	*	*	run-parts /etc/periodic/monthly

