This notebook was prepared by [Donne Martin](http://donnemartin.com). Source and license info is on [GitHub](https://github.com/donnemartin/data-science-ipython-notebooks).

# Linux Commands

* Disk Usage
* Splitting Files
* Grep, Sed
* Compression
* Curl
* View Running Processes
* Terminal Syntax Highlighting
* Vim

## Disk Usage

Display human-readable (-h) free disk space:

In [1]:
!df -h

Filesystem      Size  Used Avail Use% Mounted on
none            734G   20G  684G   3% /
tmpfs            32G     0   32G   0% /dev
tmpfs            32G     0   32G   0% /sys/fs/cgroup
/dev/md126p1    734G   20G  684G   3% /etc/hosts
shm              64M     0   64M   0% /dev/shm


Display human-readable (-h) disk usage statistics:

In [2]:
!du -h ./

16K	./.ipynb_checkpoints
112K	./featured/pandas-cookbook/cookbook/images
1.8M	./featured/pandas-cookbook/cookbook
54M	./featured/pandas-cookbook/data
44K	./featured/pandas-cookbook/.git/hooks
4.0K	./featured/pandas-cookbook/.git/branches
8.0K	./featured/pandas-cookbook/.git/logs/refs/heads
8.0K	./featured/pandas-cookbook/.git/logs/refs/remotes/origin
12K	./featured/pandas-cookbook/.git/logs/refs/remotes
24K	./featured/pandas-cookbook/.git/logs/refs
32K	./featured/pandas-cookbook/.git/logs
4.0K	./featured/pandas-cookbook/.git/refs/tags
8.0K	./featured/pandas-cookbook/.git/refs/heads
8.0K	./featured/pandas-cookbook/.git/refs/remotes/origin
12K	./featured/pandas-cookbook/.git/refs/remotes
28K	./featured/pandas-cookbook/.git/refs
8.0K	./featured/pandas-cookbook/.git/info
8.0K	./featured/pandas-cookbook/.git/objects/fb
56K	./featured/pandas-cookbook/.git/objects/a6
8.0K	./featured/pandas-cookbook/.git/objects/5a
8.0K	./featured/pandas-cookbook/.git/objects/76
20K	./feat

Display human-readable (-h) disk usage statistics, showing only the total usage (-s):

In [3]:
!du -sh ../

1.0G	../


Display the human-readable (-h) disk usage statistics, showing also the grand total for all file types (-c):

In [4]:
!du -csh ./

234M	./
234M	total


## Splitting Files

Count number of lines in a file with wc:

In [18]:
!wc -l < file.txt

100


Count the number of lines in a file with grep:

In [19]:
!grep -c "." file.txt

100


Split a file into multiple files based on line count:

In [20]:
!split -l 20 file.txt new

Split a file into multiple files based on line count, use suffix of length 1:

In [None]:
!split -l 802 -a 1 file.csv dir/part-user-csv.tbl-

## Grep, Sed

List number of files matching “.txt":

In [22]:
!ls -1 | grep .txt

file.txt


In [21]:
!ls -1 | grep .txt | wc -l

1


Check number of MapReduce records processed, outputting the results to the terminal:

In [10]:
!cat * | grep -c "foo" folder/part*

grep: folder/part*: No such file or directory
cat: write error: Broken pipe


Delete matching lines in place:

In [None]:
!sed -i '/Important Lines: /d’ original_file

## Compression

In [None]:
# Compress zip
!zip -r archive_name.zip folder_to_compress

# Compress zip without invisible Mac resources
!zip -r -X archive_name.zip folder_to_compress

# Extract zip
!unzip archive_name.zip

# Compress TAR.GZ
!tar -zcvf archive_name.tar.gz folder_to_compress

# Extract TAR.GZ
!tar -zxvf archive_name.tar.gz

# Compress TAR.BZ2
!tar -jcvf archive_name.tar.bz2 folder_to_compress

# Extract TAR.BZ2
!tar -jxvf archive_name.tar.bz2

# Extract GZ
!gunzip archivename.gz

# Uncompress all tar.gz in current directory to another directory
!for i in *.tar.gz; do echo working on $i; tar xvzf $i -C directory/ ; done

## Curl

In [None]:
# Display the curl output:
!curl donnemartin.com

# Download the curl output to a file:
!curl donnemartin.com > donnemartin.html

# Download the curl output to a file -o
!curl -o image.png http://i1.wp.com/donnemartin.com/wp-content/uploads/2015/02/splunk_cover.png

# Download the curl output to a file, keeping the original file name -O
!curl -O http://i1.wp.com/donnemartin.com/wp-content/uploads/2015/02/splunk_cover.png
    
# Download multiple files, attempting to reuse the same connection
!curl -O url1 -O url2

# Follow redirects -L
!curl -L url

# Resume a previous download -C -
!curl -C - -O url

# Authenticate -u
!curl -u username:password url

## View Running Processes

In [26]:
# Display sorted info about processes
!top

[?1h=[H[2J[mtop - 03:54:38 up 21 days, 12:00,  0 users,  load average: 6.21, 6.37, 6.48[m[m[m[m[K
Tasks:[m[m[1m   7 [m[mtotal,[m[m[1m   1 [m[mrunning,[m[m[1m   6 [m[msleeping,[m[m[1m   0 [m[mstopped,[m[m[1m   0 [m[mzombie[m[m[m[m[K
%Cpu(s):[m[m[1m  1.9 [m[mus,[m[m[1m  0.5 [m[msy,[m[m[1m  0.0 [m[mni,[m[m[1m 97.6 [m[mid,[m[m[1m  0.0 [m[mwa,[m[m[1m  0.0 [m[mhi,[m[m[1m  0.0 [m[msi,[m[m[1m  0.0 [m[mst[m[m[m[m[K
KiB Mem: [m[m[1m 65861244 [m[mtotal,[m[m[1m 24436864 [m[mused,[m[m[1m 41424380 [m[mfree,[m[m[1m  1443728 [m[mbuffers[m[m[m[m[K
KiB Swap:[m[m[1m        0 [m[mtotal,[m[m[1m        0 [m[mused,[m[m[1m        0 [m[mfree.[m[m[1m  2994604 [m[mcached Mem[m[m[m[m[K
[K
[7m  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND     [m[m[K
[m   13 jovyan    20   0  628912  31560   6240 S   6.6  0.0   0:03.84 python      [m[m[K
[m    

In [27]:
# Display all running processes
!ps aux

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.0   4216   348 ?        Ss   02:48   0:00 tini -- /bin/sh
root         5  0.0  0.0   4328   652 ?        S    02:48   0:00 /bin/sh -c star
root         6  0.0  0.0  46352  1600 ?        S    02:48   0:00 su jovyan -c en
jovyan       8  0.0  0.1 354916 67016 ?        Ssl  02:48   0:02 /opt/conda/bin/
jovyan      13  0.3  0.0 628912 31560 ?        Ssl  03:35   0:03 /opt/conda/envs
jovyan     126  0.0  0.0   4328   632 pts/0    Ss+  03:55   0:00 /bin/sh -c ps a
jovyan     127  0.0  0.0  19092  1308 pts/0    R+   03:55   0:00 ps aux


In [28]:
# Display all matching running processes with full formatting
!ps -ef | grep python

jovyan       8     6  0 02:48 ?        00:00:02 /opt/conda/bin/python /opt/conda/bin/jupyter-notebook --NotebookApp.base_url=user/7ocldL5AjDdO --NotebookApp.allow_origin=* --port=8888
jovyan      13     8  0 03:35 ?        00:00:03 /opt/conda/envs/python2/bin/python -m ipykernel -f /home/jovyan/.local/share/jupyter/runtime/kernel-fe767d89-a7b3-45c4-bb57-ce8f234554da.json
jovyan     128    13  0 03:56 pts/0    00:00:00 /bin/sh -c ps -ef | grep python
jovyan     130   128  0 03:56 pts/0    00:00:00 grep python


In [34]:
# See processes run by user dmartin
!ps -u jovyan

  PID TTY          TIME CMD
    8 ?        00:00:02 jupyter-noteboo
   13 ?        00:00:03 python
  140 pts/0    00:00:00 sh
  141 pts/0    00:00:00 ps


In [35]:
# Display running processes as a tree
!pstree

/bin/sh: 1: pstree: not found


## Terminal Syntax Highlighting

Add the following to your ~/.bash_profile:

In [38]:
!echo $CLICOLOR

1


In [39]:
!echo $LSCOLORS




In [40]:
!echo $PS1

$


In [41]:
!export PS1='\[\033[01;32m\]\u@\h\[\033[00m\]:\[\033[01;34m\]\W\[\033[00m\]\$ '

In [None]:
!export CLICOLOR=1

In [42]:
!export LSCOLORS=ExFxBxDxCxegedabagacad

In [43]:
!alias ls='ls -GFh'

Reload .bash_profile:

In [44]:
!source ~/.bash_profile

/bin/sh: 1: source: not found


In [46]:
!echo $PS1

$


## Vim

In [None]:
Normal mode:  esc

Basic movement:  h, j, k, l
Word movement:  w, W, e, E, b, B

Go to matching parenthesis:  %
Go to start of the line:  0
Go to end of the line:  $

Find character:  f

Insert mode:  i
Append to line:  A

Delete character:  x
Delete command:  d
Delete line:  dd

Replace command:  r
Change command:  c

Undo:  u (U for all changes on a line)
Redo:  CTRL-R

Copy the current line:  yy
Paste the current line:  p (P for paste above cursor)

Quit without saving changes:  q!
Write the current file and quit:  :wq

Run the following command to enable the tutorial:

In [None]:
!vimtutor

Run the following commands to enable syntax colors: