# UNIX Bash for Data Science
some sources on frequent UNIX commands:
* http://faculty.tru.ca/nmora/Frequently%20used%20UNIX%20commands.pdf
* https://www.tjhsst.edu/~dhyatt/superap/unixcmd.html

## UNIX Manual

In [None]:
command = input('Which UNIX command manual do you want?: ')
!echo ' ' && echo Manual for: $command && echo '' && man $command | head -20

### Basic file and directory commands

mkdir: make directory  
rmdir: remove directory  
cd: change directory  
cp: copy file  
mv: mofe file  
rm: remove file  
cmp: compare 2 files
more: output per window
chmod: change file permissions  


#### Symbols

**.**  - working directory   
**..** - parent directory to working directory  
**~**  - home directory  
**/**  - root directory  
*****  - string of characters wildcard  
**?**  - one character wildcard  

#### Directing and piping commands

command > file         - redirects the output of 'command' to 'file' instead of to standard output (screen)  
command >> file        - appends the output of 'command' to 'file' instead of to standard output (screen)  
command < file         - takes input for 'command' from file  
command1 | command2    - pipe standard output of command1 to standard input of command2  

### User and system info

In [None]:
date = !date
os_ = !uname -n
os_version = !uname -r
user = !whoami
host = !hostname -fs
network_ip = !ifconfig | grep "inet " | cut -f2 -d' '
local_ip = !ipconfig getifaddr en0
host_ip = !curl http://icanhazip.com
python_version = !python --version
python_path = !which -a python

In [None]:
# !ifconfig | grep -Eo 'inet (addr:)?([0-9]*\.){3}[0-9]*' | grep -Eo '([0-9]*\.){3}[0-9]*' | grep -v '127.0.0.1'
# !ifconfig | sed -En 's/127.0.0.1//;s/.*inet (addr:)?(([0-9]*\.){3}[0-9]*).*/\2/p'
# !uname -a && echo $NET_IP
# !ip -o route get to 8.8.8.8 | sed -n 's/.*src \([0-9.]\+\).*/\1/p'
# !curl http://ifconfig.me/ip
# !wget http://ipecho.net/plain -O - -q
!ifconfig | grep "inet " | cut -f2 -d' '
# !curl http://icanhazip.com
!ipconfig getifaddr en0

In [None]:
print('''
System info:
-----------------------------------------------------------------
date: {}
os: {} ({})
user: {}
host: {} {}
host_ip: {}
python: {} 
path: {}
'''.format(date[0], os_[0], os_version[0], user[0], host[0], network_ip, host_ip[-1], python_version[0], python_path))

### Getting around the filesystem

File permissions in numeric format and their meaning :
0 – no permissions  
1 – execute only  
2 – write only  
3 – write and execute  
4 – read only  
5 – read and execute  
6 – read and write  
7 – read, write and execute  
e.g.: chmod 400 filename ~ read-only 

In [None]:
!pwd

In [None]:
!ls

In [None]:
!ls -ltr # sorted by modification date

In [None]:
!ls ..

In [None]:
!ls -FGlAhp | grep ''

In [None]:
!file utils_.ipynb

In [None]:
!find .

### Processes

In [None]:
!ps -ax | head

In [None]:
!ps -ef | head

In [None]:
!ps -ef | grep -i python | grep -v grep # flag -v: exclude grep

In [None]:
!lsof -i:8888 | grep -i python

In [None]:
!who # logged on

In [None]:
!finger

In [None]:
!ping 127.0.0.1 -c 5

In [None]:
!ping -c 5 lauthom.nl

### Print to terminal

In [None]:
filename = '../_data/shakespeare.txt'
!echo $filename

In [None]:
!head -n 5 $filename && tail $filename

In [None]:
!cat $filename | head

In [None]:
!cat $filename | more # output per window

In [None]:
!less $filename  # output per window
# CTRL+F – forward one window
# CTRL+B – backward one window

### Create new file

In [None]:
!echo this is a file made in terminal > ../_data/new_file2.txt 
!echo appended text on next line >> ../_data/new_file2.txt 
!cat ../_data/new_file2.txt

In [None]:
!touch ../_data/new_file.txt 
date  = !echo date
!echo this is a file made by touch at && echo $(date) > ../_data/new_file.txt 
!cat ../_data/new_file.txt

In [None]:
%%writefile new_file.txt

In [None]:
%%writefile -a append_file.txt

### Sort

In [None]:
!head $filename | sort | head

In [None]:
# !head -n 5 $filename | sort  # by the first letter in each line
!head $filename | sort -t' ' -k2  # by -k: 2nd column, delimitter -t: ' '

### Wordcount (wc)
-l: lines  
-c: characters

In [None]:
!wc $filename && echo chars: && wc -c $filename && echo lines: && wc -l $filename

In [None]:
!sort $filename | uniq -u | wc -l

### Split to words  
MacOS version, need to use: `\'$', $'`  

 - Convert DOS file to Unix (\r\n in the end of each line): $sed 's/.$//' filename
 - replace spaces by returns: 's/ /\'$'\n/g'  
 - replace carriage return by nothing: $'s/\r//g'  

In [None]:
!sed 's/.$//' $filename > ../_data/shakespeare_unix.txt
!head ../_data/shakespeare_unix.txt
!head ../_data/shakespeare.txt

### Reverse lines of content

In [None]:
!sed -n '1!G;h;$p' ../_data/shakespeare_unix.txt | head

In [None]:
!sed -e 's/ /\'$'\n/g' -e $'s/\r//g' < $filename | head  

### Most frequent words
- remove empty lines: sed `'/^$/d'`, **`^: start of line, $: end of line, d: delete`**
- sort words: sort
- count consecutive words: uniq -c
- sort numerically and reverse order: sort -nr

In [None]:
!sed -e 's/ /\'$'\n/g' -e $'s/\r//g' $filename  | sed '/^$/d'| sort | uniq -c | sort -nr | head -15

In [None]:
!sed -e 's/ /\'$'\n/g' -e $'s/\r//g' $filename  | sed '/^$/d'| sort | uniq -c | sort -nr > ../_data/count_vs_words

In [None]:
!head ../_data/count_vs_words

### xargs

In [None]:
!head ../_data/count_vs_words | xargs echo

### grep
- **g**lobal **r**egular **e**xpression **p**rint

In [None]:
!grep -i 'CDROMS' ../_data/shakespeare.txt

In [None]:
!grep -A 10 -i 'CDROMS' ../_data/shakespeare.txt

In [None]:
# Recursive search
!grep -r 'shakespeare' *

In [None]:
!grep 'Liberty' $filename             # add -i to make case insensitive
!grep -i 'liberty' $filename | wc -l  # location of first word

### sed
 - **s**tream **ed**itor  
 - like grep + replacement
 - **`s/from/to/g`**, **s**: substitute, **g**: general (all)

In [None]:
!sed -e 's/parchment/manuscript/g' $filename > ../_data/temp_shakespeare.txt

In [None]:
!grep -i 'manuscript' ../_data/temp_shakespeare.txt
!grep -i 'parchment' ../_data/shakespeare.txt

### find 

In [None]:
!find .. | grep -i shakespeare

### TODO

In [None]:
# default aliases
#%unalias <alias>
%alias

### Make executable bash script file

In [None]:
%%!
# Rune once
echo "#!/bin/bash" > ~/.bash_aliases
echo alias mop='"Monty Python Shell"' >> ~/.bash_aliases
# chmod +x ~/aliases.command  # make executable
source ~/.bash_aliases    # udate executable
# export BASH_ENV='"~/.aliases"'
# source ~/aliases.command    # udate executable
source ~/.profile
source ~/.bash_profile

In [None]:
%%!
# Rune once
echo "#!/bin/bash" > ~/aliases.command
echo alias mop='"Monty Python Shell"' >> ~/aliases.command
chmod +x ~/aliases.command  # make executable
source ~/aliases.command    # udate executable
export BASH_ENV='"~/.aliases"'
source ~/aliases.command    # udate executable
source ~/.profile
source ~/.bash_profile

In [None]:
# !find ~ | grep .bash_

#### All UNIX commands can be used in combination with python. Moreover they can be shorthanded with aliases.