# Intermediate (?) Bash
In a continuation of part one, I'm going to introduce slightly more advanced topics in Bash. This section is merely a glimpse of all you can do with Bash, and rather than follow a particular guide, is focused on commands I use frequently in my research projects.

We're going to continue with our examination of the [data](https://swcarpentry.github.io/shell-novice/data/shell-novice-data.zip) from the very extensive Software Carpentry tutorial on Bash.

## Searching for that one line in a file ... `grep`

In the previous section, David taught you how to look inside a file using `less`. Let's `cd` (change directory) into the `creatures` subdirectory and peek inside `basilisk.dat`.

In [1]:
ls

BashPart2.ipynb	README.md	data-shell


In [2]:
cd data-shell; ls

creatures		north-pacific-gyre	solar.pdf
data			notes.txt		writing
molecules		pizza.cfg


In [3]:
cd creatures; ls

basilisk.dat	unicorn.dat


In [4]:
head basilisk.dat

COMMON NAME: basilisk
CLASSIFICATION: basiliscus vulgaris
UPDATED: 1745-05-02
CCCCAACGAG
GAAACAGATC
ATTAGAAGAT
CTGTCGCGAA
CCGCACCTCT
CCTATCTACA
TGTTTGTCTC


What if we want to find all rows where some property is defined by a colon (`:`)? We can do that using `grep`. The syntax for the basic `grep` command is as follows:

        grep PATTERN file.txt
        
In our case then, we try 
        
        grep : basilisk.dat

In [5]:
grep : basilisk.dat

COMMON NAME: basilisk
CLASSIFICATION: basiliscus vulgaris
UPDATED: 1745-05-02


We can even try this in our other creature file.

In [6]:
grep : unicorn.dat

COMMON NAME: unicorn
CLASSIFICATION: equus monoceros
UPDATED: 1738-11-24


`grep` even has a nice way to get all lines *except* those matching the pattern - feed in a `-v` option, i.e. `grep -v : basilisk.dat`. 

But before we do that, let's check the word count to see if we're going to get a lot of text.

In [7]:
wc -l basilisk.dat

     163 basilisk.dat


Yes, that's a little too much text, so let's just look at the first 10 lines without a `:` in them.

## Chaining commands together ... `|` (pipe)

If you recall from the previous section, the first 10 lines of a file can be obtained using `head`. Here, though, we want the first 10 lines of the result of a Bash operation. **What can we do???**

Luckily, Bash has a very convenient syntax for chaining operations together, using the pipe (`|`) - this symbol is usually located above your return key, and is typed by `SHIFT + \`.

So we want to do a grep, and then feed that result as the argument to `head`. So:

In [8]:
grep -v : basilisk.dat | head

CCCCAACGAG
GAAACAGATC
ATTAGAAGAT
CTGTCGCGAA
CCGCACCTCT
CCTATCTACA
TGTTTGTCTC
TGGGTGGGGA
TCCATAGGCA
GCATTACCAG


## Looping through (all files in a directory or other things) ... `for` loops

What if we wanted to run our `grep` operation on all `.dat` files in a directory? Here, we only may have two `.dat` files, but in practice, many of us have more than two samples, and sometimes way too many to do this manually. This is where `for` loops are very handy.

In [11]:
for file in *.dat; do
    echo $file
    grep -v : $file | head
    echo -e '\n'
done

basilisk.dat
CCCCAACGAG
GAAACAGATC
ATTAGAAGAT
CTGTCGCGAA
CCGCACCTCT
CCTATCTACA
TGTTTGTCTC
TGGGTGGGGA
TCCATAGGCA
GCATTACCAG


unicorn.dat
AGCCGGGTCG
CTTTACCTTA
AAGCCGAGGG
GGGTGGTACG
CCGAACATAA
ACGCTTTAAC
GTCCCTCCAG
GCTGATAATC
GTTTAAGCAC
ACGTGGTCTA




bash: syntax error near unexpected token `done'


: 258

## Splitting strings ... introducing `awk` 

## Find and replace all ... introducing `sed`