# The for-loop

So far we have used single commands or pipelines to analyze or modify files. But what if you want to do the same command over and over again, do you have to input it every time?

Luckily, a computer is happy to do the same thing multiple times if you structure your command accordingly. One way to do this is using a for-loop, which is available in nearly all programming languages.

In BASH you start a for-loop by specifying a group of elements, which is then loaded in a variable. The computer will then **do** a command that you specify **for** each element in the group. 

One easy way to create one of these groups is by using a wildcard:

In [2]:
%%bash
ls ./example_data/

caeel_3_1.fasta
caeel_3_2.fasta
caeel_3_3.fasta
caeel_3_4.fasta


The command `echo` will simply print something in the terminal. You can use it to see what you have loaded into your variable. 

In [10]:
%%bash
for file in ./example_data/caeel_*; do echo "now loaded: $file"; done

now loaded: ./example_data/caeel_3_1.fasta
now loaded: ./example_data/caeel_3_2.fasta
now loaded: ./example_data/caeel_3_3.fasta
now loaded: ./example_data/caeel_3_4.fasta


Knowing what's in your variable, you can start using the commands you know:

In [11]:
%%bash
for file in ./example_data/caeel_*; do grep ">" $file | wc -l; done

10
8
7
6


You give the variable a name in the first part of the loop and then you call your variable by putting a `$` in front of it. (**Hint:** BASH will ignore variables if they are enclosed by `'`)

In [14]:
%%bash
for kangaroo in ./example_data/*.fasta; do echo "$kangaroo"; done
for kangaroo in ./example_data/*.fasta; do echo '$kangaroo'; done

./example_data/caeel_3_1.fasta
./example_data/caeel_3_2.fasta
./example_data/caeel_3_3.fasta
./example_data/caeel_3_4.fasta
$kangaroo
$kangaroo
$kangaroo
$kangaroo


Instead of loading file names into a variable, you can also loop over the contents of a file. For this, you simply use one of the commands you already know. There is one special thing that applies here. When calling a command in the first part of a for-loop, you have to enclose it in: `$()`

In [4]:
%%bash
for header in $(grep ">" ./example_data/caeel_3_4.fasta); do echo $header; done

>caeel_3_11110
>caeel_3_11111
>caeel_3_11113
>caeel_3_11115
>caeel_3_11118
>caeel_3_11119


## Tasks

In this directory, you will find the genome annotation of *A. baumannii* AYE in [GFF format](https://en.wikipedia.org/wiki/General_feature_format). You will also find a list of gene names in the file `genes_of_interest.txt`. 

1. Use a for-loop to extract the locus-tag id of all gene names of interest. Safe the list of all CDS locus tags in a file.

In [None]:
%%bash


2. Safe each locus tag in a different file. Use the respective gene name for the name of the file (plus a suitable file extension)

In [None]:
%%bash
