<a href="https://colab.research.google.com/github/TurkuNLP/ATP_kurssi/blob/master/ATP_2025_Notebook_10.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Topics of this notebook:

*  Thu 04 Dec 2025
   * for-loops
*  Mon 08 Dec 2025
   * extra for-loops
   * where to continue from here?
   * send your questions!
* Thu 11 Dec 2025
  * FULL RECAP OF EVERYTHING + Q&A!


# For-loops

With for-loops we can execute the same command on several items. E.g., in

```
for f in cat dog mouse ; do echo $f ; done
```
we execute the command `echo` on all the strings `cat`, `dog` and `mouse` with just one command, instead of running three commands separately:

`echo cat`

`echo dog`

`echo mouse`

The item on which the command is executed in a for loop can be a file. This means that we can **execute the same command to multiple files** with one command. This is practical, since we can also execute **scripts** to multiple files with just one command.

For for-loops we need **regular expressions**.






















### For-loops in practise

A for-loop begins with `for f` where `f` stands for the variable on which the command is executed (you can think of it as 'file': "for file in the folder"),  then comes the path of the folder with the files. You can specify if you want the loop to apply to all files or to a subset.

* For all files:
`for f in path/to/folder/*`

* For files ending with csv:
`for f in path/to/folder/*csv`

* For files starting with myfi:
`for f in path/to/folder/myfi*`


If you are already in the folder you wish to operate on:

* For all files in the folder:
`for f in *`

* For all files ending with .txt:
`for f in *.txt`


After specifying which files to operate on, you write `; do`and then the commands you want to execute. After the commands, you write `; done` to signal that your for-loop is finished.

`for f in *; do [insert commands here]; done`

The commands can be anything (pipes, scripts). Within the commands, `$f` refers to the file the command is executed on. For example:

* List all text files in a folder:

`for f in *.txt ; do echo $f ; done`

* For every file in folder, print lines that have the word 'snow':

`for f in path/to/folder/* ; do cat $f | egrep "snow" ; done`

* Run a script on every file in folder:

`for f in path/to/folder/* ; do cat $f | ./script.sh ; done`







## Directing output into a file


The output can be directed into a file in two ways depending how many files you want as a result:

1) In the end (all output in **one file**)

`for f in path/to/folder/* ; do cat $f | egrep "snow" ; done > output.txt`

2) Within the command (each output in a **separate file**)

`for f in path/to/folder/* ; do cat $f | ./script.sh > $f-output.txt ; done`










## Several commands within a for-loop

* If you wish to execute several commands within the for loop, separate each one with a `;`
* E.g., print file names AND lines with "snow" in the files:

`for f in path/to/folder/* ; do echo $f ; cat $f | egrep "snow" ; done`


## Examples of some for-loops

*   print **file names only**

```
for f in path/to/folder/*txt ; do echo $f ; done

```

*   print **file names and word counts**

```
for f in path/to/folder/*txt ; do echo $f ; cat $f | wc -w ; done

```

*   print **word count only**

```
for f in path/to/folder/*txt ; do cat $f | wc -w ; done

```

*   print **number of txt files only**

```
for f in path/to/folder/*txt ; do echo $f ; done | wc -l

```

**NOTE**

*   Perl back refrences also use `$`, but they do not interfere with the `$f` and work as usual within a for loop. e.g.:

```
for f in *.txt ; do echo $f ; cat $f | perl -pe 's/([[:punct:]])/\n$1/g' ; done
```

*   What the line above does:
    1. For all files ending in `.txt`
    2. print the file name
    3. print the file and replace all punctuations with a newline and the punctuation in question (each punctuation starts a new line).

* For-loops can be also used for looping other items than just files, and the item after `for` (e.g., above: `for f`) doesn't have to be `f`; remember though that it has to match the reference next to `$` (so if you use `for n` make sure you use `$n`).

```
for n in cat dog mouse ; do echo $n ; done
```

* EXTRA: It is also possible to specify a number of times something is done with curly brackets `{ }`. E.g. print a string 10 times:
```
for i in {1..10} ; do echo 'I am the best!' ; done
```
* N.B.! With for-loops you can also make/rename/delete multiple files, but it is very easy to make mistakes with it so be careful! It is better to work with one file at a time if there is a risk of losing data.


# Hands-on 10-1

Practise using for loops.

### 10-1.1   
Print all file names in a folder using for loop.
### 10-1.2
Print the file names and word counts of each file in a folder.
### 10-1.3
Grep and print all perl scripts in your .sh files in a folder. Print also the script file names.



# Hands-on 10-2

More for loops and their output.

### 10-2.1.
Run one of your earlier scripts on at least two similar files (make sure that the script works equally well on both files) in your folder using a for-loop. Print the names and the output in one file.

### 10-2.2.
Copy the files cricket.csv and icehockey.csv from /home/mavela/data-2022/ . Make a script that counts the words in a file. Exclude numbers and punctuation. Run the script on the cricket and ice hockey files using for loops and direct the output into separate files.



---
---
---

# EXTRA: For-loops with brace expansion

THIS WILL NOT BE ASKED IN THE EXAM! This is just for your information. We will only expect you to know how to for-loop over multiple files with an asterisk (*).  

* It is possible to loop over a range of letters or numbers as characters. I will soon explain what "as characters" means, but first I will show you the notation.
* A range defined with a brace expansion uses curly backets { } and two dots between the beginning and ending number or letter, e.g.:
  * `for i in {1..5}` would loop over numbers from 1 to 5.
  * `for i in {a..f}` would loop over the alphabet from a to f.
* Referencing to the item in the loop works just like before with `$i`, e.g. `for i in {1..10} ; do echo $i ; done` will just print
```
1
2
3
4
5
```
* So what does "letters or numbers as characters" mean? Brace expansion handles everything as characters. So even numbers are not numbers, they are characters, and **you cannot expect any math-like behaviour with brace expansion**.

# EXTRA: Hands-on 10-3

### 10-3.1
* Print a list of numbers from 10 to 15. Use a for-loop and a brace expansion.


### 10-3.2
Creating a bunch of files at once, then renaming them.

* Make a new directory called `animal_corpus`.
* Move into the direcotry.
* Fill it with files with this one-liner:
```
for i in {1..20} ; do echo "Old McDonald had a farm with $i animals" > file_$i.txt ; done
```
NB: If the above line does not work, try this more explicit notation:

```
for i in {1..20}; do echo "Old McDonald had a farm with ${i} animals" > file_${i}.txt; done
```
* Write a for-loop that renames all the files so that "file" is replaced with "animals". E.g. `file_1.txt` >>>> `animals_1.txt`



### 10-3.3 (VERY ADVANCED!)

This task cannot be done with a brace expansion, but try it first anyway. Then try to understand why brace expansion does not do what we want and what the right command might be (use Google or ask an AI):

* Write a script that takes two numbers as arguments, then prints all the numbers between the first and the last number.


---
---
---

# How to continue from here?

A general tutorial to recap and learn some new stuff:
* W3Schools Bash Tutorial: https://www.w3schools.com/bash/index.php

If you are planning to use CSC's supercomputers, check this out:
* Linux basics tutorial for CSC: https://docs.csc.fi/support/tutorials/env-guide/

If you want to use what you have learned on your own Windows computer with minimum effort:
* You can emulate Linux on a Windows computer with Cygwin (https://cygwin.com/). It does not do everything like a real Linux would, but is close. Just download and install.

Feeling HC? If you have e.g. an old potato laptop with an outdated Windows that would otherwise go to a garbage heap, you could install e.g. Linux Lite on it and then have your own Linux environment where you can practice; no need for a server! (Linux is free! And Linux Lite is really a lot like Windows anyway so you could also use it to browse the internet or watch videos or whatever.)
* How to install Linux Lite on a new machine: https://www.linuxliteos.com/manual/install.html#installguide (NB! Do this at your own risk; we will not be responsible!)

Feeling like an absolute gigachad 1337 hacker? Make your new expensive high-end laptop dual boot Windows and Linux! (NB! Do this at your own risk and only if you absolutely know what you're doing!!! We will not buy you a new computer!) Some instructions e.g. here:
* https://www.linuxliteos.com/manual/install.html#prepareinst



---

# NEXT TIME: RECAP & Q&A
We will try to address any questions you may have during class, but if you want to ask more anonymously, please send your questions via email to Anna before the next class!