# Test of a loop over an array

In this notebook we will explore

1. writing loops in Bash  
    1.1. looping oveer an array  
    1.2. looping over a file  
2. using loops to collect data from an API  
    2.1 Testing the API query  
    Z.Z. Adding the query to the loop  

## 1. Writing loops in Bash
### 1.1. Reading from an array

The first step is to assign values to an array. We anticipate the structure of the Nominatim API query, by writing each value as "City,Country"

In [6]:
# assigning the values of the array to the variable cities_array
cities_array=(Bergen,Norway Paris,France Turin,Italy Bordeaux,France Accra,Ghana)
# check the array has been assigned properly
echo "${cities_array[@]}"

Bergen,Norway Paris,France Turin,Italy Bordeaux,France Accra,Ghana


We now now test our loop who will use the array as an input, and will simply print the output to the notebook.
A loop has three parts:
* An intialisation, which can start with foor, while, or until depending on what we are iterating on.
    * for is typically used for iterating on the values of an array
    * while is commonly used to iterate over the lines of a file
    * until is commonly associated with counter (until i < 5 for example)
* A do section, where the function(s) that we want to execute every loop live
* A done section which marks the end of a loop, but can also includees
    * ```< file``` to add a file as an input to the loop
    * ```<< EOF \n line1 \n line2 \n EOF``` to have a "while read line" parse a multi-line string.
    * ```<<< $result``` to have the result of a function be passed to the loop.

**Note1:** using ```<< $result``` will not work, even if the assigned a multi-line string to $result. That is because Bash expects a delimiter when using here document "<<". If the multi-line string is inside a variable, the correct solution is to use "<<<"

**Note2:** instead of writing on a single line with a semicolon between the parts of the loop (necessary when using the teminal, where everything has to be written on a singele line), we will write the diifferent parts of the loop on different lines, which has the same effect of using semi-colon, with the added benefit of making the code more readable.

In [7]:
# create a loop that iterates over cities_array and print it
for x in ${cities_array[@]}
do 
    echo "$x"
done 

Bergen,Norway
Paris,France
Turin,Italy
Bordeaux,France
Accra,Ghana


To add the values of the array to a file, one basic way would be to simply echo each value and append them to the same file.  

As a reminder, we use the "greater than" sign in various ways:  

**Single sign**. 
* ```>``` is a file redirection. It redirects the output of a command to a file, overwriting the existing contents of the file if it exists.  
    * **Example:** ```echo "Hello" > file.txt``` writes "Hello" to file.txt, replacing its contents. If the file already exists, it is overwritten.  
* ```<``` Redirects the contents of a file to be used as the input to a command.  
    * **Example:** ```grep "text" < file.txt``` searches for "text" in file.txt.  

**Double sign**  
* ```>>``` is an append redirection. It redirects the output of a command to a file, but if the file already exists, it _appends_ the output of the command to the file instead overwriting the file.
    * **Example:** ```echo "Hello again" >> file.txt``` adds "Hello again" to the end of file.txt.
* ```<<``` is called "here document". It is also used to input content to a command, but this time it is for a multiline string instead of a fiie.
    * **Example:**  
    ```
    cat <<EOF
    Line 1
    Line 2
    EOF
    ```  
    will give the the string with the contents "Line 1" and "Line 2" as an input to the command cat. Here, "EOF" is simply a marker (also called delimiteer) to tell Bash where the file starts and end. We would have used any other unique value, such as "END_OF_FILE", or "@!".  
  
**Triple sign**.  
* ```<<<```  is called a "here string". It is used to redirect a single line or a formatted string directly into the standard input of a command. Unlike ```<<``` it does not need a delimiter, as it is focused on strings that take a single line.  
    * **Example:** ```grep "text" <<< "Here is some text to search through"``` tells the comamnd grep to search the word "test" in the string "Here is some text to search through".

* ```>>>``` doees not exist in Bash!

In [8]:
# the first echo command creates the file, so only one arrow is needeed
echo "Bergen,Norway" > cities2.csv
# the following echo commands have to append instead of creating the file, so we use >>
echo "Paris,France" >> cities2.csv
echo "Turin,Italy" >> cities2.csv
echo "Bordeaux,France" >> cities2.csv
echo "Accra,Ghana" >> cities2.csv

#we verify the content of our cities2.csv file
cat cities2.csv

Bergen,Norway
Paris,France
Turin,Italy
Bordeaux,France
Accra,Ghana


But writing echo for each variable is not sustainable. We're going to use a loop instead.

In [10]:
# the loop will append the values of the array in a new file called cities3.csv
for i in ${cities_array[@]}
do 
    echo "$i" >> cities3.csv 
done

#we check that the file has all the values that we expect
cat cities3.csv

Bergen,Norway
Paris,France
Turin,Italy
Bordeaux,France
Accra,Ghana


Note that when it comes to writing into files, we could have structured the loop a bit differently. The below loop gives the same result as the previoous one

In [11]:
# the loop will append the values of the array in a new file called cities3_2.csv
for i in ${cities_array[@]}
do 
    echo "$i" 
done >> cities3_2.csv

#we check that the file has all the values that we expect
cat cities3_2.csv

Bergen,Norway
Paris,France
Turin,Italy
Bordeaux,France
Accra,Ghana


Of course, the function within a loop can do more than simply print the values to a file. We can add some tweaks to see how we can apply some transformations to the values of our array. For example, we could add some text before each value.

In [13]:
for i in ${cities_array[@]}
do 
    echo "The student comes from $i"
done  >> cities3_3.csv 

#we check that the file has all the values that we expect
cat cities3_3.csv

The student comes from Bergen,Norway
The student comes from Paris,France
The student comes from Turin,Italy
The student comes from Bordeaux,France
The student comes from Accra,Ghana


Another option is to substitute one value by another (the equivalent of =substitute() in Google Sheets). 
For that purpoose, we format our echo different:

```echo "${variable//sign_to_be_replaced/replacement_sign}"```

where the ```//``` is an operator indicating a replacement operation

for example:

In [19]:
word="Arabic"
echo "${word//A/@}"

@rabic


Note that this is case sensitive, which is why only the upperrcase A was replaced. If we want to replace the lowercase a then:

In [20]:
echo "${word//a/@}"

Ar@bic


Used in a loop, the substitution looks like this

In [26]:
for i in ${cities_array[@]}
do 
# we want to replace the commas found in each line by some text
    echo "${i//,/ is located in }"
done

Bergen is located in Norway
Paris is located in France
Turin is located in Italy
Bordeaux is located in France
Accra is located in Ghana


You will note that when using the substitution, we need to wrap the whole command in curly brackets ```{}``` in order for it to work. 

An important note is that if the values you want to substitute includes a forward slash ```/``` then the program will be confused, as it is a standard operator. In those cases, you have to "espace" the value that you want to be treated as text and not as part of the command. 

Escaping is done with a backward slash placed before the value that confuses bash, like so:

In [28]:
for i in ${cities_array[@]}
do 
#we replace the commas by a forward slash but we have to escape it to avoid confusing Bash
    echo "${i//,/\/}"
done

Bergen/Norway
Paris/France
Turin/Italy
Bordeaux/France
Accra/Ghana


### 1.2. Reading from a file
Reading from a file is done by changing our loop in three ways:
* we initialise the loop with a ```while``` operator instead of a ```for```
* we use the ```read``` command, which is designed for reading an input line by line
* we add the input file with a redirection after the ```done```.

In [14]:
while read line
do
    echo "$line"
done < cities3.csv

Bergen,Norway
Paris,France
Turin,Italy
Bordeaux,France
Accra,Ghana


It is possible to combine this loop format with a file output. this would then look like this:

In [15]:
while read line
do
    echo "$line"
done < cities3.csv >> cities3_4.csv

#we verify the content of our created file
cat cities3_4.csv

Bergen,Norway
Paris,France
Turin,Italy
Bordeaux,France
Accra,Ghana


## 2. Using loops to collect data from an API
### 2.1. Testing the API query

We are using OpenStreetMap's [nominatim API](https://nominatim.org/release-docs/develop/api/Search/), which allows us to find the coordinates of any "object" matching the query. The object can be a city, but also bakeries etc. 

The simplest way to query an API using Bash is to use the command curl.

For a city, the request looks like this:

In [33]:
curl "https://nominatim.openstreetmap.org/search?q=Bordeaux,France&format=json" 

[{"place_id":252059258,"licence":"Data © OpenStreetMap contributors, ODbL 1.0. http://osm.org/copyright","osm_type":"relation","osm_id":105270,"lat":"44.841225","lon":"-0.5800364","class":"boundary","type":"administrative","place_rank":16,"importance":0.6740050666982947,"addresstype":"city","name":"Bordeaux","display_name":"Bordeaux, Gironde, Nouvelle-Aquitaine, France métropolitaine, France","boundingbox":["44.8107826","44.9161806","-0.6386987","-0.5336838"]},{"place_id":250563408,"licence":"Data © OpenStreetMap contributors, ODbL 1.0. http://osm.org/copyright","osm_type":"relation","osm_id":1667452,"lat":"44.79384015","lon":"-0.6063085906819762","class":"boundary","type":"administrative","place_rank":14,"importance":0.3362245296868723,"addresstype":"municipality","name":"Bordeaux","display_name":"Bordeaux, Gironde, Nouvelle-Aquitaine, France métropolitaine, France","boundingbox":["44.5463125","45.0413552","-0.9049215","-0.2415648"]}]


We can combine the curl with a program called jq, which is very useful for displaying json files nicely but also for ng and tranforming them. 

In [34]:
curl "https://nominatim.openstreetmap.org/search?q=Bordeaux,France&format=json"  | jq '.[]'

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   953  100   953    0     0   4021      0 --:--:-- --:--:-- --:--:--  4021
[1;39m{
  [0m[34;1m"place_id"[0m[1;39m: [0m[0;39m252059258[0m[1;39m,
  [0m[34;1m"licence"[0m[1;39m: [0m[0;32m"Data © OpenStreetMap contributors, ODbL 1.0. http://osm.org/copyright"[0m[1;39m,
  [0m[34;1m"osm_type"[0m[1;39m: [0m[0;32m"relation"[0m[1;39m,
  [0m[34;1m"osm_id"[0m[1;39m: [0m[0;39m105270[0m[1;39m,
  [0m[34;1m"lat"[0m[1;39m: [0m[0;32m"44.841225"[0m[1;39m,
  [0m[34;1m"lon"[0m[1;39m: [0m[0;32m"-0.5800364"[0m[1;39m,
  [0m[34;1m"class"[0m[1;39m: [0m[0;32m"boundary"[0m[1;39m,
  [0m[34;1m"type"[0m[1;39m: [0m[0;32m"administrative"[0m[1;39m,
  [0m[34;1m"place_rank"[0m[1;39m: [0m[0;39m16[0m[1;39m,
  [0m[34;1m"importance"[0m[1;39m: [0m[0;39m0.6740050666982947[0m[1;39m,
  [0

In [30]:
curl -s "https://nominatim.openstreetmap.org/search?q=Bordeaux,France&format=json" | jq '.[] | [.lat, .lon, .addresstype] | @csv'

[0;32m"\"44.841225\",\"-0.5800364\",\"city\""[0m
[0;32m"\"44.79384015\",\"-0.6063085906819762\",\"municipality\""[0m


In [20]:
line="Bordeaux,France"
curl -s "https://nominatim.openstreetmap.org/search?q=${line}&format=json" | jq '.[] | [.lat, .lon, .addresstype] | @csv'

[0;32m"\"44.841225\",\"-0.5800364\",\"city\""[0m
[0;32m"\"44.79384015\",\"-0.6063085906819762\",\"municipality\""[0m


In [5]:
echo "lat,lon,type,name" > cities4.csv
while read line 
do
    result=$(curl -s "https://nominatim.openstreetmap.org/search?q=${line// /+}&format=json" | jq '.[] | [.lat, .lon, .addresstype, .name] | @csv')
    result2=$(echo "${result//\"/}")
    echo "${result2//\\/}"
done < cities3.csv >> cities4.csv


In [16]:
remote_csv=$(curl -s "https://raw.githubusercontent.com/clombion/turin_crash_course/wip/cities3.csv")

echo "lat,lon,type,name" > cities4.csv
while read line 
do
    result=$(curl -s "https://nominatim.openstreetmap.org/search?q=${line// /+}&format=json" | jq '.[] | select(.addresstype=="city") | [.lat, .lon, .addresstype, .name] | @csv')
    result2=$(echo "${result//\"/}")
    echo "${result2//\\/}"
done <<< $remote_csv >> cities4.csv

sed -i '/^$/d' cities4.csv

In [6]:
remote_csv=$(curl -s "https://raw.githubusercontent.com/clombion/turin_crash_course/wip/cities3.csv")

echo "lat,lon,type,name" > cities4.csv
while read line 
do
    result=$(curl -s "https://nominatim.openstreetmap.org/search?q=${line// /+}&format=json" | jq '.[] | select(.addresstype=="city") | [.lat, .lon] | @csv')
    result2=$(echo "${result//\"/}")
    echo "${result2//\\/}"
done <<< $remote_csv >> cities4.csv

sed -i '/^$/d' cities4.csv

Accra,Ghana
Bergen,Norway
Paris,France
Turin,Italy
Bordeaux,France
Accra,Ghana
New York,USA


In [10]:
# API call to fetch precipitation data
json_response=$(curl -s "https://api.open-meteo.com/v1/forecast?latitude=52.52&longitude=13.4&hourly=precipitation&start_date=2024-04-10&end_date=2024-04-10&timezone=auto")

# Calculate the sum of the precipitation using jq
total_precipitation=$(echo "$json_response" | jq '[.hourly.precipitation[]] | add')

echo "Total precipitation on 2024-04-10: $total_precipitation mm"


Total precipitation on 2024-04-10: 0.4 mm
