# Example: Fun with For Loops and the Iteration Pattern
One of the most conserved iteration constructs, found in many, if not all, programming languages, is the `for` loop.
> `for` loops repeatedly evaluate a block of statements while iterating over a sequence of values. Thus, `for` loops perform a fixed (known) number of iterations.

Let's look at some example `for` loops where we iterate over both ordered collections, like [arrays](https://docs.julialang.org/en/v1/base/arrays/#lib-arrays) and [tuples](https://docs.julialang.org/en/v1/manual/functions/#Tuples), and unordered collections like [Dictionaries](https://docs.julialang.org/en/v1/manual/functions/#Tuples), [NamedTuples](https://docs.julialang.org/en/v1/base/base/#Core.NamedTuple) and [Sets](https://docs.julialang.org/en/v1/base/collections/#Base.Set).

## Setup
We set up the computational environment by including the `Include. jl` file. The file loads external packages and various functions we will use in these examples. For additional information on functions and types used in this material, see the [Julia programming language documentation](https://docs.julialang.org/en/v1/). 

In [1]:
include("Include.jl");

## Example 1: Basic `for` loop iteration example for an ordered collection
Let's start by looking at the basic structure of a `for` loop. A `for` loop has a header that specifies how many times the loop will iterate. The loop `index` (the `iteration variable`) is passed into the loop's body, where you put your logic. The `iteration variable` is always a new variable, even if a variable of the same name exists in the enclosing scope. 
* In Julia, the `for` loop has its own scope that captures variables from the outside but doesn't pass new variables created inside the loop to the outside unless they already exist. The scope of the `for` loop ends with the `end` keyword.
* We use the [println function](https://docs.julialang.org/en/v1/base/io-network/#Base.println) to show output from the loop. This function takes `String` as an argument and sends it to `stdout` (the default output destination) or to an output specified by the caller. The `$(...)` is an example of a [String interpolation](https://docs.julialang.org/en/v1/manual/strings/#string-interpolation) operation.

In [13]:
number_of_elements = 5;
random_vector_array = rand(number_of_elements)
value = nothing
for i in 1:number_of_elements
    value = random_vector_array[i];
    println("The index i = $(i) and the value = $(random_vector_array[i])");
end # end of for loop scope
value

The index i = 1 and the value = 0.04728193285404991
The index i = 2 and the value = 0.796320889530637
The index i = 3 and the value = 0.5011203650758014
The index i = 4 and the value = 0.5168139655971882
The index i = 5 and the value = 0.5872797050471509


0.5872797050471509

Another `for` loop pattern is the [eachindex pattern](https://docs.julialang.org/en/v1/base/arrays/#Base.eachindex). We use the [eachindex pattern](https://docs.julialang.org/en/v1/base/arrays/#Base.eachindex) when we don't explicitly know how many elements we have in an ordered collection but we want to visit all of them in order. 
* The [eachindex pattern](https://docs.julialang.org/en/v1/base/arrays/#Base.eachindex) preferred pattern compared with something like `for i in 1:length(random_vector_array)` when we don't know how many elements are in the `random_vector_array` collection

In [15]:
N = length(random_vector_array)
for i ∈ 1:N
    value = random_vector_array[i];
    println("The index i = $(i) and the value = $(value)");
end

The index i = 1 and the value = 0.04728193285404991
The index i = 2 and the value = 0.796320889530637
The index i = 3 and the value = 0.5011203650758014
The index i = 4 and the value = 0.5168139655971882
The index i = 5 and the value = 0.5872797050471509


If we don't care about the element index in a collection, we can iterate over the elements directly. For example, the code block below accesses the values of the `random_vector_array`  directly, but `NOT` their indexes:

In [16]:
for value ∈ random_vector_array
    println("The value = $(value)");
end

The value = 0.04728193285404991
The value = 0.796320889530637
The value = 0.5011203650758014
The value = 0.5168139655971882
The value = 0.5872797050471509


Finally, we can iterate over other types of ordered collections, e.g., [tuples](https://docs.julialang.org/en/v1/manual/functions/#Tuples), which are fixed-length ordered containers that can hold any values, but cannot be modified once constructed, i.e., they are immutable. Let's build a `tuple` holding some `Int` types and iterate over these using the [eachindex pattern](https://docs.julialang.org/en/v1/base/arrays/#Base.eachindex).
* The previous examples used the [println function](https://docs.julialang.org/en/v1/base/io-network/#Base.println) to print output; here, we show another approach that does the same thing, namely the [@show macro](https://docs.julialang.org/en/v1/base/base/#Base.@show) which prints one or more expressions, and their results, to `stdout`, and returns the last result:

In [22]:
example_tuple = (1,2,3,6,5,4,1,1,1,1)
for i ∈ eachindex(example_tuple)
    @show (i, example_tuple[i])
end

(i, example_tuple[i]) = (1, 1)
(i, example_tuple[i]) = (2, 2)
(i, example_tuple[i]) = (3, 3)
(i, example_tuple[i]) = (4, 6)
(i, example_tuple[i]) = (5, 5)
(i, example_tuple[i]) = (6, 4)
(i, example_tuple[i]) = (7, 1)
(i, example_tuple[i]) = (8, 1)
(i, example_tuple[i]) = (9, 1)
(i, example_tuple[i]) = (10, 1)


## Example 2: Basic `for` loop iteration example for an unordered `Set`
In the case of unordered collections such as [Sets](https://docs.julialang.org/en/v1/base/collections/#Base.Set), we can use a `for` loop to visit each element of the collection. However, because the original collection is unordered, the order in which visit the elements will be random. 
* In the specific case of [Sets](https://docs.julialang.org/en/v1/base/collections/#Base.Set), which model `bags of stuff`, there is no notion of index (which can be a little confusing). Thus, we can only access the items directly. The other interesting thing about [Sets](https://docs.julialang.org/en/v1/base/collections/#Base.Set) is that they are `unique,` i.e., there are no repeated elements

In [21]:
example_set = Set{Char}(['A','B','C','D','R','U','S','T','A']);
for value ∈ example_set
    @show value
end

value = 'C'
value = 'U'
value = 'D'
value = 'A'
value = 'R'
value = 'S'
value = 'T'
value = 'B'


## Example 3: Basic `for` loop iteration example for an unordered `Dict`
We gathered a daily open-high-low-close `dataset` for each firm in the [S&P500](https://en.wikipedia.org/wiki/S%26P_500) from `01-03-2018` until `12-29-2023`, along with data for a few exchange-traded funds and volatility products during that time. 
* Let's load the `orignal_dataset` by calling the `MyMarketDataSet()` function and remove firms that do not have the maximum number of trading days. The cleaned dataset $\mathcal{D}$ is stored in the `dataset` variable, where the dataset $\mathcal{D}$ has data for $\mathcal{L}$ firms.
* When we use a `for` loop with a [Dictionary](https://docs.julialang.org/en/v1/base/collections/#Dictionaries), which is an unordered collection of `key=>value` pairs, we get both the `key` and `value` as the `iteration variable` organized as a `tuple`.

In [7]:
original_dataset = MyMarketDataSet() |> x-> x["dataset"];

In [23]:
original_dataset

Dict{String, DataFrame} with 515 entries:
  "TPR"  => [1m1508×8 DataFrame[0m[0m…
  "EMR"  => [1m1508×8 DataFrame[0m[0m…
  "CTAS" => [1m1508×8 DataFrame[0m[0m…
  "HSIC" => [1m1508×8 DataFrame[0m[0m…
  "KIM"  => [1m1508×8 DataFrame[0m[0m…
  "PLD"  => [1m1508×8 DataFrame[0m[0m…
  "IEX"  => [1m1508×8 DataFrame[0m[0m…
  "KSU"  => [1m994×8 DataFrame[0m[0m…
  "BAC"  => [1m1508×8 DataFrame[0m[0m…
  "CBOE" => [1m1508×8 DataFrame[0m[0m…
  "EXR"  => [1m1508×8 DataFrame[0m[0m…
  "NCLH" => [1m1508×8 DataFrame[0m[0m…
  "CVS"  => [1m1508×8 DataFrame[0m[0m…
  "DRI"  => [1m1508×8 DataFrame[0m[0m…
  "DTE"  => [1m1508×8 DataFrame[0m[0m…
  "ZION" => [1m1508×8 DataFrame[0m[0m…
  "AVY"  => [1m1508×8 DataFrame[0m[0m…
  "EW"   => [1m1508×8 DataFrame[0m[0m…
  "EA"   => [1m1508×8 DataFrame[0m[0m…
  "NWSA" => [1m1508×8 DataFrame[0m[0m…
  "BBWI" => [1m607×8 DataFrame[0m[0m…
  "CAG"  => [1m1508×8 DataFrame[0m[0m…
  "GPC"  => [1m1508×8 DataFrame[0m

In [24]:
original_dataset["AAPL"]

Row,volume,volume_weighted_average_price,open,close,high,low,timestamp,number_of_transactions
Unnamed: 0_level_1,Float64,Float64,Float64,Float64,Float64,Float64,DateTime,Int64
1,1.17982e8,43.2781,43.1325,43.0575,43.6375,42.99,2018-01-03T05:00:00,188333
2,8.97384e7,43.2473,43.135,43.2575,43.3675,43.02,2018-01-04T05:00:00,153150
3,9.46401e7,43.6732,43.36,43.75,43.8425,43.2625,2018-01-05T05:00:00,152173
4,8.22711e7,43.6581,43.5875,43.5875,43.9025,43.4825,2018-01-08T05:00:00,138842
5,8.6336e7,43.5803,43.6375,43.5825,43.765,43.3525,2018-01-09T05:00:00,154006
6,9.58396e7,43.4126,43.29,43.5725,43.575,43.25,2018-01-10T05:00:00,151201
7,7.46709e7,43.7894,43.6475,43.82,43.8722,43.6225,2018-01-11T05:00:00,117864
8,9.68123e7,44.1806,44.045,44.2725,44.34,43.9125,2018-01-12T05:00:00,151952
9,1.18128e8,44.3672,44.475,44.0475,44.8475,44.035,2018-01-16T05:00:00,195534
10,1.37547e8,44.3277,44.0375,44.775,44.8125,43.7675,2018-01-17T05:00:00,218162


### Clean the data
Not all tickers in our dataset have the maximum number of trading days for various reasons, e.g., acquisition or de-listing events. Let's collect only those tickers with the maximum number of trading days.

* First, let's compute the number of records for a company that we know has a maximum value, e.g., `AAPL`, and save that value in the `maximum_number_trading_days` variable:

In [25]:
maximum_number_trading_days = original_dataset["AAPL"] |> nrow

1508

Now, let's iterate through our data and collect only tickers with `maximum_number_trading_days` records. Save that data in the `dataset::Dict{String,DataFrame}` variable:

In [30]:
dataset = Dict{String, DataFrame}();
for (ticker,data) ∈ original_dataset    
    if (nrow(data) == maximum_number_trading_days)
        dataset[ticker] = data;
    end
end
dataset;

In [29]:
length(dataset)

460