# Example: Fun with Iteration Patterns for Arrays, Sets and Dictionaries
One of the most conserved iteration constructs, found in many, if not all, programming languages, is [the `for-loop.`](https://docs.julialang.org/en/v1/base/base/#for)
> [`for-loops`](https://docs.julialang.org/en/v1/base/base/#for) repeatedly evaluate a block of statements while iterating over a sequence of values. Thus, a [`for-loop`](https://docs.julialang.org/en/v1/base/base/#for) performs a fixed (known) number of iterations.

Let's look at some example [`for-loops`](https://docs.julialang.org/en/v1/base/base/#for) where we iterate over both ordered linear collections, like [arrays](https://docs.julialang.org/en/v1/base/arrays/#lib-arrays) and [tuples](https://docs.julialang.org/en/v1/manual/functions/#Tuples), and unordered non-linear collections like [Dictionaries](https://docs.julialang.org/en/v1/manual/functions/#Tuples), [NamedTuples](https://docs.julialang.org/en/v1/base/base/#Core.NamedTuple) and [Sets](https://docs.julialang.org/en/v1/base/collections/#Base.Set).

## Setup
We set up the computational environment by including the `Include. jl` file using [the `include(...)` method](https://docs.julialang.org/en/v1/base/base/#Base.include). The `Include.jl` file loads external packages and functions we will use in these examples. 
* For additional information on functions and types used in this example, see the [Julia programming language documentation](https://docs.julialang.org/en/v1/). 

In [3]:
include("Include.jl");

## Example 1: Basic `for-loop` iteration example for an `Array`
Let's start by looking at the basic structure of [a `for-loop.`](https://docs.julialang.org/en/v1/base/base/#for) A [`for-loop`](https://docs.julialang.org/en/v1/base/base/#for) has a header that specifies how many times the loop will iterate. The loop `index` (the `iteration variable`) is passed into the loop's body, where you put your logic. The `iteration variable` is always a new variable, even if a variable of the same name exists in the enclosing scope. 
* In Julia, the [`for-loop`](https://docs.julialang.org/en/v1/base/base/#for) has its own local scope that captures variables from the outside but doesn't pass new variables created inside the loop to the outside unless they already exist. The [local scope of the for loop](https://docs.julialang.org/en/v1/manual/variables-and-scoping/#local-scope) ends with the `end` keyword.
* We use the [println function](https://docs.julialang.org/en/v1/base/io-network/#Base.println) to show output from the loop. This function takes a [`String`](https://docs.julialang.org/en/v1/manual/strings/) as an argument and sends it to `stdout` (the default output destination) or to an output specified by the caller. The `$(...)` is an example of a [String interpolation](https://docs.julialang.org/en/v1/manual/strings/#string-interpolation) operation.

In [39]:
number_of_elements = 5;
random_vector_array = rand(number_of_elements); # create an array of random Floating point values
value = nothing
for i in 1:number_of_elements
    value = random_vector_array[i];
    println("The index i = $(i) and the value = $(random_vector_array[i])");
end # end of for loop scope
value

The index i = 1 and the value = 0.08130793486959287
The index i = 2 and the value = 0.08967492911050956
The index i = 3 and the value = 0.3110503043390046
The index i = 4 and the value = 0.7986525710464417
The index i = 5 and the value = 0.9923589405218479


0.9923589405218479

Another [`for-loop`](https://docs.julialang.org/en/v1/base/base/#for) pattern is the [eachindex pattern](https://docs.julialang.org/en/v1/base/arrays/#Base.eachindex). We use the [eachindex pattern](https://docs.julialang.org/en/v1/base/arrays/#Base.eachindex) when we don't explicitly know how many elements we have in an ordered collection, but we want to visit all of them in order. 
* The [eachindex pattern](https://docs.julialang.org/en/v1/base/arrays/#Base.eachindex) is the preferred pattern compared with something like `for i in 1:length(random_vector_array)` when we don't know how many elements are in the `random_vector_array` collection. __Why is this true?__

In [7]:
# N = length(random_vector_array)
for i ∈ eachindex(random_vector_array)
    value = random_vector_array[i];
    println("The index i = $(i) and the value = $(value)");
end

The index i = 1 and the value = 0.9829459947092559
The index i = 2 and the value = 0.48592195544473493
The index i = 3 and the value = 0.6144689681480376
The index i = 4 and the value = 0.9635346621442915
The index i = 5 and the value = 0.10922263094498108


If we don't care about the element index $i$ in a collection but instead want the values, we can iterate over the elements directly. For example, the code block below accesses the values of the `random_vector_array`  directly but `NOT` their indexes:

In [9]:
for value ∈ random_vector_array
    println("The value = $(value)");
end

The value = 0.9829459947092559
The value = 0.48592195544473493
The value = 0.6144689681480376
The value = 0.9635346621442915
The value = 0.10922263094498108


Finally, we can iterate over other types of ordered collections, e.g., [tuples](https://docs.julialang.org/en/v1/manual/functions/#Tuples), which are fixed-length ordered containers that can hold any values, but cannot be modified once constructed, i.e., they are immutable. Let's build a `tuple` holding some `Int` types and iterate over these using the [eachindex pattern](https://docs.julialang.org/en/v1/base/arrays/#Base.eachindex).
* The previous examples used the [println function](https://docs.julialang.org/en/v1/base/io-network/#Base.println) to print output; here, we show another approach that does the same thing, namely the [@show macro](https://docs.julialang.org/en/v1/base/base/#Base.@show) which prints one or more expressions, and their results, to `stdout`, and returns the last result:

In [11]:
example_tuple = (1,2,3,6,5,4,1,1,1,1)
for i ∈ eachindex(example_tuple)
    @show (i, example_tuple[i])
end

(i, example_tuple[i]) = (1, 1)
(i, example_tuple[i]) = (2, 2)
(i, example_tuple[i]) = (3, 3)
(i, example_tuple[i]) = (4, 6)
(i, example_tuple[i]) = (5, 5)
(i, example_tuple[i]) = (6, 4)
(i, example_tuple[i]) = (7, 1)
(i, example_tuple[i]) = (8, 1)
(i, example_tuple[i]) = (9, 1)
(i, example_tuple[i]) = (10, 1)


## Example 2: Basic `for-loop` iteration example for an unordered `Set.`
One of the central differences between linear and nonlinear data structures is the organization of the data and how we traverse the items in the data structure. In the case of unordered non-linear collections such as [Sets](https://docs.julialang.org/en/v1/base/collections/#Base.Set), we can use a [`for-loop`](https://docs.julialang.org/en/v1/base/base/#for) to visit each element of the collection. However, because the original collection is unordered, the order in which we visit the elements will be random. 
* In the specific case of [Sets](https://docs.julialang.org/en/v1/base/collections/#Base.Set), which model `bags of stuff`, there is no notion of index (which can be a little confusing). Thus, we can only access the items directly. The other exciting thing about [Sets](https://docs.julialang.org/en/v1/base/collections/#Base.Set) is that they are `unique,` i.e., there are no repeated elements

In [13]:
example_set = Set{Char}(['A','B','C','D','R','U','S','T','A']);
for value ∈ example_set
    @show value
end

value = 'C'
value = 'U'
value = 'D'
value = 'A'
value = 'R'
value = 'S'
value = 'T'
value = 'B'


## Example 3: Basic `for-loop` iteration of an unordered `Dict`
Dictionaries are also examples of non-linear data structures; non-linear data structures do not sequentially arrange data. Instead, data can be connected in a hierarchical or network-based format, allowing for more complex relationships between the elements. However, we can still traverse a dictionary [using a `for-loop`](https://docs.julialang.org/en/v1/base/base/#for).

We gathered a daily open-high-low-close `dataset` for each firm in the [S&P500](https://en.wikipedia.org/wiki/S%26P_500) from `01-03-2018` until `12-29-2023`, along with data for a few exchange-traded funds and volatility products during that time. 
* Let's load the `orignal_dataset` by calling the `MyMarketDataSet()` function and remove firms that do not have the maximum number of trading days. The cleaned dataset $\mathcal{D}$ is stored in the `dataset` variable, where the dataset $\mathcal{D}$ has data for $\mathcal{L}$ firms.
* When we use a [`for-loop`](https://docs.julialang.org/en/v1/base/base/#for) with a [Dictionary](https://docs.julialang.org/en/v1/base/collections/#Dictionaries), which is an unordered collection of `key=>value` pairs, we get both the `key` and `value` as the `iteration variable` organized as a `tuple.`

In [15]:
original_dataset = MyMarketDataSet() |> x-> x["dataset"];

In [43]:
original_dataset

Dict{String, DataFrame} with 515 entries:
  "TPR"  => [1m1508×8 DataFrame[0m[0m…
  "EMR"  => [1m1508×8 DataFrame[0m[0m…
  "CTAS" => [1m1508×8 DataFrame[0m[0m…
  "HSIC" => [1m1508×8 DataFrame[0m[0m…
  "KIM"  => [1m1508×8 DataFrame[0m[0m…
  "PLD"  => [1m1508×8 DataFrame[0m[0m…
  "IEX"  => [1m1508×8 DataFrame[0m[0m…
  "KSU"  => [1m994×8 DataFrame[0m[0m…
  "BAC"  => [1m1508×8 DataFrame[0m[0m…
  "CBOE" => [1m1508×8 DataFrame[0m[0m…
  "EXR"  => [1m1508×8 DataFrame[0m[0m…
  "NCLH" => [1m1508×8 DataFrame[0m[0m…
  "CVS"  => [1m1508×8 DataFrame[0m[0m…
  "DRI"  => [1m1508×8 DataFrame[0m[0m…
  "DTE"  => [1m1508×8 DataFrame[0m[0m…
  "ZION" => [1m1508×8 DataFrame[0m[0m…
  "AVY"  => [1m1508×8 DataFrame[0m[0m…
  "EW"   => [1m1508×8 DataFrame[0m[0m…
  "EA"   => [1m1508×8 DataFrame[0m[0m…
  "NWSA" => [1m1508×8 DataFrame[0m[0m…
  "BBWI" => [1m607×8 DataFrame[0m[0m…
  "CAG"  => [1m1508×8 DataFrame[0m[0m…
  "GPC"  => [1m1508×8 DataFrame[0m

In [47]:
original_dataset["AAPL"]; # Hmmm. we access the elements of dictionary like an array?

### Clean the data
Not all tickers in our dataset have the maximum number of trading days for various reasons, e.g., acquisition or de-listing events. Let's collect only those tickers with the maximum number of trading days.

* First, let's compute the number of records for a company that we know has a maximum value, e.g., `AAPL`, and save that value in the `maximum_number_trading_days` variable:

In [19]:
maximum_number_trading_days = original_dataset["AAPL"] |> nrow; # nrow? (check out: DataFrames.jl)

1508

Now, let's iterate through our data and collect only tickers with `maximum_number_trading_days` records. Save that data in the `dataset::Dict{String,DataFrame}` variable:

In [21]:
dataset = Dict{String, DataFrame}();
for (ticker,data) ∈ original_dataset    
    if (nrow(data) == maximum_number_trading_days)
        dataset[ticker] = data;
    end
end
dataset;

In [22]:
length(dataset)

460