# Get Programming with F# by [Isaac Abraham](https://github.com/isaacabraham)

## “Working with collections in F#”

Isaac Abraham proposes three types of C# developers working with collections:

1. _The C# 2 developer:_ the one who is doggedly imperative and is well within the OS-level shadow of C++, staying firmly with `for`, `while` and `do` loops.
2. _The LINQ developer:_ the one who is quite fond of using SQL set operations in C# with naïve disregard for expensive allocations and mutations.
3. _The wannabe FP developer:_ the one who understands the complexity-control benefits of the functional approach and might even dabble in the [immutable collections](https://docs.microsoft.com/en-us/archive/msdn-magazine/2017/march/net-framework-immutable-collections) of C#.

Two out of three of these developer types might find the following challenging:

>Given a set of football results …which teams won the most away games in the season.

In [None]:
#!fsharp

type FootballResult = { HomeTeam : string; AwayTeam : string; HomeGoals : int; AwayGoals : int }

let create (ht, hg) (at, ag) =
    { HomeTeam = ht; AwayTeam = at; HomeGoals = hg; AwayGoals = ag }


The `create` function is a brilliant mix: a curried function taking tuple arguments! We use this function to generate a set of data such that the answer to our football question is:

- Bale Town: 2 wins
- Ronaldo City: 1 win

In [None]:
#!fsharp

let results =
    [
        create ("Messiville", 1) ("Ronaldo City", 2)
        create ("Messiville", 1) ("Bale Town", 3)
        create ("Bale Town", 3) ("Ronaldo City", 1)
        create ("Bale Town", 2) ("Messiville", 1)
        create ("Ronaldo City", 4) ("Messiville", 2)
        create ("Ronaldo City", 1) ("Bale Town", 2)
    ]

I will resort to LINQ, approaching this challenge, reminding myself that [lesson 13](https://github.com/BryanWilhite/jupyter-central/blob/master/get-programming-with-f-sharp/13-achieving-code-reuse-in-fsharp.ipynb) introduces the use of inline functions (lambda syntax):

In [None]:
#!fsharp

open System.Linq

let win result =
    if result.HomeGoals > result.AwayGoals then 0
    else 1

results
    .GroupBy(fun result -> result.AwayTeam)
    .Select(fun group -> (group.First().AwayTeam, group.Sum(fun result -> win result)))
    .OrderByDescending(fun tuple -> snd tuple)
    .Select(fun tuple -> $"{fst tuple}: {snd tuple} wins")

index,value
0,Bale Town: 2 wins
1,Ronaldo City: 1 wins
2,Messiville: 0 wins


In C#, I would have used [anonymous objects](https://docs.microsoft.com/en-us/dotnet/csharp/fundamentals/types/anonymous-types) instead of a tuple to get the results. The F# equivalent of C# anonymous objects are _object expressions_ [📖 [docs](https://docs.microsoft.com/en-us/dotnet/fsharp/language-reference/object-expressions)] which are not mentioned until lesson 25—using them above would be skipping ahead too far in this study!

## the collection modules

There are five collection types [📖 [docs](https://docs.microsoft.com/en-us/dotnet/fsharp/language-reference/fsharp-collection-types)] in F#:

1. `List`
2. `Array`
3. `Seq`
4. `Map`
5. `Set`

Isaac Abraham distinguishes the first three types because their respective modules contain “functions designed for querying (and generating) collections.” Most of the query functions in these three modules are higher-order functions optimized for currying pipelines (with `|>`).

Of all of the collection types, `Seq` is is the least familiar collection name (once we see that `Map` is a kind of `Dictionary`). Here are the Microsoft remarks about `Seq`:

>Sequences are particularly useful when you have a large, ordered collection of data but don’t necessarily expect to use all the elements. Individual sequence elements are computed only as required, so a sequence can perform better than a list if not all the elements are used. Sequences are represented by the `seq<'T>` type, which is an alias for `IEnumerable<T>`. Therefore, any .NET Framework type that implements `System.Collections.Generic.IEnumerable<'T>` can be used as a sequence.



We now see that `Seq` is the _most_ familiar F# collection to a C# developer. Once we recall that extension methods over `IEnumerable<T>` is the essence of LINQ, we can [browse through the documentation](https://fsharp.github.io/fsharp-core-docs/reference/fsharp-collections-seqmodule.html) for the `Seq` module and associate its querying functions with [LINQ extension methods](https://docs.microsoft.com/en-us/dotnet/csharp/programming-guide/concepts/linq/classification-of-standard-query-operators-by-manner-of-execution):


| LINQ query operator | `Seq` function |
|- |-
| `Aggregate` | `Seq.fold` |
| `All` | `Seq.forall` |
| `Any` | `Seq.exists` |
| `AsEnumerable` | `Array.toSeq`, `List.toSeq`, `Map.toSeq`, `Set.toSeq` |
| `Average` | `Seq.average` |
| `Cast` | `Seq.cast` |
| `Concat` | `Seq.concat` |
| `Contains` | `Seq.contains` |
| `Count` | `Seq.length` |
| `DefaultIfEmpty` | no equivalent `Seq` function |
| `Distinct` | `Seq.distinct` |
| `ElementAt` | `Seq.item` |
| `ElementAtOrDefault` | no equivalent `Seq` function but `Seq.tryItem` is close |
| `Empty` | `Seq.empty` |
| `Except` | `Seq.except` |
| `First` | `Seq.find` |
| `FirstOrDefault` | no equivalent `Seq` function but `Seq.tryFind` is close |
| `GroupBy` | `Seq.groupBy` |
| `GroupJoin` | no equivalent `Seq` function |
| `Intersect` | no equivalent `Seq` function but `Seq.map2` can do |
| `Join` | no equivalent `Seq` function but `Seq.map2` might do |
| `Last` | `Seq.findBack` |
| `LastOrDefault` | no equivalent `Seq` function but `Seq.tryFindBack` is close |
| `LongCount` | `Seq.length` |
| `Max` | `Seq.max` |
| `Min` | `Seq.min` |
| `OfType` | no equivalent `Seq` function |
| `OrderBy` | `Seq.sortBy` |
| `OrderByDescending` | `Seq.sortByDescending` |
| `Range` | no equivalent `Seq` function but a [sequence expression](https://docs.microsoft.com/en-us/dotnet/fsharp/language-reference/sequences#sequence-expressions) like `seq { 1 .. 5 }` does more |
| `Repeat` | `Seq.replicate` |
| `Reverse` | `Seq.rev` |
| `Select` | `Seq.map` |
| `SelectMany` | `Seq.collect` (see <https://stackoverflow.com/a/4600047/22944>) |
| `SequenceEqual` | no equivalent `Seq` function but the `=` operator should do |
| `Single` | no equivalent `Seq` function (see [nine-year-old StackOverflow question](https://stackoverflow.com/questions/9850769/f-equivalent-of-linq-single)) but piping `Seq.where` and `Seq.exactlyOne` should do |
| `SingleOrDefault` | no equivalent `Seq` function but piping `Seq.where` and `Seq.tryExactlyOne` should do |
| `Skip` | `Seq.skip` |
| `SkipWhile` | `Seq.skipWhile` |
| `Sum` | `Seq.sum` |
| `Take` | `Seq.take` |
| `TakeWhile` | `Seq.takeWhile` |
| `ThenBy` and `ThenByDescending` | no equivalent `Seq` function but piping sort functions should do |
| `ToArray` | `Seq.toArray` |
| `ToDictionary` | no equivalent `Seq` function; use the `dict` function instead (see <https://stackoverflow.com/a/9720581/22944>) |
| `ToList` | `Seq.toList` |
| `ToLookup` | no equivalent `Seq` function; use `Map.ofSeq` instead |
| `Union` | `Seq.concat` |
| `Where` | `Seq.where` |
| `Zip` | `Seq.zip` |

When we see this rather lengthy table, it is important to remember a few points:

- the `Seq` functions listed above should also be available in the `List` and `Array` modules where applicable
- the `Seq` module contains more functions than what is listed above
- the [MoreLinq Nuget package](https://www.nuget.org/packages/morelinq) should be familiar to C# programmer and F# contains much of its functionality by default

## using collection-module functions instead of LINQ

Let’s try to translate the LINQ operations on `results` above to collection-module functions with the forward pipe operator, `|>`:

In [None]:
#!fsharp

results
|> List.groupBy(fun result -> result.AwayTeam)
|> List.map(fun group -> (((snd group) |> List.head).AwayTeam, (snd group) |> List.sumBy(fun result -> win result)))
|> List.sortByDescending(fun tuple -> snd tuple)
|> List.map(fun tuple -> $"{fst tuple}: {snd tuple} wins")

index,value
0,Bale Town: 2 wins
1,Ronaldo City: 1 wins
2,Messiville: 0 wins


This exercise of mine makes it quite plain why Isaac Abraham went a completely different direction for this lesson in the book. The problem I have created for myself requires translating this `IGrouping<string, FootballResult>` from LINQ:

```csharp
group.First().AwayTeam
```

To this native F# `string * seq<FootballResult>`:

```fsharp
((snd group) |> Seq.head).AwayTeam
```

This F# native expression is far more complicated than what we get from `IGrouping` interface of .NET. Because of me, we have to all the back to [lesson 9](https://github.com/BryanWilhite/jupyter-central/blob/master/get-programming-with-f-sharp/09-shaping-data-with-tuples.ipynb) to remind ourselves that `group` is a tuple (the asterisks should give the game away).

In the simplified example below, `g` is of type `string * seq<int>`:

In [None]:
#!fsharp

let g = ("one", seq {1; 2; 3})

(snd g) |> Seq.head

In order to access the sequence in the tuple `g` we have to use that `snd` function with which we should be familiar!

## both of the solutions above are not efficient or correct

The clue that what I have done above is wrong is this:

```plaintext
Messiville: 0 wins
```

I am scanning the entire set of results when I make the first move which is grouping.

Isaac Abraham writes:

> Start by thinking about _what_ it is you want to do, rather than _how_…
>
> 1. Find all results that had an away win.
> 2. Group all the away wins by the away team.
> 3. Sort the results in descending order by the by the numbers of away wins per team.

You will notice from my brilliant work above I am _not_ getting to the point which is the first thing Isaac is doing:

> Find all results that had an away win.

Isaac Abraham does this with his `isAwayWin` function:

In [None]:
#!fsharp

let isAwayWin result = result.AwayGoals > result.HomeGoals

Now we can `filter` the `results` down to _what_ we are looking for:


In [None]:
#!fsharp

results
|> List.filter isAwayWin                                //1. Find all results that had an away win.
|> List.countBy(fun result -> result.AwayTeam)          //2. Group all the away wins by the away team.
|> List.sortByDescending(fun (_, awayWins) -> awayWins) //3. Sort the results in descending order…

index,Item1,Item2
0,Bale Town,2
1,Ronaldo City,1


Putting my poor critical thinking skills aside, the C#-based problem I have is the fact there is no `CountBy` method in LINQ by default. I would have to go to MoreLinq for that [📖 [docs](https://morelinq.github.io/3.3/ref/api/html/T_MoreLinq_Extensions_CountByExtension.htm)]! It might help to think of `List.countBy` as counting by a grouping key which, in this case, is `AwayTeam`.

The second problem I am aware of is my unnecessary use of `snd` on the `group` values I wrote above. Isaac is showing me that I could use a tuple with a discard to decompose `group` which is totally awesome!

In [None]:
#!fsharp

results
|> List.groupBy(fun result -> result.AwayTeam)
|> List.map(fun (_, groupedResults) -> ((groupedResults |> List.head).AwayTeam, groupedResults |> List.sumBy(fun result -> win result)))
|> List.sortByDescending(fun (_, sum) -> sum)
|> List.map(fun (awayTeam, sum) -> $"{awayTeam}: {sum} wins")

index,value
0,Bale Town: 2 wins
1,Ronaldo City: 1 wins
2,Messiville: 0 wins


Yes, this is totally awesome but still wrong! For the sake of my bruised ego I could resort to this:


In [None]:
#!fsharp

results
|> List.filter(fun result -> win result = 1)
|> List.groupBy(fun result -> result.AwayTeam)
|> List.map(fun (_, groupedResults) -> ((groupedResults |> List.head).AwayTeam, groupedResults |> List.sumBy(fun result -> win result)))
|> List.sortByDescending(fun (_, sum) -> sum)
|> List.map(fun (awayTeam, sum) -> $"{awayTeam}: {sum} wins")

index,value
0,Bale Town: 2 wins
1,Ronaldo City: 1 wins


Now we can clearly see how my ignorance of the power of `countBy` is making me express _two_ function calls, `groupBy` and `map` instead of _one_:


In [None]:
#!fsharp

results
|> List.filter(fun result -> win result = 1)
|> List.countBy(fun result -> result.AwayTeam)
|> List.sortByDescending(fun (_, sum) -> sum)
|> List.map(fun (awayTeam, sum) -> $"{awayTeam}: {sum} wins")

index,value
0,Bale Town: 2 wins
1,Ronaldo City: 1 wins


Since the self-criticism is rather thick here (for the 21<sup>st</sup> century), I would like to add my last ding about my name for the function `win` is rather poor. According to the answers of [a StackOverflow question](https://stackoverflow.com/questions/526930/f-naming-convention), there is [an F# style guide](https://docs.microsoft.com/en-us/dotnet/fsharp/style-guide/) (at Microsoft) but I am not seeing any explicit notes about _naming_ functions. One might have to dig into [OCaml lore](https://github.com/lindig/ocaml-style) or, perhaps, understand that, because of the DSL intentions behind F#, there is a reluctance to talk about style beyond formatting. The bottom line: I have yet to find any F# authority stating in writing that, say, all function names should be verbs. (On page 109, Isaac Abraham comes very close to _suggesting_ this but not explicitly _stating_ it.)


## what happens when `result` is `empty` or `null`?

Let us see what happens when our result is `empty`:

In [None]:
#!fsharp

[]
|> List.filter(fun result -> win result = 1)
|> List.countBy(fun result -> result.AwayTeam)
|> List.sortByDescending(fun (_, sum) -> sum)
|> List.map(fun (awayTeam, sum) -> $"{awayTeam}: {sum} wins")

We can look at [the source code](https://github.com/dotnet/fsharp/blob/main/src/fsharp/FSharp.Core/list.fs) for the F# `List` module and see that it is handling empty lists for us.

Getting `null` into F# is a bit more complicated because F# recognizes `null` as an ‘outsider’ to the F# world. Microsoft [documentation](https://docs.microsoft.com/en-us/dotnet/fsharp/language-reference/values/null-values) states:

> In a library intended only for F#, you do not have to check for null values in every function. If you are writing a library for interoperation with other .NET languages, you might have to add checks for null input parameters and throw an `ArgumentNullException`, just as you do in C# or Visual Basic code.

We can see Microsoft following their own advice when I try to pass a null .NET `List<T>` (`nullList`) into our pipeline:

In [None]:
#!fsharp

let nullList: System.Collections.Generic.List<FootballResult> = null

List.ofSeq(nullList)
|> List.filter(fun result -> win result = 1)
|> List.countBy(fun result -> result.AwayTeam)
|> List.sortByDescending(fun (_, sum) -> sum)
|> List.map(fun (awayTeam, sum) -> $"{awayTeam}: {sum} wins")

Error: System.ArgumentNullException: Value cannot be null. (Parameter 'source')
   at Microsoft.FSharp.Collections.SeqModule.ToList[T](IEnumerable`1 source)
   at <StartupCode$FSI_0063>.$FSI_0063.main@()

On page 182, Isaac mentions that he does not use `Seq` that much. We can see above why: `List.ofSeq` is a way to convert `System.Collections.Generic.List<T>` into a genuine, immutable F#. `Seq` can be thought of as only existing for such a conversion. 

## F# arrays

The list `three` bound above can be made into an array with `|`:

## collection operators

Isaac Abraham calls out the following operators [📖 [docs](https://docs.microsoft.com/en-us/dotnet/fsharp/language-reference/symbol-and-operator-reference/#symbols-used-in-tuple-list-array-unit-expressions-and-patterns)]:

- `..` —generating a list (range operator)
- `::` —pre-pending to a list
- `@` —concatenating two lists

In [None]:
#!fsharp

let three = [ 1; 2; 3 ]

let four = 0 :: three

let numbers = four @ [ 4 .. 10 ]

printf "%A" numbers

[0; 1; 2; 3; 4; 5; 6; 7; 8; 9; 10]

In [None]:
#!fsharp

let threeA = [| 1; 2; 3 |]

let fourA = [| 0 |] |> Array.append threeA

let numbersA = [ 4 .. 10 ] |> List.toArray |> Array.append fourA

printf "%A" numbersA

[|1; 2; 3; 0; 4; 5; 6; 7; 8; 9; 10|]

F# arrays [📖 [docs](https://docs.microsoft.com/en-us/dotnet/fsharp/language-reference/arrays)] do not have the operators used on lists so things above get quite complicated. Based on commentary from Isaac Abraham, F# arrays seem to be in the language for high-performance interoperability:

>They’re high performance but ultimately mutable…


@[BryanWilhite](https://twitter.com/BryanWilhite)


In [None]:
#!about

0,1
,.NET Interactive© 2020 Microsoft CorporationVersion: 1.0.246201+da749355d416da20e634e5c80073b92356b57e0eBuild date: 2021-09-12T07:21:44.0000000Zhttps://github.com/dotnet/interactive
