> Enterprise Web C#

# Chapter 3 - Big O(h) Oh collections

## Introduction

LINQ stands for **L**anguage **IN**tegrated **Q**uery. It's a query language integrated in C# providing an SQL-like interface for a variety of **data sources**. Because it's completely integrated, it features IntelliSense and compile time checking amongst other features provided by the compiler.

### Data source

LINQ is defined for a number of data sources, but what is a data source? A data source is the data that is being queried and which has a LINQ provider. There are a number of LINQ providers:

- LINQ to Objects
  - querying in-memory data sources like strings, arrays, collections...
  - mainly the focus in this chapter
- LINQ to SQL
  - querying data in a relational database from C#
- LINQ to Entities
  - similar to LINQ to SQL but using [Entity Framework](https://docs.microsoft.com/en-us/ef/)
  - will be covered in a later chapter

Once you master the LINQ syntax, you can query any data source which provides a LINQ interface:
- LINQ to
  - Google, Twitter, eBay, Amazon, Fliqr...
  - XML, JSON...
  - MySQL, Oracle...
  - Excel, Word...
  - JavaScript...
  - ...

<br />

## LINQ syntax

There are two types of LINQ syntax: **query syntax** or **method syntax**.

### Query syntax

```cs
var query = from c in customerList
            where c.CustomerId == customerId
            select c;
```

The query syntax is a declarative and builtin syntax. The compiler will translate the query syntax to method syntax at compile time. Fewer LINQ operators are available when using this syntax.

### Method syntax

```cs
var query = customerList.Where(c => c.CustomerId == customerId);
```

The method syntax uses methods to query a data source. It's a part of the .NET framework: the `System.Linq` namespace in the `System.Core` assembly.

<br />

## LINQ to Objects

### Extension methods

Extension methods are methods added to an existing class to extend this class' functionality. This functionality can be added without creating a subclass, modifying the original class or even recompiling the original class. They don't differ from instance methods (methods declared in the class itself).

An extension method is declared static within a non generic static class. The first parameter must be preceded by the keyword `this`. The type of this first parameter is the type that is being extended. All parameters other parameters must be given when using the method.

```cs
public static class IntExtension
{
  public static bool IsEven(this int i)
  {
    return i % 2 == 0;
  }
}
```

This extension method has no parameters and can be called on any int: `10.IsEven()`.

```cs
public static class IntExtension
{
  public static bool Add(this int i, int number)
  {
    return i + number;
  }
}
```

This extension method has one parameters and can be called on any int: `10.Add(2)`.

An extension method has no access to private member of the class for which it's declared. The extension method cannot override an instance method (same name and arguments), the compiler will always choose the instance method.

Extension methods are best declared in another namespace named `Extensions`. This way you must explicitly write `using Extensions;` in order to use the extensions.

This feature cannot be tested in .NET Interactive.... For demos and exercises, please use Visual Studio. An example can be found [here](https://github.com/HOGENT-Web/csharp-ch-4-example-1).
        



<br />

### Querying in-memory data sources

LINQ methods are extensions methods defined on `IEnumerable<T>`. These methods can be used on any type that implements `IEnumerable<T>`, e.g. arrays or generic collections (`List<T>`, `Queue<T>`, `Stack<T>`...). The LINQ extensions belong to the `System.Linq` namespace, which should be imported when using LINQ: `using System.Linq;`.

`Enumerable` is a static class which contains all LINQ extension methods (`this` is of type `IEnumerable<T>`). In the next sections, the most important methods will be discussed, other methods can be found in the [documentation](https://docs.microsoft.com/en-us/dotnet/api/system.linq.enumerable?view=net-5.0#methods).

#### Sum

Returns the sum of a given collection of numbers.

In [None]:
using System.Linq;

int[] numbers = new int[] { 2, 8, 10 };
Console.WriteLine($"The sum of the aray is {numbers.Sum()}");

List<int> numbersList = new List<int> { 2, 8, 10 };
Console.WriteLine($"The sum of the list is {numbersList.Sum()}");

HashSet<double> numbersSet = new HashSet<double> { 2.5, 8.4, 10.6 };
Console.WriteLine($"The sum of the set is {numbersSet.Sum()}");

Stack<float> numbersStack = new Stack<float>();
numbersStack.Push(1.5F);
numbersStack.Push(2.6F);
Console.WriteLine($"The sum of the stack is {numbersStack.Sum()}");

#### Average

Returns the average of a given collection of numbers.

In [None]:
List<int> numbersList = new List<int> { 2, 8, 10 };
Console.WriteLine($"The average of the list is {numbersList.Average():0.00}");

#### Count

Returns the number of items in a collection.

In [None]:
List<int> numbersList = new List<int> { 2, 8, 10 };
Console.WriteLine($"The number of items in the list is {numbersList.Count()}");

#### Min/Max

Returns the minimum/maximum of a given list.

In [None]:
List<int> numbersList = new List<int> { 2, 8, 10 };
Console.WriteLine($"The minimum of the list is {numbersList.Min()}");
Console.WriteLine($"The maximum of the list is {numbersList.Max()}");

<br />

## Lambda expressions

Lambda expressions are **anonymous**, **inline** functions. It's a short form for writing a function. Lambda expressions use the lambda operator: `=>`. Lambda's must always return a value, so the return type cannot be `void`. Lambda's are used frequently when using LINQ.

You can store a lambda within a variable with type `Func<Param, ReturnType>` where `Param` is the type of the parameter and `ReturnType` is the return type. You can add from zero up to 16 parameters as seen in the [documentation](https://docs.microsoft.com/en-us/dotnet/api/system.func-17?view=net-5.0). The `Func` type can also be used to store normal functions in a variable.

Converting a function to a lambda is pretty easy:

In [None]:
int NrOfChars(string s)
{
  return s.Length;
}

string hello = "Hello world!";
Console.WriteLine($"{hello} has {NrOfChars(hello)} characters");

// Simply store the function
Func<string, int> myFunction = NrOfChars;

Console.WriteLine($"{hello} has {myFunction(hello)} characters");

// Or as a lambda
Func<string, int> myLambda = (s) => s.Length;

Console.WriteLine($"{hello} has {myLambda(hello)} characters");

As you can see the lambda is defined on-the-fly and has no name, it's an anonymous function. This lambda is equivalent to the original function `NrOfChars`.

An example of a lambda with more than one parameter:

In [None]:
Func<int, int, int, int> Sum = (x, y, z) => x + y + z;

int x = 1, y = 2, z = 3;
Console.WriteLine($"{x} + {y} + {z} = {Sum(x, y, z)}");

Or with no parameters:

In [None]:
Func<string> HelloWorld = () => "Hello world!";

Console.WriteLine(HelloWorld());

Lambda's are frequently used as a parameter of LINQ methods. They usually provide the information to work with.

For example, let's calculate the sum of the distances to some locations. We tell the `Sum` function which attribute to use when summing a list of objects, as object cannot be added (without operator overloading).

Behind the scenes the LINQ method will use a `foreach` to loop over all locations. For each location it'll call the given lambda and add the resulting number to the sum. In the end the total will be returned.

In [None]:
class Location
{
  public string Country { get; set; }
  public string City { get; set; }
  public int Distance { get; set; }

  // Lambda's in classes ^^
  public override string ToString()
  {
    return $"{City} in {Country} at {Distance} miles distance";
  }
}

List<Location> locations = new List<Location> {
  new Location { City = "London", Distance = 4789, Country= "UK" },
  new Location { City = "Amsterdam", Distance = 4869, Country= "NL" },
  new Location { City = "San Francisco", Distance = 684, Country= "USA" },
  new Location { City = "Las Vegas", Distance = 872, Country= "USA" },
  new Location { City = "Boston", Distance = 2488, Country= "USA" },
  new Location { City = "Raleigh", Distance = 2363, Country= "USA" },
  new Location { City = "Chicago", Distance = 1733, Country= "USA" },
  new Location { City = "Charleston", Distance = 2421, Country= "USA" },
  new Location { City = "Helsinki", Distance = 4771, Country= "Finland" },
  new Location { City = "Nice", Distance = 5428, Country= "France" },
  new Location { City = "Dublin", Distance = 4527, Country= "Ireland" }
};

Console.WriteLine($"The sum of all distances is {locations.Sum(loc => loc.Distance)}");

With lambda's you can really do complex things, like counting the number of items that satisfy a given condition.

In [None]:
int[] numbers = new int[] { 5, 1, 18, 11, 3, 6, 19, 17, 4, 10 };

Console.WriteLine($"There are {numbers.Count(n => n > 8)} numbers bigger than 8");

<br />

## Deferred execution

Not all LINQ methods are executed immediately, some executions are delayed until it can no longer be delayed. Methods which use **deferred execution** will only execute the LINQ methods when the resulting collection is iterated through. All methods which return an `IEnumerable<T>` (or an `IOrderedEnumerable<T>`) use deferred execution, methods which don't return these types use **immediate execution**.

Examples of methods which use immediate execution are `Sum`, `Count`, `Average`, `ToList` and `ToArray`. Examples of methods which use deferred execution are `Where`, `Select` and `OrderBy`. These are simply some examples, many more exist.

## Filtering

`Where` can be used to filter a collection based on a **predicate** (= condition). The predicate is given by the lambda parameter, the result is an `IEnumerable` containing all elements satisfying the given predicate.

The predicate can be any boolean expression, just write a lambda which returns a boolean and `Where` is happy.

How does `Where` work? When looping over the resulting `IEnumerable` the original will be looped over. Per element the given lambda is executed and if the result is `true`, the item is returned. The resulting `IEnumerable` will thus contain all matching elements.

The next example will filter out all cities whose names contain more than 5 characters. Try to guess the output when the collection is changed before looping over.

In [None]:
string[] cities = {
  "London", "Amsterdam", "San Francisco", "Las Vegas", "Boston", "Raleigh", "Chicago",
  "Charlestown", "Helsinki", "Nice", "Dublin"
};

IEnumerable<string> citiesWithlongNames = cities.Where(c => c.Length > 5);

Console.WriteLine("Cities with more than 5 characters:");
foreach (var city in citiesWithlongNames)
  Console.WriteLine(city);


In [None]:
// Let's do this again
citiesWithlongNames = cities.Where(c => c.Length > 5);

// But change the first city before looping over the IEnumerable
// Guess the output
cities[0] = "Oostende";

Console.WriteLine("Cities with more than 5 characters:");
foreach (var city in citiesWithlongNames)
  Console.WriteLine(city);

// Did the output match you expectations?

In [None]:
// Let's change the first one another time
cities[0] = "Brussel";

Console.WriteLine("Cities with more than 5 characters:");
foreach (var city in citiesWithlongNames)
  Console.WriteLine(city);

// Did the output match you expectations?

In [None]:
// Let's add another condition
IEnumerable<string> citiesFiltered = cities.Where(c => c.Length > 5 && c.Contains("a"));

foreach (var city in citiesFiltered)
  Console.WriteLine(city);

As you can see, the query is executed again every time the collection is looped over. That's deferred execution.

**Caution!** When the source collection is `null`, an `ArgumentNullException` will be thrown.

In [None]:
string[] cities = null;

IEnumerable<string> citiesWithlongNames = cities.Where(c => c.Length > 5);

foreach (string city in citiesWithlongNames)
  Console.WriteLine(city);

## Ordering

`OrderBy` and `OrderByDescending` can be used to sort a collection ascending/descending. If one wants to sort by multiple things, you must chain the `ThenBy` and `ThenByDescending` methods. To reverse the result, the method `Reverse` can be used.

In [None]:
string[] cities = {
  "London", "Amsterdam", "San Francisco", "Las Vegas", "Boston", "Raleigh", "Chicago",
  "Charlestown", "Helsinki", "Nice", "Dublin"
};

// c => c means to sort by the string itself
IEnumerable<string> orderedPlaces = cities.OrderBy(c => c);

Console.WriteLine("Sorted city names:");
foreach (var city in orderedPlaces)
  Console.WriteLine(city);

In [None]:
orderedPlaces = cities.OrderByDescending(c => c.Length).ThenBy(c => c);

Console.WriteLine("Sorted city names (by length and name):");
foreach (var city in orderedPlaces)
  Console.WriteLine(city);

## Chaining extension methods

As you see in the example above, LINQ methods can be chained one after the other.

In [None]:
string[] cities = {
  "London", "Amsterdam", "San Francisco", "Las Vegas", "Boston", "Raleigh", "Chicago",
  "Charlestown", "Helsinki", "Nice", "Dublin"
};

IEnumerable<string> orderedPlaces = cities.Where(c => c.Length > 5).OrderBy(c => c);

Console.WriteLine("Sorted list of city names longer than 5 characters");
foreach (var city in orderedPlaces)
  Console.WriteLine(city);


In [None]:
string[] cities = {
  "London", "Amsterdam", "San Francisco", "Las Vegas", "Boston", "Raleigh", "Chicago",
  "Charlestown", "Helsinki", "Nice", "Dublin", "San Anselmo", "San Diego", "San Mateo"
};

IEnumerable<string> selectedCities = cities
  .Where(c => c.StartsWith("S") && c.Length > 5)
  .OrderByDescending(c => c.Length)
  .ThenBy(c => c);

Console.WriteLine("Cities starting with an S, longer than 5 characters and sorted by length and name");
foreach (var city in selectedCities)
  Console.WriteLine(city);

## Map collections into other collections

The `Select` method allows to transform each element of a collection into another type. This type can be equal to the original one, can be an existing type or even an anonymous type (no type defined).

How does `Select` work? It loops through the collection using a `foreach`. For every element, the lambda is called and the result is added to a new collection. The latter collection is the return value.

Some examples:

In [None]:
int[] numbers = new int[] { 5, 1, 18, 11, 3, 6, 19, 17, 4, 10 };

// Increment all numbers by one
IEnumerable<int> newNumbers = numbers.Select(n => n + 1);

Console.WriteLine("List of numbers mappped:");
foreach (int i in newNumbers)
  Console.WriteLine(i);


In [None]:
string[] cities = {
  "London", "Amsterdam", "San Francisco", "Las Vegas", "Boston", "Raleigh", "Chicago",
  "Charlestown", "Helsinki", "Nice", "Dublin"
};

// Converting each city to its length
IEnumerable<int> lengths = cities.Select(c => c.Length);

Console.WriteLine("Lengths of cities:");
foreach (int i in lengths)
  Console.WriteLine(i);

In [None]:
// Convert the previously defined list of locations to strings
IEnumerable<string> places = locations.Select(l => l.City);

Console.WriteLine("Cities of locations:");
foreach (string i in places)
  Console.WriteLine(i);

Let's now convert our `Location` objects to a new type called `CityDistance`.

In [None]:
class CityDistance
{
  public string Country { get; set; }
  public string Name { get; set; }
  public int DistanceInKm { get; set; }

  public override string ToString()
  {
    return $"{Name} in {Country} at {DistanceInKm} km distance";
  }
}

IEnumerable<CityDistance> cityDistances = locations.Select(
  l => new CityDistance
  {
    Name = l.City,
    Country = l.Country,
    DistanceInKm = (int)(l.Distance * 1.61)
  }
);

foreach (var cd in cityDistances)
  Console.WriteLine(cd);

Sometimes it's easier to let the compiler determine the type of LINQ queries, so you should try and use `var` as the "variable type" instead of the explicit type. When using `var`, the compiler will determine the type of the variable and IntelliSense will work as expected.

In [None]:
var cities = cityDistances.Select(c => c.Name); // IEnumerable<string>

foreach (var c in cities)
  Console.WriteLine(c);

### Anonymous types

An anonymous type is a type without a class definition, you create this on-the-fly (like a lambda). In order to define an anonymous type, you should use `var` as the variable type. The type is determined by the properties listed, these properties are **read-only**.

These type are commonly used when transforming collections to objects which contain a subset of the properties of the original elements. So there is no need to define a class for each of these subtypes because you only need them once.

In [None]:
// No class definition, just the properties
var homeTown = new
{
  Name = "Oostende",
  NrOfInhabitans = 60000
};

Console.WriteLine(homeTown);

In [None]:
// You could also create a type like this, without a class CityDistance
var cityDistances = locations.Select(
  l => new
  {
    Name = l.City,
    Country = l.Country,
    DistanceInKm = (int)(l.Distance * 1.61)
  }
);

foreach (var cd in cityDistances)
  Console.WriteLine(cd);