## Collections 
![Collections Hierarchy](./images/collections.png)

## Iterable
The `Iterable` interface allows an object to be used a *foreach* target. Defined as:

In [None]:
public interface Iterable<T> {
    Iterator<T> iterator();

    // ...
}

In addition there are two default methods `forEach` and `spliterator`.  The focus however is the `iterator` method that returns `Iterator` - a class that provides a uniform way of accessing elements of a collection:

In [None]:
public interface Iterator<E> {
    boolean hasNext();
    E next();
    default void remove() {
        throw new UnsupportedOperationException("remove");
    }
}

The first two  help with iteration, whereas the last one helps altering the structure of collection while iterating. Note that this interface doesn't expose methods to add element while iterating. Specific implementation of `Iterator` like `ListIterator` does provide an add method. Example:

In [None]:
class NRandoms implements Iterable<Integer> {
    private final int COUNT;

    public NRandoms(int n) {
        this.COUNT = n;
    }

    @Override
    public Iterator<Integer> iterator() {
        return new Iterator<Integer>() {
            private int c;

            @Override
            public boolean hasNext() {
                return c < COUNT;
            }

            @Override
            public Integer next() {
                c++;
                return new Random().nextInt();
            }
        };
    }
}

for(Integer i : new NRandoms(5))
    System.out.println(i);

When structural change to a collection like adding and removing elements while iterating over the collection is required, iterator is directly used. Calling method like `remove` while iterating leads to `ConcurrentModificationException`:

In [None]:
List<String> cities = new ArrayList<>(List.of("Beijing","Shenzhen","Hangzhou","Shanghai","Guangzhou"));
for(String city : cities) {
    if(city.startsWith("S"))
        cities.remove(city);  // throws exception
}

This exception is thrown when the iterator detects that structural modification to the collection was made not by the methods exposed by the iterator. The motivation for this behaviour is to flag that some other thread might be changing the collection while a thread is iterating over it.

## Collection
Interface contains general purpose methods that we can expect from a container of collection of elements. To **add elements** to the collection, the following methods are available:

In [None]:
boolean add(E e);
boolean addAll(Collection<? extends E> c);

Based on the implementation, `add` can throw `UnsupportedOperationException` or throw `NullPointerException` when null elements are added. The return value indicates whether the contents of the collection changed. For example, adding duplicate element to a `Set` returns false.
The `addAll` method copies contents of one collection into another. Changing the original collecition should not impact the collection into which it was copied to.

Collection also provides facility to **remove elements** from a collection:

In [None]:
boolean clear();
boolean remove(Object o);
boolean removeAll(Collection<?> c);
boolean retainAll(Collection<?> c);

Remove methods uilise `equals` method for comparison and return true if the collection changed after the method execution. Similar to add, `UnsupportedOperationException` or `NullPointerException` are possibility with these operations.
One notable difference with the `add` methods is that these accept `Object` instead of generic type.

To **query contents**, the following methods are available:

In [None]:
boolean isEmpty();
int size();                           // Maximum value it can return is Integer.MAX_VALUE

boolean contains(Object o);           // Again note usage of Object instead of E
boolean containsAll(Collection<?> c);

`size` returning int can limit the number of elements in the collection in case of standard implementations like `ArrayList`, however custom implementations like *FastUtil* exist that overcome this by deprecating `size` and providing alternative method to return size. `contains` like `remove` utilises `equals`.

Collection exposes few methods to convert **collection to array**:

In [None]:
Object[] toArray();
<T> T[] toArray(T[] t);

The second variant uses the supplied array to copy elements into (overriding contents), or returning a new one if the array is not large enough:

In [1]:
List<Integer> l1 = List.of(1,2,3,4);
Integer[] i1 = new Integer[]{-1,-2,-3,-4,-5};
l1.toArray(i1);

System.out.print(Arrays.toString(i1));

[1, 2, 3, 4, null]

In [3]:
Integer[] i2 = new Integer[]{-1,-2,-3};
l1.toArray(i2);

System.out.print(Arrays.toString(i2));

[-1, -2, -3]

In [None]:
l1.toArray(new Integer[0]);
l1.toArray(Integer[]::new);  // same as previous

One pecularity of the method signature is that the type `T` is completely unrelated to type `E` of collection elements This shifts potential errors to runtime:

In [4]:
List.of(1,2,3).toArray(String[]::new); // compiles fine

EvalException: arraycopy: element type mismatch: can not cast one of the elements of java.lang.Object[] to the type of the destination array, java.lang.String

One possibility to fix this is to define `toArray` as `<T super E> T[] toArray(T[] a)`. But unlike `extends`, Java doesn't allow `super` in type variables. Another alternative way is to define a static method like:

In [None]:
<A, B extends A> A[] toArray(Collection<B> c, A[] a) {
    return c.toArray(a);
}

Since generic arrays are involved, primitive arrays cannot be passed to `toArray`:

In [None]:
int[] nums = List.of(1,2,3).toArray(new int[0]);  // compile error

## SequencedCollection
![Sequenced Collection](./images/sequenced_collections.png)  
Introduced in Java 21, `SequencedCollection` is the common class for unrelated collection classes that have a well defined order. Thus, `List`, `NavigableSet`, `Deque` have this class as a common ancestor. The class introduces the following methods:

In [None]:
SequencedCollection<E> reversed();
// Below two methods do not return boolean unlike add. Why? Because it is modeled on
// Deque where addFirst and addLast already existed and always succeeds.
// For LinkedHashSet - what would the boolean true represent? Element was added OR element
// existed but was moved to last? It is kind of ambigous
default void addFirst(E e) {
    throw new UnsupportedOperationException();
}
default void addLast(E e) {
    throw new UnsupportedOperationException();
}
// Below two methods throws NoSuchElementException if the collection is empty
default E getFirst() {
    return this.iterator().next();
}
default E getLast() {
    return this.reversed().iterator().next();
}
// also removeFirst and removeLast

Not all the methods defined here make sense for all the child classes. For example, `NavigableSet` doesn't override the `addFirst` and `addLast` returning exception, since this set is internally sorted.

`reversed` provides a reverse order view of the collection. Any modification to the view gets transmitted to the underlying collection; the reverse may not be applicable.

In [5]:
NavigableSet<String> cities = new TreeSet<>();
cities.add("Tokyo"); cities.add("Osaka"); cities.add("Hiroshima");

NavigableSet<String> citiesReversed = cities.reversed();
citiesReversed.removeFirst();

System.out.print(cities);

[Hiroshima, Osaka]

## Set
Collection that cannot contain duplicates, it has same set of methods as collection, but has been redefined to emphasis `Set` specific behaviours. It additinally introduces `equals` and `hashCode`. As per contract, two sets are equal if they have the same size and same elements. A set determines two elements are duplicate based on *equivalence relationship* which could be:
- `equals` like in `HashSet`
- identity like in `EnumSet`
- Using custom `Comparator` or `Comparable` like in `NavigableSet`

3 different mechanisms mean you can have two elements in the same set for which `equals` are equal but `compare` is not:

In [None]:
 class Color implements Comparable<Color> {
     public String name;
     public int r,g,b;
     public Color(String name, int r, int g, int b) {
         this.name= name;
         this.r = r;
         this.g = g;
         this.b = b;
     }
     public boolean equals(Object o) {
         if(o instanceof Color othr) {
             return this.name.equals(othr.name);
         }
         return false;
     }
     public int compareTo(Color othr) 
         return (othr.r + othr.g + othr.b) - (this.r + this.g + this.b);
    }
}

Color c1 = new Color("red", 255,0,0);
Color c2 = new Color("red", 254,0,0);
c1.equals(c2); // true

TreeSet<Color> colors = new TreeSet<>();
colors.add(c1); // true, added to set
colors.add(c2); // true, added to set

### HashSet
Is the most commonly used set implementation which internally uses a hash map implementation. Therefore the performance characteristics is same as a `HashMap` - meaning for lightly loaded `HashSet`, `add`, `remove` and `contains` take $O(1)$ time. `HashSet` is not thread safe.

### CopyOnWriteArraySet
Implementation uses `CopyOnWriteArrayList` internally therefore performance characteristic match that of an `ArrayList`. `add`, `remove` and `contains` take $O(n)$ time. The advantage over `HashSet` is that this variant provides thread safety - read operation doesn't need locking and is very fast. Write operation requires locks and is therefore expensive.

`CopyOnWriteArraySet` should be used when set size is relatively small, read operations are far more frequent than writes and thread safety is required. Its iterator provides a snapshot of how the set was when the `Iterator` was constructed. No iteration required while list traversal.

### EnumSet
A set that can be used to store subset of an enum members. The maximum number of elements is therefore the number of enum instance specified.

In [6]:
enum Planets { MERCURY, VENUS, EARTH, MARS, JUPITER, SATURN, NEPTUNE, URANUS }

EnumSet.of(Planets.MERCURY, Planets.MARS);    // of() creates modifiable collection in this instance

[MERCURY, MARS]

In [7]:
EnumSet.range(Planets.JUPITER, Planets.NEPTUNE);

[JUPITER, SATURN, NEPTUNE]

In [8]:
EnumSet.allOf(Planets.class);

[MERCURY, VENUS, EARTH, MARS, JUPITER, SATURN, NEPTUNE, URANUS]

Operations `add`, `remove` and `contains` all take $O(1)$ time. A simplistic implementation is given below, Java's implementation is more sophisticated and does bitwise operations on a long.

In [None]:
class SimpleEnumSet<T extends Enum<T>> {
    private final boolean[] elements;
    
    public ESet(Class<T> enumClass) {
        elements = new boolean[enumClass.getEnumConstants().length];
    }

    public boolean add(T element) {
        int pos = element.ordinal();
        if (elements[pos]) {
            return false;
        } else {
            elements[pos] = true;
            return true;
        }
    }

    public boolean contains(T element) {
        return elements[element.ordinal()];
    }

    public boolean remove(T element) {
        int pos = element.ordinal();
        if (elements[pos]) {
            elements[pos] = false;
            return true;
        } else {
            return false;
        }
    }
}

### LinkedHashSet
A set that maintains insertion order on iteration:

In [9]:
LinkedHashSet<String> states = new LinkedHashSet<>();
states.add("California"); states.add("Nevada"); states.add("Texas"); states.add("Vermont");
states;

[California, Nevada, Texas, Vermont]

In [11]:
states.addLast("Vermont");
states;

[California, Nevada, Texas, Vermont]

Like `HashSet`, `LinkedHashSet` also has an underlying hash map.. Iterating is faster than `HashSet` though, since elements are all linked together.

### NavigableSet
Provide gurantee that its `Iterator` would traverse the set in sorted order.

In [12]:
record Car(String make, float bhp) {}

NavigableSet<Car> cars = new TreeSet<>((c1, c2) -> Float.compare(c2.bhp(), c1.bhp()));
cars.add(new Car("Veyron", 1001.0f));
cars.add(new Car("911 Turbo", 641.0f));
cars.add(new Car("918 Spyder", 887.0f));
cars;

[Car[make=Veyron, bhp=1001.0], Car[make=918 Spyder, bhp=887.0], Car[make=911 Turbo, bhp=641.0]]

In [13]:
// Remove and return elements
System.out.println(cars.pollFirst());
System.out.println(cars.pollLast());
cars;

Car[make=Veyron, bhp=1001.0]
Car[make=911 Turbo, bhp=641.0]


[Car[make=918 Spyder, bhp=887.0]]

If a `Comparator` is not provides, it is assumed that elements are instance of `Comparable`.

`SortedSet`, the parent of `NavigableSet` provides certain methods that establish view into the collection:

In [16]:
SortedSet<String> trees = new TreeSet<>();
trees.add("Oak"); trees.add("Maple"); trees.add("Eucalyptus"); trees.add("Banyan");

System.out.println(trees.subSet("Banyan", "Maple"));
System.out.println(trees.headSet("Maple"));
System.out.println(trees.tailSet("Eucalyptus"));

[Banyan, Eucalyptus]
[Banyan, Eucalyptus]
[Eucalyptus, Maple, Oak]


Adding to the view may pass or may fail with "IllegalArgumentException: key out of range" exception. Changing the origianl list also updates the view automatically.

**`TreeSet`** is the implementation of `NavigableSet` provided by Java. `TreeSet` internally uses *red-black tree*. `add`, `contains` and `remove` are executed in $O(\log{n})$ time.

### ConcurrentSkipList
Backed by [SkipList](https://en.wikipedia.org/wiki/Skip_list), this set provides thread safe and sorted set . Operations `add`, `remove` and `contains` take $O(\log{n})$ time.