### Streams
Java Streams API lets us manipulate collections of data in a declarative way. One effect is that the code size can be significantly reduced as the below example shows:

In [5]:
int[] numbers = {5, 7, 2, 9, 8, 1, 7};

// Copy the above array since we don't want to mutate it
int[] result = Arrays.copyOf(numbers, numbers.length);

// Sort it
Arrays.sort(result);

// Get sum of lowest 3
int sum = 0;
for(int i=0; i<3; i++){
    sum += result[i];
}

System.out.println(sum);

8


The same operation can be written in a concise manner using Streams

In [4]:
int[] numbers = {5, 7, 2, 9, 8, 1, 7};

int sum = Arrays.stream(numbers)
                .sorted()
                .limit(3)
                .sum();

System.out.println(sum);

8


### Building Streams

Object streams are the general kind of stream. To create an object stream:

In [None]:
// From values
Stream<String> cityStream = Stream.of("Los Angeles", "Paris", "Tokyo", "Berlin");

// From arrays
String[] cities = new String[] { "Los Angeles", "Paris", "Tokyo", "Berlin" };
Stream<String> sameCityStream = Arrays.stream(cities);

// From collections
List<String> cityList = Arrays.asList(cities);
Stream<String> anotherCityStream = cityList.stream();

We have specialised Streams too: `IntStream`, `LongStream` and `DoubleStream` .  
![Stream Inheritance](https://i.stack.imgur.com/uI6XA.png)  

The stream method of Arrays has multiple overloaded versions as listed below (only a subset):
- `stream(int[] array)` returns `IntegerStream`
- `stream(double[] array)` returns `DoubleStream`
- `stream(T[] array)` returns `Stream<T>`

In [None]:
int[] primes = new int[] { 2, 3, 5, 7, 11 };
IntStream primeStream = Arrays.stream(primes);

// list's stream method returns object streams
// because list itself is a collection of objects

To create the specialized streams we can use:

In [None]:
DoubleStream doubleSteam = DoubleStream.of(3.56, 2.91, 8.314);

LongStream longStream = LongStream.range(1, 101);

We can also convert an object stream to specialised stream

In [None]:
Stream<Integer> numbers = Stream.of(1, 4, 7, 8, 0, -5);
IntStream integerNumbers = numbers.mapToInt(i -> i);

// mapToLong and mapToDouble are the other two methods

### Intermediate Operations
Intermediate operations on stream return stream. These operations can be chained together and are lazy evaluated (when a terminal operation is called).

| Operation 	| Description 	|
|-	|-	|
| sorted() 	| Returns a stream consisting of the elements of this stream, sorted according to natural order. 	|
| skip(long n) 	| Returns a stream consisting of the remaining elements of this stream after discarding the first n elements of the stream. 	|
| peek(Consumer<? super T> action) 	| Returns a stream consisting of the elements of this stream, additionally performing the provided action on each element as elements are consumed from the resulting stream. 	|
| limit(long maxSize) 	| Returns a stream consisting of the elements of this stream, truncated to be no longer than maxSize in length. 	|
| distinct() 	| Returns a stream consisting of the distinct elements (according to Object.equals(Object)) of this stream. 	|
| filter(Predicate<? super T> predicate) 	| Returns a stream consisting of the elements of this stream that match the given predicate. 	|
| map(Function<? super T,? extends R> mapper) 	| Returns a stream consisting of the results of applying the given function to the elements of this stream. 	|
| flatMap(Function<? super T,? extends Stream<? extends R>> mapper) 	| Returns a stream consisting of the results of replacing each element of this stream with the contents of a mapped stream produced by applying the provided mapping function to each element. 	|

The lazy evalutaion characteristic can be seen in the below example:

In [None]:
import java.util.stream.Stream;

// Nothing printed
Stream.of("d2", "a2", "b1", "b3", "c")
    .filter(s -> {
        System.out.println("filter: " + s);
        return true;
    });

Some operations can also be combined together while evaluating all the chained operations.

### Terminal Operations
These operations typically return a single value. Some examples:

In [9]:
// Matching terminal operations, return true or false
List<String> movies = List.of("One flew over the cuckoo's nest", "To kill a mockingbird", "Gone with the wind");

// Does any element in stream match the condition?
System.out.println(movies.stream().anyMatch(s -> s.startsWith("To")));

// Do all elements in stream match the condition?
System.out.println(movies.stream().allMatch(s -> s.startsWith("One")));

// Do none of the elements in stream match the condition?
System.out.println(movies.stream().noneMatch(s -> s.startsWith("Once")));

true
false
true


In [15]:
// Minimum, maximum, count
List<Integer> integers = List.of(12, 22, 45, 65, 5, 87);

// max() : Provide a Comparator, returns an Optional, since stream could have
// been empty
integers.stream().max((a, b) -> a - b).ifPresent(System.out::println);

// Counting
System.out.println(integers.stream().count());

87
6


`reduce` is one of the most important terminal operations. There are two overloaded variants:
- `reduce(BinaryOperator<T> accumulator)` returns `Optional<T>`
- `reduce(T identity, BinaryOperator<T> accumulator)` returns `T`

The first argument of the accumulator is the intermediate result, and the second argument is the stream element.

The below sequence of operations occur in case we use the version with identity element:
```
      1    5   ...
      |    |
IE - op - op - ... 
```

If we use the first version, three cases are possible:
- No element in the stream: return `Optional.empty()`
- One element: just return the element without applying the accumulator at all.
- Two or more elements: apply the accumulator to all of them and return the result.

In [19]:
// Sum all the elements
integers.stream().reduce((a, b) -> a + b).ifPresent(System.out::println);

// Using below class for subsequent examples
class Person {
    String name;
    int age;

    Person(String name, int age) {
        this.name = name;
        this.age = age;
    }

    @Override
    public String toString() {
        return name;
    }
}

List<Person> persons = Arrays.asList(new Person("Max", 18), new Person("Peter", 23), new Person("Pamela", 23), new Person("David", 12));

// Person with maximum age
persons.stream().reduce((p1, p2) -> p1.age > p2.age ? p1 : p2).ifPresent(System.out::println);

// Get combined name
System.out.println(persons.stream().reduce(new Person("", 0), (p1, p2) -> { p1.age += p2.age; p1.name += p2.name; return p1;}));

236
Pamela
MaxPeterPamelaDavid


`IntStream`, `LongStream`, `DoubleStream` provide some other terminal operations like `sum`, `average`, `summaryStatistics`.

The `collect` terminal operator is one of the most important terminal operator. With collect operation, we can use the result to form a collection, map, set, etc. We make use of the many static methods of the `Collectors` class.

In [None]:
// Collect to a list
List<Integer> ageList = persons.stream().map(p -> p.age).sorted().collect(Collectors.toList()); // ArrayList

// Collect to an unmodifiable list
// nothing can be added after the list has been formed
List<Integer> immutableAgeList = persons.stream().map(a -> a.age).collect(Collectors.toUnmodifiableList());

// Collect to a Set
Set<String> nameSet = persons.stream().map(p -> p.name).collect(Collectors.toSet());

// There is also a unmodifiable set counterpart

In order to get a map, use the `toMap` collector.

In [None]:
// Name -> Age map
Map<String, Integer> nameAgeMap = persons.stream().collect(Collectors.toMap(a -> a.name, a -> a.age));

// There is also a unmodifiable map counterpart

To join values, use the `joining` collector. It provides the nice benefit that joins are done between two elements.

In [None]:
// Join together as string
String s = persons.stream().map(p -> p.name).collect(Collectors.joining(", ")); // No comma at the end!

// The same can be achieved using reduce, but is a bit clunky
String[] s_ = { "" };
persons.stream().map(p -> p.name).reduce((a, b) -> a + ", " + b).ifPresent(a -> s_[0] = a);

Using `partitioningBy` we can partition our stream into two segments. The two partitions are true and false partition and depend upon the condition we specify.

In [None]:
Map<Boolean, List<Person>> agePartition = persons.stream().collect(Collectors.partitioningBy(p -> p.age > 30));

For multiple partitions, we can use `groupingBy`. By grouping, we are essentially creating buckets.

In [None]:
// Name -> person list
Map<Integer, List<Person>> ageToPersonMap = persons.stream()
    .collect(Collectors.groupingBy((p) -> p.age));

// Name -> Age list
// We employ an overloaded version of groupingBy which accepts another Collector
Map<String, List<Integer>> nameAgeGroup = persons.stream()
    .collect(Collectors.groupingBy(p -> p.name, Collectors.mapping(p -> p.age, Collectors.toList()))); 
// Why do we need toList collector in above case? Because the value is a list, if we want a set, we
// can use toSet collector

// Name -> name count
Map<String, Long> nameFrequency = persons.stream()
    .collect(Collectors.groupingBy(p -> p.name, Collectors.counting()));
// What if we want Integer and not Long?
Map<String, Integer> nameFrequency_ = persons.stream()
    .collect(Collectors.groupingBy(p -> p.name, Collectors.collectingAndThen(Collectors.counting(), Long::intValue)));

### Optional
Consider the classes below:

In [None]:
public class Person {
    private Car car;
    public Car getCar() { return car; }
}

public class Car {
    private Insurance insurance;
    public Insurance getInsurance() { return insurance; }
}

public class Insurance {
    private String name;
    public String getName() { return name; }
}

Now if a person does not have a car, calling `p.getCar()` would return null. Acting upon a null reference can lead to program exceptions. A solution to this is to introduce defensive null checking

In [None]:
// Too much nesting
public string getCarInsuranceName(Person p){
    if(p != null){
        Car c = p.getCar();
        if(c != null){
            Insurance i = c.getInsurance();
            if(i != null){
                return i.getName();
            }
        }
    }
    
    return 'Unknown';
}

Another way would be:

In [None]:
// Too many repetitions
public string getCarInsuranceName(Person p){
    if(p == null){
        return 'Unknown';
    }
    
    Car c = p.getCar();
    if(c == null){
        return 'Unknown';
    }
    
    Insurance i = c.getInsurance();
    if(i == null){
        return 'Unknown';
    }
    
    return i.getName();
}

Using `Optional`, we can model our previous classes like:

In [None]:
public class Person {
    private Optional<Car> car;
    public Optional<Car> getCar() { return car; }
}

public class Car {
    private Optional<Insurance> insurance;
    public Optional<Insurance> getInsurance() { return insurance; }
}

public class Insurance {
    private String name;
    public String getName() { return name; }
}

The use of Optional enriches the semantics of your model. The fact that a person references an `Optional<Car>`, and a car an `Optional<Insurance>`, makes it explicit in the domain that a person might or might not own a car, and that car might or might not be insured. The name of insurance is not optional, thereby signalling that it is a mandatory field.

In [None]:
// Presence of no car
Optional<Car> noCar = Optional.empty();

// Presence of car
Optional<Car> car = Optional.of(new Car());

To get the car from an optional,

In [None]:
Car c = car.get()

But, the get method return `NoSuchElementException`, for which we would have to add try-catch block. So this brings us back to where we started!

### Parallel Streams
We can get a parallel stream by:

In [2]:
import java.util.stream.Stream;

// Converting existing sequential stream
Stream<Integer> stream = Stream.of(1, 2, 3, 4, 5, 6, 7, 8, 9, 10);
stream.parallel().forEach(x -> System.out.print(x + " "));

7 4 5 1 2 10 9 3 8 6 

In [4]:
// Creating a parallel stream
Integer[] numbers = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
List<Integer> integerList = Arrays.asList(numbers);
Stream<Integer> stream = integerList.parallelStream();
stream.parallel().forEach(x -> System.out.print(x + " "));

3 1 2 9 4 8 6 7 5 10 

Using streams, the structure of concurrent code is same as the sequential one. As we can see from above example, there is just one switch to turn a sequential stream to a parallel one. What about the below code, will it create a parallel or a sequential stream?

In [5]:
Stream.of(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
    .parallel()
    .map(e -> (float)e )
    .sequential() // The last parallel or sequential operation decides
                  // since intermediate operations are lazily evaluated
    .forEach(x -> System.out.print(x + " "))   

1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 

Let's see which threads execute the intermediate operations

In [7]:
Stream.of(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
    .parallel()
    .map(e -> {
        System.out.println("Element=" + e + ", Thread=" + Thread.currentThread().getName());
        return e*e;
    })
    .forEach(x -> {})

Element=7, Thread=IJava-executor-3
Element=9, Thread=ForkJoinPool.commonPool-worker-3
Element=5, Thread=ForkJoinPool.commonPool-worker-5
Element=8, Thread=ForkJoinPool.commonPool-worker-27
Element=1, Thread=ForkJoinPool.commonPool-worker-7
Element=4, Thread=ForkJoinPool.commonPool-worker-21
Element=10, Thread=ForkJoinPool.commonPool-worker-13
Element=3, Thread=ForkJoinPool.commonPool-worker-9
Element=6, Thread=ForkJoinPool.commonPool-worker-31
Element=2, Thread=ForkJoinPool.commonPool-worker-17


In case of sequential streams:

In [8]:
// Main thread normally, Thread=IJava-executor-3 because this code
// is executed by IPython Java kernel
Stream.of(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
    .map(e -> {
        System.out.println("Element=" + e + ", Thread=" + Thread.currentThread().getName());
        return e*e;
    })
    .forEach(x -> {})

Element=1, Thread=IJava-executor-3
Element=2, Thread=IJava-executor-3
Element=3, Thread=IJava-executor-3
Element=4, Thread=IJava-executor-3
Element=5, Thread=IJava-executor-3
Element=6, Thread=IJava-executor-3
Element=7, Thread=IJava-executor-3
Element=8, Thread=IJava-executor-3
Element=9, Thread=IJava-executor-3
Element=10, Thread=IJava-executor-3


Parallel streams internally use `Common ForkJoinPool`. The number of threads is fixed (1 - number of cores). This can be modified using a vm argument.

From the `forEach` output we can see that in case of parallel stream, the order is result is different. Some methods are inherently unordered whereas some have ordered counterpart. Consider the code below:

In [9]:
Stream.of(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
    .parallel()
    .map(e -> e * 2)
    .forEachOrdered(x -> System.out.print(x + " "))   

2 4 6 8 10 12 14 16 18 20 

Even though we used parallel stream, the order of elements is maintained. This is still parallel execution. Let's see the threads involved:

In [11]:
Stream.of(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
    .parallel()
    .map(e -> {
        System.out.println("**Element=" + e + ", Thread=" + Thread.currentThread().getName());
        return e*e;
    })
    .forEachOrdered(e -> {
        System.out.println("--Element=" + e + ", Thread=" + Thread.currentThread().getName());
    })   

**Element=7, Thread=IJava-executor-5
**Element=10, Thread=ForkJoinPool.commonPool-worker-17
**Element=6, Thread=ForkJoinPool.commonPool-worker-9
**Element=8, Thread=ForkJoinPool.commonPool-worker-29
**Element=2, Thread=ForkJoinPool.commonPool-worker-15
**Element=5, Thread=ForkJoinPool.commonPool-worker-23
**Element=3, Thread=ForkJoinPool.commonPool-worker-13
**Element=1, Thread=ForkJoinPool.commonPool-worker-19
**Element=9, Thread=ForkJoinPool.commonPool-worker-25
**Element=4, Thread=ForkJoinPool.commonPool-worker-1
--Element=1, Thread=ForkJoinPool.commonPool-worker-19
--Element=4, Thread=ForkJoinPool.commonPool-worker-19
--Element=9, Thread=ForkJoinPool.commonPool-worker-19
--Element=16, Thread=ForkJoinPool.commonPool-worker-19
--Element=25, Thread=ForkJoinPool.commonPool-worker-19
--Element=36, Thread=ForkJoinPool.commonPool-worker-19
--Element=49, Thread=ForkJoinPool.commonPool-worker-19
--Element=64, Thread=ForkJoinPool.commonPool-worker-19
--Element=81, Thread=ForkJoinPool.commonP

We see that the two operations (map and forEachOrdered) are separately parallel. Compare this with:

In [12]:
Stream.of(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
    .parallel()
    .map(e -> {
        System.out.println("**Element=" + e + ", Thread=" + Thread.currentThread().getName());
        return e*e;
    })
    .forEach(e -> {
        System.out.println("--Element=" + e + ", Thread=" + Thread.currentThread().getName());
    })   

**Element=9, Thread=ForkJoinPool.commonPool-worker-25
**Element=1, Thread=ForkJoinPool.commonPool-worker-9
**Element=4, Thread=ForkJoinPool.commonPool-worker-3
--Element=16, Thread=ForkJoinPool.commonPool-worker-3
**Element=5, Thread=ForkJoinPool.commonPool-worker-17
**Element=10, Thread=ForkJoinPool.commonPool-worker-23
**Element=6, Thread=ForkJoinPool.commonPool-worker-13
**Element=8, Thread=ForkJoinPool.commonPool-worker-29
--Element=64, Thread=ForkJoinPool.commonPool-worker-29
**Element=2, Thread=ForkJoinPool.commonPool-worker-15
**Element=3, Thread=ForkJoinPool.commonPool-worker-1
--Element=9, Thread=ForkJoinPool.commonPool-worker-1
--Element=4, Thread=ForkJoinPool.commonPool-worker-15
--Element=36, Thread=ForkJoinPool.commonPool-worker-13
--Element=100, Thread=ForkJoinPool.commonPool-worker-23
--Element=25, Thread=ForkJoinPool.commonPool-worker-17
--Element=1, Thread=ForkJoinPool.commonPool-worker-9
--Element=81, Thread=ForkJoinPool.commonPool-worker-25
**Element=7, Thread=IJava-

The ordering is guaranteed by forEachOrdered because lists have ordering. If the stream was formed from a set, no ordering would have been present. Similar observation can be made when using the `findAny` method.

The reduce operation is also done parallely:

In [13]:
Stream.of(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
    .parallel()
    .reduce(0, (total, x) -> {
        System.out.println("Total=" + total + ", x=" + x + ", Thread=" + Thread.currentThread().getName());
        return total + x;
    })

Total=0, x=3, Thread=ForkJoinPool.commonPool-worker-21
Total=0, x=4, Thread=ForkJoinPool.commonPool-worker-19
Total=0, x=1, Thread=ForkJoinPool.commonPool-worker-15
Total=0, x=5, Thread=ForkJoinPool.commonPool-worker-1
Total=0, x=9, Thread=ForkJoinPool.commonPool-worker-7
Total=0, x=10, Thread=ForkJoinPool.commonPool-worker-11
Total=0, x=2, Thread=ForkJoinPool.commonPool-worker-25
Total=0, x=8, Thread=ForkJoinPool.commonPool-worker-29
Total=0, x=7, Thread=IJava-executor-7
Total=1, x=2, Thread=ForkJoinPool.commonPool-worker-25
Total=9, x=10, Thread=ForkJoinPool.commonPool-worker-11
Total=4, x=5, Thread=ForkJoinPool.commonPool-worker-1
Total=3, x=9, Thread=ForkJoinPool.commonPool-worker-1
Total=0, x=6, Thread=ForkJoinPool.commonPool-worker-5
Total=6, x=7, Thread=ForkJoinPool.commonPool-worker-5
Total=3, x=12, Thread=ForkJoinPool.commonPool-worker-1
Total=8, x=19, Thread=ForkJoinPool.commonPool-worker-11
Total=13, x=27, Thread=ForkJoinPool.commonPool-worker-11
Total=15, x=40, Thread=ForkJ

55

Which is why if we pass in an incorrect identity element, the error is amplified in case of parallel stream.  