# Functional programming

Rust's design is highly influenced by functional languages, and it has features which are not familiar for C/C++ developers. In this chapter, we will learn those features.

## Algebraic data types

Rust has C-like enum:

In [2]:
#[derive(Debug)]
enum IpType {
    V4,
    V6,
}

IpType::V4

V4

So we can define ip as an struct:

In [3]:
#[derive(Debug)]
struct Ip(IpType, &'static str);

Ip(IpType::V4, "127.0.0.1")

Ip(V4, "127.0.0.1")

But it is not ideal. we want to store 4 bytes for ip v4 and 16 bytes for ip v6. Rust enum can handle that, because enum variants are able to have fields like an struct.

In [4]:
#[derive(Debug)]
enum Ip {
    V4([u8; 4]),
    V6([u8; 16]),
}

let ip = Ip::V4([127, 0, 0, 1]);
ip

V4([127, 0, 0, 1])

Rust enums enables writing typesafe code. In this example we can't make a version 6 ip with only 4 bytes.

But how we can use such enums? Getting field from those doesn't work (and doesn't mean, really):

In [5]:
ip.0

Error: no field `0` on type `Ip`

## Pattern matching

We saw pattern matching in the memory safety chapter. It also works on enums:

In [6]:
fn show_ip(ip: &Ip) {
    match ip {
        Ip::V4(x) => {
            println!("{}.{}.{}.{}", x[0], x[1], x[2], x[3]);
        }
        Ip::V6(_) => todo!(),
    }
}

show_ip(&ip);

127.0.0.1


That is, enum variants with patterns in fields are patterns. A more complex example:

In [8]:
fn show_ip(ip: &Ip) {
    match ip {
        Ip::V4([127, _, _, _]) => println!("localhost"),
        Ip::V4([192, 168, x, y]) => println!("local network 192.168.{x}.{y}"),
        Ip::V4([a, b, c, d]) => println!("ip {a}.{b}.{c}.{d}"),        
        Ip::V6(_) => todo!(),
    }
}

show_ip(&Ip::V4([192, 168, 1, 1]));

local network 192.168.1.1


`enum` variants can also have named fields:

In [10]:
enum Message {
    Quit,
    Move { x: i32, y: i32 },
    Write(&'static str),
    ChangeColor(i32, i32, i32),
}

let message = Message::Move { x: 12, y: -33 };
match message {
    Message::Move { x: _, y: 0.. } => "move up",
    Message::Move { x: _, y: _ } => "move down",
    _ => "unhandled",
}

"move down"

`struct` is also pattern:

In [13]:
struct Human {
    name: &'static str,
    height: u32,
}

let hamid = Human { name: "hamid", height: 190 };
match hamid {
    Human { height: 300.., name: _ } => println!("invalid"),
    Human { height: 180.., name: some_ident } => println!("A tall human {some_ident}"),
    // here `name` is shortcut for `name: name`
    Human { height: _, name } => println!("A normal human {name}"),
};

A tall human hamid


Patterns that matches everything, like `_`, `ident`, `(a, b)`, ... can be used in `let`:

In [14]:
let Human { height, name } = hamid;
(height, name)

(190, "hamid")

Or in function arguments:

In [15]:
fn sum_of_array([a, b, c, d]: [i32; 4]) -> i32 {
    a + b + c + d
}

sum_of_array([1, 2, 3, 4])

10

Or in `for` loop:

In [17]:
let v = vec![1, 5, 10];
// `.enumerate` converts an iterator of X to an iterator of (index, X)
for (i, x) in v.iter().enumerate() {
    println!("{i}: {x}");
}

0: 1
1: 5
2: 10


()

`mut` is part of pattern, not part of `let`, so we can write this:

In [18]:
let (mut x, y) = (2, 5); // x is mutable, y is not
x += 4;
(y, x)

(5, 6)

Or even use it outside of `let`, for example in function arguments, for variables, or even `match`:

In [19]:
match Some(2) {
    Some(mut x) => {
        x += 2;
        x
    }
    None => 12,
}

4

Here we saw `Option` again. Did you know that `Option` itself is an `enum` and we can define ours equivalent?

In [21]:
#[derive(Debug)]
enum MyOption<T> {
    MySome(T),
    MyNone,
}

use MyOption::*;

let x: MyOption<i32> = MySome(2);
x

MySome(2)

`MyOption` is exactly equal to `Option` of standard library. It even supports null pointer niche optimization:

In [22]:
use std::mem::size_of;

(size_of::<Option<i64>>(), size_of::<MyOption<i64>>(), size_of::<Option<&i64>>(), size_of::<MyOption<&i64>>())

(16, 16, 8, 8)

So there is no additional support for `Option` variants in pattern matching. It just follows the `enum` rules.

Some functions and return types are annotated with `#[must_use]`, and compiler will warn us if we don't use them:

In [23]:
#[deny(warnings)]
fn foo() {
    let v = vec![1, 2, 3, 4];
    v.get(3); // this is suspicious
}

Error: unused return value of `core::slice::<impl [T]>::get` that must be used

We can silence the compiler by `let _ = `:

In [24]:
#[deny(warnings)]
fn foo() {
    let v = vec![1, 2, 3, 4];
    let _ = v.get(3); // We explicitly want to ignore the result
}

`_` in `let _ = ` isn't a variable name. It's just a pattern. TODO.

## Iterators

An iterator, in C++, is a pointer-like object that we can `++` it until it reaches some other iterator, like `.end()` of a collection, and we iterate over something this way. In Rust, an iterator holds both start and end pointers, and provides a `.next` method which returns an `Option`, which is `None` if we reached the end, and `Some(next_elem)` otherwise:

In [30]:
{
    let v = vec![1, 2, 5];
    let mut v_iter = v.iter();
    let mut v_iter2 = v.iter();
    println!("{:?}", v_iter.next());
    println!("{:?}", v_iter2.next());
    println!("{:?}", v_iter.next());
    println!("{:?}", v_iter.next());
    println!("{:?}", v_iter.next());
    println!("{:?}", v_iter.next());
};

Some(1)
Some(1)
Some(2)
Some(5)
None
None


We can implement `Iterator` for our own types. Let's make a type similar to ranges:

In [33]:
struct MyRange {
    start: u64,
    end: u64,
}

impl Iterator for MyRange {
    type Item = u64;
    fn next(&mut self) -> Option<u64> {
        if self.start >= self.end {
            return None;
        }
        self.start += 1;
        Some(self.start - 1)
    }
}

let mut r = MyRange { start: 5, end: 8 };
println!("{:?}", r.next());
println!("{:?}", r.next());
println!("{:?}", r.next());
println!("{:?}", r.next());
println!("{:?}", r.next());
(r.start, r.end)

Some(5)
Some(6)
Some(7)
None
None


(8, 8)

This allow us to call `for` on it:

In [35]:
let r = MyRange { start: 5, end: 8 };
for x in r {
    println!("{x}");
};

5
6
7


In addition to `for`, there are some default methods on `Iterator` trait which we can call. For example, `.count` will call `.next()` until it reaches the end, and reports the count:

In [36]:
MyRange { start: 5, end: 8 }.count()

3

Or `.collect` which can convert an iterator to a collection, like a vector, a hashmap, ...

In [39]:
let mut x: Vec<u64> = MyRange { start: 5, end: 8 }.collect();
x[1] += 10;
x

[5, 16, 7]

Or `.skip(n)` which calls `.next()` n times and then returns the remaining iterator:

In [41]:
let mut x = MyRange { start: 5, end: 10 }.skip(3);
x.next()

Some(8)

We can implement `Iterator` methods ourself to make them faster:

In [47]:
struct MyRangeFast {
    start: u64,
    end: u64,
}

impl Iterator for MyRangeFast {
    type Item = u64;
    fn next(&mut self) -> Option<u64> {
        if self.start >= self.end {
            return None;
        }
        self.start += 1;
        Some(self.start - 1)
    }

    fn count(self) -> usize {
        usize::try_from(self.end - self.start).unwrap()
    }
}

MyRangeFast {
    start: 0,
    end: 1_000_000_000_000_000_000,
}.count()

1000000000000000000

Calling this on `MyRange` can take years to execute. Since people are allowed to override these methods, there is no guarantee that they will work correct. In fact, our example doesn't works correct, and it will overflow if `start` is after the end. Integer overflow in Rust causes a panic in debug builds, and a two's complement result in release builds (it is not UB unlike in C/C++):

In [48]:
:preserve_vars_on_panic 1

Preserve vars on panic: true


In [49]:
MyRangeFast {
    start: 10,
    end: 5,
}.count()

thread '<unnamed>' panicked at 'attempt to subtract with overflow', src/lib.rs:83:25
stack backtrace:
   0: rust_begin_unwind
             at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/panicking.rs:584:5
   1: core::panicking::panic_fmt
             at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/core/src/panicking.rs:142:14
   2: core::panicking::panic
             at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/core/src/panicking.rs:48:5
   3: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
   4: run_user_code_36
   5: evcxr::runtime::Runtime::run_loop
   6: evcxr::runtime::runtime_hook
   7: evcxr_jupyter::main
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.


But the standard library does it right:

In [51]:
(0..1_000_000_000_000_000_000_u64).count()

1000000000000000000

In [52]:
(10..5).count()

0

There are another family of functions on `Iterator` trait, called iterator adapters. Let's start with `.filter`:

In [53]:
(0..100).filter(|x| x % 2 == 0).count()

50

It accepts a function which returns a boolean, and returns a new iterator, consists only of elements which are true. To better see it, we can collect it to `Vec`:

In [54]:
(0..100).filter(|x| x % 2 == 0).collect::<Vec<_>>()

[0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98]

Another adapter is `.map`, which transforms the iterator:

In [56]:
(0..20).map(|x| (x / 2, x % 2)).collect::<Vec<_>>()

[(0, 0), (0, 1), (1, 0), (1, 1), (2, 0), (2, 1), (3, 0), (3, 1), (4, 0), (4, 1), (5, 0), (5, 1), (6, 0), (6, 1), (7, 0), (7, 1), (8, 0), (8, 1), (9, 0), (9, 1)]

We can chain the iterator adapters:

In [57]:
(0..20)
    .filter(|x| x % 2 == 0)
    .map(|x| (x / 5, x % 5))
    .collect::<Vec<_>>()

[(0, 0), (0, 2), (0, 4), (1, 1), (1, 3), (2, 0), (2, 2), (2, 4), (3, 1), (3, 3)]

Iterator adaptors are lazy and do nothing without someone who calls `.next` on them (which here is `.collect` or `.count`) we can see that by printing inside closures.

In [58]:
(0..5)
    .map(|x| {
        println!("{x}");
        x + 100
    })
    .collect::<Vec<_>>()

0
1
2
3
4


[100, 101, 102, 103, 104]

In [59]:
(0..5)
    .map(|x| {
        println!("{x}");
        x + 100
    })

Map { iter: 0..5 }

And there was no print! Lazy iterators are how Rust archive ergonomics and safety at the same time. Performance of iterator adapters (after optimization) is equal to manual for loops. To show that, let's see the assembly of this function:

In [63]:
pub fn foo(x: Vec<i32>) -> i32 {
    x
        .iter()
        .filter(|x| **x < 10)
        .map(|x| x * x)
        .sum::<i32>()
}

foo(vec![1, 2, 12, 3])

14

```
example::foo:
        addi    sp, sp, -16
        sd      ra, 8(sp)
        sd      s0, 0(sp)
        mv      a1, a0
        ld      a2, 16(a0)
        ld      a0, 0(a0)
        li      s0, 0
        beqz    a2, .LBB0_5
        slli    a2, a2, 2
        li      a3, 10
        mv      a4, a0
        j       .LBB0_3
.LBB0_2:
        addi    a2, a2, -4
        addi    a4, a4, 4
        beqz    a2, .LBB0_5
.LBB0_3:
        lw      a5, 0(a4)
        bge     a5, a3, .LBB0_2
        mulw    a5, a5, a5
        addw    s0, s0, a5
        j       .LBB0_2
.LBB0_5:
        ld      a1, 8(a1)
        beqz    a1, .LBB0_8
        slli    a1, a1, 2
        beqz    a1, .LBB0_8
        li      a2, 4
        call    __rust_dealloc@plt
.LBB0_8:
        mv      a0, s0
        ld      ra, 8(sp)
        ld      s0, 0(sp)
        addi    sp, sp, 16
        ret
```

This is risc-v, not x86, which IMO is easier to read. As you can see, there is no function call (except the one which deallocates the vector) and it is a really small and simple loop.

If Rust was not lazy (like how JS `.map` and ... is implemented), every adapter would be forced to allocate the iterator items as something like vector, which would be very slow. Now that it is lazy, it can even implement infinite iterators, in constant memory! For example, here we find the 100 first primes:

In [69]:
fn is_prime(x: i64) -> bool {
    if x < 2 {
        return false;
    }
    !(2..x).take_while(|i| i * i <= x).any(|i| x % i == 0)
}

{
    // prime_iterator is almost infinite (it has at least 10^15 elements), but
    // since iterators are lazy, we can store it.
    let prime_iterator = (2..).filter(|x| is_prime(*x));
    // here `.collect` without `.take` would loop until OOM
    prime_iterator.take(100).collect::<Vec<_>>()
}

[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157, 163, 167, 173, 179, 181, 191, 193, 197, 199, 211, 223, 227, 229, 233, 239, 241, 251, 257, 263, 269, 271, 277, 281, 283, 293, 307, 311, 313, 317, 331, 337, 347, 349, 353, 359, 367, 373, 379, 383, 389, 397, 401, 409, 419, 421, 431, 433, 439, 443, 449, 457, 461, 463, 467, 479, 487, 491, 499, 503, 509, 521, 523, 541]