Skip to content
Permalink
v2019.02.10-mo…
Switch branches/tags
Go to file
 
 
Cannot retrieve contributors at this time

title: Using Monoids in C++ v1.3

author: Kristoffel Pirard

url: http://github.com/xtofl/articles/monoid/monoid.md

[(single-page version)](?print-pdf&pdfSeparateFragments=false)
This slide deck is intended for use with reveal.js;
``` articles> make monoid/monoid.reveal ```

Using Monoids in C++

pig throwing flares at itself

credit: Bartosz Milewski

Why am I doing this?

  • we, developers, need
    • proven methodologies (scaling the dev process)
    • reuse, over language boundaries
  • mathematicians: building theorems for ages
  • we don't understand Mathematese and FP-ish
    • cf. the iota discussions lately
  • mathematicians don't speak Developese (nobody does)

--

Why am I doing this

  • I'm enthousiastic (a bit nerdy)
  • I think we will benefit (FP matters)
  • Can you help me?
    • I want to help you taking the first hurdle
    • Let's take the rest together!

--

What I want to achieve

whetting some appetite for FP

no - fp's not going to solve all your problems

How I got here

  • I have to take care of the family groceries for some weeks
  • finding my way around a store???
  • always too much in my cart
  • always important stuff missing
  • stores are crowded from 10am.

Note: < 3 minutes

--

But... what groceries do I need?

  • Week menu
  • Recipes
  • Pantry

--

  • 5 chipolata's, 2kg potatoes, 5 apples, 2 packs of pasta, 400g of minced meat, grated cheese, chicken breast, basmati rice, curry sauce ...
  • O - we still got potatoes.
  • ... and 5 packs of curry sauce

--

Week

after week

after week

can't a computer do that?

--

I wish I could do...

> python3 growser.py
o currysaus <1 pak>
o chipolata <1 pak>
o basmati <1 kg>
...

--

shopping_list_menu = resulting_list(all_dishes, pantry)
shopping_list = join_ingredients(shopping_list_menu, extras)
print_ingredients(shopping_list, shop=the_shop)

--

But look at that code... :(

cart = {}
for dish in menu:
    for i in dish.ingredients:
        if not i.name in cart:
            cart[i.name] = i.amount
        else:
            assert
            cart[i.name].amount.unit == i.amount.unit
            cart[i.name].amount.n += i.amount.n

--

Meanwhile

Reading about Category Theory, bumping into terms like

  • Monoid
  • Functor (+ the great - debate)
  • Monad
  • Applicative
  • ...

wooow.... scary!

--

Discussions about 'vacuous truth/falsity'

* all({true, false, true}) == false
* all({true}) == true
* all({}) == ????

--

</How I got here>

--

And then something 'clicked'.

cart = {}
for dish in menu:
    for i in dish.ingredients:
        if not i.name in cart:
            cart[i.name] = i.amount
        else:
            assert
            cart[i.name].amount.unit == i.amount.unit
            cart[i.name].amount.n += i.amount.n

Imagine using a default-dict here.

And a way to 'add' dicts, key-wise...

--

cart = sum(dish.ingredients for dish in menu)

How come I didn't think of that sooner?

--

Why the Detour?

  • my imperative background
  • lack of concepts to express this

--

Why lack of Concepts?

... dreaming during math class?

The 'math' lingo does not map onto 'programming'.

Lack of _application_

--

Math Lingo

math lingo

credit: wikipedia

--

Misunderstanding

= lingo mismatch

prof and dev misunderstanding

credit: Jona

--

Anger

prof and dev angry

credit: Jona

Note: yoda may fit in here.

--

Knowledge for the win

  • Scientists tend to be clever
  • Common vocabulary => more collaboration
    • with scientists
    • amongst developers

--

This presentation is

An attempt to take away some fear and some dismay

=> small steps!, real-life *!

adapted from -

--

This may look like part of a Category Theory course

  • Monoid <-- we are here
  • Functor
  • Applicative
  • Monad

but it's not.

--

But we may need such courses

dev at math course

credit: Jona

What are Monoids

--

Monoid: back to school

  • Back to school: addition
    • 1 + 2 == 3
    • 234225 + 123415115 == 123649340
    • 1 + (2 + 3) == (1 + 2) + 3
    • 0 + x == x
    • x + 0 == x

--

Monoid: back to school

  • Back to school: multiplication
    • 3 · 2 == 6
    • 165 · 23 == 3795
    • 4 · ( 2 · 3 ) == ( 4 · 2 ) · 3
    • 1 · x == x
    • x · 1 == x

--

Generalizing

  • + and · are binary operations on ℕ
    • closed
    • associative
    • with an identity element (resp. 0 and 1)

--

Monoid Definition

A Monoid is a tuple <S, op, id> so that

  • op(s1, s2) element of S
  • op(s1, op(s2, s3)) == op(op(s1, s2), s3)
  • op(id, s) == op(s, id) == s

--

Generalizing: math notation

Monoid: a tuple <S, ⋄, id> so that

  • ∀ s1, s2, s3 ∈ S
    • (s1 ⋄ s2) ∈ S
    • s1 ⋄ (s2 ⋄ s3) == (s1 ⋄ s2) ⋄ s3
    • id ⋄ s == s == s ⋄ id

--

dev and math guy talking same lingo

credit: Jona

--

Examples: <bool, &&, true>

* a && b is a boolean
* a && (b && c) == (a && b) && c
* a && true == a, true && a == a

--

Examples: <bool, ||, false>

* a || b is a boolean
* a || (b || c) == (a || b) || c
* a || false == a, false || a == a

--

Examples: <string, +, "">

string string::append(string)
string operator+(string, string)
"ab"s + ("cd"s + "ef"s) == ("ab"s + "cd"s) + "ef"s
"ab"s + ""s == "ab"s == ""s + "ab"s

--

Examples: floating point

double operator+(double, double)
.5 + (1. + 2.) == (.5 + 1.) + 2.
.5 + 0. == 0. + .5 == .5

--

Examples: floating point

.1 + (.2 + .3) == (.1 + .2) + .3

NOT associative!

(unless you can live with a small error)

What are Monoids Good for?

--

So... what good is it?

Generic functions, of course!

Functions first defined on Ranges can often be generalized to any monoid.

  • Runtime
  • Compile time
  • Shopping time <!-- .element: class="fragment">

--

What else?

--

What's good about the Laws

  • Closure
  • Associativity
  • An identity element

--

Closure

  • Operation can be chained
    • "define pairwise => get range operation for free"
  • Only 1 type needed
    • less mental burden
    • less template arguments

--

Closure

// explicit knowledge
template<typename InputIt, typename T,
         typename BinaryOperation>
T accumulate(InputIt b, InputIt e, T init,
              BinaryOperation op );
// requires BinaryOperation to satisfy
// T t = op(t, *first)

vs

// implicit knowledge
template<typename InputIt, typename Monoid>
Monoid accumulate(InputIt b, InputIt e);

Note: conservation of complexity: it is moved into the Monoid type

--

Associativity

Divide and Conquer

acc(a, b, c, d) ==
    acc(
        acc(a, b),
        acc(c, d)
    )
  • => parallelization
  • Incrementalism

--

An identity element

  • operation with 'empty' lists (vacuus truth?)
  • allow 'restarting' computation
    • each part can start from "zero"
  • less mental burden

--

Treasure Trove

Theorems for Free!

  • m, n are monoid => algebraic types of m, n, too
    • tuple<int, bool> under (+, ||) with id=(0, false)
  • function composition is a monoid
    • great for pipes and filters pattern!
    • foo(bar(baz(m))) == mconcat({foo, bar, baz})(m)
  • O(log(N)) parser
  • ...

--

But we need translators


Applying it in C++

--

Creating a Monoid in C++

Goal:

Define and use mconcat(begin, end) -> M

auto result = mconcat(begin(xs), end(xs));

--

Different approaches

  • overload operator + and add a 0 constructor
    • very very implicit. monkey-patching, almost
  • template specialization
  • use semantic types (concepts may help!)

Let's take the wrong turn first.

--

Generic mappend and mempty

template<typename T> T mempty();
template<typename T> T mappend(T, T);

template<typename Monoid>
auto mconcat(It b, It e) {
    return accumulate(
        b, e,
        mempty<Monoid>(),
        mappend<Monoid>);
}

--

Specialization for int addition

template<>
int mempty<int>() { return 0; }

template<>
int mappend<int>(int a, int b) { return a + b; }
std::vector<int> ints{{1, 2, 3, 4}};
EXPECT_EQ(10, mconcat(begin(ints), end(ints)));

--

And for a custom type

struct Custom {
    std::string s;
    int n;
};
template<>
Custom mempty<Custom>() { return {}; }
template<>
Custom mappend<Custom>(Custom c, Custom d) {
    return {
        c.s.append(d.s),
        c.n + d.n
    };
}

--

Let's break it

For int product:

template<>
int mempty<int>() { return 1; }
template<>
int mappend<int>(int a, int b) { return a * b; }

Can we have 2 specializations for int?

Didn't think so
// error: redefinition of
// ‘T overloading::mempty() [with T = int]’

--

So some extra info is needed

Specify the monoid set, operator and identity.

auto intsum = monoid(0, std::plus<int>{});

All the info is there!

  • 0 is an int
  • plus<int>

--

    template<typename T_, typename Mappend_t>
    struct Monoid {
        using T = T_;
        T mempty;
        Mappend_t mappend;
    };

Usage:

Monoid<int, std::plus<int>> intsum{0, std::plus<int>{}};

... what a mess :(

--

Generic lambda's to the rescue!

    auto monoid = [](auto e, auto f) {
        return Monoid<decltype(e), decltype(f)>{e, f};
    };

Usage:

auto intsum = monoid(0, std::plus<int>);
auto intproduct = monoid(1, std::multiplies<int>);
EXPECT_EQ(10, intsum.mconcat(ints));
EXPECT_EQ(24, intproduct.mconcat(ints));

--

But isn't it slow?

thanks, quick-bench!

benchmark_result.png

Cf. also [Linear Types](https://meetingcpp.com/mcpp/slides/2018/lin.pdf)/[ligthning talk](https://www.youtube.com/watch?v=sN8tI-zleFI)

--

The Grocery List.

Define some simple data types:

using Name = std::string;
using Amount = int;
using GroceryList = std::map<Name, Amount>

--

Example dishes

name dish1 dish2 dish3
carrots 5 - 10
minced meat 300g 300g -
rice 200g 200g -
spaghetti - - 400g

--

Before

template<typename It>
GroceryList join_grocerylists(It b, It e) {
    static_assert(std::is_same_v<typename It::value_type,
                                 GroceryList>);
    GroceryList result{};
    for( ; b!=e ; ++b) {
        for(const auto &ingredient: b->items) {
            result.items[ingredient.first] += 
                ingredient.second;
        }
    }
    return result;
}

--

The Sum monoid

const auto grocery_monoid = monoid(
    GroceryList{},
    [](auto a, auto b){
        for (const auto &ib: b.items) {
            a.items[ib.first] += ib.second;
        }
        return a;
    });

--

After

template<typename It>
auto join_grocerylists(It b, It e) {
    return mconcat(grocerylist_monoid, b, e);
}

--

But Wait - There's More

From the treasure trove: Algebraic Data Types composed of Monoids are also Monoids.

A map<K, V> resembles an 'infinite struct' of values.

So...

Would map<K, Monoid> also form a Monoid?

--

Imagine we can 'declare' the monoid within a map

template<typename Map, typename Monoid>
auto fmonoid(Monoid m) {
    auto mappend = ...;
    return monoid(
        Map{},
        mappend
    };
}

--

We Can!

// remember, m is a monoid
auto mappend = [=](Map a, Map b) {
    for(const auto& kv: b) {
        auto &xa = find_or_create(a, kv.first, m.mempty);
        auto xb = kv.second;
        xa = m.mappend(xa, xb);
    }
    return a;
}

--

Don't mind the braces...

EXPECT_EQ(
    (IntMap{{{"one", 3}, {"two", 7}, {"three", 13}}}),
    mconcat(
        fmonoid<IntMap>(monoid(0, std::plus<int>{})),
        std::vector<IntMap>({
            {{ {"one", 1}, {"two", 4}, {"three", 9} }},
            {{ {"one", 2}, {"two", 3}, {"three", 4} }}
        })
    )
);

--

After

const auto intmapmonoid =
    lean::fmonoid<std::map<std::string, int>>(
        lean::monoid(0, std::plus<int>{})
    );

const auto grocery_monoid = lean::monoid(
    GroceryList{intmapmonoid.mempty},
    [](auto a, auto b) -> GroceryList {
        return {intmapmonoid.mappend(a.items, b.items)};
    }
);

(We could dig deeper and also extract the { ... .items})

--

C++20: Concepts

template<typename It, typename T>
    requires Monoid<T>
auto mconcat(It b, It e) {...}

--

What if: no Unit

Some times you have no Unit

  • Non-empty lists
  • Counting from 1
  • max

--

Add a Unit

You can create a 'Sum' type using std::variant or std::optional:

  • N = {1, 2, 3, 4, ...}
  • MN = {None} ∪ N

--

Add a Unit

In C++... use an optional<T>

// let m be a monoid:
auto tmonoid = monoid(
    optional<T>{},
    [](auto a, auto b) -> optional<T> {
        return {
            (a && b)
            ? {append(*a, *b)}
            : a
                ? a
                : b ? b : {} };
);

Conclusion

  • Math and Category theory: treasure island
    • Can be used in C++
    • Establish common vocabulary
    • Reduce complexity through known abstractions
  • I'm afraid of the math lingo

--

Translators needed!

--

We need you!

collaboration between math guy and dev

credit: Jona

--

References

--

Credits


Questions