# Monadic programming examples

Anton Antonov   
[MathematicaForPrediction at WordPress](https://mathematicaforprediction.wordpress.com)  
[RakuForPrediction at WordPress](https://rakuforprediction.wordpress.com)   
November, December 2025

----

## Introduction

This document ([notebook](https://github.com/antononcube/RakuForPrediction-blog/blob/main/Notebooks/Jupyter/Monadic-programming-examples.ipynb)) has example of monadic pipelines for computational workflows in Raku. It is a extend the blog post ["Monad laws in Raku"](https://rakuforprediction.wordpress.com/2025/11/16/monad-laws-in-raku/), [AA2], ([notebook](https://github.com/antononcube/RakuForPrediction-blog/blob/main/Notebooks/Jupyter/Monad-laws-in-Raku.ipynb)), with "real-life" examples.

### Context

As in mentioned in [AA2], here is a list of the applications of monadic programming we consider:

1. Graceful failure handling

2. Rapid specification of computational workflows

3. Algebraic structure of written code

**Remark:** Those applications are discussed in [AAv5] (and its future Raku version.)

As [a tools maker for Data Science (DS) and Machine Learning (ML)](https://raku-advent.blog/2025/12/02/day-2-doing-data-science-with-raku/), [AA3], 
I am very interested in Point 1; but as a "simple data scientist" I am mostly interested in Point 2.

That said, a large part of my Raku programming has been dedicated to rapid and reliable code generation for DS and ML by leveraging the algebraic structure of corresponding software monads, i.e. Point 3. (See [AAv2, AAv3, AAv4].) For me, first and foremost, ***monadic programming pipelines are just convenient interfaces to computational workflows***. Often I make software packages that allow "easy", linear workflows that can have very involved computational steps and multiple tuning options.

### Dictionary

- **Monadic programming**   
  A method for organizing computations as a series of steps, where each step generates a value along with additional information about the computation, such as possible failures, non-determinism, or side effects. See [Wk1].

- **Monadic pipeline**   
  Chaining of operations with a certain syntax. Monad laws apply loosely (or strongly) to that chaining. 

- **Uniform Function Call Syntax (UFCS)**  
  A feature that allows both free functions and member functions to be called using the same `object.function()` method call syntax. 

- **Method-like call**   
  Same as UFCS. A Raku example: `[3, 4, 5].&f1.$f2`.

---

## Setup

Here are loaded packages used in this notebook:

In [59]:
use Data::Reshapers;
use Data::TypeSystem;
use Data::Translators;

use DSL::Translators;
use DSL::Examples;

use ML::SparseMatrixRecommender;
use ML::TriesWithFrequencies;

use Hilite::Simple;

---

## Prefix trees

Here is a list of steps:

- Make a prefix tree (trie) with frequencies using word splitting over `@words2`
- Merge the trie with the another trie made over `@words3`
- Convert the node frequencies into probabilities
- Shrink the trie (i.e. find the "prefixes")
- Show the tree-form of the trie

Let us make a small trie of pet names (used by Raku or Perl fans):

In [6]:
sink my @words1 = random-pet-name(*)».lc.grep(/ ^ perl /);
sink my @words2 = random-pet-name(*)».lc.grep(/ ^ [ra [k|c] | camel ] /);

Here we make a trie (prefix tree) for those pet names using the feed operator and the functions of ["ML::TriesWithFrequencies"](https://raku.land/zef:antononcube/ML::TriesWithFrequencies):

In [121]:
@words1 ==> 
trie-create-by-split==>
trie-merge(@words2.&trie-create-by-split) ==>
trie-node-probabilities==>
trie-shrink==>
trie-say

TRIEROOT => 1
├─camel => 0.10526315789473684
│ ├─ia => 0.5
│ └─o => 0.5
├─perl => 0.2631578947368421
│ ├─a => 0.2
│ ├─e => 0.2
│ └─ita => 0.2
└─ra => 0.631578947368421
  ├─c => 0.75
  │ ├─er => 0.2222222222222222
  │ ├─he => 0.5555555555555556
  │ │ ├─al => 0.2
  │ │ └─l => 0.8
  │ │   └─  => 0.5
  │ │     ├─(ray ray) => 0.5
  │ │     └─ray => 0.5
  │ ├─ie => 0.1111111111111111
  │ └─ket => 0.1111111111111111
  └─k => 0.25
    ├─i => 0.3333333333333333
    └─sha => 0.6666666666666666


Using `andthen` and the `Trie` class methods (but skipping node-probabilities calculation in order to see the counts):

In [120]:
@words1
andthen .&trie-create-by-split
andthen .merge( @words2.&trie-create-by-split )
# andthen .node-probabilities
andthen .shrink
andthen .form

TRIEROOT => 19
├─camel => 2
│ ├─ia => 1
│ └─o => 1
├─perl => 5
│ ├─a => 1
│ ├─e => 1
│ └─ita => 1
└─ra => 12
  ├─c => 9
  │ ├─er => 2
  │ ├─he => 5
  │ │ ├─al => 1
  │ │ └─l => 4
  │ │   └─  => 2
  │ │     ├─(ray ray) => 1
  │ │     └─ray => 1
  │ ├─ie => 1
  │ └─ket => 1
  └─k => 3
    ├─i => 1
    └─sha => 2

---- 

## Data wrangling

One appealing way to show that monadic pipelines result in clean and readable code, is to demonstrate their use in Raku through data wrangling operations.
Here we load "data packages", get the Titanic dataset, show its structure, and show a sample of its rows:

In [100]:
#% html
my @dsTitanic = get-titanic-dataset();
my @field-names = <id passengerClass passengerSex passengerAge passengerSurvival>;

say deduce-type(@dsTitanic);

@dsTitanic.pick(6) 
==> to-html(:@field-names)

Vector(Assoc(Atom((Str)), Atom((Str)), 5), 1309)


id,passengerClass,passengerSex,passengerAge,passengerSurvival
904,3rd,female,-1,died
285,1st,female,60,survived
418,2nd,male,30,died
201,1st,male,50,died
30,1st,male,30,survived
443,2nd,male,20,died


Here is an `andthen` data wrangling monadic pipeline, the lines of which have the following interpretations:

- Initial pipeline value (the dataset)
- Rename columns
- Filter rows (with age greater or equal to 10)
- Group by the values of the columns "sex" and "survival"
- Show the structure of the pipeline value
- Give the sizes of each group as a result

In [104]:
@dsTitanic 
andthen rename-columns($_,  {passengerAge => 'age', passengerSex => 'sex', passengerSurvival => 'survival'})
andthen $_.grep(*<age> ≥ 10).List
andthen group-by($_, <sex survival>)
andthen {say "Dataset type: ", deduce-type($_); $_}($_)
andthen $_».elems

Dataset type: Struct([female.died, female.survived, male.died, male.survived], [Array, Array, Array, Array])


{female.died => 88, female.survived => 272, male.died => 512, male.survived => 118}

**Remark:** The `andthen` pipeline corresponds to the R pipeline in the second section.

Similar result can be obtained via cross-tabulation and using a pipeline with the feed (`==>`) operator:

In [105]:
@dsTitanic
==> { .grep(*<passengerAge> ≥ 10) }()
==> { cross-tabulate($_, 'passengerSex', 'passengerSurvival') }()
==> to-pretty-table()

+--------+------+----------+
|        | died | survived |
+--------+------+----------+
| female |  88  |   272    |
| male   | 512  |   118    |
+--------+------+----------+

----

## Data wrangling code with multiple languages and packages

Let us demonstrate the *rapid specification of workflows* application by generating data wrangling code from natural language commands. Here is a natural language workflow spec (each row corresponds to a pipeline segment):

In [106]:
sink my $commands = q:to/END/;
use dataset dfTitanic;
rename columns passengerAge as age, passengerSex as sex, passengerClass as class;
filter by age ≥ 10;
group by 'class' and 'sex';
counts;
END

### Grammar based interpreters

Here is a table with the generated codes for different programming languages according to the spec above (using ["DSL::English::DataQueryWorkflows"](https://raku.land/zef:antononcube/DSL::English::DataQueryWorkflows), [AAp3]): 

In [107]:
#% html
my @tbl = <Python R Raku WL>.map({ %( language => $_, code => ToDSLCode($commands, format=>'code', target => $_) ) });
to-html(@tbl, field-names => <language code>, align => 'left').subst("\n", '<br>', :g)

language,code
Python,"obj = dfTitanic.copy() obj = obj.assign( age = obj[""passengerAge""], sex = obj[""passengerSex""], class = obj[""passengerClass""] ) obj = obj[((obj[""age""]>= 10))] obj = obj.groupby([""class"", ""sex""]) obj = obj.size()"
R,"dfTitanic %>% dplyr::rename(age = passengerAge, sex = passengerSex, class = passengerClass) %>% dplyr::filter(age >= 10) %>% dplyr::group_by(class, sex) %>% dplyr::count()"
Raku,"$obj = dfTitanic ; $obj = rename-columns( $obj, %(""passengerAge"" => ""age"", ""passengerSex"" => ""sex"", ""passengerClass"" => ""class"") ) ; $obj = $obj.grep({ $_{""age""} >= 10 }).Array ; $obj = group-by($obj, (""class"", ""sex"")) ; $obj = $obj>>.elems"
WL,"obj = dfTitanic; obj = Map[ Join[ KeyDrop[ #, {""passengerAge"", ""passengerSex"", ""passengerClass""} ], <|""age"" -> #[""passengerAge""], ""sex"" -> #[""passengerSex""], ""class"" -> #[""passengerClass""]|> ]&, obj]; obj = Select[ obj, #[""age""] >= 10 & ]; obj = GroupBy[ obj, {#[""class""], #[""sex""]}& ]; obj = Map[ Length, obj]"


Executing the Raku pipeline (by replacing `dfTitanic` with `@dsTitanic` first):

In [12]:
my $obj = @dsTitanic;
$obj = rename-columns( $obj, %("passengerAge" => "age", "passengerSex" => "sex", "passengerClass" => "class") ) ;
$obj = $obj.grep({ $_{"age"} >= 10 }).Array ;
$obj = group-by($obj, ("class", "sex")) ;
$obj = $obj>>.elems

{1st.female => 132, 1st.male => 149, 2nd.female => 96, 2nd.male => 149, 3rd.female => 132, 3rd.male => 332}

That is not monadic, of course -- see the monadic version above.

----

## LLM generated (via DSL examples)

Here we define an LLM-examples function for translation of natural language commands into code using DSL examples (provided by "DSL::Examples"):

In [13]:
my sub llm-pipeline-segment($lang, $workflow-name = 'DataReshaping') { llm-example-function(dsl-examples(){$lang}{$workflow-name}) };

&llm-pipeline-segment

Here is the LLM translated code:

In [14]:
my $code = llm-pipeline-segment('Raku', 'DataReshaping')($commands)

```perl
my $obj = dfTitanic;
$obj = rename-columns($obj, %(passengerAge => 'age', passengerSex => 'sex', passengerClass => 'class'));
$obj = $obj.grep({ $_{'age'} >= 10 }).Array;
$obj = group-by($obj, ('class', 'sex'));
say $obj>>.elems;
```

Here the translated code *is turned into monadic code* by string manipulation:

In [30]:
my $code-mon =$code.subst(/ $<lhs>=('$' \w+) \h+ '=' \h+ (\S*)? $<lhs> (<-[;]>*) ';'/, {"==> \{{$0}\$_{$1} \}()"} ):g;
$code-mon .= subst(/ $<printer>=[note|say] \h* $<lhs>=('$' \w+) ['>>'|»] '.elems' /, {"==> \{$<printer> \$_>>.elems\}()"}):g;

```perl
my $obj = dfTitanic;
==> {rename-columns($_, %(passengerAge => 'age', passengerSex => 'sex', passengerClass => 'class')) }()
==> {$_.grep({ $_{'age'} >= 10 }).Array }()
==> {group-by($_, ('class', 'sex')) }()
==> {say $_>>.elems}();
```

**Remark:** It is assumed that the string manipulation above is insightful of how and why the monadic pipelines simplify imperative code.

----

## Recommendation pipeline

Here is a computational specification for creating a recommender and obtaining a profile recommendation:

In [108]:
sink my $spec = q:to/END/;
create from @dsTitanic; 
apply LSI functions IDF, None, Cosine; 
recommend by profile for passengerSex:male, and passengerClass:1st;
join across with @dsTitanic on "id";
echo the pipeline value;
END

Here is the Raku code for that spec given as an HTML code snipped with code-highlights:

In [112]:
#%html
ToDSLCode($spec, default-targets-spec => 'WL', format => 'code')
andthen .subst('.', "\n.", :g)
andthen hilite($_)

Here we execute a slightly modified version of the pipeline:

In [None]:
sink my $obj = ML::SparseMatrixRecommender.new
.create-from-wide-form(@dsTitanic)
.apply-term-weight-functions("IDF", "None", "Cosine")
.recommend-by-profile(["passengerSex:male", "passengerClass:1st"])
.join-across(@dsTitanic, on => "id" )
.echo-value(as => {to-pretty-table($_, )} )

----

## Functional parsers (multi-operation pipelines)

In can be said that the package ["FunctionalParsers"](https://raku.land/zef:antononcube/FunctionalParsers), [AAp4], implements multi-operator monadic pipelines the creation of parsers and interpreters. "FunctionalParsers" achieves that using special infix implementations.

In [None]:
use FunctionalParsers :ALL;
my &p1 = {1} ⨀ symbol('one');
my &p2 = {2} ⨀ symbol('two');
my &p3 = {3} ⨀ symbol('three');
my &p4 = {4} ⨀ symbol('four');
my &pH = {10**2} ⨀ symbol('hundred');
my &pT = {10**3} ⨀ symbol('thousand');
my &pM = {10**6} ⨀ symbol('million');
sink my &pNoun = symbol('things') ⨁ symbol('objects');

Here is a parser -- all three monad operations are used:

In [115]:
# Parse sentences that have (1) a digit part, (2) a multiplier part, and (3) a noun
my &p = (&p1 ⨁ &p2 ⨁ &p3 ⨁ &p4) ⨂ (&pT ⨁ &pH ⨁ &pM) ⨂ &pNoun;

# Interpreter:
# (1) flatten the parsed elements
# (2) multiply the first two elements and make a sentence with third element
sink &p = { "{$_[0] * $_[1]} $_[2]"} ⨀ {.flat} ⨀ &p 

Here the parser is applied to different sentences:

In [116]:
['three million things', 'one hundred objects', 'five thousand things']
andthen .map({ &p($_.words.List).head.tail })
andthen (.say for |$_)

3000000 things
100 objects
Nil


The last sentence is not parsed because the parser `&p` knows only the digits from 1 to 4.

----

## References

### Articles, blog posts

[Wk1] Wikipedia entry: [Monad (functional programming)](https://en.wikipedia.org/wiki/Monad_(functional_programming)), URL: [https://en.wikipedia.org/wiki/Monad_(functional_programming)](https://en.wikipedia.org/wiki/Monad_(functional_programming)) . 

[Wk2] Wikipedia entry: [Monad transformer](https://en.wikipedia.org/wiki/Monad_transformer), URL: [https://en.wikipedia.org/wiki/Monad_transformer](https://en.wikipedia.org/wiki/Monad_transformer) .

[H1] Haskell.org article: [Monad laws,](https://wiki.haskell.org/Monad_laws) URL: [https://wiki.haskell.org/Monad_laws](https://wiki.haskell.org/Monad_laws). 

[SH2] Sheng Liang, Paul Hudak, Mark Jones, ["Monad transformers and modular interpreters",](http://haskell.cs.yale.edu/wp-content/uploads/2011/02/POPL96-Modular-interpreters.pdf) (1995), Proceedings of the 22nd ACM SIGPLAN-SIGACT symposium on Principles of programming languages. New York, NY: ACM. pp. 333--343. doi:10.1145/199448.199528.

[PW1] Philip Wadler, ["The essence of functional programming"](https://page.mi.fu-berlin.de/scravy/realworldhaskell/materialien/the-essence-of-functional-programming.pdf), (1992), 19'th Annual Symposium on Principles of Programming Languages, Albuquerque, New Mexico, January 1992.

[RW1] Hadley Wickham et al., [dplyr: A Grammar of Data Manipulation](https://github.com/tidyverse/dplyr), (2014), [tidyverse at GitHub](https://github.com/tidyverse), URL: [https://github.com/tidyverse/dplyr](https://github.com/tidyverse/dplyr) .
       (See also, [http://dplyr.tidyverse.org](http://dplyr.tidyverse.org) .)

[AA1] Anton Antonov, ["Monad code generation and extension"](https://mathematicaforprediction.wordpress.com/2017/06/23/monad-code-generation-and-extension/), (2017), [MathematicaForPrediction at WordPress](https://mathematicaforprediction.wordpress.com).

[AA2] Anton Antonov, ["Monad laws in Raku"](https://rakuforprediction.wordpress.com/2025/11/16/monad-laws-in-raku/), (2025), [RakuForPrediction at WordPress](https://rakuforprediction.wordpress.com).

[AA3] Anton Antonov, ["Day 2 – Doing Data Science with Raku"](https://raku-advent.blog/2025/12/02/day-2-doing-data-science-with-raku/), (2025), [Raku Advent Calendar at WordPress](https://raku-advent.blog/).

### Packages

[AAp1] Anton Antonov, [MonadMakers](https://resources.wolframcloud.com/PacletRepository/resources/AntonAntonov/MonadMakers/), Wolfram Language paclet, (2023), [Wolfram Language Paclet Repository](https://resources.wolframcloud.com/PacletRepository/).

[AAp2] Anton Antonov, [StatStateMonadCodeGeneratoreNon](https://github.com/antononcube/R-packages/tree/master/StateMonadCodeGenerator), R package, (2019-2024), 
[GitHub/@antononcube](https://github.com/antononcube/).

[AAp3] Anton Antonov, [DSL::English::DataQueryWorkflows](https://github.com/antononcube/Raku-DSL-English-DataQueryWorkflows), Raku package, (2020-2024), 
[GitHub/@antononcube](https://github.com/antononcube/).

[AAp4] Anton Antonov, [FunctionalParsers](https://github.com/antononcube/Raku-FunctionalParsers), Raku package, (2023-2024), 
[GitHub/@antononcube](https://github.com/antononcube/).

### Videos

[AAv1] Anton Antonov, [Monadic Programming: With Application to Data Analysis, Machine Learning and Language Processing](https://www.youtube.com/watch?v=_cIFA5GHF58), (2017), Wolfram Technology Conference 2017 presentation. [YouTube/WolframResearch](https://www.youtube.com/@WolframResearch).

[AAv2] Anton Antonov, [Raku for Prediction](https://www.youtube.com/watch?v=frpCBjbQtnA), (2021), [The Raku Conference 2021](https://www.youtube.com/@therakuconference6823).

[AAv3] Anton Antonov, [Simplified Machine Learning Workflows Overview](https://www.youtube.com/watch?v=Xy7eV8wRLbE), (2022), Wolfram Technology Conference 2022 presentation. [YouTube/WolframResearch](https://www.youtube.com/@WolframResearch).

[AAv4] Anton Antonov, [Simplified Machine Learning Workflows Overview (Raku-centric)](https://www.youtube.com/watch?v=p3iwPsc6e74), (2022), Wolfram Technology Conference 2022 presentation. [YouTube/@AAA4prediction](https://www.youtube.com/@AAA4prediction).

[AAv5] Anton Antonov, [Applications of Monadic Programming, Part 1, Questions & Answers](https://www.youtube.com/watch?v=Xz5B4B0kVco), (2025), [YouTube/@AAA4prediction](https://www.youtube.com/@AAA4prediction).
