## PDFA

In this notebook, we will see how to
use the `PDFA` class.

### Example

Utility functions to display SVGs.

In [6]:
import numpy as np
%matplotlib inline
import tempfile
from pathlib import Path
from IPython.core.display import display, HTML, SVG
from src.pdfa import PDFA
from src.pdfa.render import to_graphviz

_default_svg_style = "display: block; margin-left: auto; margin-right: auto; width: 50%;"
def display_svgs(*filenames, style=_default_svg_style):
    svgs = [SVG(filename=f).data for f in filenames]
    joined_svgs = "".join(svgs)
    no_wrap_div = f'<div style="{style}white-space: nowrap">{joined_svgs}</div>'
    display(HTML(no_wrap_div))

def render_automaton(pdfa: PDFA):
    digraph = to_graphviz(automaton)
    tmp_dir = tempfile.mkdtemp()
    tmp_filepath = str(Path(tmp_dir, "output"))
    digraph.render(tmp_filepath)
    display_svgs(tmp_filepath + ".svg")

The following automaton captures all the
sequences of _only_ heads, followed by one tail.

In [7]:
def make_automaton(p: float = 0.5) -> PDFA:
    """
    Make the PDFA for the heads and tail example.

    :param p: the probability of getting head.
    :return: the PDFA.
    """
    return PDFA(
        nb_states=1,
        alphabet_size=2,
        transition_dict={
            0: {
                0: (0, p),
                1: (1, 1 - p),
            }
        }
    )

automaton = make_automaton(0.5)
render_automaton(automaton)

Sample a word from the PDFA above.

In [8]:
automaton.sample()

[1]

The average length of the trace is:

$\sum\limits_{n=1}^{\infty} n\cdot p^{n-1}p = \frac{1}{(1 - p)}$

Which for $p=\frac{1}{2}$, it is $2$.

In [16]:
ps = [0.1, 0.5, 0.9]
expected_length = lambda x: 1 / (1 - x)

nb_samples = 10000
for p in ps:
    _automaton = make_automaton(p)
    samples = [_automaton.sample() for _ in range(nb_samples)]
    average_length = np.mean([len(l) for l in samples])
    print(f"The average length of the samples is: {average_length:5.2f}. Expected: {expected_length(p):5.2f}.")

The average length of the samples is:  1.11. Expected:  1.11.
The average length of the samples is:  1.99. Expected:  2.00.
The average length of the samples is:  9.69. Expected: 10.00.
