RPNI data labelling format #32

tomyaacov · 2022-05-30T10:16:53Z

Hi,
I started using RPNI algorithm in the package recently.
From what I understood in the literature, in the standard RPNI the entire words are labeled as accepting\rejecting and not every prefix of it.
I wonder if there is a way to format the data in the same way.
For example, the data record:

[('b', False), ('b', False), ('a', True)]

will be represented as something like:

[('b', None), ('b', None), ('a', True)]

Am I missing something? is this option available in some way?

emuskardin · 2022-05-30T10:45:47Z

Hi,

To best of my understanding, RPNI only works iff all words are "prefix-complete", meaning that prefixes of all words are also labeled. Otherwise the prefix tree acceptor would be too ambiguous.

@mtappler , can you chime in on this?

Other than that, None is reserved as an indicator of the initial output/acceptance, and can only be found at index 0.

mtappler · 2022-05-30T12:31:21Z

Hi,

We had some internal discussions. RPNI for DFAs can have "holes" in the dataset, meaning the sample data is not necessarily prefix-closed.
However, our current implementation assumes that all inputs have corresponding outputs, like in traces of Mealy/Moore machines. We are currently investigating how to adapt our RPNI implementation to work with non-prefix-closed samples, where some inputs have "None" as ouput as you suggested.
We will get back to you, when we know more.

tomyaacov · 2022-05-30T15:12:12Z

Thanks, will be happy to help.

emuskardin · 2022-05-31T10:59:15Z

Hi,

I just pushed updated version on RPNI, so if you update AALpy's version it should work with non-prefix closed sets. Only RPNI for Mealy machines requires that all prefixes are also labeled, we will look into that at some later point when we have more time.

I also changed the data format a bit, now it is bit simpler I would think :)
(If you want to learn Moore machines, simply replace True/False with outputs)

from aalpy.learning_algs import run_RPNI
data = [(('a', 'a', 'a'), True),
        (('a', 'a', 'b', 'a'), True),
        (('b', 'b', 'a'), True),
        (('b', 'b', 'a', 'b', 'a'), True),
        (('a',), False),
        (('b', 'b'), False),
        (('a', 'a', 'b'), False),
        (('a', 'b', 'a'), False)]


model = run_RPNI(data, automaton_type='dfa')

Let us know if you run into any more challenges :)

tomyaacov · 2022-05-31T11:30:58Z

Thank you for the fast response! Will check it out and update you if any issues will come up.

emuskardin closed this as completed May 31, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RPNI data labelling format #32

RPNI data labelling format #32

tomyaacov commented May 30, 2022

emuskardin commented May 30, 2022 •

edited

mtappler commented May 30, 2022

tomyaacov commented May 30, 2022

emuskardin commented May 31, 2022 •

edited

tomyaacov commented May 31, 2022

RPNI data labelling format #32

RPNI data labelling format #32

Comments

tomyaacov commented May 30, 2022

emuskardin commented May 30, 2022 • edited

mtappler commented May 30, 2022

tomyaacov commented May 30, 2022

emuskardin commented May 31, 2022 • edited

tomyaacov commented May 31, 2022

emuskardin commented May 30, 2022 •

edited

emuskardin commented May 31, 2022 •

edited