Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RPNI data labelling format #32

Closed
tomyaacov opened this issue May 30, 2022 · 5 comments
Closed

RPNI data labelling format #32

tomyaacov opened this issue May 30, 2022 · 5 comments

Comments

@tomyaacov
Copy link

Hi,
I started using RPNI algorithm in the package recently.
From what I understood in the literature, in the standard RPNI the entire words are labeled as accepting\rejecting and not every prefix of it.
I wonder if there is a way to format the data in the same way.
For example, the data record:

[('b', False), ('b', False), ('a', True)]

will be represented as something like:

[('b', None), ('b', None), ('a', True)]

Am I missing something? is this option available in some way?

@emuskardin
Copy link
Member

emuskardin commented May 30, 2022

Hi,

To best of my understanding, RPNI only works iff all words are "prefix-complete", meaning that prefixes of all words are also labeled. Otherwise the prefix tree acceptor would be too ambiguous.

@mtappler , can you chime in on this?

Other than that, None is reserved as an indicator of the initial output/acceptance, and can only be found at index 0.

@mtappler
Copy link
Member

Hi,

We had some internal discussions. RPNI for DFAs can have "holes" in the dataset, meaning the sample data is not necessarily prefix-closed.
However, our current implementation assumes that all inputs have corresponding outputs, like in traces of Mealy/Moore machines. We are currently investigating how to adapt our RPNI implementation to work with non-prefix-closed samples, where some inputs have "None" as ouput as you suggested.
We will get back to you, when we know more.

@tomyaacov
Copy link
Author

Thanks, will be happy to help.

@emuskardin
Copy link
Member

emuskardin commented May 31, 2022

Hi,

I just pushed updated version on RPNI, so if you update AALpy's version it should work with non-prefix closed sets. Only RPNI for Mealy machines requires that all prefixes are also labeled, we will look into that at some later point when we have more time.

I also changed the data format a bit, now it is bit simpler I would think :)
(If you want to learn Moore machines, simply replace True/False with outputs)

from aalpy.learning_algs import run_RPNI
data = [(('a', 'a', 'a'), True),
        (('a', 'a', 'b', 'a'), True),
        (('b', 'b', 'a'), True),
        (('b', 'b', 'a', 'b', 'a'), True),
        (('a',), False),
        (('b', 'b'), False),
        (('a', 'a', 'b'), False),
        (('a', 'b', 'a'), False)]


model = run_RPNI(data, automaton_type='dfa')

Let us know if you run into any more challenges :)

@tomyaacov
Copy link
Author

Thank you for the fast response! Will check it out and update you if any issues will come up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants