Skip to content
This repository

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Learn Finite State Automata that generalize over a set of training instances

branch: master

Fetching latest commit…

Octocat-spinner-32-eaf2f5

Cannot retrieve the latest commit at this time

Octocat-spinner-32 run_config
Octocat-spinner-32 src
Octocat-spinner-32 COPYING
Octocat-spinner-32 README
Octocat-spinner-32 best-instance.dot
Octocat-spinner-32 best-instance.png
Octocat-spinner-32 toy-training.txt
README
README
======

This is an (incomplete) implementation of GAL, the Genetic Automaton Learner as 
defined in Chapter 5 of Belz (2000).

The point of this code is to learn a FSA that *generalizes* from a set of positive
examples. (If you just want to cover exactly the input, you can use any existing
package for doing FSA minimization.)

The input file is in text format, one sequence per line, the alphabet will be induced
by tokens separated by white-spaces.

This project currently builds under Eclipse, feel free to contribute an ant script.


To run an example, use the LearnFSA run configuration in run_config/

To create a graph from the output, install the package graphviz and use

  dot -Tpng best-instance.dot > best-instance.png
  
Take a look at the PNG in the repository for an example of the learned automata.

Edit the default.properties to get better goodness of fit. Read chapter 5 of Belz (2000)
for more details and for further ideas about how to improve it. Patches are welcomed.

Pablo Duboue, November 2010




References

Belz, Anja (2000) "Computational Learning of Finite-State Models for 
Natural Language Processing", PhD Thesis, University of Sussex.




Something went wrong with that request. Please try again.