Skip to content
Alto, the Algebraic Language Toolkit
Java Groovy Jupyter Notebook Other
Branch: master
Clone or download
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.settings
examples added rdg example grammar May 23, 2019
gradle/wrapper Switch to gradle, fix source locations Apr 3, 2019
img Release 2.3.0, travis releases, update README Apr 11, 2019
scripts fixed CTF parsing bug that would cause outside scores to become NaN i… Apr 12, 2017
src Test new SubsetAlgebra.subset implementation. Aug 8, 2019
.gitignore bumped jline dependency to jline3; fixes #43 Apr 16, 2019
.travis.yml update travis api key Apr 11, 2019
APACHE-LICENSE-2.0.txt added laboratory 2.0 support to main alto branch Jun 30, 2016
README.md readme: added potsdam repo to alto-as-dependency section Jun 6, 2019
alto-by-example.ipynb update alto-by-example notebook for new jar location. May 23, 2019
build.gradle disable failing on error for travis-ci again Aug 8, 2019
gradlew Switch to gradle, fix source locations Apr 3, 2019
gradlew.bat Switch to gradle, fix source locations Apr 3, 2019
init.scala
license-info.md
settings.gradle Switch to gradle, fix source locations Apr 3, 2019

README.md

The Alto parser

Build Status

Welcome to Alto, the Algebraic Language Toolkit.

Alto is a parser and decoder for Interpreted Regular Tree Grammars (IRTGs). It is being developed at Saarland University in the Computational Linguistics group, led by Alexander Koller. Its main features are:

  • Represents grammars from a wide variety of popular grammar formalisms as IRTGs, including:
    • Context-free grammars
    • Tree-adjoining grammars (TAG)
    • Tree automata and bottom-up tree transducers
    • Synchronous context-free grammars, TAG, etc.
    • Tree-to-string and string-to-tree transducers
    • Synchronous Hyperedge Replacement Grammars (HRG): Alto is the fastest published HRG parser in the world
    • and many more
  • Implements chart-based algorithms for
    • parsing
    • synchronous parsing (with inputs from multiple sides of a synchronous grammar)
    • decoding (to another side of a synchronous grammar)
    • computing 1-best (Viterbi) and k-best derivations
    • maximum likelihood and expectation maximization (EM) training
    • binarization
  • Supports PCFG-style and log-linear probability models for all of these grammar formalisms.
  • Built for easy extensibility: implement your own grammar formalism by adding an Algebra class, and use any of the Alto algorithms directly.
  • Comes with a GUI that provides access to most of these algorithms and visualizes parsing results.

Alto is published under a Apache 2.0 license. More license information can be found in the file license-info.md.

The basic theory of IRTGs is explained in Koller & Kuhlmann, IWPT 2011. You can find more details on the Literature page.

Running and using Alto

Alto requires at least Java 8 and can be downloaded here. To build Alto from source, clone this repository and run ./gradlew build (or ./gradlew.bat build if you use Windows).

To use Alto as a library in your project, include it via jitpack:

repositories {
    [...]
    mavenCentral()
    maven {url 'http://akci.coli.uni-saarland.de/artifactory/external'}
    maven {url 'https://jitpack.io'}
}
dependencies {
    [...]
    compile group: "com.github.coli-saar", name:"alto", version:"2.3.0"
}

If you want to build against the latest version, use master-SNAPSHOT or a specific git hash as version.

See the Wiki for more details on how to use Alto. The tutorials are a good way to get started. For advanced usage, you can check out the JavaDoc (see below).

If you run into trouble, please feel free to ask for help on our Google group, or you can submit an issue.

JavaDoc

You can read the JavaDoc API documentation for the current stable version or the JavaDoc API documentation for the master branch.

Screenshots

Here are some screenshots of the Alto GUI. Here's an IRTG with one string and one graph interpretation (equivalent to a synchronous HRG):

Screenshot of GUI showing an IRTG grammar

Here's the result of parsing "the boy wants to go" with this grammar:

Screenshot of GUI showing parse trees

Version History

Version 2.3.0, April 2019

  • Move to GitHub and switch build system to gradle.
  • fix deprecation warnings in the build process
  • use more generics instead of raw classes

Version 2.1, April 2017

  • Improved intersection and invhom algorithms (condensed, sibling-finder) for much faster PCFG, TAG, and HRG parsing (Groschwitz et al., ACL 2016).
  • Added pruning techniques, including beam search and coarse-to-fine parsing.
  • Added adaptive importance sampler for grammar induction (Teichmann et al., ACL 2016 Workshop on Statistical NLP and Weighted Automata).
  • Added "inside" binarization strategy (Klein & Manning 2003).
  • Added command-line scripts for parsing and grammar/corpus conversion.
  • Initial support for running reproducible experiments using Alto Lab.
  • Many small bugfixes and performance improvements.

Version 2.0, July 2015

  • Initial Bitbucket release.

Contributors

You can’t perform that action at this time.