DELPH-IN is an international consortium of researchers committed to producing precise, high-quality language processing tools and resources, primarily in the HPSG syntactic and MRS semantic frameworks, and PyDelphin is a suite of Python libraries for processing data and interacting with tools in the DELPH-IN ecosystem. PyDelphin's goal is to lower the barriers to making use of DELPH-IN resources to help users quickly build applications or perform experiments, and it has been successfully used for research into machine translation (e.g., Goodman, 2018), sentence chunking (Muszyńska, 2016), neural semantic parsing (Buys & Blunsom, 2017), natural language generation (Hajdik et al., 2019), and more.
Documentation, including guides and an API reference, is available here: http://pydelphin.readthedocs.io/
New to PyDelphin? Want to see examples? Try the walkthrough.
Get the latest release of PyDelphin from PyPI:
$ pip install pydelphin
For more information about requirements, installing from source, and running unit tests, please see the documentation.
API changes in new versions are documented in the CHANGELOG, but for any unexpected changes please file an issue.
PyDelphin contains the following modules:
Semantic Representations:
delphin.mrs
: Minimal Recursion Semanticsdelphin.eds
: Elementary Dependency Structuresdelphin.dmrs
: Dependency Minimal Recursion Semantics
Semantic Components, Interfaces, and Metrics:
delphin.semi
: Semantic Interfacedelphin.vpm
: Variable Property Mappingdelphin.variable
: MRS variablesdelphin.predicate
: Semantic Predicatesdelphin.scope
: Underspecified scopedelphin.sembase
: Basic semantic structuresdelphin.codecs
: A wide variety of serialization codecs for MRS, EDS, and DMRSdelphin.edm
: Elementary Dependency Match
Grammar and Parse Inspection:
delphin.derivation
: Derivation treesdelphin.tdl
: Type-Description Languagedelphin.tfs
: Feature structures and type hierarchies
Tokenization:
delphin.repp
: Regular-Expression PreProcessordelphin.tokens
: YY Token latticesdelphin.lnk
: Lnk surface alignments
Corpus Management and Processing:
delphin.itsdb
: [incr tsdb()] profilesdelphin.tsdb
: Low-level interface to test suite databasesdelphin.tsql
: TSQL test suite queries
Interfaces with External Processors:
delphin.interface
: Structures for interacting with external processorsdelphin.ace
: Python wrapper for common tasks using ACEdelphin.web
: Client for the web API
Core Components and Command Line Interface:
delphin.commands
: Functional interface to common tasksdelphin.cli
: Command-line interface to functional commandsdelphin.hierarchy
: Multiple-inheritance hierarchiesdelphin.exceptions
: PyDelphin's basic exception classes
Miscellaneous:
delphin.highlight
: Pygments lexers and styles for highlighting MRS and TDL
Please use the issue tracker for bug reports, feature requests, and documentation requests. For more general questions and support, try one of the following channels maintained by the DELPH-IN community:
- DELPH-IN Discourse forums
- developers mailing list
Please use the following for academic citations (and see: https://ieeexplore.ieee.org/abstract/document/8939628):
@INPROCEEDINGS{Goodman:2019,
author={Goodman, Michael Wayne},
title={A Python Library for Deep Linguistic Resources},
booktitle={2019 Pacific Neighborhood Consortium Annual Conference and Joint Meetings (PNC)},
year={2019},
month=oct,
address={Singapore},
keywords={research software;linguistics;semantics;HPSG;computational linguistics;natural language processing;open source software}
}
Thanks to PyDelphin's contributors and all who've participated by filing issues and feature requests. Also thanks to the users of PyDelphin!
- Parser/Generators (chronological order)
- Grammar profiling, testing, and analysis
- [incr tsdb()]: http://www.delph-in.net/itsdb/
- gDelta: https://github.com/ned2/gdelta
- Typediff: https://github.com/ned2/typediff
- Software libraries and repositories
- Also see (may have overlap with the above):
Earlier versions of PyDelphin were spelled "pyDelphin" with a lower-case "p" and this form is used in several publications. The current recommended spelling has an upper-case "P".