Skip to content
NLP framework in python for entity recognition and relationship extraction
Branch: develop
Clone or download
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
nalaf make sure the code doesn't fail on empty sentences Dec 21, 2018
resources
tests rename norm to norms Jul 17, 2017
.gitignore
.travis.yml
LICENSE.txt add Apache License Dec 3, 2015
MANIFEST.in #bugfix in build -- add LICENSE.txt Jan 13, 2018
README.md
example_annotate.py
setup.py

README.md

PyPI version Build Status codecov

nalaf - (Na)tural (La)nguage (F)ramework

nalaf is a NLP framework written in python. The goal is to be a general-purpose module-based and easy-to-use framework for common text mining tasks. At the moment two tasks are covered: named-entity recognition (NER) and relationship extraction. These modules support both training and annotating. Associated to these, helper components such as cross-validation training or reading and conversion from different corpora formats are given. At the moment, NER is implemented with Conditional Random Fields (CRFs) and relationship extraction with Support Vector Machines (SVMs) using either linear or tree kernels.

Historically, the framework started from 2 joint theses at Rostlab at Technische Universität München with a focus on bioinformatics / BioNLP. Concretely the first goal was to do extraction of NL mutation mentions. Soon after another master's thesis used and generalized the framework to do relationship extraction of transcription factors (TF) interacting with gene or gene products. The nalaf framework is planned to be used in other BioNLP tasks at Rostlab.

As a result of the original BioNLP focus, some parts of the code are tailored to the biomedical domain. However, current efforts are underway to generalize all parts and this process is almost done. Development is not active and code maintenance is not guaranteed.

Current maintainer: Juan Miguel Cejuela (@juanmirocks).

Pipeline diagram (editable version on Lucidchart of the pipeline diagram; requires log in)

HOWTO Install

Requirements

  • Requires Python 3 (>= 3.5)

Install nalaf

From PyPi

pip3 install nalaf
python3 -m nalaf.download_data

From source

git clone https://github.com/Rostlab/nalaf.git
cd nalaf
python3 setup.py install
python3 -m nalaf.download_data

Test

python3 setup.py nosetests -a '!slow' # Exclude the slow ones

HOWTO Run, Examples

Run example_annotate.py for a simple example of annotation with a pre-trained NER model for protein names extraction:

Development

You can’t perform that action at this time.