Skip to content
Generic framework for information extraction tasks, including recognition of named entities, temporal expressions, spatial expressions and events.
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.


Build Status Coverage Status License: LGPL v3

Copyright (C) Wrocław University of Science and Technology (PWr), 2010-2018. All rights reserved.

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see


  • Michał Marcińczuk (2010–present),
  • Jan Kocoń (2014–present),
  • Adam Kaczmarek (2014–2015),
  • Michał Krautforst (2013-2015),
  • Dominik Piasecki (2013),
  • Maciej Janicki (2011)


System architecture and KPWr NER models

Marcińczuk, Michał; Kocoń, Jan; Oleksy, Marcin. Liner2 — a Generic Framework for Named Entity Recognition In: Proceedings of the 6th Workshop on Balto-Slavic Natural Language Processing, pages 86–91, Valencia, Spain, 4 April 2017. Association for Computational Linguistics



  author = 	"Marci{\'{n}}czuk, Micha{\l}
		and Koco{\'{n}}, Jan
		and Oleksy, Marcin",
  title = 	"Liner2 --- a Generic Framework for Named Entity Recognition",
  booktitle = 	"Proceedings of the 6th Workshop on Balto-Slavic Natural Language Processing",
  year = 	"2017",
  publisher = 	"Association for Computational Linguistics",
  pages = 	"86--91",
  location = 	"Valencia, Spain",
  doi = 	"10.18653/v1/W17-1413",
  url = 	""

NKJP NER model

Marcińczuk, Michał; Kocoń, Jan; Gawor, Michał. Recognition of Named Entities for Polish-Comparison of Deep Learning and Conditional Random Fields Approaches Ogrodniczuk, Maciej; Kobyliński, Łukasz (Eds.): Proceedings of the PolEval 2018 Workshop, pp. 63-73, Institute of Computer Science, Polish Academy of Science, Warszawa, 2018.



  title     = "Recognition of Named Entities for Polish-Comparison of Deep Learning and Conditional Random Fields Approaches",
  author    = "Marcińczuk, Michał and Kocoń, Jan and Gawor, Michał",
  year      = "2018",
  editor    = "Ogrodniczuk, Maciej and Kobyliński, Łukasz",
  booktitle = "Proceedings of the PolEval 2018 Workshop",
  location  = "Warsaw, Poland",
  pages     = "77--92",
  publisher = "Institute of Computer Science, Polish Academy of Science"



  • Java 8
  • C++ compiler (gcc 3.0 or higher) for CRF++


Optional libraries:



If you do not have CRF++ installed then do the following steps:

cd g419-external-dependencies
tar -xvf CRF++-0.57.tar.gz
cd CRF++-0.57
sudo make install
sudo ldconfig


./gradlew jar

Runtime test



* A framework for multitask sequence labeling, including: named entities, temporal expressions. *
*                                                                                               *
* Authors: Michał Marcińczuk (2010–2016), Jan Kocoń (2014–2016), Adam Kaczmarek (2014–2015)     *
*    Past: Michał Krautforst (2013-2015), Dominik Piasecki (2013), Maciej Janicki (2011)        *
* Contact:                                                        *
*                                                                                               *
*          G4.19 Research Group, Wrocław University of Technology                               *

Use one of the following tools:
 - agreement           -- checks agreement (of annotations) between suplied documents
 - agreement2          -- compare sets of annotations for each pair of corpora. One set is
                          treated as a reference set and the other as a set to evaluate. It is a
                          refactored version of the agreement action.
 - annotations         -- generates an arff file with a list of annotations and their features
 - constituents-eval   -- evaluates normalizer against a specific set of documents (-i
                          batch:FORMAT, -i FORMAT)
 - convert             -- converts documents from one format to another and applies defined
 - curve               -- brak opisu
 - eval                -- evaluates chunkers against a specific set of documents (-i
                          batch:FORMAT, -i FORMAT) #or perform cross validation (-i cv:{format})
 - eval-unique         -- evaluates chunkers against a specific set of documents (-i
                          batch:FORMAT, -i FORMAT) #or perform cross validation (-i
                          cv:{format}). The evaluation is performed on the sets#with unique
                          annotations, i.e. annotations with the same orth/base are treated as a
                          single annotation
 - inplace             -- process documents in place
 - interactive         -- processes text entered directly into the terminal
 - lemmatize           -- ToDo
 - normalizer-eval3    -- processes data with given model
 - normalizer-validate -- Read all annotation and their metadata and look for errors.
 - pipe                -- processes data with given model
 - search              -- earches for a phrases matching given pattern based on a set of token
 - selection           -- todo
 - stats               -- prints corpus statistics
 - train               -- trains chunkers

usage: ./liner2-cli [action] [options]

Pre-trained models

KPWr NER for Polish

The package contains three models for recognition named entities according to KPWr NE guidelines.

  • nam — named entity boundaries,
  • top9 — coarse-grained categories,
  • n82 — fine-grained categories.


Download the package:

cd Liner2
wget -O liner25_model_ner_rev1.7z 

Unpack the package:

7z x liner25_model_ner_rev1.7z

Process a sample CCL file:

./liner2-cli pipe -i ccl -o tuples -f stuff/resources/sample-sentence.xml -m liner25_model_ner_rev1/config-top9.ini

Expected output:

(4,11,nam_liv,"Ala Nowak")

PolEval 2018 Task 2: Named Entity Recognition


DSpace page: (temporarily off-line)

Direct link to the package: (temporarily off-line)

Liner2 participated in PolEval 2018 Task 2 on named entity recognition. It got a third place with the following scores:

Metric F1 score
Final 0.810
Exact 0.778
Overlap 0.818

Download the package with model:

cd Liner2
wget -O 

Unpack the model:


Process a sample CCL file:

./liner2-cli pipe -i ccl -o tuples -f stuff/resources/sample-sentence.xml -m liner26_model_ner_nkjp/config-nkjp-poleval2018.ini

Expected output:

(4,11,null,persName,"Ala Nowak","Ala Nowak")

Service mode (using RabbitMQ)


Liner2 can be run as a service which listen to a RabbitMQ queue for upcomming requests (liner2-input). and submit the results to another queue (liner2-output). The input message (send by the client) should have the following format:



  • ROUTE_KEY — name of a route used to post the results to the output queue. The routing key is used by the client to receive the response for their request ignoring others,
  • PATH — an absolute path to the file to process.

For example:

client-001 /tmp/document.txt

The message send by the service will contain path to a file which contains the output of processing.

Running the service

./liner2-daemon rabbitmq -m liner26_model_ner_nkjp/config-nkjp-poleval2018.ini -i plain:wcrft

Expected output:

 INFO [Thread-1] ( - Listing to RabbitMQ on channel liner2-input ...
Consumer amq.ctag-m6D9fIMI_Qsm61BH7HoxlA registered

It is possible to run more than one instance of ./liner2-daemon rabbitmq. However, all of them should use the same model and input format.


Folder stuff/python contains a Python script to test the communication with the service. The script takes a text to process, stores the texts in a temporal file, generates a routing key, send both to the liner2-input queue and listen to liner2-output. After receiving the response it reads the output file, removes both temporal files and prints the output.

python3 stuff/python/ "Pani Ala Nowak mieszkw w Zielonej Górze"

The output should be as follows:

[INFO] Temp route: route-1DVRP4
[INFO] Temp input file: /tmp/amu7_3at
[INFO] Sent msg 'route-1DVRP4 /tmp/amu7_3at' to liner2-input
[INFO] Temp output file: b'/tmp/amu7_3at-ner.xml'
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE chunkList SYSTEM "ccl.dtd">
 <chunk id="ch1">
  <sentence id="s1">
    <lex disamb="1"><base>pani</base><ctag>subst:sg:nom:f</ctag></lex>
    <ann chan="persname">0</ann>
    <ann chan="persname_forename">0</ann>
    <ann chan="persname_surname">0</ann>
    <ann chan="placename_settlement">0</ann>
    <lex disamb="1"><base>Ala</base><ctag>subst:sg:nom:f</ctag></lex>
    <ann chan="persname" head="1">1</ann>
    <ann chan="persname_forename" head="1">1</ann>
    <ann chan="persname_surname">0</ann>
    <ann chan="placename_settlement">0</ann>
    <lex disamb="1"><base>Nowak</base><ctag>subst:sg:nom:m1</ctag></lex>
    <lex disamb="1"><base>nowak</base><ctag>subst:sg:nom:m1</ctag></lex>
    <ann chan="persname">1</ann>
    <ann chan="persname_forename">0</ann>
    <ann chan="persname_surname" head="1">1</ann>
    <ann chan="placename_settlement">0</ann>
    <lex disamb="1"><base>mieszkać</base><ctag>fin:sg:ter:imperf</ctag></lex>
    <ann chan="persname">0</ann>
    <ann chan="persname_forename">0</ann>
    <ann chan="persname_surname">0</ann>
    <ann chan="placename_settlement">0</ann>
    <lex disamb="1"><base>w</base><ctag>prep:loc:nwok</ctag></lex>
    <ann chan="persname">0</ann>
    <ann chan="persname_forename">0</ann>
    <ann chan="persname_surname">0</ann>
    <ann chan="placename_settlement">0</ann>
    <lex disamb="1"><base>zielony</base><ctag>adj:sg:loc:f:pos</ctag></lex>
    <ann chan="persname">0</ann>
    <ann chan="persname_forename">0</ann>
    <ann chan="persname_surname">0</ann>
    <ann chan="placename_settlement" head="1">1</ann>
    <lex disamb="1"><base>góra</base><ctag>subst:sg:loc:f</ctag></lex>
    <ann chan="persname">0</ann>
    <ann chan="persname_forename">0</ann>
    <ann chan="persname_surname">0</ann>
    <ann chan="placename_settlement">1</ann>

Logs on the server side:

 INFO [pool-1-thread-5] ( - Received path: '/tmp/amu7_3at'
 INFO [pool-1-thread-5] ( - Output saved to /tmp/amu7_3at
 INFO [pool-1-thread-5] ( - Sent /tmp/amu7_3at-ner.xml to liner2-output:route-1DVRP4'
 INFO [pool-1-thread-5] ( - Request processing done
You can’t perform that action at this time.