As I no longer have time to maintain this project I am looking for collaborators to help to maintain. You can sign up by sending a pull request which fixes a bug or adds a feature.
ITU Turkish NLP Pipeline Caller
A Python3 wrapper tool to help using ITU Turkish NLP Pipeline API
For details of the pipeline, please check the pipeline page and the sources below.
To be able to use the pipeline, you need an authentication token (details on API web page).
If you experience any problem please contact with me via the gitter chat room.
This repository is tested with Python 3.4, 3.5 and 3.6 versions, but using the most up-to-date one is always better.
Using PyPI just run
pip3 install ITU-Turkish-NLP-Pipeline-Caller
Download the latest release, extract the archive and inside that directory simply run
python3 ./setup.py install to install.
As a Command Line Tool
The tool reads the token from
pipeline.token file (under the same directory with the tool) by default.
reads the input file, prints the output under
You can select the pipeline tool by using
pipeline_caller <filename> --tool <tool_name>
default is "pipelineNoisy"
You can force the encoding for I/O by using
pipeline_caller <filename> -e <encoding>
default is your system locale
You can switch processing type using
-p option. Input text can be processed whole at once, sentence by sentence or word by word. For some tools (
isturkish for example) in the Pipeline, word by word processing is necessary at the moment. Default type is whole at once.
pipeline_caller <filename> --tool isturkish -p word sends input text to
isturkish tool, word by word.
And you can change the output directory by using
pipeline_caller <filename> -o <another_directory>
default is "output"
pipeline_caller --help shows the help menu.
Using As a Module
caller = pipeline_caller.PipelineCaller()
result = caller.call(<tool_name>, <text>, <api_token>)
Check DEFAULTS block in the source code if you need (generally, you don't) to change one of these:
api_url = "http://tools.nlp.itu.edu.tr/SimpleApi"
pipeline_encoding = 'UTF-8'
token_path = "pipeline.token" for command line tool
default_output_dir = "output"
default_enconding = locale.getpreferredencoding(False) default encoding in your OS, for I/O operations in command line tool
default_sentence_split_delimiter_class = "[\.\?:;!]" for command line tool, to separate sentences and process sentence by sentence
Special thanks to Asst. Prof. Dr. Peter Schüller for his great suggestions!
Author, Copyright & License
This work was a part of a KnowLP research project.
Copyright 2015-2018 Maintainers:
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License version 2 as published by the Free Software Foundation.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/.