docker-compose script for NLP Building Blocks.
NLP Building Blocks

This repository contains a docker-compose script for creating the NLP Building Blocks containing:


Starting the Services in Containers

git clone
cd nlp-building-blocks
docker-compose up

Sending Requests to the Containers

Once the containers have started you can access each application at their respective ports:

  • Prose Sentence Extraction Engine - port 8060 - API Guide
  • Renku Language Detection Engine - port 7070 - API Guide
  • Sonnet Tokenization Engine - port 9040 - API Guide
  • Idyl E3 Entity Extraction Engine - port 9090 - API Guide
  • Verso Text Preprocessing Engine - port 7080 - API Guide

Java SDK

The nlp-building-blocks-java-sdk can be used to programatically create NLP pipelines using the services.

Using cURL

You can easily interact with each of the services using cURL as shown below:

Renku Language Detection Engine

To detect the language of text:

curl http://localhost:7070/api/language -d "Can you please tell me what language this text is in?" -H "Content-Type: text/plain"

Prose Sentence Extraction Engine

To extract sentences from text:

curl "http://localhost:8060/api/sentences?language=eng" -d "This is a sentence. This is another sentence." -H "Content-Type: text/plain"

Sonnet Tokenization Engine

To tokenize text:

curl "http://localhost:9040/api/tokenize?language=eng" -d "Tokenize this text please." -H "Content-Type: text/plain"

Idyl E3 Entity Extraction Engine

To extract named-entities from text:

curl "http://localhost:9000/api/extract?language=eng" -d '["George", "Washington", "was", "president."]' -H "Content-Type: application/json"

Verso Text Preprocessing Engine

To preprocess text for an NLP pipeline:

curl "http://$HOST:7080/api/preprocess?lc=y" -d "Preprocess this text, please?" -H "Content-Type: text/plain"


This project is licensed under the Apache License, version 2.0.

