
# Txt Werk - Neofonie Text Analysis Tool


Txt Werk ist the Text Analysis Tool from Neofonie GmbH. It allows you to annotate your text in german and english language with information about recognized named entites.

For more information, please visit the webpage http://www.txtwerk.de/ of Txt Werk.


This Notebook demonstrates some simple usages of the Txt Werk API.

No special prerequisites are necessary.

NOTE: It is expected that a file txt_werk_apikey.py exists in the directory of the notebook with a valid Txt Werk Api-Key!




## Requesting Txt Werk API with Http requests.




### Calls to the Txt Werk API with a given text.



#### 1. Requesting recognized entities.


In [1]:
import requests
import json

try :
    from txt_werk_apikey import txt_werk
except ImportError :
    raise RuntimeError("Credentials must be supplied as dict in txt_werk_apikey.py. See example_txt_werk_apikey.py or use this as a template: txt_werk=dict(apikey='apikey')")

TXT_WERK_URL = "https://api.neofonie.de/rest/txt/analyzer"

## Set Txt Werk Api-Key in headers
headers={'X-Api-Key' : txt_werk['apikey']}

## Let's go
text = "Angela Merkel wurde am 17. Juli 1954 in Hamburg als Angela Dorothea Kasner geboren."

r = requests.post(TXT_WERK_URL, data={'text': text, 'services' : 'entities'}, headers=headers)
txt_werk_response = r.json()
      
print("Txt Werk Request:\n\n\"" + text + "\"\n\n")
print("Txt Werk Response:\n\n" + json.dumps(txt_werk_response, indent=4))


Txt Werk Request:

"Angela Merkel wurde am 17. Juli 1954 in Hamburg als Angela Dorothea Kasner geboren."


Txt Werk Response:

{
    "timestamp": 1478783930870,
    "language": "de",
    "text": "Angela Merkel wurde am 17. Juli 1954 in Hamburg als Angela Dorothea Kasner geboren.",
    "entities": [
        {
            "label": "Angela Merkel",
            "type": "PERSON",
            "end": 13,
            "uri": "https://www.wikidata.org/wiki/Q567",
            "start": 0,
            "confidence": 47.60983657836914,
            "surface": "Angela Merkel"
        },
        {
            "label": "17. Juli",
            "type": "CONCEPT",
            "end": 31,
            "uri": "https://www.wikidata.org/wiki/Q2729",
            "start": 23,
            "confidence": 39.16166687011719,
            "surface": "17. Juli"
        },
        {
            "label": "Hamburg",
            "type": "PLACE",
            "end": 47,
            "uri": "https://www.wikidata.org/wiki/Q1055",
 


#### 2. Requesting recognized tags.


In [2]:
r = requests.post(TXT_WERK_URL, data={'text': text, 'services' : 'tags'}, headers=headers)
txt_werk_response = r.json()
      
print("Txt Werk Request:\n\n\"" + text + "\"\n\n")
print("Txt Werk Response:\n\n" + json.dumps(txt_werk_response, indent=4))


Txt Werk Request:

"Angela Merkel wurde am 17. Juli 1954 in Hamburg als Angela Dorothea Kasner geboren."


Txt Werk Response:

{
    "timestamp": 1478783931027,
    "text": "Angela Merkel wurde am 17. Juli 1954 in Hamburg als Angela Dorothea Kasner geboren.",
    "tags": [
        {
            "confidence": 0.9988351678969338,
            "term": "17. Juli"
        },
        {
            "confidence": 0.998310833033523,
            "term": "Angela Dorothea Kasner"
        },
        {
            "confidence": 0.9971508741324149,
            "term": "Angela Merkel"
        },
        {
            "confidence": 0.9931194554831768,
            "term": "Hamburg"
        }
    ],
    "language": "de"
}



#### 3. Requesting recognized tags as before but also giving a title as input, which should enhance the results.


In [3]:
title = "Ein Lebenslauf von Frau Merkel"

r = requests.post(TXT_WERK_URL, data={'title': title, 'text': text, 'services' : 'tags'}, headers=headers)
txt_werk_response = r.json()
      
print("Txt Werk Request:\n\n\"" + text + "\"\n\n")
print("Txt Werk Response:\n\n" + json.dumps(txt_werk_response, indent=4))


Txt Werk Request:

"Angela Merkel wurde am 17. Juli 1954 in Hamburg als Angela Dorothea Kasner geboren."


Txt Werk Response:

{
    "timestamp": 1478783931188,
    "text": "Ein Lebenslauf von Frau Merkel\n\nAngela Merkel wurde am 17. Juli 1954 in Hamburg als Angela Dorothea Kasner geboren.",
    "tags": [
        {
            "confidence": 0.999646474945941,
            "term": "17. Juli"
        },
        {
            "confidence": 0.9995733902090875,
            "term": "Angela Dorothea Kasner"
        },
        {
            "confidence": 0.99944244934105,
            "term": "Lebenslauf"
        },
        {
            "confidence": 0.9978629324922726,
            "term": "Hamburg"
        }
    ],
    "language": "de"
}



#### 4. Requesting recognized categories with their confidence values.


In [4]:
r = requests.post(TXT_WERK_URL, data={'text': text, 'services' : 'categories'}, headers=headers)
txt_werk_response = r.json()
      
print("Txt Werk Request:\n\n\"" + text + "\"\n\n")
print("Txt Werk Response:\n\n" + json.dumps(txt_werk_response, indent=4))


Txt Werk Request:

"Angela Merkel wurde am 17. Juli 1954 in Hamburg als Angela Dorothea Kasner geboren."


Txt Werk Response:

{
    "language": "de",
    "timestamp": 1478783931336,
    "text": "Angela Merkel wurde am 17. Juli 1954 in Hamburg als Angela Dorothea Kasner geboren.",
    "categories": [
        {
            "label": "politik",
            "confidence": 0.9840945695370302
        },
        {
            "label": "wirtschaft",
            "confidence": 0.010815793425103136
        },
        {
            "label": "kultur",
            "confidence": 0.005075348628913111
        },
        {
            "label": "sport",
            "confidence": 1.0970299976779499e-05
        },
        {
            "label": "reisen",
            "confidence": 1.8793566199359706e-06
        },
        {
            "label": "wissenschaft",
            "confidence": 8.05313821392574e-07
        },
        {
            "label": "internet",
            "confidence": 6.269585510453141e-07
 


#### 4. Requesting recognized dates.


In [5]:
r = requests.post(TXT_WERK_URL, data={'text': text, 'services' : 'dates'}, headers=headers)
txt_werk_response = r.json()
      
print("Txt Werk Request:\n\n\"" + text + "\"\n\n")
print("Txt Werk Response:\n\n" + json.dumps(txt_werk_response, indent=4))


Txt Werk Request:

"Angela Merkel wurde am 17. Juli 1954 in Hamburg als Angela Dorothea Kasner geboren."


Txt Werk Response:

{
    "timestamp": 1478783931491,
    "text": "Angela Merkel wurde am 17. Juli 1954 in Hamburg als Angela Dorothea Kasner geboren.",
    "dates": [
        {
            "dateEnd": {
                "month": 7,
                "bc": false,
                "year": 1954,
                "day": 17
            },
            "dateStart": {
                "month": 7,
                "bc": false,
                "year": 1954,
                "day": 17
            },
            "start": 23,
            "surface": "17. Juli 1954",
            "end": 36
        }
    ],
    "language": "de"
}



### Using the simple TxtWerkClient for calls to the Txt Werk API with a given text.



#### 1. Using the simple TxtWerkClient for access to the Txt Werk API.


In [6]:
from txtwerk_client import TxtWerkClient
from IPython.core.display import display, HTML

txt_werk_client = TxtWerkClient()

## Let's go
txt_werk_response = txt_werk_client.check_text(text)

print("\nResponse from Txt Werk:\n\n" + json.dumps(txt_werk_response, indent=4) + "\n")



Response from Txt Werk:

{
    "entities": [
        {
            "label": "Angela Merkel",
            "type": "PERSON",
            "surface": "Angela Merkel",
            "uri": "https://www.wikidata.org/wiki/Q567",
            "start": 0,
            "confidence": 47.60983657836914,
            "end": 13
        },
        {
            "label": "17. Juli",
            "type": "CONCEPT",
            "surface": "17. Juli",
            "uri": "https://www.wikidata.org/wiki/Q2729",
            "start": 23,
            "confidence": 39.16166687011719,
            "end": 31
        },
        {
            "label": "Hamburg",
            "type": "PLACE",
            "surface": "Hamburg",
            "uri": "https://www.wikidata.org/wiki/Q1055",
            "start": 40,
            "confidence": 39.6832389831543,
            "end": 47
        },
        {
            "label": null,
            "type": "PERSON",
            "surface": "Angela Dorothea Kasner",
            "uri": null,
 


#### 2. Formatting output in HTML on the previous result.


In [7]:

## Once again the recognized entities, but formatted
print("\nEntities from Txt Werk:\n\n" + txt_werk_client.format_entities(txt_werk_response['entities']))

## Now let's get teh text html annotated for display in webpage

display(HTML(txt_werk_client.check_text_html_annotated(text)))
print("\n")



Entities from Txt Werk:

[ PERSON, "Angela Merkel", "Angela Merkel", https://www.wikidata.org/wiki/Q567, [0,13], 47.60983657836914 ]
[ CONCEPT, "17. Juli", "17. Juli", https://www.wikidata.org/wiki/Q2729, [23,31], 39.16166687011719 ]
[ PLACE, "Hamburg", "Hamburg", https://www.wikidata.org/wiki/Q1055, [40,47], 39.6832389831543 ]
[ PERSON, "None", "Angela Dorothea Kasner", None, [52,74], 75.0 ]





