A python wrapper for Datumbox
Switch branches/tags
Nothing to show
Clone or download
Pull request Compare This branch is even with irish315:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.



The Datumbox API provides a number of Remote Procedure Calls for Text Analysis and Natural Language Processing. This repo provides an easy way to use the API when writting Python.

You'll need an API key which you can get from the Datumbox Site


###Twitter Sentiment Analysis

>>> from DatumBox import DatumBox
>>> datum_box = DatumBox(API_KEY)
>>> datum_box.twitter_sentiment_analysis("I love my cat")

Text given to the classification methods should not contain HTML tags, the text_extract method provides an easy way to remove HTML tags (But involves a remote procedure call which may be undesirable)

##Exceptions that can be raised## The wrapper will throw DatumBoxError if the API returns an error. Page 11 of the API Documentation shows you possible Error Codes / Messages

The wrapper uses urllib2 to make the remote procedure calls so you can handle any exceptions this can raise if you wish.

##Failing Tests## Many of the classification tasks the Datumbox attempts to solve are AI-Complete this means that the results returned by the API are heuristic. Specifically the Readabilty Assesment and Commercial Detection tests I wrote fail as the API returns the wrong result, this should not be taken as a weakness of the API but rather the state of NLP in general.