Switch branches/tags
Nothing to show
Find file
Fetching contributors…
Cannot retrieve contributors at this time
38 lines (19 sloc) 1.57 KB


The Oslo-Bergen-Tagger is a GPL licensed Norwegian text tagger. This is an interface for accessing it with Clojure.

clj-obt supports Linux only, and you must acquire the tagger from or clone it here on GitHub:


Simply add the library with Leiningen: [clj-obt "0.5.1"] and require clj-obt.core


Before tagging any text, you must set the path to the Oslo-Bergen-Tagger by calling:

(set-obt-path! "path/to/obt")

Or you can supply it when using the tagger function:

(obt-tag "tag this" "path/to/obt")

You only need to set the path once per session. Currently, only disambiguated bokmål is supported.

The resulting output from the tagger is parsed to simple maps. There are helper functions in to manipulate and filter the tagged words.

Startup time of OBT

There is a startup cost in calling the tagger. If you call obt-tag on a number of texts sequentially, there might be a about a seconds worth of overhead for each invocation. This is due to the startup time of OBT. We can get around this by giving obt-tag all the texts in a vector like this:

(obt-tag ["many" "texts" "to" "be" "tagged"])

The tagger function will then concatenate all the texts, tag them in one OBT invocation, split them up again, and return them to you - separately in a vector. Doing this, we save about n-1 of OBT startup time cost.


Copyright (C) 2011-2012 Aleksander Skjæveland Larsen

Distributed under the Eclipse Public License, the same as Clojure.