Skip to content
Clojure library for using the Oslo-Bergen-Tagger
Find file
Pull request Compare This branch is 1 commit ahead, 11 commits behind master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.


The Oslo-Bergen-Tagger is a GPL licensed Norwegian text tagger. This is an interface for accessing it with Clojure.

clj-obt supports Linux only, and you must acquire the tagger from or clone it here on GitHub:


Simply add the library with Leiningen: [clj-obt "0.2.0"] and require clj-obt.core


Before tagging any text, you must set the path to the Oslo-Bergen-Tagger by calling:

(set-obt-path! "path/to/obt")

Or you can supply it when using the tagger function:

(obt-tag "tag this" "path/to/obt")

You only need to set the path once per session. Currently, only disambiguated bokmål is supported.

The resulting output from the tagger is parsed to simple maps. There are helper functions in to manipulate and filter the tagged words.

Startup time of OBT

There is a startup cost in calling the tagger. If you call obt-tag on a number of texts sequentially, there might be a about a seconds worth of overhead for each invocation. This is due to the startup time of OBT. We can get around this by giving obt-tag all the texts in a vector like this:

(obt-tag ["many" "texts" "to" "be" "tagged"])

The tagger function will then concatenate all the texts, tag them in one OBT invocation, split them up again, and return them to you - separately in a vector. Doing this, we save about n-1 OBT startup time.


Copyright (C) 2011 Aleksander Skjæveland Larsen

Distributed under the Eclipse Public License, the same as Clojure.

Something went wrong with that request. Please try again.