Korp API for Python
Library for Python to use Korp API. This library provides an easy way to query Korp systems for language corpora.
Installation
sudo pip install korp
Usage
You can initialise Korp with either service_name (språkbanken, kielipankki or GT) or url to your Korp’s API interface such as https://korp.csc.fi/cgi-bin/korp.cgi .
An example for getting all concordances for North Sami corpora in Giellatekno Korp for query [pos=”A”] “go” [pos=”N”].
from korp.korp import Korp
korppi = Korp(service_name="GT") #uses Giellatekno, "kielipankki" and "språkbanken" are other possible service_name values
corpora = korppi.list_corpora("SME") #lists corpora returns the ones starting with the North Sami language code
number_of_results, concordances = korppi.all_concordances('[pos="A"] "go" [pos="N"]', corpora)
More information
See the Wiki for a complete description or my blog for a real life Korp example.
Need for NLP solutions for your business?
My company, Rootroo offers consulting related to multilingual NLP tasks. We have a strong academic background in the state-of-the-art AI solutions for every NLP need. Just contact us, we won't bite.
Cite
If you use this in an academic publication, I would be ever so grateful if you cited it as follows:
Mika Hämäläinen. (2018, January 9). Python Korp Library (Version v1). Zenodo. http://doi.org/10.5281/zenodo.1143374
Licence
Apache License 2.0 (C) 2017-2019 Mika Hämäläinen