Skip to content


Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Latest commit


Git stats


Failed to load latest commit information.

KorAP web service client package for Python

Project Status: Active – The project has reached a stable, usable state and is being actively developed. CI check Last commit GitHub closed issues GitHub issues GitHub license PyPI - Python Version PyPI - Downloads


Python client wrapper package to access the web service API of the KorAP Corpus Analysis Platform developed at IDS Mannheim. Currently, this is no native Python package. Internally, it uses KorAP's client package for R via rpy2. The latter also automatically translates between R data frames (or tibbles) and pandas DataFrames.


1. Install latest R version from CRAN

or, alternatively, on some recent Linux distributions:

#### Debian / Ubuntu
sudo apt-get install -y r-base r-base-dev r-cran-tidyverse r-cran-r.utils r-cran-pixmap r-cran-webshot r-cran-ade4 r-cran-segmented r-cran-purrr r-cran-dygraphs r-cran-cvst r-cran-quantmod r-cran-graphlayouts r-cran-rappdirs r-cran-ggdendro r-cran-seqinr r-cran-heatmaply r-cran-igraph r-cran-plotly libcurl4-gnutls-dev libssl-dev libfontconfig1-dev libsecret-1-dev libxml2-dev libsodium-dev python3-pip python3-rpy2 python3-pandas

#### Fedora / CentOS / RHEL
sudo yum install -y R R-devel libcurl-devel openssl-devel libxml2-devel libsodium-devel python3-pandas

2. Install the RKorAPClient package

Start R and run:

install.packages('RKorAPClient', repos='')

or install RKorAPClient from the package installation menu entry.

3. Install the Python package

On Linux an MacOs:

python3 -m pip install KorAPClient

On Windows:

py -m pip install KorAPClient


The core classes and methods to access the KorAP API are documented in the KorAPClient API documentation. For additional, mostly static helper functions, please refer to the Reference Manual of RKorAPClient for now. For translating R syntax to Python and vice versa, refer to the rpy2 Documentation.

Please note that some arguments in the original RKorAPClient functions use characters that are not allowed in Python keyword argument names. For these cases, you can however use Python's **kwargs syntax. For example, to let frequencyQuery interpret queries as queries for alternative variants and make it return their proportions instead of relative frequencies, you can write:

from KorAPClient import KorAPConnection
KorAPConnection(verbose=True) \
    .frequencyQuery(['"Wissenschaftler.*"', '"Wissenschafter.*"'],\
                    **{"as.alternatives": True})
query totalResults vc webUIRequestUrl total f conf.low conf.high
1 "Wissenschaftler.*" 942053 1080268 0.872055 0.871423 0.872684
2 "Wissenschafter.*" 138215 1080268 0.127945 0.127316 0.128577


Frequencies of "Hello World" over years and countries

from KorAPClient import KorAPClient, KorAPConnection
import altair as alt
import pandas as pd

QUERY = "Hello World"
df = pd.DataFrame(range(2010, 2019), columns=["Year"], dtype=str) \
    .merge(pd.DataFrame(["DE", "CH"], columns=["Country"]), how="cross")
df["vc"] = "textType=/Zeit.*/ & pubPlaceKey = " + df.Country + " & pubDate in " + df.Year
df = KorAPClient.ipm(KorAPConnection().frequencyQuery(QUERY,

alt.Chart(df).mark_line(point=True).encode(y="ipm", x="Year:T", color="Country", href="webUIRequestUrl") \

Frequency per million words of “Hello World“ in DE vs. CH from 2010 to 2018 in newspapers and magazines

Identify in … setzen light verb constructions by the collocationAnalysis method


from KorAPClient import KorAPConnection

kcon = KorAPConnection(verbose=True)
results = kcon.collocationAnalysis("focus(in [tt/p=NN] {[tt/l=setzen]})",
results['collocate'] = "[" + results['collocate'] +"](" + results['webUIRequestUrl'] +")"
print(results[['collocate', 'logDice', 'pmi', 'll']].head(10).round(2).to_markdown(floatfmt=".2f"))
collocate logDice pmi ll
1 Szene 10.37 11.54 824928.58
2 Gang 9.65 10.99 366993.93
3 Verbindung 9.20 10.34 347644.75
4 Kenntnis 9.15 10.67 206902.89
5 Bewegung 8.80 9.91 264577.07
6 Brand 8.76 9.97 210654.43
7 Anführungszeichen 8.06 12.52 54148.31
8 Kraft 7.94 8.91 189399.70
9 Beziehung 6.92 8.29 37723.54
10 Relation 6.64 10.24 17105.84

Command Line Invocation

The Python KorAP client can also be called from the command line and shell scripts:

$ korapclient -h
usage: python -m KorAPClient [-h] [-v] [-l QUERY_LANGUAGE] [-u API_URL] [-c VC [VC ...]] [-q QUERY [QUERY ...]]

Send a query to the KorAP API and print results as tsv.

optional arguments:
  -h, --help            show this help message and exit
  -v, --verbose
  -l QUERY_LANGUAGE, --query-language QUERY_LANGUAGE
  -u API_URL, --api-url API_URL
                        Specify this to access a corpus other that DeReKo.
  -c VC [VC ...], --vc VC [VC ...]
                        virtual corpus definition[s]
  -q QUERY [QUERY ...], --query QUERY [QUERY ...]
                        If not specified only the size of the virtual corpus will be queried.

  python -m KorAPClient -v --query "Hello World" "Hallo Welt" --vc "pubDate in 2017" "pubDate in 2018" "pubDate in 2019"

Accessed API Services

By using the KorAPClient you agree to the respective terms of use of the accessed KorAP API services which will be printed upon opening a connection.

Development and License

Author: Marc Kupietz

Copyright (c) 2021, Leibniz Institute for the German Language, Mannheim, Germany

This package is developed as part of the KorAP Corpus Analysis Platform at the Leibniz Institute for German Language (IDS).

It is published under the BSD-2 License.

Contributors: Ines Pisetta, Nils Diewald

To cite this work, please refer to: Kupietz et al. (2020, 2022), below.


Contributions are very welcome!

Your contributions should ideally be committed via our Gerrit server to facilitate reviewing (see Gerrit Code Review - A Quick Introduction if you are not familiar with Gerrit). However, we are also happy to accept comments and pull requests via GitHub.

Please note that unless you explicitly state otherwise any contribution intentionally submitted for inclusion into this software shall – as this software itself – be under the BSD-2 License.