CLARIAH chaining search is a Python library and Jupyter web interface to easily combine exploration of linguistic resources published in the CLARIN/CLARIAH infrastructure, such as corpora, lexica and treebanks. CLARIAH chaining search is developed by the Dutch Language Institute (INT).
Linguistic resources, such as lexica and corpora, are usually published as web applications, where users issue a search term, and a number of results are shown in the browser. However, connecting multiple web applications in a single process for research and analysis, is a difficult task.
As a solution, CLARIAH chaining search supplies a platform in which search and analysis operations can be freely combined in a single interface. One can build customizable workflows with any number of steps, in which heterogeneous resources (specifically from the CLARIN ecosystem) can be sequentially searched and quantitatively analysed. Any step in such a workflow is to be built programmatically, by means of the Python programming language. Working examples of both simple and complex workflows are provided as a reference for the user.
Examples.ipynbgives a number of case studies of accessing and chaining together lexica, corpora and treebanks.- The
contribfolder contains a number of more specific case studies.Case_study_paper.ipynbexplores differences between social class and gender in 17th and 18th century sea letters, in the Letters as Loot corpus. - Use
Sandbox.ipynbto start chaining yourself. - For a tutorial, refer to our Quickstart.
- Reference of our library chaininglib, described in the documentation (online or local (not for Azure cloud instance)).
The notebook can be run online on Azure. Create an account, clone the notebook, and you can run it in the cloud!
- Chaining search is a Jupyter notebook, which depends on Python 3, pip (PyPi) and venv. Please first install Python 3 and pip via your package management system. E.g. for Ubuntu:
sudo apt install python3-pip python3-venv
-
Now, run our install script in a terminal, as a normal user (without
sudo):./install.shIf permission is denied, issue the following command once:
sudo chmod +x install.shand then run the install script.
-
Every time you want to run the notebook, run the
run.shscript as a normal user (withoutsudo):./run.shA browser window will open. Now, click
Sandbox.ipynborExamples.ipynb. The first time you use it, pick the kernelenvfrom menuKernel > Change kernel > env.
Chaining search can be easily installed using our install script. This will install all prerequisites for Chaining search.
- Open a command prompt (Windows key + R, then issue "cmd").
- Change to the Chaining search directory (the directory where this README is located):
cd CHAINING\SEARCH\DIRECTORY
- If you don't have Python yet, install it now:
python_install.bat
- Close the command prompt after this (required!)
Now we're ready to install our notebook:
- Open a command prompt (again: Windows key + R, then type "cmd").
- Change to the Chaining search directory (the directory where this README is located):
- Invoke the install script:
install.bat
Every time you would like to run chaining search, invoke our run script:
- Open a command prompt (Windows key + R, then issue
cmd). - Change to the Chaining search directory (the directory where this README is located):
cd CHAINING\SEARCH\DIRECTORY
- Invoke the run script:
run.bat
- A browser window will open. Now, click
Sandbox.ipynborExamples.ipynb. The first time you use it, pick the kernelenvfrom menuKernel > Change kernel > env.
If you encounter any bugs or errors, please let us know via our GitHub issue tracker or send an e-mail to servicedesk@ivdnt.org.