CLARIAH offers humanities scholars a Common Lab providing access to large collections of digital resources and innovative tools for research

  • FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (including corpora) with linguistic annotations. A wide variety of linguistic annotations are support, making FoLiA a useful format for NLP tasks and data interchange. Note that the actual Python library for processi…

    Python 6 Updated Aug 17, 2018
  • grlc builds Web APIs using shared SPARQL queries

    CSS 49 9 MIT Updated Aug 17, 2018
  • A set of workflows for corpus building through OCR, post-correction and Natural Language Processing

    Groovy 1 2 GPL-3.0 Updated Aug 16, 2018
  • B&G LABS experimental space; React based UI components; testing LABS APIs; etc

    CSS 7 7 Unlicense Updated Aug 16, 2018
  • HTML MIT Updated Aug 15, 2018
  • Data for Frog, mandatory

    Lex 1 GPL-3.0 Updated Aug 14, 2018
  • LaMachine - A software distribution of our in-house as well as some 3rd party NLP software - Virtual Machine, Docker, or local compilation/installation script

    Shell 8 GPL-3.0 Updated Aug 13, 2018
  • Proposal for crosswalks between a number of video annotation tools, including the CLARIAH Web Annotation tool, ELAN, FrameTrail, VIAN and Waldorf.js.

    1 Updated Aug 13, 2018
  • Frog is an integration of memory-based natural language processing (NLP) modules developed for Dutch. All NLP modules are based on Timbl, the Tilburg memory-based learning software package.

    C++ 1 8 GPL-3.0 Updated Aug 6, 2018
  • Command-line utilities for working with the Format for Linguistic Annotation (FoLiA), powered by libfolia (C++), written by Ko van der Sloot

    C++ 2 GPL-3.0 Updated Aug 6, 2018
  • Unicode tokeniser. Ucto tokenizes text files: it separates words from punctuation, and splits sentences. It offers several other basic preprocessing steps such as changing case that you can all use to make your text suited for further processing such as indexing, part-of-speech tagging, or machine translation. Ucto comes with tokenisation rules …

    C++ 5 GPL-3.0 Updated Aug 6, 2018
  • FoLiA library for C++

    C++ 4 GPL-3.0 Updated Aug 6, 2018
  • Guidelines for software quality & sustainability (CLARIAH WP2 task 54.100)

    TeX 11 2 Updated Aug 2, 2018
  • WP4 SPARQL queries using hisco data

    Updated Jul 31, 2018
  • SERPENS: SEaRch PEst and Nuisance Species (A CLARIAH Research Pilot Project)

    Jupyter Notebook Apache-2.0 Updated Jul 17, 2018
  • Toad: Trainer Of All Data, the Frog training collection

    C++ 1 GPL-3.0 Updated Jul 13, 2018
  • This repository contains the converter files used to convert .csv's into RDF using COW.

    R MIT Updated Jul 4, 2018
  • 3 MIT Updated Jul 3, 2018
  • Amsterdam Time Machine

    Updated Jun 25, 2018
  • HTML Updated Jun 20, 2018
  • Service for converting CSV to the CSVW RDF format using COW

    Python 1 1 Updated Jun 18, 2018
  • This repository holds some schemas used by tool and service metadata specifications

    Updated Jun 14, 2018
  • humigec project

    Updated Jun 8, 2018
  • Safely convert IRI-like string to IRI.

    Python 2 1 MIT Updated Jun 6, 2018
  • Current historical studies of career mobility often focus on linkage of personal records such as baptism records. More qualitative sources, such as biographies contain vital information as well, but are labour intensive to process. We propose a combination of Robust Semantic Parsing and Linked Data conversion tools to automatically derive career…

    TeX 2 MIT Updated Jun 5, 2018
  • Queries related to github.com/CLARIAH/BdVteaching

    Updated May 30, 2018
  • teaching materials for a replication study using Linked Data

    GPL-3.0 Updated May 28, 2018
  • wp3-clam

    Forked from proycon/clam

    Quickly turn command-line applications into RESTful webservices with a web-application front-end. You provide a specification of your command line application, its input, output and parameters, and CLAM wraps around your application to form a fully fledged RESTful webservice.

    Python 12 GPL-3.0 Updated May 23, 2018
  • PyNLPl, pronounced as 'pineapple', is a Python library for Natural Language Processing. It contains various modules useful for common, and less common, NLP tasks. PyNLPl can be used for basic tasks such as the extraction of n-grams and frequency lists, and to build simple language model. There are also more complex data types and algorithms. Mor…

    Python 55 GPL-3.0 Updated May 23, 2018
  • wp3-flat

    Forked from proycon/flat

    FoLiA Linguistic Annotation Tool - Flat is a web-based linguistic annotation environment based around the FoLiA format (http://proycon.github.io/folia), a rich XML-based format for linguistic annotation. Flat allows users to view annotated FoLiA documents and enrich these documents with new annotations, a wide variety of linguistic annotation ty…

    JavaScript 10 GPL-3.0 Updated May 11, 2018