Grow your team on GitHub
GitHub is home to over 28 million developers working together. Join them to grow your own development teams, manage permissions, and collaborate on projects.Sign up
Meresco Components are components to build searchengines, repositories and archives, based on Meresco Core
Toolbox for OCR post-correction
Meresco Lucene is a set of components and tools to integrate Lucene (based on PyLucene 4.3) into Meresco
Loader software for automated imaging of optical media with Nimbie disc robot
Various resources and documentation related to the nl-menu recovery efforts
Narralyzer is a narrative analyzer
Tool for extracting topics, keywords and their collocates from a Dutch corpus. Includes and extends the functionality of the Keyword Generator.
Create ingest-ready SIPs from batches of optical media images
Saving URLs of Leesplein.nl to Wayback Machine of The Internet Archive
Entity linker for the newspaper collection of the National Library of the Netherlands. Links named entity mentions to DBpedia descriptions using either a binary SVM classifier or a neural net.
Predict news article topics and DBpedia description topics and type.
Meresco Html is a template engine based on generators, and a sequel to Slowfoot. It is also known as DynamicHtml or Seecr Html.
Web interface to manually annotate named entity mentions in newspaper articles with the correct DBpedia link(s), if any. Produces labeled data sets for training and evaluating the DAC Entity Linker.
Scripts for quality assessment of e-books
Collection of Python scripts to build a Solr index from selected Dutch and English DBpedia dumps.
Bash script that performs file format identification on all files in a directory tree using Apache Tika
Classified Historical Newspaper Images
Book back recognition
Verify size of ISO 9660 image against Volume Descriptor fields
Python API for KB data-services
Named Entities Recognition Annotator Tool for Europeana Newspapers
Automated JP2 profiling for digitisation batches
Bulk downloader of web resources via OAI/PMH
Advertisement search interface based on image similarity.
Most used topics