WHaverals

Wouter Haverals WHaverals

Achievements

Highlights

Stars

clovaai / donut

Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022

Python 6,087 488 Updated Jul 11, 2024

malcolmosh / dailypi

An e-paper dashboard for a Raspberry Pi Zero W.

Python 53 3 Updated Oct 27, 2024

microsoft / graphrag

A modular graph-based Retrieval-Augmented Generation (RAG) system

Python 23,185 2,308 Updated Mar 7, 2025

modelscope / ms-swift

Use PEFT or Full-parameter to finetune 450+ LLMs (Qwen2.5, InternLM3, GLM4, Llama3.3, Mistral, Yi1.5, Baichuan2, DeepSeek-R1, ...) and 150+ MLLMs (Qwen2.5-VL, Qwen2-Audio, Llama3.2-Vision, Llava, I…

Python 6,082 521 Updated Mar 7, 2025

X-PLUG / mPLUG-DocOwl

mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding

Python 2,128 127 Updated Dec 24, 2024

inseq-team / inseq

Interpretability for sequence generation models 🐛 🔍

Python 406 36 Updated Nov 10, 2024

dasmiq / passim

Detect and align similar passages

Python 98 15 Updated Feb 3, 2025

UB-Mannheim / ocr-fileformat

Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)

JavaScript 187 24 Updated Feb 5, 2025

ropensci / textreuse

Detect text reuse and document similarity

R 199 34 Updated Feb 14, 2025

Living-with-machines / DeezyMatch_tutorials

Collection of tutorials for DeezyMatch (https://github.com/Living-with-machines/DeezyMatch)

Jupyter Notebook 7 Updated Oct 16, 2024

codebox / homoglyph

A big list of homoglyphs and some code to detect them

JavaScript 572 69 Updated Aug 22, 2024

chardet / chardet

Python character encoding detector

Python 2,231 262 Updated Jan 13, 2025

kensho-technologies / qwikidata

Python tools for interacting with Wikidata

Python 152 18 Updated Oct 28, 2023

Princeton-CDH / viapy

VIAF via Python

Python 10 3 Updated Apr 24, 2024

ixc / python-edtf

Python 56 19 Updated Oct 15, 2024

rspeer / python-ftfy

Fixes mojibake and other glitches in Unicode text, after the fact.

Python 3,869 121 Updated Oct 30, 2024

ekzhu / datasketch

MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble and HNSW

Python 2,654 297 Updated Jun 4, 2024

martiansideofthemoon / style-transfer-paraphrase

Official code and data repository for our EMNLP 2020 long paper "Reformulating Unsupervised Style Transfer as Paraphrase Generation" (https://arxiv.org/abs/2010.05700).

HTML 235 46 Updated Jun 13, 2022

fuzhenxin / Style-Transfer-in-Text

Paper List for Style Transfer in Text

1,620 194 Updated Mar 16, 2023

oumi-ai / oumi

Everything you need to build state-of-the-art foundation models, end-to-end.

Python 7,653 543 Updated Mar 7, 2025

huggingface / segment-anything-2

Forked from facebookresearch/sam2

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…

Jupyter Notebook 61 6 Updated Oct 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wouter Haverals WHaverals

Achievements

Achievements

Highlights

Block or report WHaverals

Stars

clovaai / donut

malcolmosh / dailypi

microsoft / graphrag

modelscope / ms-swift

X-PLUG / mPLUG-DocOwl

inseq-team / inseq

dasmiq / passim

UB-Mannheim / ocr-fileformat

ropensci / textreuse

Living-with-machines / DeezyMatch_tutorials

codebox / homoglyph

chardet / chardet

kensho-technologies / qwikidata

Princeton-CDH / viapy

ixc / python-edtf

rspeer / python-ftfy

ekzhu / datasketch

martiansideofthemoon / style-transfer-paraphrase

fuzhenxin / Style-Transfer-in-Text

oumi-ai / oumi

huggingface / segment-anything-2

letta-ai / letta

mlabonne / llm-course

hassonlab / neurohack

StNamesLab / StreetNamesDatabase

EdAbel / setlist-variety

bojone / labse

neulab / awesome-align

bfsujason / bertalign

QwenLM / Qwen2.5-Coder