Skip to content
hacking 🎧 , UTC +2
hacking 🎧 , UTC +2




  • Pro


@dbmdz @flairNLP
Block or Report

Block or report stefan-it

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse


👋 Hi there

I'm working at Bavarian State Library 📚 now and please visit, watch and star our pre-trained language models repo!

I'm currently working on the awesome Flair library and love contributing to 🤗 Transformers.

📰 Latest news

Latest news of new language models, PRs and many more!

  • 18.11.2021: Release of new multilingual and monolingual Historic Language Models - as preparation for upcoming CLEF-HIPE 2022 - see repo here.

  • 23.09.2021: Release of ConvBERTurk (cased and uncased) and ELECTRA (uncased) trained on Turkish part of mC4 corpus - see repo here.

  • 07.09.2021: Release of new larger German GPT-2 model - see model hub card here.

  • 17.08.2021: Release of new re-trained German GPT-2 model - see repo here.

  • 05.07.2021: Preprint of the ICDAR 2021 paper "Data Centric Domain Adaptation for Historical Text with OCR Errors" together with Luisa März, Nina Poerner, Benjamin Roth and Hinrich Schütze is out now!

  • 24.06.2021: Turkish Language Model Zoo repo got a new logo from Merve Noyan, please follow her! Additionally, a new Turkish ELECTRA model was released, that was trained on the Turkish part of multilingual C4 dataset. More details here.

  • 03.05.2021: GC4LM: A Colossal (Biased) language model for German was released. Repo with more details here.

  • 27.04.2021: Our paper "Data Centric Domain Adaptation for Historical Text with OCR Errors" was accepted at ICDAR 2021. More details soon!

  • 16.03.2021: Turkish model zoo is still growing! Public release of ConvBERTurk - see repo here.

  • 07.02.2021: Public release of German Europeana DistilBERT and ConvBERT models. Repo with more information is here.

  • 28.01.2021: Expect a new German Europeana ELECTRA Large model incl. a distilled German Europeana BERT model soon 🤗

  • 16.11.2020: Public release of French Europeana BERT and ELECTRA models - see repository here.

  • 16.11:2020: Public release of a German GPT-2 model (incl. fine-tuned model on Faust I and II). Repo with more information is available here.

  • 11.11.2020: Public release of Ukrainian ELECTRA model. Repo is now available here.

  • 11.11.2020: New workstation build (RTX 3090 and Ryzen 9 5900X) has completed! Expect a lot of new Flair/Transformers models in near future!

  • 02.11.2020: Public release of Italian XXL ELECTRA model. New repo for Italian BERT and ELECTRA models is now available here 🎉

  • 22.10.2020: Preprint of "German's Next Language Model" is now available here. Models are also available on the Hugging Face model hub 🎉

  • 22.10.2020: Our shared task paper Triple E - Effective Ensembling of Embeddings and Language Models for NER of Historical German together with Luisa März is released 🎉

  • 30.09.2020: "German's Next Language Model" together with Branden Chan and Timo Möller was accepted at COLING 2020! Expect new language models for German on the Hugging Face model hub soon 🤗

  • 23.09.2020: Flair in version 0.6.1 is out now!

  • 02.09.2020: Slow response time - I'm currently focussing on EACL 2021. Expect great new things 😎

  • 18.08.2020: French BERT model, trained on Historic newspapers from Europeana: find the model here and the corresponding repository here.

📃 Publications

📃 Preprints

💬 Contact

Please open an issue in the corresponding repository or tag me (@stefan-it) in issues/prs/commits on GitHub :)

You can also find me on the 🤗 Discussion forum.

Pinned Loading

  1. Turkish BERT/DistilBERT, ELECTRA and ConvBERT models

    Python 285 26

  2. DBMDZ BERT, DistilBERT, ELECTRA, GPT-2 and ConvBERT models

    107 10

  3. Experiments with Zalando's flair library

    Python 35 5

  4. Language Models for Zalando's flair library

    50 4

  5. Repository for "Towards Robust Named Entity Recognition for Historic German"

    Python 14 3

  6. General-Purpose Neural Networks for Sentence Boundary Detection

    Python 57 6

553 contributions in the last year

Dec Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Mon Wed Fri

Contribution activity

December 2021

Reviewed 1 pull request in 1 repository
dbmdz/flusswerk 1 pull request
2 contributions in private repositories Dec 1

Seeing something unexpected? Take a look at the GitHub profile guide.