Skip to content
hacking 🎧 , UTC +2
hacking 🎧 , UTC +2




  • Pro


@dbmdz @flairNLP
Block or Report

Block or report stefan-it

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse


👋 Hi there

I'm working at Bavarian State Library 📚 now and please visit, watch and star our pre-trained language models repo!

I'm currently working on the awesome Flair library and love contributing to 🤗 Transformers.

📰 Latest news

Latest news of new language models, PRs and many more!

  • 05.07.2021: Preprint of the ICDAR 2021 paper "Data Centric Domain Adaptation for Historical Text with OCR Errors" together with Luisa März, Nina Poerner, Benjamin Roth and Hinrich Schütze is out now!

  • 24.06.2021: Turkish Language Model Zoo repo got a new logo from Merve Noyan, please follow her! Additionally, a new Turkish ELECTRA model was released, that was trained on the Turkish part of multilingual C4 dataset. More details here.

  • 03.05.2021: GC4LM: A Colossal (Biased) language model for German was released. Repo with more details here.

  • 27.04.2021: Our paper "Data Centric Domain Adaptation for Historical Text with OCR Errors" was accepted at ICDAR 2021. More details soon!

  • 16.03.2021: Turkish model zoo is still growing! Public release of ConvBERTurk - see repo here.

  • 07.02.2021: Public release of German Europeana DistilBERT and ConvBERT models. Repo with more information is here.

  • 28.01.2021: Expect a new German Europeana ELECTRA Large model incl. a distilled German Europeana BERT model soon 🤗

  • 16.11.2020: Public release of French Europeana BERT and ELECTRA models - see repository here.

  • 16.11:2020: Public release of a German GPT-2 model (incl. fine-tuned model on Faust I and II). Repo with more information is available here.

  • 11.11.2020: Public release of Ukrainian ELECTRA model. Repo is now available here.

  • 11.11.2020: New workstation build (RTX 3090 and Ryzen 9 5900X) has completed! Expect a lot of new Flair/Transformers models in near future!

  • 02.11.2020: Public release of Italian XXL ELECTRA model. New repo for Italian BERT and ELECTRA models is now available here 🎉

  • 22.10.2020: Preprint of "German's Next Language Model" is now available here. Models are also available on the Hugging Face model hub 🎉

  • 22.10.2020: Our shared task paper Triple E - Effective Ensembling of Embeddings and Language Models for NER of Historical German together with Luisa März is released 🎉

  • 30.09.2020: "German's Next Language Model" together with Branden Chan and Timo Möller was accepted at COLING 2020! Expect new language models for German on the Hugging Face model hub soon 🤗

  • 23.09.2020: Flair in version 0.6.1 is out now!

  • 02.09.2020: Slow response time - I'm currently focussing on EACL 2021. Expect great new things 😎

  • 18.08.2020: French BERT model, trained on Historic newspapers from Europeana: find the model here and the corresponding repository here.

📃 Publications

📃 Preprints

💬 Contact

Please open an issue in the corresponding repository or tag me (@stefan-it) in issues/prs/commits on GitHub :)

You can also find me on the 🤗 Discussion forum.


  1. Turkish BERT/DistilBERT, ELECTRA and ConvBERT models

    Python 266 22

  2. DBMDZ BERT, DistilBERT, ELECTRA, GPT-2 and ConvBERT models

    94 8

  3. Experiments with Zalando's flair library

    Python 35 5

  4. Language Models for Zalando's flair library

    50 4

  5. Repository for "Towards Robust Named Entity Recognition for Historic German"

    Python 14 3

  6. General-Purpose Neural Networks for Sentence Boundary Detection

    Python 56 6

790 contributions in the last year

Aug Sep Oct Nov Dec Jan Feb Mar Apr May Jun Jul Mon Wed Fri

Contribution activity

July 2021

Opened 4 pull requests in 3 repositories
2 merged
1 merged
1 merged
Reviewed 5 pull requests in 4 repositories
dbmdz/cudami 2 pull requests
dbmdz/solr-ocrhighlighting 1 pull request
dbmdz/imageio-jnr 1 pull request
huggingface/transformers 1 pull request

Created an issue in EMBEDDIA/stacked-ner that received 3 comments

NewsEye dataset question

Hi @EmanuelaBoros , thanks for releasing the code! We have tried to reproduce the results from the "Alleviating Digitization Errors in Named Entity…

Opened 1 other issue in 1 repository
1 open
8 contributions in private repositories Jul 9

Seeing something unexpected? Take a look at the GitHub profile guide.