hmTEAMS

Historical Multilingual and Monolingual TEAMS Models. The following languages are covered:

English (British Library Corpus - Books)
German (Europeana Newspaper)
French (Europeana Newspaper)
Finnish (Europeana Newspaper, Digilib)
Swedish (Europeana Newspaper, Digilib)
Dutch (Delpher Corpus)
Norwegian (NCC Corpus)

Architecture

We pretrain a "Training ELECTRA Augmented with Multi-word Selection" (TEAMS) model:

Pretraining

We pretrain the hmTEAMS model on a v3-32 TPU Pod. All details can be found here.

Results

We perform experiments on various historic NER datasets, such as HIPE-2022 or ICDAR Europeana. All results incl. hyper-parameters can be found here.

Release

Our pretrained hmTEAMS model can be obtained from the Hugging Face Model Hub:

Fine-tuned Models

We release the following models, trained on various Historic NER Datasets (HIPE-2020, HIPE-2022, ICDAR):

Language	Model(s)
English	AjMC (HIPE-2022) - TopRes19th (HIPE-2022)
German	AjMC (HIPE-2022) - NewsEye - HIPE-2020
French	AjMC (HIPE-2022) - ICDAR-Europeana - LeTemps (HIPE-2022) - NewsEye - HIPE-2020
Finnish	NewsEye (HIPE-2022)
Swedish	NewsEye (HIPE-2022)
Dutch	ICDAR-Europeana

Changelog

25.09.2024: All hmTEAMS models are now released under permissive Apache 2.0 license.
08.09.2023: Evaluation on German and French HIPE-2020 datasets added here.
01.09.2023: Evaluation on German and French NewsEye datasets added here.
28.08.2023: Evaluation on TopRes19th dataset added here.
27.08.2023: Evaluation on LeTemps dataset is added here.
06.08.2023: Evaluation on various historic NER datasets are completed. Results can be found here.
01.08.2023: hmTEAMS organization can be found on the Model Hub. More information of how to access trained hmTEAMS models are coming soon.
25.05.2023: Initial version of this repo.

Acknowledgements

We thank Luisa März, Katharina Schmid and Erion Çano for their fruitful discussions about Historical Language Models.

Research supported with Cloud TPUs from Google's TPU Research Cloud (TRC). Many Thanks for providing access to the TPUs ❤️

Name		Name	Last commit message	Last commit date
Latest commit History 72 Commits
bench		bench
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
hmteams_overview.svg		hmteams_overview.svg
logo.jpeg		logo.jpeg
pretraining.md		pretraining.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

hmTEAMS

Architecture

Pretraining

Results

Release

Fine-tuned Models

Changelog

Acknowledgements

About

Releases

Packages

License

stefan-it/hmTEAMS

Folders and files

Latest commit

History

Repository files navigation

hmTEAMS

Architecture

Pretraining

Results

Release

Fine-tuned Models

Changelog

Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Packages