Skip to content

wietsedv/low-resource-adapt

main
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
src
 
 
 
 

Wietse de Vries β€’ Martijn Bartelds β€’ Malvina Nissim β€’ Martijn Wieling

Adapting Monolingual Models: Data can be Scarce when Language Similarity is High

This repository contains everything that is needed to replicate the results in the paper:

πŸ“ Adapting Monolingual Models: Data can be Scarce when Language Similarity is High [Findings of ACL 2021]

Models

The best fine-tuned models for Gronings and West Frisian are available on the HuggingFace model hub:

Lexical layers

These models are identical to BERTje, but with different lexical layers (bert.embeddings.word_embeddings).

POS tagging

These models share the same fine-tuned Transformer layers + classification head, but with the retrained lexical layers from the models above.

Development

Conda/mamba dependencies are listed in environment.yml. This repository contains all scripts and configs that are needed to replicate the results in the paper. A more extensive usage guide will be provided later.

BibTeX entry

@inproceedings{de-vries-etal-2021-adapting,
    title = "Adapting Monolingual Models: Data can be Scarce when Language Similarity is High",
    author = "de Vries, Wietse  and
      Bartelds, Martijn  and
      Nissim, Malvina  and
      Wieling, Martijn",
    booktitle = "Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021",
    month = aug,
    year = "2021",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.findings-acl.433",
    doi = "10.18653/v1/2021.findings-acl.433",
    pages = "4901--4907",
}

About

Code for the paper "Adapting Monolingual Models: Data can be Scarce when Language Similarity is High" (ACL Findings 2021)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published