Skip to content

meizhen-nlp/GranCATs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

README

Introduction

This is a release of multilingual contrastive adapter set (involving three different granularities of natural language text) referenced in GranCATs: Cross-Lingual Enhancement through Granularity-Specific Contrastive Adapters

Our goal is to provide a effective and cheap method to enable MLLMs(Multilingual Language Models) handle well cross-lingual representaion alignments and boost the performance on downstream tasks, without an extensive parallel corpus and computational resources.

You can download the parallel corpus covering 29 kinds of languages from here. This corpus involve phrase-level, sentence-level and paragraph-level of text.

What is GranCATs?

GranCATs provides a set of multilingual contrastive adapters, involving phrase-level, sentence-level and paragraph-level contrastive adapter. Granularity-specific adapter can be adaptively plugged into Transformer Layer and help the ability of cross-lingual alignments and boost performance on multilingual understanding tasks.

What is released in this repo?

Citation

If you use these models, please cite the following paper:

@inproceedings{10.1145/3583780.3614896,
author = {Liu, Meizhen and He, Jiakai and Guo, Xu and Chen, Jianye and Hui, Siu Cheung and Zhou, Fengyu},
title = {GranCATs: Cross-Lingual Enhancement through Granularity-Specific Contrastive Adapters},
year = {2023},
isbn = {9798400701245},
url = {https://doi.org/10.1145/3583780.3614896},
booktitle = {Proceedings of the 32nd ACM International Conference on Information and Knowledge Management},
pages = {1461–1471},
location = {Birmingham, United Kingdom},
series = {CIKM '23}}

About

This repo provide the pre-training code and parallel data, you can download them to reproduce the results. Later we will update the technical details.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages