Skip to content

Spectral Normalization (SN): A very general language embedding normalization procedure that subsumes various previous approaches. SN removes the structural profiles across languages without destroying their intrinsic meaning.

Notifications You must be signed in to change notification settings

poaboagye/SpecNorm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

56 Commits
 
 
 
 
 
 
 
 

Repository files navigation


Normalization of Language Embeddings for Cross-Lingual Alignment

Official Repository for the implemention of Spectral Normalization from our ICLR 2022 paper:

Normalization of Language Embeddings for Cross-Lingual Alignment. Prince Osei Aboaggye, Yan Zheng, Junpeng Wang, Michael Yeh, Wei Zhang, Liang Wang, Hao Yang, Jeff M. Phillips.

This paper proposes a new and general approach to preprocessing (word) embeddings (static and contextual embeddings) called SPECTRAL NORMALIZATION. We demonstrated that when this normalization method is used to preprocess monolingual embeddings, it allows alignment procedures to find better alignments, resulting in improved performance on the Bilingual Lexicon Induction (BLI) task as well as Cross-lingual document classification and Cross-lingual natural language inference tasks. Moreover, we demonstrate that this improvement is very broadly useful; it holds in contextual embeddings and embeddings of non-language data (on genomic data).

Dependencies

  • python
  • numpy

How To Apply Spectral Normalization

Given a monolingual word embedding (static and contextual embeddings) or any non-language data such as genomic data, you can apply Spectral Normalization with the following command:

python IterSpecNorm.py --input_file INPUT_EMBED --output_file OUTPUT_PATH --niter 5 --max_vocab 200000 --beta 2

Mapping

After preprocessing the embeddings with Spectral Normalization, you can then align the normalized embeddings using libraries such as MUSE, VecMap, etc.

Citation

If you find anything helpful in this work, please cite our paper:

@inproceedings{aboagye2021normalization,
  title={Normalization of Language Embeddings for Cross-Lingual Alignment},
  author={Aboagye, Prince Osei and Zheng, Yan and Yeh, Chin-Chia Michael and Wang, Junpeng and Zhang, Wei and Wang, Liang and Yang, Hao and Phillips, Jeff},
  booktitle={International Conference on Learning Representations},
  year={2021}
}}

Contact

For questions, please email prince@cs.utah.edu

About

Spectral Normalization (SN): A very general language embedding normalization procedure that subsumes various previous approaches. SN removes the structural profiles across languages without destroying their intrinsic meaning.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages