Skip to content

setu4993/convert-labse-tf-pt

Repository files navigation

LaBSE

Project

This project is an implementation to convert Google's LaBSE model from TensorFlow to PyTorch. It also offers extensions to convert the smaller-LaBSE model from TensorFlow to PyTorch, and the LEALLA family of models.

The models are uploaded to the HuggingFace Model Hub in the PyTorch HF-compatible (original and safetensors), TensorFlow and Flax formats, alongwith a compatible tokenizer.

Export

To convert and export the models:

poetry install
poetry run convert_labse --output_path /path/to/models

To update the models on the HuggingFace Model Hub:

# Clone the already uploaded models.
cd /path/to/model
git clone https://huggingface.co/setu4993/LaBSE.git

# Export models anew and update.
cd /path/to/repo
poetry install
poetry run convert_labse --output_path /path/to/models/LaBSE --huggingface_path

Export Commands by Model

  1. LaBSE: poetry run convert_labse --output_path /path/to/models/setu4993/LaBSE --huggingface_path
  2. smaller-LaBSE: poetry run convert_labse --output_path /path/to/models/setu4993/smaller-LaBSE --smaller --huggingface_path
  3. LEALLA-base: poetry run convert_lealla --size base --output_path /path/to/models/setu4993/LEALLA-base --huggingface_path
  4. LEALLA-small: poetry run convert_lealla --size small --output_path /path/to/models/setu4993/LEALLA-small --huggingface_path
  5. LEALLA-large: poetry run convert_lealla --size large --output_path /path/to/models/setu4993/LEALLA-large --huggingface_path

Model Cards

See the model-cards directory for a copy of the model cards.

License

This repository and the conversion code is licensed under the MIT license, but the model is distributed with an Apache-2.0 license.