Skip to content

Latest commit

History

History
84 lines (52 loc) 路 3.13 KB

falcon.md

File metadata and controls

84 lines (52 loc) 路 3.13 KB

Falcon

Overview

Falcon is a class of causal decoder-only models built by TII. The largest Falcon checkpoints have been trained on >=1T tokens of text, with a particular emphasis on the RefinedWeb corpus. They are made available under the Apache 2.0 license.

Falcon's architecture is modern and optimized for inference, with multi-query attention and support for efficient attention variants like FlashAttention. Both 'base' models trained only as causal language models as well as 'instruct' models that have received further fine-tuning are available.

Falcon models are (as of 2023) some of the largest and most powerful open-source language models, and consistently rank highly in the OpenLLM leaderboard.

Converting custom checkpoints

Falcon models were initially added to the Hugging Face Hub as custom code checkpoints. However, Falcon is now fully supported in the Transformers library. If you fine-tuned a model from a custom code checkpoint, we recommend converting your checkpoint to the new in-library format, as this should give significant improvements to stability and performance, especially for generation, as well as removing the need to use trust_remote_code=True!

You can convert custom code checkpoints to full Transformers checkpoints using the convert_custom_code_checkpoint.py script located in the Falcon model directory of the Transformers library. To use this script, simply call it with python convert_custom_code_checkpoint.py --checkpoint_dir my_model. This will convert your checkpoint in-place, and you can immediately load it from the directory afterwards with e.g. from_pretrained(). If your model hasn't been uploaded to the Hub, we recommend making a backup before attempting the conversion, just in case!

FalconConfig

[[autodoc]] FalconConfig - all

FalconModel

[[autodoc]] FalconModel - forward

FalconForCausalLM

[[autodoc]] FalconForCausalLM - forward

FalconForSequenceClassification

[[autodoc]] FalconForSequenceClassification - forward

FalconForTokenClassification

[[autodoc]] FalconForTokenClassification - forward

FalconForQuestionAnswering

[[autodoc]] FalconForQuestionAnswering - forward