Skip to content

nealchen2003/LangFlow

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LangFlow: Continuous Diffusion Rivals Discrete in Language Modeling

arXiv HuggingFace Blog

By Yuxin Chen*, Chumeng Liang*, Hangke Sui, Ruihan Guo, Chaoran Cheng, Jiaxuan You, Ge Liu.

The first continuous diffusion language model that rivals discrete counterparts on standard language modeling benchmarks like LM1B and OpenWebText.

LangFlow pipeline Evaluation results

TODO

  • Inference code
  • OpenWebText checkpoint on HuggingFace
  • Training code (after paper acceptance)
  • All trainable checkpoints (after paper acceptance)

Quick Start

1. Install dependencies

conda create -n langflow python=3.12
conda activate langflow
# Install CUDA-enabled torch first (adjust cu124 to match your driver)
pip install torch --index-url https://download.pytorch.org/whl/cu124
pip install -r requirements.txt

2. Download the checkpoint

Download only the safetensors weights file from HuggingFace — no need to clone the HF repo:

# Using huggingface-hub CLI
hf download Continuous-Rivals-Discrete/langflow-owt model.safetensors --local-dir ./checkpoints

3. Run inference

python inference.py \
    --checkpoint ./checkpoints/model.safetensors \
    --num_samples 5 \
    --batch_size 1 \
    --num_steps 1024 \
    --seq_length 1024 \
    --seed 42 \
    --output samples.txt

About

The first continuous diffusion language model that rivals discrete counterparts on standard language modeling benchmarks like LM1B and OpenWebText.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages