Andromeda: Ultra-Fast and Ultra-Intelligent SOTA Language Model 🚀🌌

Welcome to Andromeda, The Fastest, Most Creative, and Reliable Language Model Ever Built, train your own verison, conduct inference, and finetune your own verison with simple plug in and play scripts get started in 10 seconds:

Features

💼 Handle Ultra Long Sequences (32,000-200,000+ context lengths)
⚡ Ultra Fast Processing (32,000+ tokens in under 100ms)
🎓 Superior Reasoning Capabilities

🎯 Principles

Efficiency: Optimize with techniques like attention flashing, rotary position encodings, and deep normalization.
Flexibility: Adapt to various tasks and domains for wide applications.
Scalability: Designed to scale with resources and data sizes.
Community-Driven: Thrives on contributions from the open-source community.

💻 Install

python3.11 -m pip install --upgrade andromeda-torch

Usage

Forward pass with random inputs

import torch

from andromeda.configs import Andromeda1Billion

model = Andromeda1Billion()

x = torch.randint(0, 256, (1, 1024)).cuda()

out = model(x)  # (1, 1024, 20000)
print(out)

Tokenized inputs

from andromeda_torch import Tokenizer
from andromeda_torch.configs import Andromeda1Billion

model = Andromeda1Billion()
tokenizer = Tokenizer()

encoded_text = tokenizer.encode("Hello world!")
out = model(encoded_text)
print(out)

📚 Training

Set the environment variables:
- ENTITY_NAME: Your wandb project name
- OUTPUT_DIR: Directory to save the weights (e.g., ./weights)
- MASTER_ADDR: For distributed training
- MASTER_PORT For master port distributed training
- RANK- Number of nodes services
- WORLD_SIZE Number of gpus
Configure the training:
- Accelerate Config
- Enable Deepspeed 3
- Accelerate launch train_distributed_accelerate.py

For more information, refer to the Training SOP.

Todo

Add Yarn Embeddings from zeta

📈 Benchmarks

Speed

Andromeda utilizes one of the most reliable Attentions ever, flash attention 2.0 Triton. It consumes 50x less memory than GPT-3 and 10x less than LLAMA.

We can speed this up even more with dynamic sparse flash attention 2.0.

License

Apache License

Name		Name	Last commit message	Last commit date
Latest commit History 428 Commits
.github		.github
andromeda_torch		andromeda_torch
config		config
data		data
images		images
tests		tests
.env		.env
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
example.py		example.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
train.py		train.py

License

kyegomez/Andromeda

Folders and files

Latest commit

History

Repository files navigation

Andromeda: Ultra-Fast and Ultra-Intelligent SOTA Language Model 🚀🌌

Features

🎯 Principles

💻 Install

Usage

📚 Training

Todo

📈 Benchmarks

Speed

License

About

Topics

Resources

License

Stars

Watchers

Forks

Languages