mixture-of-experts

Here are 65 public repositories matching this topic...

microsoft / DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

machine-learning compression deep-learning gpu inference pytorch zero data-parallelism model-parallelism mixture-of-experts pipeline-parallelism billion-parameters trillion-parameters

Updated May 24, 2024
Python

dvmazur / mixtral-offloading

Star

Run Mixtral-8x7B models in Colab or consumer desktops

deep-learning pytorch offloading quantization language-model google-colab colab-notebook mixture-of-experts llm

Updated Apr 8, 2024
Python

learning-at-home / hivemind

Star

Decentralized deep learning in PyTorch. Built to train models on thousands of volunteers across the world.

distributed-systems machine-learning deep-learning pytorch dht neural-networks asyncio asynchronous-programming volunteer-computing hivemind distributed-training mixture-of-experts

Updated May 13, 2024
Python

PKU-YuanGroup / MoE-LLaVA

Star

Mixture-of-Experts for Large Vision-Language Models

moe multi-modal mixture-of-experts large-vision-language-model

Updated May 15, 2024
Python

davidmrau / mixture-of-experts

Star

PyTorch Re-Implementation of "The Sparsely-Gated Mixture-of-Experts Layer" by Noam Shazeer et al. https://arxiv.org/abs/1701.06538

pytorch moe re-implementation mixture-of-experts sparsely-gated-mixture-of-experts

Updated Apr 19, 2024
Python

pjlab-sys4nlp / llama-moe

Star

⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training

moe llama mixture-of-experts llm continual-pre-training expert-partition

Updated Feb 26, 2024
Python

drawbridge / keras-mmoe

Star

A TensorFlow Keras implementation of "Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts" (KDD 2018)

data-science machine-learning deep-neural-networks deep-learning tensorflow keras multi-task-learning kdd2018 mixture-of-experts

Updated Mar 25, 2023
Python

microsoft / tutel

Star

Tutel MoE: An Optimized Mixture-of-Experts Implementation

nlp pytorch transformer moe mixture-of-experts

Updated May 14, 2024
Python

lucidrains / mixture-of-experts

Star

A Pytorch implementation of Sparsely-Gated Mixture of Experts, for massively increasing the parameter count of language models

deep-learning artificial-intelligence transformer mixture-of-experts

Updated Sep 13, 2023
Python

ymcui / Chinese-Mixtral

Star

中文Mixtral混合专家大模型（Chinese Mixtral MoE LLMs）

nlp moe 64k mixture-of-experts 32k large-language-models llm mixtral

Updated Apr 30, 2024
Python

Leeroo-AI / mergoo

Star

A library for easily merging multiple LLM experts, and efficiently train the merged LLM.

nlp open-source transformers merge artificial-intelligence multi-model lora fine-tuning mixture-of-experts large-language-models llm generative-ai mixture-of-adapters

Updated May 16, 2024
Python

Luodian / Generalizable-Mixture-of-Experts

Star

GMoE could be the next backbone model for many kinds of generalization task.

deep-learning pytorch pytorch-implementation mixture-of-experts domain-generalization

Updated Mar 21, 2023
Python

lucidrains / st-moe-pytorch

Star

Implementation of ST-Moe, the latest incarnation of MoE after years of research at Brain, in Pytorch

deep-learning artificial-intelligence mixture-of-experts conditional-computation

Updated Feb 29, 2024
Python

lucidrains / soft-moe-pytorch

Star

Implementation of Soft MoE, proposed by Brain's Vision team, in Pytorch

deep-learning transformers artificial-intelligence mixture-of-experts

Updated Apr 24, 2024
Python

efeslab / fiddler

Star

Fast Inference of MoE Models with CPU-GPU Orchestration

mixture-of-experts llm local-inference llm-inference mixtral-8x7b

Updated May 22, 2024
Python

lucidrains / mixture-of-attention

Star

Some personal experiments around routing tokens to different autoregressive attention, akin to mixture-of-experts

deep-learning artificial-intelligence attention-mechanisms mixture-of-experts routed-attention

Updated Jul 16, 2023
Python

YangLing0818 / RealCompo

Star

RealCompo: Dynamic Equilibrium between Realism and Compositionality Improves Text-to-Image Diffusion Models

mixture-of-experts diffusion-models layout-to-image text-to-image-diffusion

Updated Mar 11, 2024
Python

liuqidong07 / MOELoRA-peft

Star

[SIGIR'24] The official implementation code of MOELoRA.

multi-task peft multitask-learning mixture-of-experts large-language-models chatglm low-rank-adaptation parameter-efficient-fine-tuning peft-fine-tuning-llm

Updated May 16, 2024
Python

xrsrke / pipegoose

Star

Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*

transformers moe data-parallelism distributed-optimizers model-parallelism megatron mixture-of-experts pipeline-parallelism huggingface-transformers megatron-lm tensor-parallelism large-scale-language-modeling 3d-parallelism zero-1 sequence-parallelism

Updated Dec 14, 2023
Python

HLTCHKUST / MoEL

Star

MoEL: Mixture of Empathetic Listeners

chatbot transformer dialogue-systems empathy dialogue-generation mixture-of-experts transformer-pytorch

Updated May 3, 2024
Python

Improve this page

Add a description, image, and links to the mixture-of-experts topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the mixture-of-experts topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mixture-of-experts

Here are 65 public repositories matching this topic...

microsoft / DeepSpeed

dvmazur / mixtral-offloading

learning-at-home / hivemind

PKU-YuanGroup / MoE-LLaVA

davidmrau / mixture-of-experts

pjlab-sys4nlp / llama-moe

drawbridge / keras-mmoe

microsoft / tutel

lucidrains / mixture-of-experts

ymcui / Chinese-Mixtral

Leeroo-AI / mergoo

Luodian / Generalizable-Mixture-of-Experts

lucidrains / st-moe-pytorch

lucidrains / soft-moe-pytorch

efeslab / fiddler

lucidrains / mixture-of-attention

YangLing0818 / RealCompo

liuqidong07 / MOELoRA-peft

xrsrke / pipegoose

HLTCHKUST / MoEL

Improve this page

Add this topic to your repo