Skip to content
Mike Simpson edited this page Jul 18, 2023 · 29 revisions

This page documents the GPU configurations that are known to work (or not work) and what models they are known to work with. If you have used FauxPilot with a GPU and model configuration that's not listed in the table below, please add it!

GPU Model VRAM Number of GPUs Working? Models Notes Checker
NVIDIA Tesla A100 80GB 1 Yes 350M, 2B, 6B, 16B @leemgs
NVIDIA Tesla A100 40GB 1 Yes 16B @Ghost-Assassin
NVIDIA Tesla T4 16GB 1, 2, 4 Yes 350M, 2B, 6B @Jaeker0512
NVIDIA Tesla V100 16GB 1 Yes 350M, 6B Out of memory when tried 16B-multi-2gpu on 2 such GPUs @askoldilvento
NVIDIA RTX A6000 48GB 1, 2, 4 Yes 350M, 2B, 6B, 16B @moyix
NVIDIA RTX A4000 16GB 1 Yes 6B @grantharris33
NVIDIA RTX 4090 24GB 1 Yes 6B @TK009
NVIDIA RTX 3090 24GB 1 Yes 2B @152334H
NVIDIA RTX 3090 24GB 1 Yes 6B Docker-in-WSL2 & fauxpilot-windows @Frederisk
NVIDIA RTX 3090 24GB 1 Yes 6B podman in Linux [tiny tweaks needed] @mormegil-cz
NVIDIA RTX 3080Ti 12GB 1 Yes 350M, 2B @???
NVIDIA RTX 3070Ti 8GB 1 Yes 350M, 2B Tested in Docker-in-WSL2 @m5kro
NVIDIA RTX 3060Ti 8GB 1 Yes 350M, 2B Tested in Docker-in-WSL @dewacandra4
NVIDIA RTX 2080Ti 12GB 1 Yes 350M, 2B, 6B @leemgs
NVIDIA RTX 2080 8GB 1 Yes 350M Docker-in-WSL, Windows 10, 16GB, slow @enoris75
NVIDIA RTX 2070 SUPER 8GB 1 Yes 350M, 2B Tested in Docker-in-WSL @SoulRaven80
NVIDIA RTX 2060 SUPER 8GB 1 Yes 2B @xjtu-blacksmith
NVIDIA RTX 2060 XC 12GB 1 Yes 2B Tested in Docker-in-WSL @azeemba
NVIDIA GTX 1080Ti 11GB 1 Yes 350M, 2B @???
NVIDIA GTX 1060 6GB 1 Yes 350M Docker-in-WSL2 & fauxpilot-windows @Frederisk
NVIDIA GTX 1060 6GB 1 Yes 350M Linux as is @billyblackburn
NVIDIA Titan Xp 12GB 1 Yes 350M, 2B, 6B @leemgs
AMD RX6800XT 16GB 1 Yes 2B Python Backend Only. Used hack as https://github.com/fauxpilot/fauxpilot/discussions/81#discussioncomment-5785300. triton transformer backend will crash. @Ghost-Assassin
NVIDIA Quadro T2000 36GB (4GB dedicated, 32GB shared) 1 Yes 350M, 2B, 6B ~8s for 10 solution on smallest model. All others using shared memory are extremely slow, 16B loads but doesn't work @MikeS159