DeepSpeed 14.2 versions for Linux
LINUX VERSION HERE (Not Windows)
Libaio Requirement
You need libaio-dev
or libaio-devl
(depending on your Linux flavour), otherwise DeepSpeed will fail e.g at your terminal.
- Debian-based systems
sudo apt install libaio-dev
- RPM-based systems
sudo yum install libaio-devel
DeepSpeed setup/compilation
DeepSpeed is complicated at best. You have to install DeepSpeed that matches:
- Your Python version in your Python virtual environment e.g 3.10, 3.11, 3.12 etc
- Your version of PyTorch within your Python virtual environment e.g, 2.0.x, 2.1.x, 2.2,x, 2.3.x etc
- The version of CUDA that PyTorch was installed with in your Python virtual environment e.g. 11.8, or 12.1
If you change your version of Python, PyTorch or the CUDA version PyTorch uses within that virtual environment, you will need to uninstall DeepSpeed pip uninstall deepspeed
and then install the correct matching version.
To understand a filename deepspeed-0.14.2+cu121torch2.3-cp312-cp312-manylinux_2_24_x86_64.whl
- deepspeed-0.14.2+ the version of DeepSpeed
- cu121 the CUDA version of PyTorch, in this case cu121 means CUDA 12.1
- torch2.3 the version of PyTorch that it works with.
- cp312-cp312 the version of Python it works with.
- manylinux_2_24_x86_64 states its a Linux version.
So you will start your Python virtual environment then use something like AllTalks diagnostics to find out what version of:
- Python is installed
- PyTorch is installed
- CUDA version that PyTorch was installed with.
You will then hunt through the below files and download the correct file to your folder, and still loaded into your Python virtual environment, you will run pip install deepspeed-0.14.2+{version here}manylinux_2_24_x86_64.whl
where "version here" is the correct, matching version that you have downloaded.
To be clear, lets say your virtual Python environment is running Python 3.11.6 with PyTorch 2.2.1 with CUDA 12.1, you would want to download deepspeed-0.14.2+cu121torch2.2-cp311-cp311-manylinux_2_24_x86_64.whl
Note: You will need to install the Nvidia CUDA Development Toolkit, version 12.1.0 works perfectly with PyTorch 2.2.1, as it has been tested and confirmed to work. Version 12.4 has been tried and found to be problematic In Conda Python virtual environments, you can start the Python virtual environment and install this toolkit using the following command conda install nvidia/label/cuda-12.1.0::cuda-toolkit=12.1
.
To be absolutely clear, the Nvidia CUDA Development Toolkit is separate from:
- Your graphics card driver version.
- The version of CUDA used by your graphics driver.
- The version of PyTorch or Python on your system and their associated CUDA versions.
Think of the CUDA Development Toolkit like the engine diagnostics tools used by mechanics. These tools are necessary for the development, compilation and testing of CUDA applications (or in this case DeepSpeed). Just as a mechanic's tools are separate from the engine, car model, or the type of fuel the car uses, the CUDA Development Toolkit is separate from your graphics driver, the CUDA version your driver uses, and the versions of PyTorch or Python installed on your system. Aka, the versions do not all have to match exactly.
Also note, you will see this warning message when AllTalk starts up and DeepSpeed for Linux is installed. It is safe to ignore, as far as AllTalk is concerned.