Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REQUEST] - Installing DeepSpeed on Windows! (Correct instructions HERE. Please update the Front page of GitHub) #4729

Open
erew123 opened this issue Nov 26, 2023 · 20 comments
Assignees
Labels
enhancement New feature or request

Comments

@erew123
Copy link

erew123 commented Nov 26, 2023

I have detailed this on a closed ticket here #3342 (comment) why the current instructions are unclear (along with photos showing what the DeepSpeed install routine VS the instructions you currently give and why they are failing). I am also going to detail some other issues on a separate portion of this ticket.

However I wouldn't like the below instructions to get lost. These are the full and correct instructions for installing DeepSpeed on Windows. Please update the front page https://github.com/microsoft/DeepSpeed#windows the instructions there are so unclear currently.

NB: DeepSpeed version 9.x, 10.x, 11.x and 12.x will all fail to compile (on Windows) for a multitude of reasons. As such, DeepSpeed v8.3 is the most recent build a for Windows. I have further detailed this on the next section down of this ticket.

DeepSpeed Version 8.3 & CUDA 11.8 or CUDA 12.1 - Installation Instructions

  1. Download the 8.3 release of DeepSpeed extract it to a folder.
  2. Install Visual C++ build tools, such as VS2019 C++ x64/x86 build tools.
  3. Download and install the Nvidia Cuda Toolkit 11.8 or 12.1
  4. Edit your Windows environment variables to ensure that CUDA_HOME and CUDA_PATH are set to your Nvidia Cuda Toolkit path. (The folder above the bin folder that nvcc.exe is installed in). Examples are:

CUDA 11.8
set CUDA_HOME=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8
set CUDA_PATH=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8

CUDA 12.1
set CUDA_HOME=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1
set CUDA_PATH=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1

  1. OPTIONAL Currently Python versions 3.9.x are supported. If you do not have an python environment already created, you can install Miniconda, then at a command prompt, create and activate your environment with:
    conda create -n pythonenv python=3.9.18
    activate pythonenv
  2. Launch the Command Prompt cmd with Administrator privilege as it requires admin to allow creating symlink folders.
  3. Install PyTorch, 2.1.0 with CUDA 11.8 or CUDA 12.1 into your Python 3.9.x environment e.g.

CUDA 11.8
activate pythonenv (activate your python environment)
conda install pytorch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 pytorch-cuda=11.8 -c pytorch -c nvidia

CUDA 12.1
activate pythonenv (activate your python environment)
conda install pytorch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 pytorch-cuda=12.1 -c pytorch -c nvidia

  1. In your python environment and double check that your CUDA_HOME and CUDA_PATH are still pointing to the correct location.
    set (to list and check the windows environment variables. Refer to step 4 if not)
  2. Navigate to your deepspeed folder in the Command Prompt:
    cd c:\deepspeed (wherever you extracted it to)
  3. You can try using the build_win.bat however you may need to set options to not install certain features e.g. set DS_BUILD_AIO=0 https://www.deepspeed.ai/tutorials/advanced-install/#pre-install-deepspeed-ops. Type the following into the command prompt:
    set DS_BUILD_AIO=0
    set DS_BUILD_SPARSE_ATTN=0
    set DS_BUILD_EVOFORMER_ATTN=0
    build_win.bat (This will start building your wheel file and may take a while)
  4. Now cd dist to go into your dist folder and you can now pip install deepspeed-YOURFILENAME.whl (or whatever your WHL file is called).

NB: DeepSpeed version 9.x, 10.x, 11.x and 12.x will all fail to compile with the above settings.
NB: Python 3.10.x or greater is currently incompatible with the scripts.

@erew123 erew123 added the enhancement New feature or request label Nov 26, 2023
@erew123 erew123 changed the title [REQUEST] - Please update the front page on Github with these instructions for installing on Windows! [REQUEST] - Installing DeepSpeed on Windows! (Correct instructions HERE. Please update the Front page of GitHub) Nov 26, 2023
@erew123
Copy link
Author

erew123 commented Nov 26, 2023

As mentioned, here are some other issues I noticed with other versions of Python, DeepSpeed 9+ etc.

Python 3.10 and later - The scripts almost complete, however, they cannot reach the compiler, because they are adding extra backslashes to the path within windows e.g. c:\\Program files\\Nvidia etc. I suspect later builds of Python would work fine, if the script that calls the compiler had its path routine updated.

Path issue

DeepSpeed v9 and above - None of these will compile for a mix of reasons on Windows. I have tried combinations of CUDA versions 11.1 through to 12.3, versions of Python 3.8 to 3.11 and of course versions of PyTorch etc. None of the 9+ scripts work and the DeepSpeed 12 script introduces a new issue, which I can find no workaround for. Perhaps the previous DeepSpeed 9 script issues are fixed by the 12th version, but it wont install on windows because of "No module names 'dskernels'.

DeepSpeed-0 12 0 DSKERNELS

Other Errors/Documentation - I currently see no way to stop it trying to install dskernels on https://www.deepspeed.ai/tutorials/advanced-install/#pre-install-deepspeed-ops

Also I note there may be other settings that aren't listed there such as DS_BUILD_CUTLASS_OPS=0 which is referenced here #4669 (comment)

The reason lots of people were having issues with install DeepSpeed on Windows is because not only are the current instructions very unclear, but they were missing lots of nuance e.g. Most things that require CUDA on Windows just use the graphics driver, inc most AI code, so most people will just assume that when your instructions on the front page say "CUDA", DeepSpeed should install. The instructions never clarified that you need the Nvidia Development Toolkit for Windows.

Re the "build_win.bat" - file that is now coming bundled with DeepSpeed, I could suggest adding this to the top of the file:

setlocal enabledelayedexpansion
:: Capture value of CUDA_PATH
set CURRENT_CUDA_PATH=!CUDA_PATH!
:: Create new CUDA_HOME variable with same value
set CUDA_HOME=!CURRENT_CUDA_PATH!
echo CUDA_HOME is set to %CUDA_HOME% 

as this should help correct the issue that the scripts are looking for an environment variable CUDA_HOME but on Windows, that environment variable doesn't exist as standard. On Windows you get CUDA_PATH. So the above bit of batch file code, will pull the CUDA_PATH and then create CUDA_HOME temporarily, allowing the scripts to run and not error out with the below error where it cant find "nvcc.exe" because CUDA_HOME doesn't exist.

image

Perhaps in current versions, you could even put a check in place in that file, that tells people not to try comping 12.x etc on Windows (until you have it working).

Thanks

@loadams
Copy link
Contributor

loadams commented Nov 28, 2023

@erew123 - would you like to make the PR to update the README on GitHub with these steps or prefer we do it?

@erew123
Copy link
Author

erew123 commented Nov 28, 2023

EDIT - Please see next comment below. I have submitted a pull request.

@loadams I actually have no idea how to make a Pull Request on github, so unless its a very simple step 1,2,3 kind of deal that you can point me to, then I'm happy for you guys to do it!

Thanks for seeing this and taking note :) It took me a while to get it all figured out and I thought it better to help everyone with instructions, not just myself!

All the best!

@erew123
Copy link
Author

erew123 commented Nov 28, 2023

@loadams I've amazingly had a second reason to figure out how to do a pull request today, so as I did that one, I have amended the readme.md file and I am submitting a pull request now. I've done my best to make it look clean.

@S95Sedan
Copy link

S95Sedan commented Dec 7, 2023

@erew123 with some small changes in the code 11.2 + CUDA 12.1 can be made to work aswell (same instructions as here but a python 3.11 env), see here:
oobabooga/text-generation-webui#4734 (comment)

12.4 needs deepspeed-kernels which wont compile for windows either.

@elkay
Copy link

elkay commented Dec 14, 2023

I was able to get this built using Python 3.9 in a Conda environment, however I now cannot get it to install. Does this require Python 3.9 to run? Basically, I'm trying to get DeepSpeed working with the new All Talk TTS in Text-Generation-Webui on a Windows 11 x64 box but that's running on a newer version of Python than 3.9.

https://www.reddit.com/r/Oobabooga/comments/18ha3vs/alltalk_tts_voice_cloning_advanced_coqui_tts/

When trying to install via pip:

ERROR: deepspeed-0.8.3+unknown-cp39-cp39-win_amd64.whl is not a supported wheel on this platform.

Is this a lost cause right now?

EDIT:

The .whl in the link in the comment right above me installed just fine. sigh

@loadams
Copy link
Contributor

loadams commented Dec 14, 2023

@elkay - I'm not sure I follow, are you able to use the much newer whl? If so, is there a reason that you are trying to install with the older (0.8.3) whl?

@elkay
Copy link

elkay commented Dec 14, 2023

@elkay - I'm not sure I follow, are you able to use the much newer whl? If so, is there a reason that you are trying to install with the older (0.8.3) whl?

I was following directions to build 8.3 because everything I read said that was the highest that could be built and run on native Windows. And yes it's a Python 3.11 environment.

@erew123
Copy link
Author

erew123 commented Dec 14, 2023

@elkay Let me break this down for you.

  1. you've built the DeepSpeed v8.3 wheel file on Python version 3.9. deepspeed-0.8.3+unknown-cp39-cp39-win_amd64.whl
  • As mentioned at the top, later versions on DeepSpeed, beyond 8.3 wont compile on Windows currently, so you have to use DeepSpeed v8.3.
  • The compiling script file that is within DeepSpeed v8.3 will only compile on Python 3.9.x latest. (3.9.18 I think is the last version of 3.9.x)

So you have compiled the file - deepspeed-0.8.3+unknown-cp39-cp39-win_amd64.whl

The cp39-win part of the file name means it is targeted ONLY to be installed on python 3.9, hence if you try to install that file into a later version of Python, say version 3.11, it will fail, because they are checks within the file that will tell the installation routine "this is only for version 3.9 of Python, so you cannot install it on other versions of Python".

I don't expect Microsoft (and Logan above) will in any way support hacking the wheel file/installing DeepSpeed of a later version of Python than is supported, so you are probably better talking with S95sedan on the link he gave oobabooga/text-generation-webui#4734 (comment) as he will be the best bet on this.

Personally, I have not tried hacking the file to make it work on a later version of Python than 3.9.x, so I cannot say is there are other hurdles to making it work in that way. I may try it some time, and if I do, Ill feed back on the link on the Oobabooga forum.

@erew123
Copy link
Author

erew123 commented Dec 14, 2023

@elkay I note that S95sedans link says he has edit the installation file/compilation file of DeepSpeed 11.2....So it looks like you would need to try with that and follow his instructions with a later version of DeepSpeed. oobabooga/text-generation-webui#4734 (comment)

@ZJDATY
Copy link

ZJDATY commented Feb 5, 2024

@erew123 HI, the other libraries I have installed require at least Python 3.10, but based on your issue, it seems that using Python 3.10 to build Windows would be more complex, and I have some confusion. Can you explain in more detail each step of the operation that needs to be done when the Python version is 3.10?

@erew123
Copy link
Author

erew123 commented Feb 5, 2024

@ZJDATY The problem is with the installation routines that exist beyond 8.3 of DeepSpeed. Obviously this is for MS to resolve. Though take a look here and you should find some options https://github.com/erew123/alltalk_tts?#-deepspeed-installation-options

@S95Sedan
Copy link

S95Sedan commented Feb 6, 2024

All of them up till 13.1 are built Here for python 3.11. (Some with some disabled modules)

I assume you can follow the same steps for python 3.10 and get similair results, though i haven't tested it myself.

@eastchun
Copy link

eastchun commented Feb 24, 2024

@erew123
I followed your instructions and successfully built and installed deepspeed 8.3 wheel on python 3.9 / windows 11.

But, when I ran it, I got the following error message:

File "C:\dreambyte\DisCo\pyworks\agent.py", line 201, in prepare_dist_model self.model, self.optimizer, _, _ = deepspeed.initialize( File "C:\Users\eastc\anaconda3\envs\disco_p39\lib\site-packages\deepspeed\__init__.py", line 125, in initialize engine = DeepSpeedEngine(args=args, File "C:\Users\eastc\anaconda3\envs\disco_p39\lib\site-packages\deepspeed\runtime\engine.py", line 297, in __init__ self._configure_distributed_model(model) File "C:\Users\eastc\anaconda3\envs\disco_p39\lib\site-packages\deepspeed\runtime\engine.py", line 1182, in _configure_distributed_model self._broadcast_model() File "C:\Users\eastc\anaconda3\envs\disco_p39\lib\site-packages\deepspeed\runtime\engine.py", line 1105, in _broadcast_model dist.broadcast(p, File "C:\Users\eastc\anaconda3\envs\disco_p39\lib\site-packages\deepspeed\comm\comm.py", line 123, in log_wrapper return func(*args, **kwargs) File "C:\Users\eastc\anaconda3\envs\disco_p39\lib\site-packages\deepspeed\comm\comm.py", line 228, in broadcast return cdb.broadcast(tensor=tensor, src=src, group=group, async_op=async_op) File "C:\Users\eastc\anaconda3\envs\disco_p39\lib\site-packages\deepspeed\comm\torch.py", line 78, in broadcast return torch.distributed.broadcast(tensor=tensor, File "C:\Users\eastc\anaconda3\envs\disco_p39\lib\site-packages\torch\distributed\c10d_logger.py", line 47, in wrapper return func(*args, **kwargs) File "C:\Users\eastc\anaconda3\envs\disco_p39\lib\site-packages\torch\distributed\distributed_c10d.py", line 1914, in broadcast work.wait() **RuntimeError: a leaf Variable that requires grad is being used in an in-place operation.**

For this error, I found [https://github.com//issues/2818#issue-1581325652]:

deepspeed.init_distributed(dist_backend="gloo") might cause this problem because they claim no such error with "nccl" backend option. Also, if I don't activate deepspeed, then it runs with no such an error.

Any comment for this 'gloo' issue?

@erew123
Copy link
Author

erew123 commented Feb 24, 2024

Hi @eastchun

I'm not a DeepSpeed expert in so much as I know its inner-workings as such. Its was more a case of "the install/setup routine was broken/the instructions from MS were wrong/rubbish and I spent 12+ hours figuring out how to actually correctly get the install working, so Ill share this with everyone".

Saying that, the specific error you are getting with "DisCo" (Im guessing. Software I dont know), Its down to how its handling DeepSpeed (best I can tell). https://medium.com/@mrityu.jha/understanding-the-grad-of-autograd-fc8d266fd6cf

This could be as a result of a call for the features that cannot be installed on Windows e.g.

set DS_BUILD_AIO=0
set DS_BUILD_SPARSE_ATTN=0
set DS_BUILD_EVOFORMER_ATTN=0

Or it could be as per the article that I link above, some bits of code within DisCo (again, I assume this is the software) needs updating as to how it imports and handles DeepSpeed. I would point the Developer of the software that you are using towards the article I sent you and hopefully they can make sense of the issue on the Windows platform (As it may be ok on Linux).

Thanks

@eastchun
Copy link

eastchun commented Feb 24, 2024

Thanks @erew123 for quick reply.
It appears that other people also mentioned deepspeed issue in Windows for 'gloo' supporting.
Please refer to the following related links:

#1030 (comment)
oobabooga/text-generation-webui#1225 (comment)
pytorch/pytorch#71049 (comment)

Anyway, I modified one line (Line 196) in the following deepspeed source file:

C:\Users\eastc\anaconda3\envs\disco_p310\Lib\site-packages\deepspeed\comm\torch.py

def broadcast(self, tensor, src, group=None, async_op=False):
  if DS_COMM_BROADCAST_OFF:
     if int(os.getenv('RANK', '0')) == > 0:
        utils.logger.warning("BROADCAST  is OFF")
        return Noop()
     else:
        tensor_copy = tensor.detach() # Added line
        return torch.distributed.broadcast(tensor=tensor_copy, src=src, group=group, async_op=async_op)
        #return torch.distributed.broadcast(tensor=tensor, src=src, group=group, async_op=async_op) # Commented out`

Purpose of this modification is to detach tensors before broadcasting, and deepspeed appears now working for gloo backend with this simple modification (I just tested for inference case only and will try for training case soon).

@eastchun
Copy link

eastchun commented Feb 24, 2024

@erew123

In order to run deepspeed in Windows environment with 'gloo' backend (Windows doesn't support 'nccl'),
we may need to modify the following deepspeed src file before running 'build_win.bat' for wheel generation:

File to modify: DeepSpeed-0.8.3\deepspeed\comm\torch.py
(Note. DeepSpeed-0.83 is my folder name to download deepspeed 0.83 Release pkg).

Line 77 ~ :

   def broadcast(self, tensor, src, group=None, async_op=False):

        tensor_copy = tensor.detach() # Added

        return torch.distributed.broadcast(tensor=tensor_copy, src=src, group=group, async_op=async_op) # Added
        #return torch.distributed.broadcast(tensor=tensor, src=src, group=group, async_op=async_op) # Commented

In case of installing deepspeed using one of the deepspeed wheels for Windows available in the internet, we may need to apply similar modification to the 'broadcast' method in the installed "...\lib\site-packages\deepspeed\comm\torch.py" file. Otherwise, it will generate "RuntimeError: a leaf Variable that requires grad is being used in an in-place operation" (after changing 'nccl backend' to 'gloo backend' for Windows).

@erew123
Copy link
Author

erew123 commented Feb 24, 2024

@eastchun Glad you have it working! There are later versions of DeepSpeed you can now compile, but Im not sure if they will work on Python 3.9. You may wish to look at @S95Sedan as they have done some fantastic work getting later versions of DeepSpeed able to compile https://github.com/S95Sedan/Deepspeed-Windows

Im not sure if the change you mentioned is or isnt needed of later versions (for your requirements). But If you do want to try later versions of DeepSpeed at any time, that would be a good place to take a look.

Thanks

@eastchun
Copy link

@erew123
I tried another deepspeed wheel available in the internet (https://www.linkedin.com/pulse/deepspeed-wheels-windows-furkan-g%C3%B6z%C3%BCkara-vb7vf/?trk=article-ssr-frontend-pulse_more-articles_related-content-card).

It is deepspeed 0.11.2 with py310 and it gave the same error for gloo distributed backend.

I'll try on the one you recommended and let you know the result. Thanks.

@erew123
Copy link
Author

erew123 commented Feb 24, 2024

@eastchun All the files required to be changed are https://github.com/S95Sedan/Deepspeed-Windows/tree/main/modifications

You will have to potentially add your additional modification.

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

7 participants