-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: docker improvements #264
base: main
Are you sure you want to change the base?
feat: docker improvements #264
Conversation
If you want to implement a similar setup on windows, follow: |
@wl-zhao, @yuxumin, @Zengyi-Qin, may you take a look, please. I wasn't able to assign a reviewer, this option seems to be disabled in this repo. Thank you! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I approved this too soon and I don't know how to (or if I even can) retract it. This does not fix "kernel died" for me and has numerous problems (more than I care to enumerate right now). Needs to go back in the oven. I'll probably just move on. I've wasted far too much time trying to get this project to work. Good luck.
2024-06-11 08:25:45 (17.9 MB/s) - ‘checkpoints_v2_0417.zip’ saved [122086901/122086901]
Archive: checkpoints_v2_0417.zip
creating: /tmp/extract_temp/checkpoints_v2/
creating: /tmp/extract_temp/checkpoints_v2/base_speakers/
creating: /tmp/extract_temp/checkpoints_v2/base_speakers/ses/
inflating: /tmp/extract_temp/checkpoints_v2/base_speakers/ses/fr.pth
inflating: /tmp/extract_temp/checkpoints_v2/base_speakers/ses/en-us.pth
inflating: /tmp/extract_temp/checkpoints_v2/base_speakers/ses/en-india.pth
inflating: /tmp/extract_temp/checkpoints_v2/base_speakers/ses/en-br.pth
inflating: /tmp/extract_temp/checkpoints_v2/base_speakers/ses/es.pth
inflating: /tmp/extract_temp/checkpoints_v2/base_speakers/ses/en-newest.pth
inflating: /tmp/extract_temp/checkpoints_v2/base_speakers/ses/jp.pth
inflating: /tmp/extract_temp/checkpoints_v2/base_speakers/ses/en-default.pth
inflating: /tmp/extract_temp/checkpoints_v2/base_speakers/ses/kr.pth
inflating: /tmp/extract_temp/checkpoints_v2/base_speakers/ses/zh.pth
inflating: /tmp/extract_temp/checkpoints_v2/base_speakers/ses/en-au.pth
creating: /tmp/extract_temp/checkpoints_v2/converter/
inflating: /tmp/extract_temp/checkpoints_v2/converter/config.json
inflating: /tmp/extract_temp/checkpoints_v2/converter/checkpoint.pth
mv: cannot move '/tmp/extract_temp/checkpoints_v2/base_speakers' to '/workspace/checkpoints_v2/base_speakers': Directory not empty
mv: cannot move '/tmp/extract_temp/checkpoints_v2/converter' to '/workspace/checkpoints_v2/converter': Directory not empty
Starting Jupyter Notebook...
[I 2024-06-11 08:25:46.049 ServerApp] jupyter_lsp | extension was successfully linked.
[I 2024-06-11 08:25:46.051 ServerApp] jupyter_server_terminals | extension was successfully linked.
[W 2024-06-11 08:25:46.052 LabApp] 'notebook_dir' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[W 2024-06-11 08:25:46.053 ServerApp] notebook_dir is deprecated, use root_dir
[I 2024-06-11 08:25:46.053 ServerApp] jupyterlab | extension was successfully linked.
[I 2024-06-11 08:25:46.055 ServerApp] notebook | extension was successfully linked.
[I 2024-06-11 08:25:46.055 ServerApp] Writing Jupyter server cookie secret to /root/.local/share/jupyter/runtime/jupyter_cookie_secret
[I 2024-06-11 08:25:46.175 ServerApp] notebook_shim | extension was successfully linked.
[I 2024-06-11 08:25:46.181 ServerApp] notebook_shim | extension was successfully loaded.
[I 2024-06-11 08:25:46.182 ServerApp] jupyter_lsp | extension was successfully loaded.
[I 2024-06-11 08:25:46.182 ServerApp] jupyter_server_terminals | extension was successfully loaded.
[I 2024-06-11 08:25:46.183 LabApp] JupyterLab extension loaded from /usr/local/lib/python3.10/site-packages/jupyterlab
[I 2024-06-11 08:25:46.183 LabApp] JupyterLab application directory is /usr/local/share/jupyter/lab
[I 2024-06-11 08:25:46.183 LabApp] Extension Manager is 'pypi'.
[I 2024-06-11 08:25:46.197 ServerApp] jupyterlab | extension was successfully loaded.
[I 2024-06-11 08:25:46.198 ServerApp] notebook | extension was successfully loaded.
[I 2024-06-11 08:25:46.198 ServerApp] Serving notebooks from local directory: /workspace
[I 2024-06-11 08:25:46.198 ServerApp] Jupyter Server 2.14.1 is running at:
[I 2024-06-11 08:25:46.198 ServerApp] http://3b1a70be49a4:8888/tree?token=24d0165bed2a4d5aefc4b79f960fc5f53557b15e02dd5b69
[I 2024-06-11 08:25:46.198 ServerApp] http://127.0.0.1:8888/tree?token=24d0165bed2a4d5aefc4b79f960fc5f53557b15e02dd5b69
[I 2024-06-11 08:25:46.198 ServerApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 2024-06-11 08:25:46.199 ServerApp]
To access the server, open this file in a browser:
file:///root/.local/share/jupyter/runtime/jpserver-15-open.html
Or copy and paste one of these URLs:
http://3b1a70be49a4:8888/tree?token=24d0165bed2a4d5aefc4b79f960fc5f53557b15e02dd5b69
http://127.0.0.1:8888/tree?token=24d0165bed2a4d5aefc4b79f960fc5f53557b15e02dd5b69
[W 2024-06-11 08:25:46.207 ServerApp] Failed to fetch commands from language server spec finder `pyright`:
The 'nodejs' trait of a LanguageServerManager instance expected a unicode string, not the NoneType None.
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/traitlets/traitlets.py", line 632, in get
value = obj._trait_values[self.name]
KeyError: 'nodejs'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/jupyter_lsp/manager.py", line 279, in _autodetect_language_servers
specs = spec_finder(self) or {}
File "/usr/local/lib/python3.10/site-packages/jupyter_lsp/specs/utils.py", line 148, in __call__
"argv": ([mgr.nodejs, node_module, *self.args] if is_installed else []),
File "/usr/local/lib/python3.10/site-packages/traitlets/traitlets.py", line 687, in __get__
return t.cast(G, self.get(obj, cls)) # the G should encode the Optional
File "/usr/local/lib/python3.10/site-packages/traitlets/traitlets.py", line 649, in get
value = self._validate(obj, default)
File "/usr/local/lib/python3.10/site-packages/traitlets/traitlets.py", line 722, in _validate
value = self.validate(obj, value)
File "/usr/local/lib/python3.10/site-packages/traitlets/traitlets.py", line 2945, in validate
self.error(obj, value)
File "/usr/local/lib/python3.10/site-packages/traitlets/traitlets.py", line 831, in error
raise TraitError(e)
traitlets.traitlets.TraitError: The 'nodejs' trait of a LanguageServerManager instance expected a unicode string, not the NoneType None.
[I 2024-06-11 08:25:46.208 ServerApp] Skipped non-installed server(s): bash-language-server, dockerfile-language-server-nodejs, javascript-typescript-langserver, jedi-language-server, julia-language-server, python-language-server, python-lsp-server, r-languageserver, sql-language-server, texlab, typescript-language-server, unified-language-server, vscode-css-languageserver-bin, vscode-html-languageserver-bin, vscode-json-languageserver-bin, yaml-language-server
[W 2024-06-11 08:25:46.214 ServerApp] Failed to fetch commands from language server spec finder `pyright`:
The 'nodejs' trait of a LanguageServerManager instance expected a unicode string, not the NoneType None.
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/traitlets/traitlets.py", line 632, in get
value = obj._trait_values[self.name]
KeyError: 'nodejs'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/jupyter_lsp/manager.py", line 279, in _autodetect_language_servers
specs = spec_finder(self) or {}
File "/usr/local/lib/python3.10/site-packages/jupyter_lsp/specs/utils.py", line 148, in __call__
"argv": ([mgr.nodejs, node_module, *self.args] if is_installed else []),
File "/usr/local/lib/python3.10/site-packages/traitlets/traitlets.py", line 687, in __get__
return t.cast(G, self.get(obj, cls)) # the G should encode the Optional
File "/usr/local/lib/python3.10/site-packages/traitlets/traitlets.py", line 649, in get
value = self._validate(obj, default)
File "/usr/local/lib/python3.10/site-packages/traitlets/traitlets.py", line 722, in _validate
value = self.validate(obj, value)
File "/usr/local/lib/python3.10/site-packages/traitlets/traitlets.py", line 2945, in validate
self.error(obj, value)
File "/usr/local/lib/python3.10/site-packages/traitlets/traitlets.py", line 831, in error
raise TraitError(e)
traitlets.traitlets.TraitError: The 'nodejs' trait of a LanguageServerManager instance expected a unicode string, not the NoneType None.
@oldmanjk Thank you for reviewing the pull and bringing this up, but kernel died can have various causes, this error message can be displayed in everything from lack of memory to missing libs etc. First issue you are havingThe folders should not exist or be populated prior to the checkpoint extraction, this is a docker container. Based on your logs, it seems the first issue you are facing is related to moving the extracted checkpoint files: the directories /workspace/checkpoints_v2/base_speakers and /workspace/checkpoints_v2/converter are not empty, preventing the extracted files from being moved. In my testing environment, I've built the image from scratch, and it works without any issues. The folders should not exist or be populated prior to the extraction process. Idk what your build context is, but these are some things that I can think of:
Second issueFor the second one. Looks like LSP tries to autodetect and start lang servers and it looks for the nodejs executable path. I have the container running right now, here: @oldmanjk, could you please check if you have any user-specific jupyter configurations, additional jupyter extensions, or dev environment settings that might be enabling or interacting with the LSP extension? If so, try disabling or removing them and rebuilding the container to see if the errors persist If the issue still persists after considering these, I'd be happy to work with you to investigate further and find a solution. We can explore additional steps. |
Thanks for the fast and thorough response. Unfortunately, I have deleted everything and moved on. Good luck though! |
fix: dont hard code the tar.xz
@vladlearns I have been working in parallels on a fix for the Dockerfile that will suit the CPU setup, particularly for Mac series M and similar systems. I have finally chanced upon a solution. Considering your work on this matter, perhaps we can combine our efforts. We could develop specialized Dockerfiles; one for CUDA and another for CPU. Correspondingly, we could generate docker-compose files (docker-compose.cuda.yml and docker-compose.cpu.yml). What do you think? My work : npjonath#1 note: this PR also include the fix from @Afnanksalal, as this is a requirement to run this project on CPU based architecture. (#262) The Openvoice V1 work correctly on my setup. The V2 is still not working because of this issue from MeloTTS Issue: myshell-ai/MeloTTS#167 |
I don't use this project anymore, so I probably shouldn't be a requested reviewer |
@oldgithubman You added yourself by approving the PR and then dismissing the review because of your environment. Later, you decided to leave without providing any details. Now, when I ask for a review, you are automatically added, and there is no way to remove you. |
@npjonath Hey! |
@vladlearns No it was just for talking about this with you. You can leave the naming as it. I guess GPU usage is the default one. I will add docker-compose file and Dockerfile.cpu separately to extends your implementation. |
Ok. I don't really know what I'm doing. I'll just approve it so you can move on |
@npjonath Sure. So far, I've tested my setup on multiple environments. It works for multiple people as well, but it seems they don't merge pull requests into the main branch. Instead, they ask contributors to fork the repository and point to the fork in the documentation |
Features:
This setup has been thoroughly tested to ensure stability and performance.
Prerequisites:
Join the NVIDIA Developer Program:
Download cuDNN:
Run:
docker build -t openvoice .
then
docker run --gpus all -p 8888:8888 openvoice v2
tl;dr
Hey everyone,
I've been working on improving the Docker setup for OpenVoice, and I think these changes will make it much easier to run in a containerized environment.
The main issue I've seen is with CUDA and cuDNN versions not matching up, causing errors. In this Dockerfile, I've included CUDA 12.1 and cuDNN 8.9.7, which work well with the latest PyTorch that supports CUDA 12. This should help eliminate those errors.
Another improvement is the entrypoint shell script. Additionally, you can now download only the checkpoints you need: it will only download the checkpoints for the specified version, saving time and bandwidth.
I've also optimized the Docker layer cache. I rearranged some commands so that if only the local files change, Docker can reuse the base layers that have all the lengthy installations. This should speed up your builds when you're making changes to your local setup.
In summary, smoother, faster, and less prone to errors. It's now easier to spin up different versions and notebooks without CUDA issues or long installations.
This setup has been thoroughly tested to ensure stability and performance.
Give it a try and let me know how it goes! I'm always happy to hear feedback and suggestions. I think this will be a big improvement for the OpenVoice experience.
Happy Dockerizing! 🐳
Vlad
Screenshots:
Running:
Results: