Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Building the template for x86_64 or VM #29

Closed
skanduru opened this issue Feb 28, 2022 · 5 comments
Closed

Building the template for x86_64 or VM #29

skanduru opened this issue Feb 28, 2022 · 5 comments

Comments

@skanduru
Copy link

Hi

Thanks for the detailed steps on building and documentation. Can I make the build image for intel desktop or VM ? In case I use (build/train) on a VM what are the TARGET_ARCH options that needs to be given ?

Thank you so much
Srini

@veritas9872
Copy link
Collaborator

veritas9872 commented Mar 1, 2022

The default settings are for Intel x86_64, which is what most people use. Using a VM will not affect the build, assuming that it has enough memory. To build everything, please do the following.

  1. Run make env to create a .env file.
  2. Edit the .env file to add CCA and other necessary variables for the full service in docker-compose.yaml.
  3. Edit the docker-compose.yaml file's full service to add volumes, ports, etc. as necessary for your application.
  4. Edit the reqs/apt-train.requirements.txt and pip-train.requirements.txt files to include the libraries necessary for your application. Please do not remove packages marked as essential packages.
  5. Run make up to start the build.
  6. Take a break. The first build will take some time.
  7. After the build has finished and assuming that there are no bugs (usually driver mismatch issues) run make exec to enter the container.
  8. If there has been a CUDA version mismatch or driver mismatch bug, change the .env file or docker-compose.yaml file to use the appropriate version of CUDA, cuDNN, etc.
  9. Run make rebuild to rebuild the image.
  10. Take a short break. The second build is much faster than the first build due to caching.
  11. Run make exec and start coding.
  12. To exit the container, simply use Ctrl+p Ctrl+q to exit without stopping the container. Ctrl+d works too, but this may stop the container (this needs checking).
  13. To restart a stopped container without deleting the previous container, use make start (not recommended). Only use this if there is essential data saved in a non-volume directory on the container or if it has an unreproducible configuration. This is not recommended because the entire project is about making everything reproducible.
  14. To delete the temporary state of the previous container and create a new container from the same image, use make up (recommended method). This is necessary if adding a new volume, adding a new port, etc.
  15. Run make exec to enter an up and running container. make exec will never delete an existing image or restart it.
  16. After the project is complete and the container must be deleted, use make down to delete the container and all networks created by it.

@veritas9872
Copy link
Collaborator

@skanduru I hope that the information helps. It shows instructions for the most commonly envountered situations so it may seem a bit long. However, most users only need to deal with 8 or so on the first installation and use 3 afterwards.

@skanduru
Copy link
Author

skanduru commented Mar 2, 2022

Thank you very much .. Appreciate it. I am trying to learn pytorch and this will give a good entry point I think with a good templated package.

CCA = 11.6.0-1 I suppose - the latest stable version. Need to play around a bit to get acquainted with the CUDA apps

@veritas9872
Copy link
Collaborator

veritas9872 commented Mar 2, 2022

@skanduru I believe that there has been a misunderstanding. CCA is an attribute of the hardware, not the software. Also, CUDA version should be 11.3.1 for PyTorch version v1.10.2 as CUDA 11.5+ is not supported for PyTorch 1.10.x. However, the current master branch supports CUDA 11.5.x.

@skanduru
Copy link
Author

skanduru commented Mar 2, 2022

Thanks for the clarification!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants