Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable CUDA 11 compatibility mode #2356

Merged
merged 6 commits into from
Oct 16, 2020
Merged

Conversation

JanuszL
Copy link
Contributor

@JanuszL JanuszL commented Oct 14, 2020

  • makes CUDA 11.1 builds to be named as CUDA 11.0 because due to the CUDA enhanced compatibility of the build with the latest toolkit version will run on the latest CUDA 11.0 capable driver

Signed-off-by: Janusz Lisiecki jlisiecki@nvidia.com

Why we need this PR?

Pick one, remove the rest

  • It enables CUDA 11 compatibility mode

What happened in this PR?

Fill relevant points, put NA otherwise. Replace anything inside []

  • What solution was applied:
    makes CUDA 11.1 builds to be named as CUDA 11.0 because due to the CUDA enhanced compatibility of the build with the latest toolkit version will run on the latest CUDA 11.0 capable driver
  • Affected modules and functionalities:
    plain and conda build
  • Key points relevant for the review:
    if it is the right way?
  • Validation and testing:
    CI
  • Documentation (including examples):
    compilation/installation instruction is updated

JIRA TASK: [NA]

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1700565]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1700749]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1700756]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1700756]: BUILD FAILED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1700756]: BUILD PASSED

@awolant awolant self-requested a review October 15, 2020 09:39
set(${CUDA_VERSION_SHORT_VAR} "${${CUDA_VERSION_MAJOR_VAR}}.${${CUDA_VERSION_MINOR_VAR}}" PARENT_SCOPE)
set(${CUDA_VERSION_SHORT_DIGIT_ONLY_VAR} "${${CUDA_VERSION_MAJOR_VAR}}${${CUDA_VERSION_MINOR_VAR}}" PARENT_SCOPE)

message(STATUS "CUDA version: ${CUDA_VERSION}, major: ${${CUDA_VERSION_MAJOR_VAR}}, minor: ${${CUDA_VERSION_MINOR_VAR}}, patch: ${${CUDA_VERSION_PATCH_VAR}}, short: ${${CUDA_VERSION_SHORT_VAR}}, digit-only: ${${CUDA_VERSION_SHORT_DIGIT_ONLY_VAR}}")
message(STATUS "CUDA version compatibile: major: ${${CUDA_VERSION_MAJOR_VAR}}, minor: ${${CUDA_VERSION_MINOR_VAR}}, patch: ${${CUDA_VERSION_PATCH_VAR}}, short: ${${CUDA_VERSION_SHORT_VAR}}, digit-only: ${${CUDA_VERSION_SHORT_DIGIT_ONLY_VAR}}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
message(STATUS "CUDA version compatibile: major: ${${CUDA_VERSION_MAJOR_VAR}}, minor: ${${CUDA_VERSION_MINOR_VAR}}, patch: ${${CUDA_VERSION_PATCH_VAR}}, short: ${${CUDA_VERSION_SHORT_VAR}}, digit-only: ${${CUDA_VERSION_SHORT_DIGIT_ONLY_VAR}}")
message(STATUS "Compatible CUDA version: major: ${${CUDA_VERSION_MAJOR_VAR}}, minor: ${${CUDA_VERSION_MINOR_VAR}}, patch: ${${CUDA_VERSION_PATCH_VAR}}, short: ${${CUDA_VERSION_SHORT_VAR}}, digit-only: ${${CUDA_VERSION_SHORT_DIGIT_ONLY_VAR}}")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

``.XX.Y`` can be passed, script check for the supported version is bypased and the user needs
to make sure that Dockerfile.cudaXXY.deps is present in `docker/` directory.
| The default is ``11.1``. Thanks to CUDA extended compatibility mode CUDA 11.1 wheel is named as
CUDA 11.0 because it can work with the CUDA 11.0 R450.x driver family (please update to the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
CUDA 11.0 because it can work with the CUDA 11.0 R450.x driver family (please update to the
CUDA 11.0 because it can work with the CUDA 11.0 R450.x driver family. Please update to the

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

to make sure that Dockerfile.cudaXXY.deps is present in `docker/` directory.
| The default is ``11.1``. Thanks to CUDA extended compatibility mode CUDA 11.1 wheel is named as
CUDA 11.0 because it can work with the CUDA 11.0 R450.x driver family (please update to the
latest recommended driver version in that family). If the value of the version is prefixed with
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
latest recommended driver version in that family). If the value of the version is prefixed with
latest recommended driver version in that family. If the value of the version is prefixed with

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

| The default is ``11.1``. Thanks to CUDA extended compatibility mode CUDA 11.1 wheel is named as
CUDA 11.0 because it can work with the CUDA 11.0 R450.x driver family (please update to the
latest recommended driver version in that family). If the value of the version is prefixed with
`.` then any value ``.XX.Y`` can be passed, script check for the supported version is bypassed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
`.` then any value ``.XX.Y`` can be passed, script check for the supported version is bypassed
`.` then any value ``.XX.Y`` can be passed, and in that case the supported version check is bypassed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

CUDA 11.0 because it can work with the CUDA 11.0 R450.x driver family (please update to the
latest recommended driver version in that family). If the value of the version is prefixed with
`.` then any value ``.XX.Y`` can be passed, script check for the supported version is bypassed
and the user needs to make sure that Dockerfile.cudaXXY.deps is present in `docker/` directory.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
and the user needs to make sure that Dockerfile.cudaXXY.deps is present in `docker/` directory.
and the user needs to make sure that Dockerfile.cudaXXY.deps is present in the `docker/` directory.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

additional functionalities.
CUDA 11.0 build uses CUDA toolkit enhanced compatibility. It is built with the latest CUDA 11.x
toolkit while it can run on the latest, stable CUDA 11.0 capable drivers (450.80 or later).
Using the newest driver (455.x for example) may enable additional functionalities.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Using the newest driver (455.x for example) may enable additional functionalities.
Using the newest driver (455.x, for example) may enable additional functionalities.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

dali_tf_plugin/build_in_custom_op_docker.sh Outdated Show resolved Hide resolved
@JanuszL JanuszL force-pushed the cuda111_compatibility branch 3 times, most recently from f071a05 to ae2ce9c Compare October 16, 2020 08:29
| The default is ``11.1``. Thanks to CUDA extended compatibility mode CUDA 11.1 wheel is named as
CUDA 11.0 because it can work with the CUDA 11.0 R450.x driver family. Please update to the
latest recommended driver version in that family. If the value of the version is prefixed with
`.` then any value ``.XX.Y`` can be passed, and in that case the supported version check is
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is some issue with wording here, I think. the supported version check is and user ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done... I think

- makes CUDA 11.1 builds to be named as CUDA 11.0 because due to
  the CUDA enhanced compatibility of the build with the latest toolkit version
  will run on the latest CUDA 11.0 capable driver

Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1707617]: BUILD STARTED

additional functionalities.
CUDA 11.0 build uses CUDA toolkit enhanced compatibility. It is built with the latest CUDA 11.x
toolkit while it can run on the latest, stable CUDA 11.0 capable drivers (450.80 or later).
Using the newest driver (455.x, for example) may enable additional functionalities.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that we need to put an "example" of latest driver - that's bound to go stale.

Suggested change
Using the newest driver (455.x, for example) may enable additional functionalities.
Using the latest driver may enable additional functionality.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1707770]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1707770]: BUILD FAILED

@JanuszL JanuszL force-pushed the cuda111_compatibility branch 3 times, most recently from edbc238 to 385b512 Compare October 16, 2020 12:32
@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1707896]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1707896]: BUILD FAILED

Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1708117]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1708117]: BUILD PASSED

@JanuszL JanuszL merged commit 2214987 into NVIDIA:master Oct 16, 2020
@JanuszL JanuszL deleted the cuda111_compatibility branch October 16, 2020 19:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants