Document generation CI is broken#13308
Conversation
…or 'test' to be specified, but we build using the sln file and skip tests in the stage that was meant to generate the documentation.
…ocumentationGeneration
Couple of cleanups.
| buildArch: x64 | ||
| additionalBuildFlags: --gen_doc validate --skip_tests --enable_pybind --use_dml --use_cuda --cuda_version=11.6 --cuda_home="C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.6" --enable_cuda_profiling --cmake_extra_defines CMAKE_CUDA_ARCHITECTURES=52 --cmake_extra_defines onnxruntime_BUILD_UNIT_TESTS=OFF | ||
| # note: need to specify `--gen_doc` when creating the build config so it has to be in additionalBuildFlags | ||
| additionalBuildFlags: --gen_doc --skip_tests --enable_pybind --use_dml --use_cuda --cuda_version=11.6 --cuda_home="C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.6" --cmake_extra_defines CMAKE_CUDA_ARCHITECTURES=52 --cmake_extra_defines onnxruntime_BUILD_UNIT_TESTS=OFF |
There was a problem hiding this comment.
nit: In the past, "--cmake_extra_defines" can only be specified once in a command line. Though we made some hack now it works, I hope we don't need to rely on it. Here you can put two defines together.
|
I created a PR to move ORT GPU pipelines to T4, #13170. If that PR was merged, the "Windows GPU CI Pipeline" of this PR would not pass. It is because DML needs WDDM, but T4 doesn't WDDM. So I created a new machine pool and a new VM image that is just for DML. You can't assume a single machine that can run both DML and CUDA, especially someone just suggested me to move the DML pipeline to AMD GPUs. It wouldn't make sense to install CUDA libraries on all AMD GPU machines. |
Well we shouldn't need to actually run any GPU-specific code of DML or CUDA beyond basic initialization just for document generation anyway - we just need the kernels registered. So how about an internal overload of |
|
@fdwr , so what's the plan? |
|
Can we check this in to fix the operator documentation at least? either way we need to figure out how this works going forward. One option is that we will be able to set a config value in the session options with this PR that the DML EP could read and not error out if there's no gpu when we're generating docs. updating the DML EP factory creator is fine as well. |
|
@skottmckay, Please check in ASAP to unblock operator documentation update. |
|
The "onnxruntime-Win2019-GPU-dml" machine pool should not need to have CUDA installed. I think both @fdwr and @skottmckay's suggestions are good, but can we make it happen? |
|
Will PR #13318 be merged soon, or would you like me to create an internal overload of DMLProviderFactoryCreator::Create which addGlobalSchemaFunctions() can call in onnxruntime_pybind_schema.cc to avoid the IsSoftwareAdapter check? |
|
Anything I can help? |
|
@snnn Sorry, I wasn't sure who was doing what, what Scott was adding it or I should. I can add it in a few hours after today's meetings. Was thinking |
|
@snnn or @tianleiwu could you please signoff so it can be checked in? |
… DML EP to initialize on software-only devices (#13428) ### Description The documentation pipeline does not require an actual GPU, and running on GPU-capable agents costs more. So to enable running on CPU-only devices and to potentially consolidate future pipelines, and since the tests are not actually executed on this device anyway (it just needs to initialize the EP for the sake of operator kernel enumeration), add an initialization flag to skip the software device check - this is only an internal overload not exposed in the public API. See #13308. ### Motivation and Context - *If it fixes an open issue, please link to the issue here.* NA
… DML EP to initialize on software-only devices (#13428) ### Description The documentation pipeline does not require an actual GPU, and running on GPU-capable agents costs more. So to enable running on CPU-only devices and to potentially consolidate future pipelines, and since the tests are not actually executed on this device anyway (it just needs to initialize the EP for the sake of operator kernel enumeration), add an initialization flag to skip the software device check - this is only an internal overload not exposed in the public API. See #13308. ### Motivation and Context - *If it fixes an open issue, please link to the issue here.* NA
### Description <!-- Describe your changes. --> Fix document generation CI. It's not currently updating the docs as we're skipping the tests, which is the invocation of build.py that would have generated the documentation. Setup specific task to generate documentation for greater clarity. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Operator kernel documentation is not getting updated and is now out of date.
Description
Fix document generation CI. It's not currently updating the docs as we're skipping the tests, which is the invocation of build.py that would have generated the documentation.
Setup specific task to generate documentation for greater clarity.
Motivation and Context
Operator kernel documentation is not getting updated and is now out of date.