Document generation CI is broken by skottmckay · Pull Request #13308 · microsoft/onnxruntime

skottmckay · 2022-10-13T11:17:10Z

Description

Fix document generation CI. It's not currently updating the docs as we're skipping the tests, which is the invocation of build.py that would have generated the documentation.

Setup specific task to generate documentation for greater clarity.

Motivation and Context

Operator kernel documentation is not getting updated and is now out of date.

…or 'test' to be specified, but we build using the sln file and skip tests in the stage that was meant to generate the documentation.

…ocumentationGeneration

… op schemas.

…cessfully.

Couple of cleanups.

…ocumentationGeneration

snnn · 2022-10-14T04:31:27Z

        buildArch: x64
-        additionalBuildFlags: --gen_doc validate --skip_tests --enable_pybind --use_dml --use_cuda --cuda_version=11.6 --cuda_home="C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.6" --enable_cuda_profiling --cmake_extra_defines CMAKE_CUDA_ARCHITECTURES=52 --cmake_extra_defines onnxruntime_BUILD_UNIT_TESTS=OFF
+        # note: need to specify `--gen_doc` when creating the build config so it has to be in additionalBuildFlags
+        additionalBuildFlags: --gen_doc --skip_tests --enable_pybind --use_dml --use_cuda --cuda_version=11.6 --cuda_home="C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.6" --cmake_extra_defines CMAKE_CUDA_ARCHITECTURES=52 --cmake_extra_defines onnxruntime_BUILD_UNIT_TESTS=OFF


nit: In the past, "--cmake_extra_defines" can only be specified once in a command line. Though we made some hack now it works, I hope we don't need to rely on it. Here you can put two defines together.

snnn · 2022-10-14T19:14:54Z

I created a PR to move ORT GPU pipelines to T4, #13170. If that PR was merged, the "Windows GPU CI Pipeline" of this PR would not pass. It is because DML needs WDDM, but T4 doesn't WDDM. So I created a new machine pool and a new VM image that is just for DML. You can't assume a single machine that can run both DML and CUDA, especially someone just suggested me to move the DML pipeline to AMD GPUs. It wouldn't make sense to install CUDA libraries on all AMD GPU machines.

fdwr · 2022-10-14T19:32:39Z

You can't assume a single machine that can run both DML and CUDA

Well we shouldn't need to actually run any GPU-specific code of DML or CUDA beyond basic initialization just for document generation anyway - we just need the kernels registered. So how about an internal overload of DMLProviderFactoryCreator::Create which addGlobalSchemaFunctions() can call in onnxruntime_pybind_schema.cc to avoid the IsSoftwareAdapter check? (maybe an optional bool flag)

snnn · 2022-10-18T04:22:09Z

@fdwr , so what's the plan?

skottmckay · 2022-10-18T23:59:05Z

Can we check this in to fix the operator documentation at least? either way we need to figure out how this works going forward.

One option is that we will be able to set a config value in the session options with this PR that the DML EP could read and not error out if there's no gpu when we're generating docs. updating the DML EP factory creator is fine as well.

tianleiwu · 2022-10-21T17:20:14Z

@skottmckay, Please check in ASAP to unblock operator documentation update.

snnn · 2022-10-21T18:08:16Z

The "onnxruntime-Win2019-GPU-dml" machine pool should not need to have CUDA installed. I think both @fdwr and @skottmckay's suggestions are good, but can we make it happen?

snnn · 2022-10-21T21:53:02Z

Will PR #13318 be merged soon, or would you like me to create an internal overload of DMLProviderFactoryCreator::Create which addGlobalSchemaFunctions() can call in onnxruntime_pybind_schema.cc to avoid the IsSoftwareAdapter check?

snnn · 2022-10-24T15:07:18Z

Anything I can help?

fdwr · 2022-10-24T20:31:20Z

@snnn Sorry, I wasn't sure who was doing what, what Scott was adding it or I should. I can add it in a few hours after today's meetings. Was thinking DMLProviderFactoryCreator::Create(int device_id, bool skip_software_adapter_check) which would be false by default.

fdwr · 2022-10-25T02:26:49Z

#13428

skottmckay · 2022-10-25T08:09:43Z

@snnn or @tianleiwu could you please signoff so it can be checked in?

… DML EP to initialize on software-only devices (#13428) ### Description The documentation pipeline does not require an actual GPU, and running on GPU-capable agents costs more. So to enable running on CPU-only devices and to potentially consolidate future pipelines, and since the tests are not actually executed on this device anyway (it just needs to initialize the EP for the sake of operator kernel enumeration), add an initialization flag to skip the software device check - this is only an internal overload not exposed in the public API. See #13308. ### Motivation and Context - *If it fixes an open issue, please link to the issue here.* NA

Update pool used for doc generation. - add temporary diff to validate it works

snnn

Thank you!

… DML EP to initialize on software-only devices (#13428) ### Description The documentation pipeline does not require an actual GPU, and running on GPU-capable agents costs more. So to enable running on CPU-only devices and to potentially consolidate future pipelines, and since the tests are not actually executed on this device anyway (it just needs to initialize the EP for the sake of operator kernel enumeration), add an initialization flag to skip the software device check - this is only an internal overload not exposed in the public API. See #13308. ### Motivation and Context - *If it fixes an open issue, please link to the issue here.* NA

### Description  Fix document generation CI. It's not currently updating the docs as we're skipping the tests, which is the invocation of build.py that would have generated the documentation. Setup specific task to generate documentation for greater clarity. ### Motivation and Context  Operator kernel documentation is not getting updated and is now out of date.

skottmckay added 5 commits October 13, 2022 18:49

Fix the document generation CI. It currently requires either 'build' …

517e494

…or 'test' to be specified, but we build using the sln file and skip tests in the stage that was meant to generate the documentation.

Merge remote-tracking branch 'origin/main' into skottmckay/FixKernelD…

5a8d0fe

…ocumentationGeneration

Install ONNX when generating docs

feafd0d

Specify --gen_doc in additional flags to the python package has the…

275be5d

… op schemas.

Add extra condition to args.gen_doc handling

55d35a8

skottmckay requested a review from fdwr October 13, 2022 11:17

skottmckay added 2 commits October 14, 2022 09:56

Use DML pool for doc gen as it needs a GPU for the DML EP to load suc…

fa2599d

…cessfully.

Re-instated commented out stages.

6b2b764

Couple of cleanups.

fdwr previously approved these changes Oct 14, 2022

View reviewed changes

skottmckay added 2 commits October 14, 2022 10:44

Merge remote-tracking branch 'origin/main' into skottmckay/FixKernelD…

1087d11

…ocumentationGeneration

Update docs using CI output.

cbe6e82

skottmckay dismissed fdwr’s stale review via cbe6e82 October 14, 2022 01:18

skottmckay marked this pull request as ready for review October 14, 2022 03:58

skottmckay requested a review from a team October 14, 2022 03:58

Fix default for document generation

cb238b4

snnn reviewed Oct 14, 2022

View reviewed changes

fdwr previously approved these changes Oct 17, 2022

View reviewed changes

Merge

f87c0bb

skottmckay dismissed fdwr’s stale review via f87c0bb October 23, 2022 23:56

Add doco from pipeline build

115eb58

fdwr mentioned this pull request Oct 25, 2022

Document generation for operator kernels, enable internal overload of DML EP to initialize on software-only devices #13428

Merged

snnn reviewed Oct 25, 2022

View reviewed changes

Comment thread tools/ci_build/github/azure-pipelines/win-gpu-ci-pipeline.yml Outdated

skottmckay added 4 commits October 27, 2022 15:10

Merge

801eefa

Update pool used for doc generation. - add temporary diff to validate it works

Use GPU pool

bb5b1e2

Try CUDA pool

95ae49c

Update with build artifact

b7b3439

snnn approved these changes Oct 27, 2022

View reviewed changes

skottmckay merged commit ab71c4b into main Oct 27, 2022

skottmckay deleted the skottmckay/FixKernelDocumentationGeneration branch October 27, 2022 21:20

Conversation

skottmckay commented Oct 13, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation and Context

Uh oh!

snnn Oct 14, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

snnn commented Oct 14, 2022

Uh oh!

fdwr commented Oct 14, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

snnn commented Oct 18, 2022

Uh oh!

skottmckay commented Oct 18, 2022 • edited by fdwr Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tianleiwu commented Oct 21, 2022

Uh oh!

snnn commented Oct 21, 2022

Uh oh!

snnn commented Oct 21, 2022

Uh oh!

snnn commented Oct 24, 2022

Uh oh!

fdwr commented Oct 24, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fdwr commented Oct 25, 2022

Uh oh!

skottmckay commented Oct 25, 2022

Uh oh!

Uh oh!

snnn left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

skottmckay commented Oct 13, 2022 •

edited

Loading

snnn Oct 14, 2022 •

edited

Loading

fdwr commented Oct 14, 2022 •

edited

Loading

skottmckay commented Oct 18, 2022 •

edited by fdwr

Loading

fdwr commented Oct 24, 2022 •

edited

Loading