New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Relay] Support 'external codegen targets'. #11173
Conversation
669999a
to
62b9d75
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes sense to me, though I'll need to chew it over a bit more. I have some questions above, mostly about how the external target definitions would be used, and which sets of targets are allowed.
(Part of Collage, https://github.com/apache/tvm-rfcs/blob/main/rfcs/0062-collage.md) This change prepares the VM and Relay target handling machinery to support external codegen targets in addition to 'regular' targets. This allows us to configure the build with Collage as follows: ``` host_target = tvm.target.Target("llvm") targets = [tvm.target.Target("cuda", host_target), tvm.target.Target("cutlass", host_target), tvm.target.Target("cudnn", host_target)] with tvm.transform.PassContext(...): exe = tvm.relay.vm.compile(module, target=targets) ``` Four changes are required: 1. I introduce four new target kinds for the external codegens currently supported by Collage. Others can be added as they are vetted for use by Collage. These are given a device type matching the external codegen's assumption (ie just CUDA currently), and given a target kind attribute "is_external_codegen" of True. The latter is needed by Collage to signal the target kind name represents and external codegen 'compiler' name. See the RFC for specifics. 2. I introduce the binary relation Target::IsExternalCodegenFor so that external codegen targets can be related back to the 'underlying' targets they are implicitly using in their codegen. 3. I rework the VMCompiler and BuildModule interfaces to accept an Array<Target> of 'raw targets' instead of a Map<Integer, Target>. This more general representation is needed because we may now have multiple targets of the same device type active simultaneously. I add new static methods on the Python Target to convert to this form in a way that mimics check_and_update_host_consist. 4. I rework CompilationConfig to work from Array<Target> directly, to not depend on the host_target argument (since dealt with on the Python side), and to understand that if we have two targets for the same device type the non-external codegen target takes precedence. The change to CompilationConfig seems neutral with respect to the recent discussions on compilation configuration representation and tvmc. I made a few attempts to remove Target.check_and_update_host_const entirely in favor of using CompilationConfig as the definitive target handling choke point but backed out once they became too large.
… instead of CompilationConfig (don't want to bake it into any official APIs). - Started unit tests.
CI likely to fail due to stricter FindPrimitiveTargetOrFail but let's see.
Hey thanks Eric. PTAL. Note the Hexagon special case in CompliationConfigNode::Init, which I transliterated from the AOT build. Would be good to check that's the Right Thing to Do. |
53bf11c
to
0533b4c
Compare
- Unit test for new Target members.
Last call? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nicely done and a pretty slick API for external targets. Thanks @mbs-octoml!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @mbs-octoml, LGTM regarding the handling for Hexagon as a host target in CompilationConfigNode::Init -- reads to me as equivalent and the AOT tests pass on hardware.
Thanks @mbs-octoml, @Lunderberg, @jwfromm, this is merged! |
* [Relay] Support 'external codegen targets'. (Part of Collage, https://github.com/apache/tvm-rfcs/blob/main/rfcs/0062-collage.md) This change prepares the VM and Relay target handling machinery to support external codegen targets in addition to 'regular' targets. This allows us to configure the build with Collage as follows: ``` host_target = tvm.target.Target("llvm") targets = [tvm.target.Target("cuda", host_target), tvm.target.Target("cutlass", host_target), tvm.target.Target("cudnn", host_target)] with tvm.transform.PassContext(...): exe = tvm.relay.vm.compile(module, target=targets) ``` Four changes are required: 1. I introduce four new target kinds for the external codegens currently supported by Collage. Others can be added as they are vetted for use by Collage. These are given a device type matching the external codegen's assumption (ie just CUDA currently), and given a target kind attribute "is_external_codegen" of True. The latter is needed by Collage to signal the target kind name represents and external codegen 'compiler' name. See the RFC for specifics. 2. I introduce the binary relation Target::IsExternalCodegenFor so that external codegen targets can be related back to the 'underlying' targets they are implicitly using in their codegen. 3. I rework the VMCompiler and BuildModule interfaces to accept an Array<Target> of 'raw targets' instead of a Map<Integer, Target>. This more general representation is needed because we may now have multiple targets of the same device type active simultaneously. I add new static methods on the Python Target to convert to this form in a way that mimics check_and_update_host_consist. 4. I rework CompilationConfig to work from Array<Target> directly, to not depend on the host_target argument (since dealt with on the Python side), and to understand that if we have two targets for the same device type the non-external codegen target takes precedence. The change to CompilationConfig seems neutral with respect to the recent discussions on compilation configuration representation and tvmc. I made a few attempts to remove Target.check_and_update_host_const entirely in favor of using CompilationConfig as the definitive target handling choke point but backed out once they became too large. * - Working on unit tests * - Fix two Debug-only failures * - Use Array<Target> in GraphExecutorCodegen/AOTExecutorCodegen ifaces instead of CompilationConfig (don't want to bake it into any official APIs). - Started unit tests. * - Lints * - Moar Lints * - Fix some unit tests * - Fix last unit test failures * - whitespace * - Address Eric's comments. CI likely to fail due to stricter FindPrimitiveTargetOrFail but let's see. * - Comment adjustments. - Unit test for new Target members.
This finishes the work started in apache#11173 to support 'external codegen' targets in the N build-like API surfaces. - It turns out it's ok if a build is given only a single 'external codegen' target, so remove that check in CompilationConfig::Init. When Collage builds a 'candidate partition' it does so for a single target. As far as Collage is concerned it does not care whether the target is regular (eg Target("cuda")), or for a specific external codegen (eg Target("cutlass")), it just passes the target into the build. - Add CompilationConfig::FindPrimitiveTargetForKind which I'll later need to retrieve the external codegen Target instance corresponding to a "Compiler" attribute value. - Target.update_target_host_consist was supporting three API styles: - single target - map from device type to target - map from target to IRModule (for the ir_to_runtime API) I replaced all those calls with a more specialized 'canonicalize' call: - Target.canonicalize_target_and_host - Target.canonicalize_multi_targets_and_host - Target.canonicalize_target_map_and_host In particular, all the tuning interfaces (task extraction, tuning, tuning records) all explicitly *do not* support multiple targets since the underlying code just doesn't support that.
This finishes the work started in apache#11173 to support 'external codegen' targets in the N build-like API surfaces. - It turns out it's ok if a build is given only a single 'external codegen' target, so remove that check in CompilationConfig::Init. When Collage builds a 'candidate partition' it does so for a single target. As far as Collage is concerned it does not care whether the target is regular (eg Target("cuda")), or for a specific external codegen (eg Target("cutlass")), it just passes the target into the build. - Add CompilationConfig::FindPrimitiveTargetForKind which I'll later need to retrieve the external codegen Target instance corresponding to a "Compiler" attribute value. - Target.update_target_host_consist was supporting three API styles: - single target - map from device type to target - map from target to IRModule (for the ir_to_runtime API) I replaced all those calls with a more specialized 'canonicalize' call: - Target.canonicalize_target_and_host - Target.canonicalize_multi_targets_and_host - Target.canonicalize_target_map_and_host In particular, all the tuning interfaces (task extraction, tuning, tuning records) all explicitly *do not* support multiple targets since the underlying code just doesn't support that.
This finishes the work started in apache#11173 to support 'external codegen' targets in the N build-like API surfaces. - It turns out it's ok if a build is given only a single 'external codegen' target, so remove that check in CompilationConfig::Init. When Collage builds a 'candidate partition' it does so for a single target. As far as Collage is concerned it does not care whether the target is regular (eg Target("cuda")), or for a specific external codegen (eg Target("cutlass")), it just passes the target into the build. - Add CompilationConfig::FindPrimitiveTargetForKind which I'll later need to retrieve the external codegen Target instance corresponding to a "Compiler" attribute value. - Target.update_target_host_consist was supporting three API styles: - single target - map from device type to target - map from target to IRModule (for the ir_to_runtime API) I replaced all those calls with a more specialized 'canonicalize' call: - Target.canonicalize_target_and_host - Target.canonicalize_multi_targets_and_host - Target.canonicalize_target_map_and_host In particular, all the tuning interfaces (task extraction, tuning, tuning records) all explicitly *do not* support multiple targets since the underlying code just doesn't support that.
* Finish support for list-of-targets This finishes the work started in #11173 to support 'external codegen' targets in the N build-like API surfaces. - It turns out it's ok if a build is given only a single 'external codegen' target, so remove that check in CompilationConfig::Init. When Collage builds a 'candidate partition' it does so for a single target. As far as Collage is concerned it does not care whether the target is regular (eg Target("cuda")), or for a specific external codegen (eg Target("cutlass")), it just passes the target into the build. - Add CompilationConfig::FindPrimitiveTargetForKind which I'll later need to retrieve the external codegen Target instance corresponding to a "Compiler" attribute value. - Target.update_target_host_consist was supporting three API styles: - single target - map from device type to target - map from target to IRModule (for the ir_to_runtime API) I replaced all those calls with a more specialized 'canonicalize' call: - Target.canonicalize_target_and_host - Target.canonicalize_multi_targets_and_host - Target.canonicalize_target_map_and_host In particular, all the tuning interfaces (task extraction, tuning, tuning records) all explicitly *do not* support multiple targets since the underlying code just doesn't support that. * - Lints - Revert unintended changes * - more lints * - Fix model_library_format handling of target. - Improve comments in compilation_config.h * - Lints - Update target/target_host params documentation * - Fix micro library format tests - Rev micro library format from 5 to 6 - Use Target.current() in a few places * - eta contract comprehension * - Woops, one more device: target map left - Handle host already being in Target * - lint * - lint * - Bug with append - Take device type from target * - Fix hexagon
* [Relay] Support 'external codegen targets'. (Part of Collage, https://github.com/apache/tvm-rfcs/blob/main/rfcs/0062-collage.md) This change prepares the VM and Relay target handling machinery to support external codegen targets in addition to 'regular' targets. This allows us to configure the build with Collage as follows: ``` host_target = tvm.target.Target("llvm") targets = [tvm.target.Target("cuda", host_target), tvm.target.Target("cutlass", host_target), tvm.target.Target("cudnn", host_target)] with tvm.transform.PassContext(...): exe = tvm.relay.vm.compile(module, target=targets) ``` Four changes are required: 1. I introduce four new target kinds for the external codegens currently supported by Collage. Others can be added as they are vetted for use by Collage. These are given a device type matching the external codegen's assumption (ie just CUDA currently), and given a target kind attribute "is_external_codegen" of True. The latter is needed by Collage to signal the target kind name represents and external codegen 'compiler' name. See the RFC for specifics. 2. I introduce the binary relation Target::IsExternalCodegenFor so that external codegen targets can be related back to the 'underlying' targets they are implicitly using in their codegen. 3. I rework the VMCompiler and BuildModule interfaces to accept an Array<Target> of 'raw targets' instead of a Map<Integer, Target>. This more general representation is needed because we may now have multiple targets of the same device type active simultaneously. I add new static methods on the Python Target to convert to this form in a way that mimics check_and_update_host_consist. 4. I rework CompilationConfig to work from Array<Target> directly, to not depend on the host_target argument (since dealt with on the Python side), and to understand that if we have two targets for the same device type the non-external codegen target takes precedence. The change to CompilationConfig seems neutral with respect to the recent discussions on compilation configuration representation and tvmc. I made a few attempts to remove Target.check_and_update_host_const entirely in favor of using CompilationConfig as the definitive target handling choke point but backed out once they became too large. * - Working on unit tests * - Fix two Debug-only failures * - Use Array<Target> in GraphExecutorCodegen/AOTExecutorCodegen ifaces instead of CompilationConfig (don't want to bake it into any official APIs). - Started unit tests. * - Lints * - Moar Lints * - Fix some unit tests * - Fix last unit test failures * - whitespace * - Address Eric's comments. CI likely to fail due to stricter FindPrimitiveTargetOrFail but let's see. * - Comment adjustments. - Unit test for new Target members.
* [Relay] Support 'external codegen targets'. (Part of Collage, https://github.com/apache/tvm-rfcs/blob/main/rfcs/0062-collage.md) This change prepares the VM and Relay target handling machinery to support external codegen targets in addition to 'regular' targets. This allows us to configure the build with Collage as follows: ``` host_target = tvm.target.Target("llvm") targets = [tvm.target.Target("cuda", host_target), tvm.target.Target("cutlass", host_target), tvm.target.Target("cudnn", host_target)] with tvm.transform.PassContext(...): exe = tvm.relay.vm.compile(module, target=targets) ``` Four changes are required: 1. I introduce four new target kinds for the external codegens currently supported by Collage. Others can be added as they are vetted for use by Collage. These are given a device type matching the external codegen's assumption (ie just CUDA currently), and given a target kind attribute "is_external_codegen" of True. The latter is needed by Collage to signal the target kind name represents and external codegen 'compiler' name. See the RFC for specifics. 2. I introduce the binary relation Target::IsExternalCodegenFor so that external codegen targets can be related back to the 'underlying' targets they are implicitly using in their codegen. 3. I rework the VMCompiler and BuildModule interfaces to accept an Array<Target> of 'raw targets' instead of a Map<Integer, Target>. This more general representation is needed because we may now have multiple targets of the same device type active simultaneously. I add new static methods on the Python Target to convert to this form in a way that mimics check_and_update_host_consist. 4. I rework CompilationConfig to work from Array<Target> directly, to not depend on the host_target argument (since dealt with on the Python side), and to understand that if we have two targets for the same device type the non-external codegen target takes precedence. The change to CompilationConfig seems neutral with respect to the recent discussions on compilation configuration representation and tvmc. I made a few attempts to remove Target.check_and_update_host_const entirely in favor of using CompilationConfig as the definitive target handling choke point but backed out once they became too large. * - Working on unit tests * - Fix two Debug-only failures * - Use Array<Target> in GraphExecutorCodegen/AOTExecutorCodegen ifaces instead of CompilationConfig (don't want to bake it into any official APIs). - Started unit tests. * - Lints * - Moar Lints * - Fix some unit tests * - Fix last unit test failures * - whitespace * - Address Eric's comments. CI likely to fail due to stricter FindPrimitiveTargetOrFail but let's see. * - Comment adjustments. - Unit test for new Target members.
* Finish support for list-of-targets This finishes the work started in apache#11173 to support 'external codegen' targets in the N build-like API surfaces. - It turns out it's ok if a build is given only a single 'external codegen' target, so remove that check in CompilationConfig::Init. When Collage builds a 'candidate partition' it does so for a single target. As far as Collage is concerned it does not care whether the target is regular (eg Target("cuda")), or for a specific external codegen (eg Target("cutlass")), it just passes the target into the build. - Add CompilationConfig::FindPrimitiveTargetForKind which I'll later need to retrieve the external codegen Target instance corresponding to a "Compiler" attribute value. - Target.update_target_host_consist was supporting three API styles: - single target - map from device type to target - map from target to IRModule (for the ir_to_runtime API) I replaced all those calls with a more specialized 'canonicalize' call: - Target.canonicalize_target_and_host - Target.canonicalize_multi_targets_and_host - Target.canonicalize_target_map_and_host In particular, all the tuning interfaces (task extraction, tuning, tuning records) all explicitly *do not* support multiple targets since the underlying code just doesn't support that. * - Lints - Revert unintended changes * - more lints * - Fix model_library_format handling of target. - Improve comments in compilation_config.h * - Lints - Update target/target_host params documentation * - Fix micro library format tests - Rev micro library format from 5 to 6 - Use Target.current() in a few places * - eta contract comprehension * - Woops, one more device: target map left - Handle host already being in Target * - lint * - lint * - Bug with append - Take device type from target * - Fix hexagon
(Part of Collage, https://github.com/apache/tvm-rfcs/blob/main/rfcs/0062-collage.md)
This change prepares the target handling machinery to support
'external codegen' targets in addition to 'regular' targets. This allows us
to configure the build with Collage as follows:
Four changes are required:
by Collage ("tensorrt", "cutlass", "cudnn" and "cublas"). Others can be added as
they are vetted for use by Collage. These are given a device type matching the
external codegen's assumption (ie just CUDA currently), and given a target kind
attribute "is_external_codegen" of True. The latter is needed by Collage to signal
the target kind name represents an external codegen 'compiler' name.
'external codegen' targets can be related back to the 'regular' targets
they are implicitly using in their codegen.
'raw targets' instead of a Map<Integer, Target>. This more general representation
is needed because we may now have multiple targets of the same device type
active simultaneously. I add new static methods on the Python Target to
convert to this form in a way that mimics check_and_update_host_consist.
on the host_target argument (since dealt with adequately on the Python side),
and to understand that if we have two targets for the same device type the
non-external codegen target takes precedence.
The change to CompilationConfig seems neutral with respect to the recent
discussions on compilation configuration representation and tvmc.
I've made sure to not expose CompilationConfig in any core APIs, preferring
the more neutral
Array<Target>
instead.I made a few attempts to remove Target.check_and_update_host_const entirely in
favor of using CompilationConfig as the definitive target handling choke point but
backed out each time due to difficulties supporting the existing Python code.