Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Relay] Support 'external codegen targets'. #11173

Merged
merged 11 commits into from May 4, 2022

Conversation

mbs-octoml
Copy link
Contributor

@mbs-octoml mbs-octoml commented Apr 28, 2022

(Part of Collage, https://github.com/apache/tvm-rfcs/blob/main/rfcs/0062-collage.md)

This change prepares the target handling machinery to support
'external codegen' targets in addition to 'regular' targets. This allows us
to configure the build with Collage as follows:

    host_target = tvm.target.Target("llvm")
    targets = [tvm.target.Target("cuda", host_target),
               tvm.target.Target("cutlass", host_target),
               tvm.target.Target("cudnn", host_target)]
    with tvm.transform.PassContext(...):
        exe = tvm.relay.vm.compile(module, target=targets)

Four changes are required:

  1. I introduce four new target kinds for the external codegens currently supported
    by Collage ("tensorrt", "cutlass", "cudnn" and "cublas"). Others can be added as
    they are vetted for use by Collage. These are given a device type matching the
    external codegen's assumption (ie just CUDA currently), and given a target kind
    attribute "is_external_codegen" of True. The latter is needed by Collage to signal
    the target kind name represents an external codegen 'compiler' name.
  2. I introduce the binary relation Target::IsExternalCodegenFor so that
    'external codegen' targets can be related back to the 'regular' targets
    they are implicitly using in their codegen.
  3. I rework the VMCompiler and BuildModule interfaces to accept an Array of
    'raw targets' instead of a Map<Integer, Target>. This more general representation
    is needed because we may now have multiple targets of the same device type
    active simultaneously. I add new static methods on the Python Target to
    convert to this form in a way that mimics check_and_update_host_consist.
  4. I rework CompilationConfig to work from Array directly, to not depend
    on the host_target argument (since dealt with adequately on the Python side),
    and to understand that if we have two targets for the same device type the
    non-external codegen target takes precedence.

The change to CompilationConfig seems neutral with respect to the recent
discussions on compilation configuration representation and tvmc.

I've made sure to not expose CompilationConfig in any core APIs, preferring
the more neutral Array<Target> instead.

I made a few attempts to remove Target.check_and_update_host_const entirely in
favor of using CompilationConfig as the definitive target handling choke point but
backed out each time due to difficulties supporting the existing Python code.

@mbs-octoml mbs-octoml force-pushed the mbs-collage-targets branch 2 times, most recently from 669999a to 62b9d75 Compare May 4, 2022 01:03
@mbs-octoml mbs-octoml marked this pull request as ready for review May 4, 2022 01:23
Copy link
Contributor

@Lunderberg Lunderberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes sense to me, though I'll need to chew it over a bit more. I have some questions above, mostly about how the external target definitions would be used, and which sets of targets are allowed.

src/target/compilation_config.cc Outdated Show resolved Hide resolved
include/tvm/target/compilation_config.h Show resolved Hide resolved
src/target/compilation_config.cc Outdated Show resolved Hide resolved
src/relay/backend/te_compiler.cc Show resolved Hide resolved
(Part of Collage, https://github.com/apache/tvm-rfcs/blob/main/rfcs/0062-collage.md)

This change prepares the VM and Relay target handling machinery to support
external codegen targets in addition to 'regular' targets. This allows us
to configure the build with Collage as follows:
```
    host_target = tvm.target.Target("llvm")
    targets = [tvm.target.Target("cuda", host_target),
               tvm.target.Target("cutlass", host_target),
               tvm.target.Target("cudnn", host_target)]
    with tvm.transform.PassContext(...):
        exe = tvm.relay.vm.compile(module, target=targets)
```

Four changes are required:
1. I introduce four new target kinds for the external codegens currently supported
   by Collage. Others can be added as they are vetted for use by Collage. These
   are given a device type matching the external codegen's assumption (ie just CUDA
   currently), and given a target kind attribute "is_external_codegen" of True. The
   latter is needed by Collage to signal the target kind name represents and external
   codegen 'compiler' name. See the RFC for specifics.
2. I introduce the binary relation Target::IsExternalCodegenFor so that
   external codegen targets can be related back to the 'underlying' targets
   they are implicitly using in their codegen.
3. I rework the VMCompiler and BuildModule interfaces to accept an Array<Target> of
   'raw targets' instead of a Map<Integer, Target>. This more general representation
   is needed because we may now have multiple targets of the same device type
   active simultaneously. I add new static methods on the Python Target to
   convert to this form in a way that mimics check_and_update_host_consist.
4. I rework CompilationConfig to work from Array<Target> directly, to not depend
   on the host_target argument (since dealt with on the Python side), and to
   understand that if we have two targets for the same device type the non-external
   codegen target takes precedence.

The change to CompilationConfig seems neutral with respect to the recent discussions
on compilation configuration representation and tvmc.

I made a few attempts to remove Target.check_and_update_host_const entirely in favor
of using CompilationConfig as the definitive target handling choke point but backed
out once they became too large.
… instead

  of CompilationConfig (don't want to bake it into any official APIs).
- Started unit tests.
  CI likely to fail due to stricter FindPrimitiveTargetOrFail but let's see.
@mbs-octoml
Copy link
Contributor Author

Hey thanks Eric. PTAL. Note the Hexagon special case in CompliationConfigNode::Init, which I transliterated from the AOT build. Would be good to check that's the Right Thing to Do.

- Unit test for new Target members.
@mbs-octoml
Copy link
Contributor Author

Last call?

Copy link
Contributor

@jwfromm jwfromm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nicely done and a pretty slick API for external targets. Thanks @mbs-octoml!

Copy link
Contributor

@csullivan csullivan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mbs-octoml, LGTM regarding the handling for Hexagon as a host target in CompilationConfigNode::Init -- reads to me as equivalent and the AOT tests pass on hardware.

@csullivan csullivan merged commit 521b80a into apache:main May 4, 2022
@csullivan
Copy link
Contributor

Thanks @mbs-octoml, @Lunderberg, @jwfromm, this is merged!

@mbs-octoml mbs-octoml deleted the mbs-collage-targets branch May 5, 2022 00:58
shtinsa pushed a commit to Deelvin/tvm that referenced this pull request May 17, 2022
* [Relay] Support 'external codegen targets'.

(Part of Collage, https://github.com/apache/tvm-rfcs/blob/main/rfcs/0062-collage.md)

This change prepares the VM and Relay target handling machinery to support
external codegen targets in addition to 'regular' targets. This allows us
to configure the build with Collage as follows:
```
    host_target = tvm.target.Target("llvm")
    targets = [tvm.target.Target("cuda", host_target),
               tvm.target.Target("cutlass", host_target),
               tvm.target.Target("cudnn", host_target)]
    with tvm.transform.PassContext(...):
        exe = tvm.relay.vm.compile(module, target=targets)
```

Four changes are required:
1. I introduce four new target kinds for the external codegens currently supported
   by Collage. Others can be added as they are vetted for use by Collage. These
   are given a device type matching the external codegen's assumption (ie just CUDA
   currently), and given a target kind attribute "is_external_codegen" of True. The
   latter is needed by Collage to signal the target kind name represents and external
   codegen 'compiler' name. See the RFC for specifics.
2. I introduce the binary relation Target::IsExternalCodegenFor so that
   external codegen targets can be related back to the 'underlying' targets
   they are implicitly using in their codegen.
3. I rework the VMCompiler and BuildModule interfaces to accept an Array<Target> of
   'raw targets' instead of a Map<Integer, Target>. This more general representation
   is needed because we may now have multiple targets of the same device type
   active simultaneously. I add new static methods on the Python Target to
   convert to this form in a way that mimics check_and_update_host_consist.
4. I rework CompilationConfig to work from Array<Target> directly, to not depend
   on the host_target argument (since dealt with on the Python side), and to
   understand that if we have two targets for the same device type the non-external
   codegen target takes precedence.

The change to CompilationConfig seems neutral with respect to the recent discussions
on compilation configuration representation and tvmc.

I made a few attempts to remove Target.check_and_update_host_const entirely in favor
of using CompilationConfig as the definitive target handling choke point but backed
out once they became too large.

* - Working on unit tests

* - Fix two Debug-only failures

* - Use Array<Target> in GraphExecutorCodegen/AOTExecutorCodegen ifaces instead
  of CompilationConfig (don't want to bake it into any official APIs).
- Started unit tests.

* - Lints

* - Moar Lints

* - Fix some unit tests

* - Fix last unit test failures

* - whitespace

* - Address Eric's comments.
  CI likely to fail due to stricter FindPrimitiveTargetOrFail but let's see.

* - Comment adjustments.
- Unit test for new Target members.
mbs-octoml added a commit to mbs-octoml/mbs-tvm that referenced this pull request May 19, 2022
This finishes the work started in apache#11173 to support
'external codegen' targets in the N build-like API surfaces.

 - It turns out it's ok if a build is given only a single 'external codegen' target, so remove that check
   in CompilationConfig::Init. When Collage builds a 'candidate partition' it does so for a single target.
   As far as Collage is concerned it does not care whether the target is regular (eg Target("cuda")), or
   for a specific external codegen (eg Target("cutlass")), it just passes the target into the build.

 - Add CompilationConfig::FindPrimitiveTargetForKind which I'll later need to retrieve
   the external codegen Target instance corresponding to a "Compiler" attribute value.

 - Target.update_target_host_consist was supporting three API styles:
    - single target
    - map from device type to target
    - map from target to IRModule (for the ir_to_runtime API)
   I replaced all those calls with a more specialized 'canonicalize' call:
    - Target.canonicalize_target_and_host
    - Target.canonicalize_multi_targets_and_host
    - Target.canonicalize_target_map_and_host
   In particular, all the tuning interfaces (task extraction, tuning, tuning records) all explicitly
   *do not* support multiple targets since the underlying code just doesn't support that.
mbs-octoml added a commit to mbs-octoml/mbs-tvm that referenced this pull request May 20, 2022
This finishes the work started in apache#11173 to support
'external codegen' targets in the N build-like API surfaces.

 - It turns out it's ok if a build is given only a single 'external codegen' target, so remove that check
   in CompilationConfig::Init. When Collage builds a 'candidate partition' it does so for a single target.
   As far as Collage is concerned it does not care whether the target is regular (eg Target("cuda")), or
   for a specific external codegen (eg Target("cutlass")), it just passes the target into the build.

 - Add CompilationConfig::FindPrimitiveTargetForKind which I'll later need to retrieve
   the external codegen Target instance corresponding to a "Compiler" attribute value.

 - Target.update_target_host_consist was supporting three API styles:
    - single target
    - map from device type to target
    - map from target to IRModule (for the ir_to_runtime API)
   I replaced all those calls with a more specialized 'canonicalize' call:
    - Target.canonicalize_target_and_host
    - Target.canonicalize_multi_targets_and_host
    - Target.canonicalize_target_map_and_host
   In particular, all the tuning interfaces (task extraction, tuning, tuning records) all explicitly
   *do not* support multiple targets since the underlying code just doesn't support that.
mbs-octoml added a commit to mbs-octoml/mbs-tvm that referenced this pull request May 22, 2022
This finishes the work started in apache#11173 to support
'external codegen' targets in the N build-like API surfaces.

 - It turns out it's ok if a build is given only a single 'external codegen' target, so remove that check
   in CompilationConfig::Init. When Collage builds a 'candidate partition' it does so for a single target.
   As far as Collage is concerned it does not care whether the target is regular (eg Target("cuda")), or
   for a specific external codegen (eg Target("cutlass")), it just passes the target into the build.

 - Add CompilationConfig::FindPrimitiveTargetForKind which I'll later need to retrieve
   the external codegen Target instance corresponding to a "Compiler" attribute value.

 - Target.update_target_host_consist was supporting three API styles:
    - single target
    - map from device type to target
    - map from target to IRModule (for the ir_to_runtime API)
   I replaced all those calls with a more specialized 'canonicalize' call:
    - Target.canonicalize_target_and_host
    - Target.canonicalize_multi_targets_and_host
    - Target.canonicalize_target_map_and_host
   In particular, all the tuning interfaces (task extraction, tuning, tuning records) all explicitly
   *do not* support multiple targets since the underlying code just doesn't support that.
jwfromm pushed a commit that referenced this pull request May 23, 2022
* Finish support for list-of-targets

This finishes the work started in #11173 to support
'external codegen' targets in the N build-like API surfaces.

 - It turns out it's ok if a build is given only a single 'external codegen' target, so remove that check
   in CompilationConfig::Init. When Collage builds a 'candidate partition' it does so for a single target.
   As far as Collage is concerned it does not care whether the target is regular (eg Target("cuda")), or
   for a specific external codegen (eg Target("cutlass")), it just passes the target into the build.

 - Add CompilationConfig::FindPrimitiveTargetForKind which I'll later need to retrieve
   the external codegen Target instance corresponding to a "Compiler" attribute value.

 - Target.update_target_host_consist was supporting three API styles:
    - single target
    - map from device type to target
    - map from target to IRModule (for the ir_to_runtime API)
   I replaced all those calls with a more specialized 'canonicalize' call:
    - Target.canonicalize_target_and_host
    - Target.canonicalize_multi_targets_and_host
    - Target.canonicalize_target_map_and_host
   In particular, all the tuning interfaces (task extraction, tuning, tuning records) all explicitly
   *do not* support multiple targets since the underlying code just doesn't support that.

* - Lints
- Revert unintended changes

* - more lints

* - Fix model_library_format handling of target.
- Improve comments in compilation_config.h

* - Lints
- Update target/target_host params documentation

* - Fix micro library format tests
- Rev micro library format from 5 to 6
- Use Target.current() in a few places

* - eta contract comprehension

* - Woops, one more device: target map left
- Handle host already being in Target

* - lint

* - lint

* - Bug with append
- Take device type from target

* - Fix hexagon
SebastianBoblest pushed a commit to SebastianBoblest/tvm that referenced this pull request May 27, 2022
* [Relay] Support 'external codegen targets'.

(Part of Collage, https://github.com/apache/tvm-rfcs/blob/main/rfcs/0062-collage.md)

This change prepares the VM and Relay target handling machinery to support
external codegen targets in addition to 'regular' targets. This allows us
to configure the build with Collage as follows:
```
    host_target = tvm.target.Target("llvm")
    targets = [tvm.target.Target("cuda", host_target),
               tvm.target.Target("cutlass", host_target),
               tvm.target.Target("cudnn", host_target)]
    with tvm.transform.PassContext(...):
        exe = tvm.relay.vm.compile(module, target=targets)
```

Four changes are required:
1. I introduce four new target kinds for the external codegens currently supported
   by Collage. Others can be added as they are vetted for use by Collage. These
   are given a device type matching the external codegen's assumption (ie just CUDA
   currently), and given a target kind attribute "is_external_codegen" of True. The
   latter is needed by Collage to signal the target kind name represents and external
   codegen 'compiler' name. See the RFC for specifics.
2. I introduce the binary relation Target::IsExternalCodegenFor so that
   external codegen targets can be related back to the 'underlying' targets
   they are implicitly using in their codegen.
3. I rework the VMCompiler and BuildModule interfaces to accept an Array<Target> of
   'raw targets' instead of a Map<Integer, Target>. This more general representation
   is needed because we may now have multiple targets of the same device type
   active simultaneously. I add new static methods on the Python Target to
   convert to this form in a way that mimics check_and_update_host_consist.
4. I rework CompilationConfig to work from Array<Target> directly, to not depend
   on the host_target argument (since dealt with on the Python side), and to
   understand that if we have two targets for the same device type the non-external
   codegen target takes precedence.

The change to CompilationConfig seems neutral with respect to the recent discussions
on compilation configuration representation and tvmc.

I made a few attempts to remove Target.check_and_update_host_const entirely in favor
of using CompilationConfig as the definitive target handling choke point but backed
out once they became too large.

* - Working on unit tests

* - Fix two Debug-only failures

* - Use Array<Target> in GraphExecutorCodegen/AOTExecutorCodegen ifaces instead
  of CompilationConfig (don't want to bake it into any official APIs).
- Started unit tests.

* - Lints

* - Moar Lints

* - Fix some unit tests

* - Fix last unit test failures

* - whitespace

* - Address Eric's comments.
  CI likely to fail due to stricter FindPrimitiveTargetOrFail but let's see.

* - Comment adjustments.
- Unit test for new Target members.
juda pushed a commit to juda/tvm that referenced this pull request Jun 21, 2022
* [Relay] Support 'external codegen targets'.

(Part of Collage, https://github.com/apache/tvm-rfcs/blob/main/rfcs/0062-collage.md)

This change prepares the VM and Relay target handling machinery to support
external codegen targets in addition to 'regular' targets. This allows us
to configure the build with Collage as follows:
```
    host_target = tvm.target.Target("llvm")
    targets = [tvm.target.Target("cuda", host_target),
               tvm.target.Target("cutlass", host_target),
               tvm.target.Target("cudnn", host_target)]
    with tvm.transform.PassContext(...):
        exe = tvm.relay.vm.compile(module, target=targets)
```

Four changes are required:
1. I introduce four new target kinds for the external codegens currently supported
   by Collage. Others can be added as they are vetted for use by Collage. These
   are given a device type matching the external codegen's assumption (ie just CUDA
   currently), and given a target kind attribute "is_external_codegen" of True. The
   latter is needed by Collage to signal the target kind name represents and external
   codegen 'compiler' name. See the RFC for specifics.
2. I introduce the binary relation Target::IsExternalCodegenFor so that
   external codegen targets can be related back to the 'underlying' targets
   they are implicitly using in their codegen.
3. I rework the VMCompiler and BuildModule interfaces to accept an Array<Target> of
   'raw targets' instead of a Map<Integer, Target>. This more general representation
   is needed because we may now have multiple targets of the same device type
   active simultaneously. I add new static methods on the Python Target to
   convert to this form in a way that mimics check_and_update_host_consist.
4. I rework CompilationConfig to work from Array<Target> directly, to not depend
   on the host_target argument (since dealt with on the Python side), and to
   understand that if we have two targets for the same device type the non-external
   codegen target takes precedence.

The change to CompilationConfig seems neutral with respect to the recent discussions
on compilation configuration representation and tvmc.

I made a few attempts to remove Target.check_and_update_host_const entirely in favor
of using CompilationConfig as the definitive target handling choke point but backed
out once they became too large.

* - Working on unit tests

* - Fix two Debug-only failures

* - Use Array<Target> in GraphExecutorCodegen/AOTExecutorCodegen ifaces instead
  of CompilationConfig (don't want to bake it into any official APIs).
- Started unit tests.

* - Lints

* - Moar Lints

* - Fix some unit tests

* - Fix last unit test failures

* - whitespace

* - Address Eric's comments.
  CI likely to fail due to stricter FindPrimitiveTargetOrFail but let's see.

* - Comment adjustments.
- Unit test for new Target members.
juda pushed a commit to juda/tvm that referenced this pull request Jun 21, 2022
* Finish support for list-of-targets

This finishes the work started in apache#11173 to support
'external codegen' targets in the N build-like API surfaces.

 - It turns out it's ok if a build is given only a single 'external codegen' target, so remove that check
   in CompilationConfig::Init. When Collage builds a 'candidate partition' it does so for a single target.
   As far as Collage is concerned it does not care whether the target is regular (eg Target("cuda")), or
   for a specific external codegen (eg Target("cutlass")), it just passes the target into the build.

 - Add CompilationConfig::FindPrimitiveTargetForKind which I'll later need to retrieve
   the external codegen Target instance corresponding to a "Compiler" attribute value.

 - Target.update_target_host_consist was supporting three API styles:
    - single target
    - map from device type to target
    - map from target to IRModule (for the ir_to_runtime API)
   I replaced all those calls with a more specialized 'canonicalize' call:
    - Target.canonicalize_target_and_host
    - Target.canonicalize_multi_targets_and_host
    - Target.canonicalize_target_map_and_host
   In particular, all the tuning interfaces (task extraction, tuning, tuning records) all explicitly
   *do not* support multiple targets since the underlying code just doesn't support that.

* - Lints
- Revert unintended changes

* - more lints

* - Fix model_library_format handling of target.
- Improve comments in compilation_config.h

* - Lints
- Update target/target_host params documentation

* - Fix micro library format tests
- Rev micro library format from 5 to 6
- Use Target.current() in a few places

* - eta contract comprehension

* - Woops, one more device: target map left
- Handle host already being in Target

* - lint

* - lint

* - Bug with append
- Take device type from target

* - Fix hexagon
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants