Skip to content

[Hexagon][Runtime] Better support for 2-tier memory#12574

Merged
tmoreau89 merged 1 commit intoapache:mainfrom
cconvey:alloc-2tier
Sep 6, 2022
Merged

[Hexagon][Runtime] Better support for 2-tier memory#12574
tmoreau89 merged 1 commit intoapache:mainfrom
cconvey:alloc-2tier

Conversation

@cconvey
Copy link
Contributor

@cconvey cconvey commented Aug 24, 2022

[hexagon][runtime] better support for 2-tier memory

  • Introduce 'global.ddr' memory scope:

    • Like 'global', this allocates memory from the Hexagon SoC's
      DDR memory.
    • Like 'global.vtcm', the specified tensor shape must be 1d
      or 2d, where 2d indicates Hexagon's "indirect tensor"
      (i.e., discontiguous) allocation scheme.
  • Change memory-alignment strategy to always be 2048-byte aligned
    on Hexagon. (This can be refined in the future, but for now it
    ensures all allocations meet the strictest alignment requirements
    for any Hexagon operations.)

cc @areusch @mehrdadh

@cconvey cconvey force-pushed the alloc-2tier branch 6 times, most recently from 3e8a668 to aa3355b Compare August 29, 2022 17:33
@cconvey cconvey changed the title [WIP] Alloc 2tier Alloc 2tier Aug 29, 2022
@cconvey
Copy link
Contributor Author

cconvey commented Aug 29, 2022

@Lunderberg @csullivan @kparzysz-quic : Heads up, this should be ready for review once CI passes.

@cconvey cconvey force-pushed the alloc-2tier branch 2 times, most recently from a01b450 to eba609d Compare August 29, 2022 18:35
@cconvey
Copy link
Contributor Author

cconvey commented Aug 29, 2022

@csullivan @Lunderberg @adstraw @kparzysz-quic: FYI if anyone wants to review

Looking at the CI results, I suspect there's a non-trivial issue with some existing code that uses "global" scope. PR won't be ready for review until I look into that.

@cconvey cconvey changed the title Alloc 2tier [WIP] alloc 2tier Aug 30, 2022
@cconvey
Copy link
Contributor Author

cconvey commented Aug 31, 2022

This might need fixing:

tests/python/contrib/test_hexagon/test_2d_physical_buffers.py::TestElementWise::test_execute[nchw-8h8w32c-1d-nchw-8h8w32c-1d-int8-2-32-8x8-nhwc-global.vtcm-hexagon] [09:37:00] /home/cconvey/r/cconvey/tvm/alloc-2tier/src/target/target_info.cc:45: Warning: MemoryInfo for scope = global.vtcm is undefined
[09:37:00] /home/cconvey/r/cconvey/tvm/alloc-2tier/src/target/target_info.cc:45: Warning: MemoryInfo for scope = global.vtcm is undefined

@cconvey cconvey force-pushed the alloc-2tier branch 4 times, most recently from e3d059d to 99346e4 Compare September 1, 2022 19:17
- Introduce 'global.ddr' memory scope:
  - Like 'global', this allocates memory from the Hexagon SoC's
    DDR memory.
  - Like 'global.vtcm', the specified tensor shape must be 1d
    or 2d, where 2d indicates Hexagon's "indirect tensor"
    (i.e., discontiguous) allocation scheme.

- Change memory-alignment strategy to always be 2048-byte aligned
  on Hexagon.  (This can be refined in the future, but for now it
  ensures all allocations meet the strictest alignment requirements
  for any Hexagon operations.)
@cconvey
Copy link
Contributor Author

cconvey commented Sep 1, 2022

This might need fixing:

tests/python/contrib/test_hexagon/test_2d_physical_buffers.py::TestElementWise::test_execute[nchw-8h8w32c-1d-nchw-8h8w32c-1d-int8-2-32-8x8-nhwc-global.vtcm-hexagon] [09:37:00] /home/cconvey/r/cconvey/tvm/alloc-2tier/src/target/target_info.cc:45: Warning: MemoryInfo for scope = global.vtcm is undefined
[09:37:00] /home/cconvey/r/cconvey/tvm/alloc-2tier/src/target/target_info.cc:45: Warning: MemoryInfo for scope = global.vtcm is undefined

I discussed this via DM with @csullivan. His impression is that having supporting MemoryInfo for a given memory-scope can be useful, but isn't required. So I'm not planning to deal with this warning in this PR.

@cconvey cconvey changed the title [WIP] alloc 2tier alloc 2tier Sep 1, 2022
@cconvey
Copy link
Contributor Author

cconvey commented Sep 1, 2022

@csullivan @jverma-quic : request for review

Copy link
Contributor

@nverke nverke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this change! It might have saved me from a bug that was pushed to mainline a few days ago.

Ship it!

Copy link
Contributor

@csullivan csullivan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

shape = tvm.testing.parameter((128, 128))

(scope, axis_separators,) = tvm.testing.parameters(
("global", []),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Could add an xfail test for ("global", [1]) to indicate this expectation explicitly.

@cconvey cconvey changed the title alloc 2tier [Hexagon][Runtime] Better support for 2-tier memory Sep 6, 2022
@tmoreau89 tmoreau89 merged commit 832cffa into apache:main Sep 6, 2022
Copy link
Contributor

@adstraw adstraw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the clarifying comments.

@cconvey cconvey deleted the alloc-2tier branch September 6, 2022 15:06
@tmoreau89
Copy link
Contributor

Thank you @cconvey @csullivan @adstraw @nverke, the PR has been merged.

xinetzone pushed a commit to daobook/tvm that referenced this pull request Nov 25, 2022
- Introduce 'global.ddr' memory scope:
  - Like 'global', this allocates memory from the Hexagon SoC's
    DDR memory.
  - Like 'global.vtcm', the specified tensor shape must be 1d
    or 2d, where 2d indicates Hexagon's "indirect tensor"
    (i.e., discontiguous) allocation scheme.

- Change memory-alignment strategy to always be 2048-byte aligned
  on Hexagon.  (This can be refined in the future, but for now it
  ensures all allocations meet the strictest alignment requirements
  for any Hexagon operations.)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants