Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

memory planning: add offset to planning output and respect it in graph executor #8134

Closed
wants to merge 2 commits into from

Conversation

rafzi
Copy link
Contributor

@rafzi rafzi commented May 25, 2021

This adds the possibility to specify an offset for the results of memory planning, such that buffers can be placed at other positions than the base address. This way we can also partially overlap buffers to enable the most optimal buffer placements.

More details in this discussion: https://discuss.tvm.apache.org/t/discussion-alignment-memory-planning/9730

@areusch areusch self-assigned this Jun 3, 2021
@rafzi
Copy link
Contributor Author

rafzi commented Jun 12, 2021

It seems like the long term plans of TVM are conflicting with this approach, in that the memory planning should happen in TIR.

Is this something that is useful to TVM right now? Should I continue work on this or drop it in favor of the upcoming approach?

@areusch
Copy link
Contributor

areusch commented Jun 14, 2021

@rafzi apologies for the delay in reviewing this one. i'm not sure there is broad alignment yet on the way we intend to do full-graph memory planning in TVM. and, even when we do come to agreement on a model for memory (which I think may look similar to the one you're working towards here), we still need to implement support for it in both Graph and AOT executors. Also, the Graph executor is invoking TIR PrimFunc, so it's likely something similar to this PR will be useful. My thinking is that what you have here is fairly close and we'll just need to rename fields or add additional e.g. pool_id to give more context to the offset.

So I'm not convinced we should drop this PR; however, before proceeding, I'd like to get everyone aligned around a single memory planning proposal. There are a couple of theoretically orthogonal pieces of such a proposal as well: a) the interface between the TVM graph and the memory planner; b) the algorithm(s) used in planning; c) the interface between TVM and the executors. At present there are two suggestions for (a) a TIR-level interface and a Relay-level planner. I think the TIR-based planner offers more flexibility but the Relay one is easier to implement to (e.g. it's nearly complete in the tree today).

Would you be interested in reviewing the TIR-level interface proposed in the USMP RFC? It would be great to get your thoughts whether it's possible to implement the algorithms you've proposed using that interface as well.

Given there is some interest from the community in doing whole-program TIR optimization, plus the AOT top-level function is in TIR, it may be slightly more impactful to adopt that interface. However, I'd like to understand whether that precludes including the algorithms you've proposed here. Finally, this PR could serve as a basis to implement the Graph executor changes required to support (c).

Let me know your thoughts!

@jroesch
Copy link
Member

jroesch commented Jan 19, 2022

This PR appears to be out of date, please feel free to reopen it if this is not the case.

As part of the new year we are attempting to triage the project's open pull requests to ensure that code which
is ready for review and/or merging receives adequate attention.

Thanks again for your contribution, and feel free to reach out to discuss these changes.

@jroesch jroesch closed this Jan 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants