Skip to content

[MCA][MCSchedModel] Add a optional DelayCycles vector for SchedWriteRes. #45218

@adibiagio

Description

@adibiagio
Bugzilla Link 45873
Version trunk
OS Windows NT
CC @adibiagio,@legrosbuffle,@jrmuizel,@LebedevRI,@RKSimon

Extended Description

This was suggested by Andy in https://lists.llvm.org/pipermail/llvm-dev/2020-May/141487.html

The idea is to add a DelayCycles vector to SchedWriteRes to indicate the relative start cycle for each reserved resource. That would effectively model dependent uOps.

At the moment, it is not possible to delay the consumption of specific hardware resources. The expectation is that resource consumption always starts at relative cycle #​0 (i.e. relative to the instruction issue cycle).

A vector of DelayCycles (if present) would contain unsigned integer values (ideally one per each processor resource consumed by a write), and those values would be offsets in cycles relative to the issue cycle.
The absence of a DelayCycles vector would be semantically equivalent to a all-zeroes DelayCycles vector.

This would require a mostly mechanical change in tablegen to teach how to parse and semantically analyze this new concept. The subtarget-emitter would eventually generate information about those delay-cycles in a table.
A more complicated change would be needed for the bookkeping logic in mca (HardwareUnits/ResourceManager.cpp).

Most x86 processor models would probably benefit from this change. SchedWrite definitions which might benefit from this change are writes for horizontal operations. On most x86 processors, horizontal add/sub is usually decoded into a pair of shuffles uOPs followed by a single (data-dependent) vector ADD uOP.
The ADD uOP doesn't execute immediately because it needs to wait for the other two shuffle uOPs. So the ALU pipe is still available at relative cycle #​0, and it is only consumed by the horizontal operation starting from relative cycle #​1.

This was just an example. There are probably various write descriptors (not just writes for microcoded instructions) which would benefit from this change.

This will also solve a number of known problems with the descriptors in Haswell/Broadwell. Last but not least it would allow us to simplify the bookkeping logic in llvm-mca and get rid of the not-so-nice "reserved" bit for processor resource groups. More details about those two issues can be found in the above mentioned llvmdev thread.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions