-
Notifications
You must be signed in to change notification settings - Fork 721
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Avoid generating store of uninitialized auto when reducing TRT2
In the case of a single non-constant delimiter, it's possible for the pattern to match when the original load and comparison look as follows: n56n istore elem n55n b2i n54n bloadi <array-shadow> ... n60n istore tmpDelim n59n b2i n58n i2b n57n iload delim n65n ificmpeq --> block_EXIT n55n ==>b2i n59n ==>b2i With such a match, the booltable node of the pattern graph corresponds to the ificmpeq node of the target graph. The transformer function for TRT2 (CISCTransform2FindBytes) finds this target node and from there it uses the second child as the delimiter, but instead of the b2i that one might expect, that second child is a variable (tmpDelim) that has been matched up with the store node (n60n in this example). This situation results in the transformation generating a load of tmpDelim while at the same time removing the definition (n60n) that provides the value that is expected at that load. This problem should be a pretty rare occurrence. If the tmpDelim store is dead, it should usually have been eliminated. If OTOH it isn't dead, then the pattern doesn't match. As such, it would probably be reasonable to detect this case and simply refuse to transform. However, it's not necessarily straightforward to detect the problem. I believe that tableCISCNode->getHeadOfTrNodeInfo()->_node being a store indicates that the problem is occurring, but it's not obvious that we couldn't see the same fundamental problem with a load node, or some other node that has a load as a descendant. With this uncertainty, reliably detecting the problem case requires walking a subtree and looking for auto loads that aren't loop-invariant. Since detecting the problem is already that complex, and since even in the presence of a store the delimiter might be loop-invariant anyway, this commit goes slightly further and chases down single definitions to construct an expression that does not load from autos that are defined in the loop (if possible). In the example above, that means that we still remove the tmpDelim store, but now the arraytranslateAndTest node uses b2i (i2b (iload delim)) instead of iload tmpDelim.
- Loading branch information
Showing
1 changed file
with
210 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters