Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix shuffling thunk for Unix AMD64 #16904

Merged
merged 1 commit into from Mar 13, 2018
Merged

Conversation

@janvorli
Copy link
Member

@janvorli janvorli commented Mar 13, 2018

The shufflign thunk was generated incorrectly for some edge cases when
a struct was passed in a register or a pair of registers in the
destination, but on stack in the source.
That resulted in corruption of delegate argument in those cases.

This change implements a new algorithm that ensures that argument slots
are never overwritten before their current value is moved out.

It also adds an extensive regression test that checks various
interesting combinations of arguments that were causing issues before.

Close #16833

@janvorli janvorli added this to the 2.1.0 milestone Mar 13, 2018
@janvorli janvorli self-assigned this Mar 13, 2018
@janvorli janvorli requested a review from jkotas Mar 13, 2018
@janvorli
Copy link
Member Author

@janvorli janvorli commented Mar 13, 2018

@janvorli janvorli force-pushed the fix-shuffling-thunk branch from d002620 to 0ec5e3a Mar 13, 2018
@jkotas
Copy link
Member

@jkotas jkotas commented Mar 13, 2018

I am having hard time convincing myself that this works for all corner cases. It probably does, but it is hard to see. Also, iterating the signature twice is not particularly efficient.

Would it be better and simpler to build the pShuffleEntryArray in regular way, and then apply topological sort on it once we are done?

It does not have to be particularly smart because of the array is going to be mostly sorted already. E.g. I think it can look like this:

    //
    // Topological sort of shuffle entries to ensure that the slots are not overwritten before they are moved.
    //
    int argSlots = NUM_FLOAT_ARGUMENT_REGISTERS + NUM_ARGUMENT_REGISTERS + sArgPlacerSrc.SizeOfArgStack() / sizeof(size_t);
    NewArrayHolder<bool> filledSlots = new bool[argSlots];
    bool reordered;
    do
    {
        reordered = false;

        for (int i = 0; i < argSlots; i++)
        {
            filledSlots[i] = false;
        }

        for (int i = 0; i < pShuffleEntryArray->GetCount(); i++)
        {
            entry = (*pShuffleEntryArray)[i];

            // If the slot that we are moving the argument to was filled in already, we need to move this entry in front 
            // of the entry that filled it in.
            if (filledSlots[GetNormalizedArgumentSlotIndex(entry.srcofs)])
            {
                for (int j = i; (*pShuffleEntryArray)[j].dstofs != entry.srcofs; j--)
                    (*pShuffleEntryArray)[j] = (*pShuffleEntryArray)[j - 1];
                (*pShuffleEntryArray)[j] = entry;
                reordered = true;
            }

            filledSlots[GetNormalizedArgumentSlotIndex(entry.dstofs)] = true;
        }
    }
    while (reordered);

@janvorli
Copy link
Member Author

@janvorli janvorli commented Mar 13, 2018

The algorithm I've implemented is very simple in what it does. Let me try to explain in a few words. When it wants to move a register or stack slot, it checks whether the target register or stack slot was already moved away. If yes, it just performs the move, since the slot is guaranteed to be free. If not, it postpones the move until the target slot is freed. That's all. I don't see any corner cases that this could handle in a wrong way.

The only part that may look more complex is the fact that this can naturally create a chain of postponed moves, hence the loop that adds the postponed moves once it finds that the move it has added enables a postponed move to happen.

Regarding the double iteration over the arguments, that's a bit unfortunate, but do you think that this function is a performance bottleneck?

@jkotas
Copy link
Member

@jkotas jkotas commented Mar 13, 2018

I understand that it is simple in principle, but it is not easy to read and understand because of it is interleaved with the other code and ifdefs, the special cases for return buffer, ... . I think it would be a lot easier to understand if all the code related to the topological sort is together and fits onto a single screen.

I do not think that this function is performance bottleneck. I just mentioned it as a side-benefit.

The shufflign thunk was generated incorrectly for some edge cases when
a struct was passed in a register or a pair of registers in the
destination, but on stack in the source.
This change implements a new algorithm that ensures that argument slots
are never overwritten before their current value is moved out.

It also adds an extensive regression test that checks various
interesting combinations of arguments that were causing issues before.
@janvorli janvorli force-pushed the fix-shuffling-thunk branch from 0ec5e3a to bf33bec Mar 13, 2018
@janvorli
Copy link
Member Author

@janvorli janvorli commented Mar 13, 2018

@jkotas ok, I've changed it to the way you've suggested and amended the commit. While I could cleanup the stuff I had to the point where all the interleaving of ifdefs is gone and all relevant pieces of code are single screen size, the fact that my way needs to iterate over the arguments even in cases when it is basically useless, which will happen in most of the cases made me change my mind.

@@ -352,6 +378,10 @@ VOID GenerateShuffleArray(MethodDesc* pInvoke, MethodDesc *pTargetMeth, SArray<S
ArgLocDesc sArgSrc;
ArgLocDesc sArgDst;

#if defined(UNIX_AMD64_ABI) && defined(FEATURE_UNIX_AMD64_STRUCT_PASSING)
int argSlots = NUM_FLOAT_ARGUMENT_REGISTERS + NUM_ARGUMENT_REGISTERS + sArgPlacerSrc.SizeOfArgStack() / sizeof(size_t);
Copy link
Member

@jkotas jkotas Mar 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Can this be inside the big ifdef below?

Copy link
Member Author

@janvorli janvorli Mar 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately it cannot. The sArgPlacerSrc.SizeOfArgStack() cannot be called after the iteration has finished. It asserts due to that if I try to call it at that place, that's why I had to move it here.

Copy link
Member

@jkotas jkotas Mar 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

jkotas
jkotas approved these changes Mar 13, 2018
Copy link
Member

@jkotas jkotas left a comment

LGTM otherwise

@janvorli janvorli merged commit 41c0295 into dotnet:master Mar 13, 2018
15 of 18 checks passed
@janvorli janvorli deleted the fix-shuffling-thunk branch Mar 13, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
2 participants