Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize UpdateTimers #984

Merged
merged 1 commit into from Feb 6, 2016
Merged

Conversation

LegendOfDragoon
Copy link
Contributor

No description provided.

@@ -161,13 +161,27 @@ void CSystemTimer::UpdateTimers()
int TimeTaken = m_LastUpdate - m_NextTimer;
if (TimeTaken != 0)
{
uint32_t random, wired;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably should be int32_t

project64 added a commit that referenced this pull request Feb 6, 2016
@project64 project64 merged commit 1c7a422 into project64:master Feb 6, 2016
@cxd4
Copy link
Contributor

cxd4 commented Feb 6, 2016

Since the subject came up, I briefly wrote something like this last night:

void CSystemTimer::UpdateTimers()
{
    int32_t difference;
    const int TimeTaken = m_LastUpdate - m_NextTimer;

    if (TimeTaken == 0)
    {
        return;
    }
    m_LastUpdate = m_NextTimer;
    g_Reg->COUNT_REGISTER += TimeTaken;
    g_Reg->RANDOM_REGISTER -= TimeTaken / g_System->CountPerOp();

    if ((int32_t)g_Reg->RANDOM_REGISTER < (int32_t)g_Reg->WIRED_REGISTER)
    {
        difference = 32 - (g_Reg -> WIRED_REGISTER);
        g_Reg->RANDOM_REGISTER = g_Reg->WIRED_REGISTER + (g_Reg->RANDOM_REGISTER % difference);
    }
}

Something like that is what I originally came up with for abolishing the unnecessary while loop. However after having since written that, I've conjured some sets of inputs where (random) is a significantly smaller number compared to (wired), that my solution is a breaking change. I know I could guarantee safely removing the loop by using a floating-point ceiling operation and using that as the coefficient for replacing the loop, though.

Either way, in the interest of not investing beyond an hour into something I have no way of testing without running Windows here anyway (Linux builds don't run), I'll leave this research to you and wish you luck.

@LegendOfDragoon
Copy link
Contributor Author

I'll try and check more thoroughly from now on, before submitting PRs. The reason I used uint32_t increment = 32 - wired; was to try and get the compiler to do that equation outside of the loop.

Actually now I'm really confused. I must have made a mistake somewhere with my testing, because now I do see a difference.
This is what I get with random += increment;
00474ECB push 20h
00474ECD pop eax
00474ECE sub eax,ecx
00474ED0 add edx,eax
00474ED2 cmp edx,ecx
00474ED4 jl CSystemTimer::UpdateTimers+50h (0474ED0h)

This is what I get with random += 32 - wired;
00474ECB push 20h
00474ECD pop eax
00474ECE sub eax,ecx
00474ED0 add edx,eax
00474ED2 cmp edx,ecx
00474ED4 jl CSystemTimer::UpdateTimers+4Bh (0474ECBh)

Maybe compiler settings may have something to do with this problem. I know that zilmar seems to prefer minimize size. It doesn't seem to do well for this function. Or perhaps maybe I shouldn't use MSVC 2013. I'll have to ask someone to try MSVC 2015, since I do not want to install it yet.

I guess I'll try and come up with an algorithm to replace the loop. It really was just a temporary thing until I test more games where wired_register != 0 and come up with the best algorithm I can implement. I think wired = 0 in many games. I'll try doing more checking.

I didn't think to use int32_t since the original code just used int for that part. Good idea though.

If I'm not mistaken, g_Reg is a pointer, so that may have something to do with the optimizations being hindered when not using local variables.

@cxd4
Copy link
Contributor

cxd4 commented Feb 6, 2016

It's not that I thought of that aspect of the PR premature insofar as you checking more thoroughly if it was actually warranted first, so much as just in general compilers are flexible past that and there are stronger ways to optimize it while preserving code brevity and cleanliness.

I didn't think to use int32_t since the original code just used int for that part. Good idea though.

Depends. I don't know what zilmar knows about RANDOM and WIRED, so I don't know whether the (int32_t) treatment is really correct or not. I know the R4300i manual states that only the low 6 bits of both registers should be non-zero...if that is the case then the usage of int32_t is not necessitated and should really just be (int) like you had originally.

On the other hand, if the doc is wrong and PJ64's method is right, then it must be int32_t because the comparison is signed 32 bits where the MSB changes everything, even if the low 6 bits between both registers say otherwise in the inequality check. So in the absence of certainty I chose int32_t over int.

Another thing you have to consider is that the original code, even before your optimizations or mine, seems a bit sketchy. We have this:

        while ((int)g_Reg->RANDOM_REGISTER < (int)g_Reg->WIRED_REGISTER)
        {
            g_Reg->RANDOM_REGISTER += 32 - g_Reg->WIRED_REGISTER;
        }

How do you know WIRED < 32?

If it's == 32, this is a permanent loop and freezes the process indefinitely.

If it's > 32, random is really subtracting, not adding, until it's greater than wired, which is expensive, easily time-consuming, and probably a bug.

So it should be considered that we may even be talking about how to optimize a function that isn't even written for accuracy yet. Maybe zilmar needs to run hardware tests to see how sensitive the upper 26 bits of these regs honestly are.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants