Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up element data #912

Closed
wants to merge 34 commits into from
Closed

Conversation

Pirulax
Copy link
Contributor

@Pirulax Pirulax commented May 4, 2019

This PR improves element data perfomrnace by 22%(at least for now server-side), I'll implement it client-side as well tomorrow, hopefully.

Renamed the branch so i need a new PR, the old one was #907

Benchmarks can be found @ #907

Test resource

@patrikjuvonen patrikjuvonen added the enhancement New feature or request label May 4, 2019
assert(szName);

std::map<std::string, SCustomData>::iterator it = m_Data.find(szName);
CFastHashMap<SString, SCustomData>::iterator it = m_Data.find(szName);
if (it != m_Data.end())
return &it->second;

return NULL;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe just replace the content of Get with a call with MapFind?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could but that'd be just a pointless function call, and we can to squeeze out as much performance as we can, and I think this isn't premature optimization(yet)

Copy link
Contributor

@qaisjp qaisjp May 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there will be a real difference

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it converting from SString to const char* and then back?
Or it's optimized by the compiler already?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

99% chance optimised away. If in doubt, measure the difference

SCustomData* pData = m_pCustomData->Get(szName);
if (pData)
CLuaArgument OldVariable;
if (m_pCustomData->Set(szName, Variable, bSynchronized, &OldVariable))
Copy link
Contributor

@qaisjp qaisjp May 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it work correctly if there is no previous element data?

Copy link
Contributor Author

@Pirulax Pirulax May 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does, since the original one worked like this as well, it only set a value to CLuaArgument if there was something, otherwise an empty CLuaArgument is == nil

Copy link
Contributor

@qaisjp qaisjp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR

@Lpsd Lpsd mentioned this pull request Jul 9, 2019
@Pirulax
Copy link
Contributor Author

Pirulax commented Aug 12, 2019

I just feel like m_SyncedData is so premature optimization...
Because think about it.. Most of the time elementdata is synced, so you have 2x the memory usage, and you dont gain that much(if at all), but you lose when updating the element data, because you need to update it in the syncedData map as well. Maybe an std::referENCe_wrapper will help, but even then you have another SString as the key..
Maybe we can just remove the m_SyncedData entirely?

*Edit: yea, its referENCe wrapper, not refernece

@qaisjp
Copy link
Contributor

qaisjp commented Aug 12, 2019

Maybe we can just remove the m_SyncedData entirely?

Yeah a few months ago (when we were discussing this in #mta.dev) I thought this looked unnecessary too. I think I looked into removing it, and then realised it was actually necessary. I think? I don't really remember.

@prnxdev
Copy link

prnxdev commented Aug 12, 2019

@qaisjp what is #mta.dev?

@patrikjuvonen
Copy link
Contributor

@qaisjp what is #mta.dev?

#mta.dev is MTA's old IRC channel for development discussions, nowadays it is bridged with our #development Discord channel.

@qaisjp
Copy link
Contributor

qaisjp commented May 15, 2020

Whatever you did broke the PR, so I've undone your changes.

@Pirulax
Copy link
Contributor Author

Pirulax commented May 16, 2020

Whatever you did broke the PR, so I've undone your changes.

Nah, its okay, i purposefully git reset --head HARD-ed it, because the old one didnt suit this anyways, and with that knowledge i improved it... "Improved it".. More like i had some fun, because tbh, the differences are quite.. small, to say the least. But try it yourself, I dont think its really worth it merging this PR.

Try the test resource from #907.
Dont look at the old results, they're now invalid, every set function doubled, probably due to that little branching(argStream.NextIsBoolean()) that was added. But it doubling the call time seems very bad to me.. I even tried the 1.5.7 nightly x64 server, it was a little slower(~10%), but was still twice slower than the older version with just reading a bool value. Is it possible thats causing such a huge difference?

@qaisjp
Copy link
Contributor

qaisjp commented May 16, 2020

I haven't looked at your recent code yet. You can try converting the function to the new lua arg parser, as that apparently speeds stuff up.

@Pirulax
Copy link
Contributor Author

Pirulax commented May 16, 2020

Oh, btw dont belive these failing..

Actually, do, because the client really is, because im a lazy fuck, but the server works fine, just pull the branch, and test it out, I'd be glad, so we can see if there really isnt much of a difference, other than way smaller memory consumption(i made a m_broadcasted, and a m_localOrSubbed maps, so in theory now it should use like 3/4 less memory, because most of the data that is set is synced, and if its synced used to be in both maps, with 2 SCustomData's)

@Pirulax
Copy link
Contributor Author

Pirulax commented May 16, 2020

I haven't looked at your recent code yet. You can try converting the function to the new lua arg parser, as that apparently speeds stuff up.

Uh oh, where can I find that magic?

@qaisjp
Copy link
Contributor

qaisjp commented May 16, 2020

Take a look at CLuaBcryptDefs - you'll need to use ArgumentParserWarn<MethodName, false>.

I think you can use std::optional<std::variant<bool, ESyncType>> to do this:

if (argStream.NextIsBool())
{
bool bSynchronize;
argStream.ReadBool(bSynchronize, true);
syncType = bSynchronize ? ESyncType::BROADCAST : ESyncType::LOCAL;
}
else
argStream.ReadEnumString(syncType, ESyncType::BROADCAST);

@Pirulax
Copy link
Contributor Author

Pirulax commented May 18, 2020

Alright. As it turned out, calling the onElementDataChange even takes the most time.
Calling the CCustomData::Set function barely takes time(6.8 us*), also all other function calls are pretty negligible, except this one(the whole process of calling onElementDataChange), this is why the getData is so much faster: it doesnt call events.

so, time list(on x64 release server):
CStaticFuncDefs::SetElementData() 106 580 ns -> 106 us -> 0.1 ms, and 0.08 ms from this comes form calling the event, with only the default resources running. So it may take more, depending on the amount of resources, elements, due to the element tree size, as of my understanding.

This PR will help the client way more than the server, because the client is way busier than the server.

*10 us on my CPU, take these measurements with a bag of salt, since im pretty sure they arent the most accurate due to the fact CTimeMarkerQHD is written: it needs to class CLOCK::now(), which translates to steady_clock::now(), which is already a small overhead + pushing the data back in the vector along with a stack allocated cstr, and everything, which makes every marker set after the first one inaccurate.

@Pirulax
Copy link
Contributor Author

Pirulax commented May 20, 2020

Take a look at CLuaBcryptDefs - you'll need to use ArgumentParserWarn<MethodName, false>.

I think you can use std::optional<std::variant<bool, ESyncType>> to do this:

if (argStream.NextIsBool())
{
bool bSynchronize;
argStream.ReadBool(bSynchronize, true);
syncType = bSynchronize ? ESyncType::BROADCAST : ESyncType::LOCAL;
}
else
argStream.ReadEnumString(syncType, ESyncType::BROADCAST);

Made a PR regarding that refactor, so I dont clutter this PR with non-necessary stuff anymore, cuz i did a few things not related to this PR at all.

@Pirulax
Copy link
Contributor Author

Pirulax commented Jun 18, 2020

Alright, this PR adds no value at all(well, on client-side it does, but I haven't done that yet. I'll leave the server as it is)

@Pirulax Pirulax closed this Jun 18, 2020
@Dutchman101
Copy link
Member

I don't agree, i feel it does add value and it would be a shame to let this effort go to waste.

In the case of elementdata, as long we're using it, there aren't many ways to optimize the implementation. I feel like you squeezed the most out of it with these changes, and even if the benefit is just a 20% performance gain, we should take it. It will already make a significant difference in today's climate of scripts (mis)using elementdata massively..

@Pirulax
Copy link
Contributor Author

Pirulax commented Jun 18, 2020

The problem with this PR is that is incorporates a lot of changes, that aren't related to it at all(see refactors to other functions I think in CStaticFDs). I'll make an other PR that'll actually be organized, and only contain changes related to it, because rn, reviewing it(let alone merging) is a PITA, IMHO.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants