Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vectors are incredibly slow #321

Open
Pirulax opened this issue Aug 10, 2018 · 18 comments
Open

Vectors are incredibly slow #321

Pirulax opened this issue Aug 10, 2018 · 18 comments
Labels
bug Something isn't working
Milestone

Comments

@Pirulax
Copy link
Contributor

Pirulax commented Aug 10, 2018

Describe the bug
So, using Vectors is a really good way to make your script more maintainable, but...
They are incredibly slow.
Here are some results:

------------[Vector2]------------
Create: 12348 ms.
sub: 16146 ms.
mul: 12616 ms.
eq: 456 ms.
pow: 12112 ms.
div: 13508 ms.
add: 11024 ms.
------------[Vector3]------------
Create: 13820 ms.
sub: 12819 ms.
mul: 11932 ms.
eq: 635 ms.
pow: 11516 ms.
div: 11353 ms.
add: 12621 ms.
------------[Vector4]------------
Create: 14891 ms.
sub: 11640 ms.
mul: 12306 ms.
eq: 385 ms.
pow: 13031 ms.
div: 11626 ms.
add: 10854 ms.
------------------------
Sub: 95
Function calls: 283 ms.

Sub: 95 (Time it takes to do a simple i = i-i expression)
Function calls: 283 ms. (This is the time it takes to call a function.)

Expected behavior
I'd expect Vectors to be far more faster.
The main reason they are slow is they use lua_classmetamethod, you may ask why:
Because I made a little Vector class in Lua and ran a performance test, got nearly the same results
the difference was 300ms/1m runs(with integers, with floats it was a little bit less)
Looks like comparation is pretty fast, maybe modification of the value is the slow part.

Additional context
Here's the test resource:
vectest.zip

(It's redundant a little and such, but really I just made it to ran these tests.)

@CrosRoad95
Copy link
Contributor

I agree! vectors are 10 times slower then normal list of coords

@Pirulax
Copy link
Contributor Author

Pirulax commented Aug 10, 2018

On the other side, internal Lua functions are pretty fast, so, maybe we need a custom version of lua, which has vectors by default, or something, because this way only MTA's source code benefits from Vectors.
They are literally 100x times slower(not considering the time it takes to create them.)

@patrikjuvonen patrikjuvonen added the enhancement New feature or request label Aug 10, 2018
@patrikjuvonen patrikjuvonen changed the title Vectors are incredibly slow. Vectors are incredibly slow Aug 10, 2018
@Pirulax
Copy link
Contributor Author

Pirulax commented Aug 12, 2018

Actually, OOP in MTA is 10x slower than PP..
Ridiculous

@qaisjp
Copy link
Contributor

qaisjp commented Sep 5, 2018

There is also a memory "problem" caused by infrequent garbage collection. See mantis 9840.

It would be useful to document this online.

Suggestions:

  • Implement vectors as Lua tables, maybe??? Not sure if this would help at all, but as a side effect it fixes Vectors and matrices cannot be passed via events #374
  • Modify vectors to do more in-place operations. If necessary, make <oop>true</oop> now take a version <oop>2</oop> for v2

@gatno
Copy link

gatno commented Oct 12, 2018

There is also a memory "problem" caused by infrequent garbage collection. See mantis 9840.

Wow that's heavy. This could be a hugh part of our performance issues. Our gamemode ist full written in oop and we used Vectors neary everywhere.

@qaisjp
Copy link
Contributor

qaisjp commented Oct 12, 2018

I don't think it causes any performance issues, it would only affect memory usage.

@ccw808
Copy link
Member

ccw808 commented Oct 12, 2018

Memory issue was 'fixed' in 6f3fe6d (but Mantis was not updated)

@Pirulax
Copy link
Contributor Author

Pirulax commented Nov 10, 2018

What do you mean under "in-place" operations?
Actually, garbage collection is pretty slow, if you create 200 vectors every frame, and you measure the time it takes to create them, you'll notice it's between 0 and 1(depends on CPU), but sometimes it jumps up to 28ms, that's I suppose the gc kicking in, even do maybe it's just the frame limiter.

@botder botder added this to the Backlog milestone Mar 3, 2019
@Pirulax
Copy link
Contributor Author

Pirulax commented Apr 30, 2019

But even creating a Vector takes a lot of time for some reason.

@Pirulax
Copy link
Contributor Author

Pirulax commented May 1, 2019

Tried in-place add, it nearly takes the same time to complete, so that didnt really help.
Implementing vectors as Lua tables wouldn't help too much either, I've tried that and it gave me the same results.
With that in mind in really have no clue why it's so slow.
Reading from the stack can't be that slow, since __eq is fast.
Pushing to the stack can't be that slow since then every function that is vector based should be slow(because they push vectors onto the stack).

operator+ can't be so slow since adding values up with simple dereferencing(so in-place) would be way faster.
Maybe what causes the issue are metatables, and the way they work?
But then why is __eq waay faster than the other methods? Maybe it's some kind of Lua optimization?

Edit:

Actually....
download

@qaisjp
Copy link
Contributor

qaisjp commented May 1, 2019

Summarisation of discussion in #mta.dev. Proposed next steps are:

  1. context switches wastes time. lets move the existing vector API to a Lua implementation (retaining the exact same API)
  2. explore in-place operations as a separate pull request
    • do a "draft" implementation for sake of profiling
    • measure the difference result = result + x is vs. result.add(x) (but for complex calculations, not something this simple. write a bitcoin miner in lua or something.)
    • ensure that the in-place operations does not affect existing scripts.
    • document in-place operations somewhere
    • make it clear that this feature does not break or improve existing scripts. scripts will need to be modified to take advantage of this feature. also make it clear that infix operators are generally better for readability, and in-place ops are for squeezing out that last bit of performance when doing complex calculations

@Pirulax
Copy link
Contributor Author

Pirulax commented May 1, 2019

Even if we implement vectors in Lua we won't be able to pass them trough events as the metatable will get lost in the process, so we don't fix #374

@qaisjp
Copy link
Contributor

qaisjp commented May 1, 2019

Even if we implement vectors in Lua we won't be able to pass them trough events as the metatable will get lost in the process, so we don't fix #374

The coordinates will still be stored in the dictionary section vec.x,vec.y,vec.z (or array section, vec[1],vec[2],vec[3], if we wanted .x to go via __index), so half the problem will be fixed.

Further work will need to be done to magically re-apply the metatable on the "other end", Potential solution for magically reapplying the metatable:

  • set some __mta_type field if that doesn't already exist
  • on the "other end", check if metatable is already set. if not, check if table passed around has __mta_type set to Vector3. if so, set the table's metatable to Vector3

This same magically reapplying the metatable thing will need to be done when passing vectors between resources. Probably needs to be implemented as part of CLuaArgument(s).

(Could actually be a generic setting like __mta_metatable that will set the table's metatable to the whatever table living at _G[__mta_metatable]. This will benefit general OOP implementations. But I like the idea of the typename being accessible as well.)

@Einheit-101
Copy link

Does the "bad" OOP performance only apply to OOP functions or will the whole script performance suffer when i simply enable OOP in the meta while using only 1 OOP function in a big script? Because i need OOP to get the camera forward matrix and i dont want to break CPU performance even more.

@sbx320
Copy link
Member

sbx320 commented Apr 26, 2020

@Einheit-101 the performance issue is really only with Vectors and Matrices.

Ultimately the issue is mostly due to garbage collection. If you do something simple like
(a + b + c)*(a + b + c) with the Vectors a, b and c, MTA will need to:

  • create vectors a, b, c
  • create intermediate vector for a + b (left side)
  • create intermediate vector for a + b + c (left side)
  • garbage collect a + b vector (left side)
  • create intermediate vector for a + b (right side)
  • create intermediate vector for a + b + c (right side)
  • garbage collect a + b vector (right side)
  • create intermediate vector for the result
  • garbage collect a + b + c vector (left side)
  • garbage collect a + b + c vector (right side)
  • return result vector

If you did the same thing with three floats, each you'd not get a single object construction or destruction.

@Pirulax
Copy link
Contributor Author

Pirulax commented Oct 14, 2020

Yeah, so I've looked around, and found this.
Now, I've looked at the code, and it seems like to be mostly compatible with the current Lua bytecode.. mostly.. The only thing being incompatible is the typeids have increased, so now it looks like this:

// ltm.c:23

const char *const luaT_typenames[] = {
  "nil", "boolean", "userdata", "number",
  "vec",  /* LUA-VEC */
  "string", "table", "function", "userdata", "thread",
  "proto", "upval"
};

(note the `/LUA-VEC/ comment)

// lua.h:69

#define LUA_TNONE		(-1)

#define LUA_TNIL		0
#define LUA_TBOOLEAN		1
#define LUA_TLIGHTUSERDATA	2
#define LUA_TNUMBER		3
#define LUA_TSTRING		4
#define LUA_TTABLE		5
#define LUA_TFUNCTION		6
#define LUA_TUSERDATA		7
#define LUA_TTHREAD		8
#define LUA_TVEC		9  /* LUA-VEC */
// lobject.h:19

/* tags for values visible from Lua */
#define LAST_TAG	LUA_TVEC  /* LUA_TTHREAD   -- LUA-VEC */

#define NUM_TAGS	(LAST_TAG+1)


/*
** Extra tags for non-values
*/
#define LUA_TPROTO	(LAST_TAG+1)
#define LUA_TUPVAL	(LAST_TAG+2)
#define LUA_TDEADKEY	(LAST_TAG+3)

@qaisjp
Copy link
Contributor

qaisjp commented Oct 14, 2020

  • If it's possible to make this backwards compatible with previously compiled scripts, great!
  • If it's not possible to make it backwards compatible, I don't see a problem with forcing people to recompile their code
  • I'm not against this either, as this wouldn't change any syntax afaik (Lua or the existing MTA API, I assume?)

But I am not excited about modifying our copy of Lua further. This isn't a no -- I'd just be a lot more excited if we had a proper Lua branch with our custom changes somewhere.

@Pirulax
Copy link
Contributor Author

Pirulax commented Oct 15, 2020

so, I have successfully implemented Lua-Vec into MTA.
Nothing blew up, nothing has crashed(other than me creating 160 million userdatas and running out of memory).
Here are some benchmark numbers:

  • 10 000 000 iterations
  • GC:OFF means the GC was stopped with collectgarbage "stop"
  • GC:ON means the garbage collector was running eg collectgarbage "restart"
  • mem usage delta is the delta between memusg(total) before running the functions, and at the end
  • in the userdata cases (V2-3-4) feel free to multiply the memory value by 2
  • Test resource: vectest.zip
============[native-vector - GC: OFF]============
memusg(total): 305.59 MiB

=> create: 3190 ms
=> sub: 1941 ms
=> mul: 1964 ms
=> eg: 703 ms
=> add: 1865 ms

memory usage delta: 1.192 GiB
============[native-vector - GC: ON]============
memusg(total): 1.490 GiB

=> create: 2819 ms
=> sub: 1587 ms
=> mul: 1582 ms
=> eg: 618 ms
=> add: 1575 ms

memory usage delta: 1.192 GiB
==================================================

============[Vector2 - GC: OFF]============
memusg(total): 1.490 GiB

=> create: 12694 ms
=> sub: 12399 ms
=> mul: 12751 ms
=> eg: 2787 ms
=> add: 12590 ms

memory usage delta: 2.906 GiB
============[Vector2 - GC: ON]============
memusg(total): 3.279 GiB

=> create: 50838 ms
=> sub: 27321 ms
=> mul: 30752 ms
=> eg: 7328 ms
=> add: 27570 ms

memory usage delta: 457.76 MiB
==================================================

============[Vector3 - GC: OFF]============
memusg(total): 1.938 GiB

=> create: 25011 ms
=> sub: 27654 ms
=> mul: 12385 ms
=> eg: 2625 ms
=> add: 13643 ms

memory usage delta: 2.906 GiB
============[Vector3 - GC: ON]============
memusg(total): 3.279 GiB

=> create: 87044 ms
=> sub: 28738 ms
=> mul: 27916 ms
=> eg: 7129 ms
=> add: 23017 ms

memory usage delta: -489 B
==================================================

============[Vector4 - GC: OFF]============
memusg(total): 1.938 GiB

=> create: 23437 ms
=> sub: 30718 ms
=> mul: 12657 ms
=> eg: 2584 ms
=> add: 13442 ms

memory usage delta: 2.906 GiB
============[Vector4 - GC: ON]============
memusg(total): 3.279 GiB

=> create: 84181 ms
=> sub: 25660 ms
=> mul: 24275 ms
=> eg: 6246 ms
=> add: 19446 ms

memory usage delta: -481 B
==================================================

@sbx320 sbx320 added bug Something isn't working and removed enhancement New feature or request labels Mar 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

9 participants