Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: setTableEntry extremely slow #4769

Open
Azhrei opened this issue Apr 30, 2024 · 12 comments · May be fixed by #4770
Open

[Bug]: setTableEntry extremely slow #4769

Azhrei opened this issue Apr 30, 2024 · 12 comments · May be fixed by #4770
Assignees
Labels
bug claimed Issue is being actively worked on. documentation needed Missing, out-of-date or bad documentation feature Adding functionality that adds value macro changes This issue adds or changes macro functions. Extra work is required (testing, wiki, code editor) performance A performance or quality of life improvement refactor Refactoring the code for optimal awesomeness.

Comments

@Azhrei
Copy link
Member

Azhrei commented Apr 30, 2024

Describe the Bug

When adding 3200 elements to a new table using setTableEntry(), MT can consume significant cpu and even more memory (upwards of 48GB!).

To Reproduce

Execute the follow macro and note the execution time. Then comment out or delete the setTableEntry() and run it again.

[h: pattern = "Dex"]
[h: replaceWith = "DEX"]
[h: tableName = "Beasts"]
[h: stringIndex = "40"]

[h, COUNT(3200), CODE:
    {
        [h: stringList = table(tableName, roll.count)]
        [h: value = listGet(stringList, stringIndex, "~")]
        [h: value = replace(value, pattern, replaceWith)]
        [h: stringList = listReplace(stringList, stringIndex, value, "~")]
        [h: setTableEntry(tableName, roll.count, stringList)]            
    }
]

Expected Behaviour

The setTableEntry() function should overwrite an existing entry if it's there, or add a new one if it isn't. (The latter is new functionality that doesn't exist in the current version, so the wiki will need to be updated.) Unfortunately, I don't see a way to change the return code so that it can indicate whether an entry was added or replaced and still maintain backward compatibility.

Screenshots

No response

MapTool Info

All

Desktop

Any

Additional Context

See this Discord discussion for the initial report from Full Bleed.

@Azhrei Azhrei added bug feature Adding functionality that adds value claimed Issue is being actively worked on. documentation needed Missing, out-of-date or bad documentation macro changes This issue adds or changes macro functions. Extra work is required (testing, wiki, code editor) refactor Refactoring the code for optimal awesomeness. performance A performance or quality of life improvement labels Apr 30, 2024
@Azhrei Azhrei self-assigned this Apr 30, 2024
@Azhrei
Copy link
Member Author

Azhrei commented Apr 30, 2024

I think setTableEntry() should be allowed to create a new entry if it doesn't exist.

I can see a couple issues -- such as gaps in the table if one were to use setTableEntry to create entry 100 when the table only has two elements at position 0 and 1, for example. Should this be disallowed? Existing macros can't do this, so there shouldn't be any broken code as a result of this change.

For now, I'm going to focus on the performance issue while the above question is discussed.

@Azhrei Azhrei linked a pull request Apr 30, 2024 that will close this issue
@kwvanderlinde
Copy link
Collaborator

I think setTableEntry() should be allowed to create a new entry if it doesn't exist.

I see a lot things going against this. For one, we don't need it since we can already use addTableEntry() as a fallback, e.g.,:

[updated = setTableEntry(table, roll, result)]
[if (!updated): addTableEntry(table, roll, roll, result)]

Not hard to package that up as a UDF if doing this a lot.

As a follow-up to that, the proposed change would be a breaking change to setTableEntry(). Currently it won't do anything if it didn't find a match, and it will report that in the return value. If we changed setTableEntry() as suggested, there would be no more failure cases.

Third is that setTableEntry() would not support full table functionality when adding entries because it does not understand roll ranges, just single numbers. So it's a lacking convenience, and I hope we never feel the need to add range support to setTableEntry() just to address that.

Finally, it will give the impression that tables is are mappings from numbers to results, but that isn't true. Of course we can use tables that way, but it's not reality.

I can see a couple issues -- such as gaps in the table if one were to use setTableEntry to create entry 100 when the table only has two elements at position 0 and 1, for example. Should this be disallowed? Existing macros can't do this, so there shouldn't be any broken code as a result of this change.

Gaps are already possible with addTableEntry() and via the UI. I wouldn't advise putting gaps in tables since rolling on such tables is a bit broken, nor is it the most intuitive thing when it works. But the problem is already there regardless of changes to setTableEntry().

For now, I'm going to focus on the performance issue while the above question is discussed.

Small tangent there: I took a look at LookupTable and the entire thing is just not designed for large tables. Even rolling has to do a linear scan to find an entry, so doing that repeatedly on large tables wil be noticably slow. If we could add some structure to LookupTable (e.g., sorted entries + binary search to find entries) that could improve table performance in more scenarios than just this one.

@Azhrei
Copy link
Member Author

Azhrei commented May 1, 2024

All good points; thanks.

I had to review the wiki pages (I don't build tables very often) to learn about some of the details, like having gaps in the entries.

@FullBleed
Copy link

FullBleed commented May 1, 2024

Thanks for opening the issue.

I'm not sure how much of the current behaviour can fairly be called a "bug" unless it's doing something truly unintended as opposed to just doing something poorly conceived to accommodate large tables... I just know that there is something really bad going on when addTableEntry and setTableEntry are done at a large scale (i.e. when hundreds or thousands of records are edited or added) and I suspect that the big-brain collective can improve the performance significantly.

Of additional note, the massive memory usage when doing a lot of edits to a large table also seems to permanently lockup a large amount of memory. That may, in fact, be a real bug.

I'm attaching a nearly 3200 record table of monsters based on Pathfinder 1e monsters for anyone who might need a large table to test with. Remove the .zip extension to use in MT.

PF 1e Beasts.mttable.zip

Edit: I haven't finished indexing the table... The (0) record has all of the listings, (1) alpha, (2) type, and I think (3) subtype.

@Azhrei
Copy link
Member Author

Azhrei commented May 1, 2024

Some design ideas...

I like the idea of a sorted ArrayList to hold table entries and a binary search to find a specific entry when given a roll number. That should perform well and doesn't change the structure stored in the campaign file so there shouldn't be any work when reading older campaigns (potentially a sort?).

Adding entries in bulk, as the OP was attempting, is a common case during development, but rare during a game. If we add/set a single entry, it can be inserted where it belongs in the table and we avoid a sort, but performing that operation in a loop can become O(n) as all following entries must be "pushed down" by one. Then, there's the issue of every change being pushed out to clients if it does happen during a game. I see three options.

  1. Add a separate function, loadTable(), that is given a data set to be loaded into the table. That data is loaded and sorted as a single operation, with a single call to push the changes out to clients. This reduces the server/client traffic significantly, which is important given that the server must push it to individual clients (although, bulk table updates during a game are unlikely). Users like aliasmask have macros that convert CSV into tables, but this function could incorporate options for JSON Arrays, delimited files, and perhaps others, greatly simplifying data import.

  2. Add a special variable that essentially informs the implementation that a bulk change is being performed, something along the lines of table.loading = flag where flag is true/false (similar to macro.catch). This would turn off the sort and server/client push until the add/set loop is complete. If a loop failed partway through, the partial table would be sorted and pushed to clients. This solution would be implemented as a special VariableResolver variable so that it is scoped to a particular macro and when the macro ends (either successful or not), it automatically performs the pending sort/push operations.

  3. Add a new roll option, something like tx, that can be included to indicate that all server/client push operations should be queued up and not sent to clients until the code block has completed. Once the code block completes, redundant entries in the queue would be removed. This has the advantage of solving problems in other areas where an update loop causes the Zone to change (I'm thinking of token updates, primarily) and thus be pushed out to clients immediately. This new roll option would be used thusly:

[h, COUNT(3200), TX, CODE:
    {
        [h: ...]
        [h: addTableEntry(tableName, roll.count, roll.count, stringList)]            
    }
]

In regards to option 3, if the tx roll option is coded to allow Java callback hooks, the table operations could register to be notified when the code block ends and they could do a single sort at that time, allowing the add operation to just append an entry rather than inserting.

Any other ideas?

@kwvanderlinde
Copy link
Collaborator

If we add/set a single entry, it can be inserted where it belongs in the table and we avoid a sort, but performing that operation in a loop can become O(n) as all following entries must be "pushed down" by one.

Technically true, but I don't think this will be noticable. Even inserting into large ArrayList is pretty quick despite how much work it feels like. It would only be a problem is repeatedly doing large bulk loads, but that doesn't seem like a practical case. Though your loadTable() idea could sidestep even that concern depending on how the data is sourced.

Then, there's the issue of every change being pushed out to clients if it does happen during a game.

And right now this is even worse than it seems, since setTableEntry() pushes the entire campaign properties, not just the specific table or modifications of the table. So that would be another area of improvement.

For the rest. I like the consideration of cutting down on network traffic. Definitely in favour of an approach that is not specific to tables. Similar to your idea (3), I've had in mind that it should be possible to treat all macros as running in exactly that sort of "transaction". No need for special syntax to opt into it, we can just apply the idea generally.

@cwisniew
Copy link
Member

cwisniew commented May 2, 2024

I think the loadTable() is the way to go.
I am not a fan of the large-scale code changes that would be required to support conditional transaction support for macros.
Nor do I think we could make macros transactional by default as its guaranteed to break peoples macros

@kwvanderlinde
Copy link
Collaborator

True transactions would certainly be too much. But merely queueing up server commands during macro execution should not be an issue since there's no guarantee how quickly they can get actioned anyways.

@cwisniew
Copy link
Member

cwisniew commented May 2, 2024

True transactions would certainly be too much. But merely queueing up server commands during macro execution should not be an issue since there's no guarantee how quickly they can get actioned anyways.

There are already issues with changes from multiple clients overwriting each other without extending the window on long-running macros. Worse, macros that use input (including the input functionality built into the parser) or rest calls are also problematic as they could run a very long time. Without a specific use case that can't be solved with a bulk operation like loadTable() it definitely falls under "nice to have but more pain than its worth"

Also, some commands rely on coming back from the server to the local client to behave properly, which could be a mess to untangle.

@Azhrei
Copy link
Member Author

Azhrei commented May 2, 2024

I'm fine with the loadTable() idea — it's certainly the quick-fix approach.

Perhaps the TX idea gets put into the bin for MTscript 2.0? I could see allowing it to be extended to multiple different categories, something along the lines of TX(updates) where updates identifies that it's the campaign updates that are paused until the code block completes. It would allow for new extensions without breaking backward compatibility with existing code blocks.

It might also allow for some built in parallelism if the parser can determine that the contents of the code block wouldn't interfere with itself if run on separate threads.

@Azhrei
Copy link
Member Author

Azhrei commented May 8, 2024

Just an update for @FullBleed ... There were no existing tests for LookupTable, LookupTable.LookupEntry, or LookupFunctionsTest, so I'm starting from scratch. I'm mostly done with the first, but few of the methods are documented, so I've been stuck learning first what they're for and then adding that documentation — it just takes time.

@Azhrei
Copy link
Member Author

Azhrei commented May 17, 2024

Another update... I'm done with the tests and I have the loadTable() function implemented using a JSON array of arrays. I'm going to add support for an array of objects and see which is faster. If there's a huge difference, then will likely only keep the faster one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug claimed Issue is being actively worked on. documentation needed Missing, out-of-date or bad documentation feature Adding functionality that adds value macro changes This issue adds or changes macro functions. Extra work is required (testing, wiki, code editor) performance A performance or quality of life improvement refactor Refactoring the code for optimal awesomeness.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants