loading tmx files is unnecessarily slow for big worlds #2156

oncer · 2019-07-12T21:03:03Z

In the project I'm working on right now we have a world file with ~200 TMX maps loaded. When creating a new MapScene, i.e. by clicking on a map that is not currently opened in a tab, we experienced loading times of several seconds.
Any time a map is opened in a new tab, MapScene::refreshScene() is called, which calls DocumentManager::loadDocument for each map in the world.

DocumentManager::loadDocument in turn loops through the instances in Document::sDocumentInstances to look up an already loaded map. As I understand it, sDocumentInstances is a lookup index for tmx files that have been loaded previously, in this case at least all the maps in the current world. Since we are looping through all previously loaded maps inside a loop of all maps of the world, this turns into a classic O(n²) scenario.

The loading time was reduced greatly when we replaced QList with QHash for Document::sDocumentInstances. See this commit for reference.

This change alone made a big difference for us, decreasing the loading time of several seconds to less then a second. Maybe it's also possible to re-use the existing world structure when a map file from the same world is loaded, eliminating the need for the loop inside refreshScene in many cases.

bjorn · 2019-07-16T14:21:45Z

That's a great performance analysis and speed improvement!

I think normally the linear search would not have been much of a problem even with 200 maps, but what made it slow is that for each map the path was converted to the canonical file path, which takes time mostly because it requires file system access. Btw, there is still a linear search done at the start of DocumentManager::loadDocument over the open files, which I think can just be removed.

Since we should never have the same file loaded multiple times, turning the list into a map should be totally fine. The only worry I have is what happens if you overwrite a loaded file with another open file, since this is a sure way to produce two Document instances with the same canonical file path. Of course, that's a rather rare case and in the end it may not really matter because the map is only used to find existing loaded documents for a given file name, and in this case we probably want to use the last one that was given this file name, which is exactly the one that will be in the map. That said, when the destructor is called it should only remove the entry when it points to the same instance.

Would you like to open a pull request with that change?

This change alone made a big difference for us, decreasing the loading time of several seconds to less then a second. Maybe it's also possible to re-use the existing world structure when a map file from the same world is loaded, eliminating the need for the loop inside refreshScene in many cases.

I think it's not possible to avoid the loop in refreshScene. Sadly we can't share the MapItem instances between scenes, so all that we're sharing are the loaded maps and tilesets. The loop is creating the MapItem instances, reusing previous ones to speed up future refreshes (like when the world file is edited).

We could in theory share MapItem instances when we would share the MapScene between views on the same world. That could become a rather tricky patch, but it would be pretty neat. :-)

oncer · 2019-07-16T15:24:51Z

I think normally the linear search would not have been much of a problem even with 200 maps, but what made it slow is that for each map the path was converted to the canonical file path, which takes time mostly because it requires file system access.

Do you think it's possible to come up with a different system that does not rely on canonical file path? In any case, my change is in pull request #2159

bjorn · 2019-07-16T18:41:34Z

Do you think it's possible to come up with a different system that does not rely on canonical file path?

Well, in most cases we could rely on a clean absolute path (which is much faster to construct because it does not require file system access), but when symlinks are present this can lead to the same file being loaded multiple times (with or without using the symlink). However, we could cache the canonical file path in the document, so that we would not need to reconstruct it every time we want to compare by canonical path.

bjorn · 2019-07-17T09:40:55Z

@oncer Btw, I see you also did a change to automate the running of a Python script upon save (3b25da3). In the latest development snapshot, JavaScript extensions can trigger custom commands. They can also connect to the tiled.assetSaved event. Would you be interested in trying to hook up your Python script that way?

I'd also be interested in looking at enabling a JavaScript extension to set the type and name for tile objects like you're doing in commit a5c0fe2. I think the main thing missing here is a signal when an object gets added.

oncer · 2019-07-19T18:57:34Z

It would be very nice to do both of these via an extension, it's probably less work than maintaining and building a custom branch! So yes, if there is a way to hook up a Python script and rename objects as they get added, I would definitely look into it. Thorbjørn Lindeijer <notifications@github.com> schrieb am Mi., 17. Juli 2019, 11:41:

…

@oncer <https://github.com/oncer> Btw, I see you also did a change to automate the running of a Python script upon save (3b25da3 <3b25da3>). In the latest development snapshot, JavaScript extensions can trigger custom commands. They can also connect to the tiled.assetSaved event. Would you be interested in trying to hook up your Python script that way? I'd also be interested in looking at enabling a JavaScript extension to set the type and name for tile objects like you're doing in commit a5c0fe2 <a5c0fe2>. I think the main thing missing here is a signal when an object gets added. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#2156>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAA5EXSDPK7MTFBSTEMNYMLP73SL7ANCNFSM4ICRISFA> .

oncer mentioned this issue Jul 16, 2019

use QHash instead of QList for Document::sDocumentInstances #2159

Merged

bjorn closed this as completed in b0c6ac3 Jul 17, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

loading tmx files is unnecessarily slow for big worlds #2156

loading tmx files is unnecessarily slow for big worlds #2156

oncer commented Jul 12, 2019

bjorn commented Jul 16, 2019

oncer commented Jul 16, 2019

bjorn commented Jul 16, 2019 •

edited

bjorn commented Jul 17, 2019

oncer commented Jul 19, 2019 via email

loading tmx files is unnecessarily slow for big worlds #2156

loading tmx files is unnecessarily slow for big worlds #2156

Comments

oncer commented Jul 12, 2019

bjorn commented Jul 16, 2019

oncer commented Jul 16, 2019

bjorn commented Jul 16, 2019 • edited

bjorn commented Jul 17, 2019

oncer commented Jul 19, 2019 via email

bjorn commented Jul 16, 2019 •

edited