New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize the layer move processing and GFX render #33
Comments
Thank you so much for your work on this project! Your source code is ported very clearly and has been wonderful to work with. I am working on a 3DS port of the game and have fixed this issue by adding
to |
@ds-sloth, yeah, I think about making the similar thing, but seems you can contribute this while I focused on other things, and so, you'll need to refresh those lists once every individual block or NPC got added/removed/resorted/etc. Also, there's is need a major optimize on the rendering, because here is a weak case when it scans through all element arrays to find visible one, and in some cases when layers with blocks are in move, the block sorting check gets disabled and then the full scan gets every gfx update, that a reason why on some bigger levels my game got lagged while running my Archosn c70 tablet. I'll review all your changes you'll make once you'll show them 😉 |
It will take me a little while to upload. I am actually re-implementing the editor GUI for 3DS so there is a lot of work I did and still have to do. I agree that using a quad-tree will be very good but that is beyond my programming skill. :) |
As I told you, the quad-tree implementation is already at PGE Engine, and it needs to backport it. It should register all known objects inside self, and when they change their state, they should be updated at the quad-tree by the
I still keep in my mind some sort of the "Nostalgic Editor", and the first thing I would use the ImGUI library over SDL2. You will don't need to use any heavy GUI frameworks like Qt or WxWidgets, and you don't need to make any sort of platform-dependent solutions like WinAPI, GTK, or Cocoa. The ImGUI is also used for various embedded solutions, one of them was a... robot! P.S> The library itself is here: https://github.com/ocornut/imgui Keep the note, TheXTech does use the SDL Render API to be simple and fully platform-independent, so it should work even on a smart kettle that has no OpenGL on board, etc. |
What if we made one quad tree per layer? For bricks, I don't think they ever move except when their layer moves so this could save a lot of processing power. (This would also save processing time for the move layer function.) The game uses absolute positions but I think it should be pretty possible to switch it to positions relative to layer..... Can you link your quad tree code? I don't have much time these days but would like to take a look and learn what I can so I could include it someday. I made a silly GUI toolset but it works. Very minimal and hard to maintain. It just uses frm main render rect to simulate buttons. But it works well on the 3DS which has a 240p screen (!). One could definitely replace it on PC but imgui seems to depend on a 3D rendering toolkit and I don't think it supports the 3DS's. |
That will require the recursive search and the computation of the coordinates from local into global, I already did this at PGE Engine where every layer has its own quad-tree subtree. That adds the necessary to make a more tricky way on physics. At least, here the optimization should be applied with the few changes as possible to guarantee the full compatibility of physics workflow. In the case of TheXTech, adding the trees-per-layer will require a lot of changes in physics computation that will require the two-way recursive coordinates conversion.
|
@ds-sloth, hello!
That should resolve the problem of the Dr.Pepper Pyramid (at the "2k15 Summer Takeover" episode) where the result lags on my Orange Pi PC with the Mali400 processor because this damn querying 12k elements in one frame, and calls the same 12k elements for every playable character and every active NPC. That would be tricky because there are some queries were depends on the in-array order, one example is when the player stomps the SMW Koopa, it spawns the beech-Koopa at the shell. I accidentally made a bug when the just spawned beech Koopa got immediately stomped by the player. I fixed that when I found the mistake in the way of the array loop. The cases I should pay:
|
The world map code looks great. I haven't merged it into my 3DS build yet because I am trying to rebase my 3DS build on your current code and integrate the two branches. But it seems like it will speed things up a lot. Without your optimizations, the world map runs at 60fps on the New 3DS but only at 30fps on the Old 3DS. As you make these optimizations, please keep in mind to separate the different classes of NPCs / BGOs in your quadtree so that we do not need to loop over the full list three/four times and then check whether the ID is in the right set. (For BGOs this is solved in the original code by sorting the BGOs by class, but that doesn't work in the Level Editor and we could make it smarter with the quadtree representation.) |
You can do your merge gradually, and some good news for you as I don't plan any major changes on the sooner time except the quad-tree optimization for levels as I really wish to solve the DrPeeper problem on my damned ARM board.
At the world map, I made the own quad-tree for every type of object, I guess, to simplify the process I'll make the same for level objects. Mainly, because the game does per-type searches. The only case when I will need to mix up all objects is the rendering system, otherwise, do keep the same Z-groups as vanilla does, per type, and the resulting Z-value. |
Great, thank you. |
The source of slow levels is the "2k12 - Summer Takeover" episode, there are next levels confirmed as slow:
Others I don't know as I didn't reach them yet on my ARM board. All these levels have a lot of blocks and moving layers by X. The fact, another reason to lag:
|
btw, the "Dungeon of Pain" at "The Invasion" also lags on my Archos 70c tablet. The "The Night" at "Another Princess Cliche" gets lagged too on my tablet. |
Redigit had to make the one-dimensional tile map, that allows finding a few blocks by X-axis, however, if you do have a lot of blocks by Y-axis, you will lag, even no moving layers will be on the section, he-he 😼 Mainly because EVERY NPC will query for these blocks. |
So, the other argument to make the quad-tree instead of this mess, then, no matter which dimension is, all should work quickly, even the section will contain a million objects! |
Btw, there are two features I do plan to make in the sooner time:
|
Those both sound great. |
The raw FreeType I would add (would it will harm you or not?) I do use it directly, I even don't use fontconfig, I do use only fonts placed into the config pack's directory at PGE Engine (and will be additional "fonts" directory at game assets that will add the support for new fonts). I will keep all fallback to:
As I keep the full backward compatibility to old resource packs, especially because several users do a lot of mods over them, and they will have the pain to synchronize everything. But, to simplify some cases, a good idea to provide the sort of resoure packs patches to add new resources and allow new features used with them. |
FreeType could be difficult because I am using citro3d/citro2d (3DS specific rendering library) due to compatibility issues with SDL on 3DS. The 3DS stores textures in a non-linear memory layout so it might be non-trivial to render a FreeType font to it but I am sure I will be able to. |
Don't worry, I do a manual conversion of every used glyph into the raster fragment that I do pass into the renderer as the same sort of a texture 😉 |
You can see how they work at PGE Engine already: The |
So, what do you think on this? |
Yes, I think I would just have to make a wrapper to convert the glyph into the 3DS texture format instead of the GL texture format. Not sure how much work that would be but it should definitely be possible. Thanks for the link. Also, just spent some time making Background2s work at 1280x720 and I can see why you want to rewrite a lot of that code. |
@ds-sloth, I finally tested your branch on my ARM board, and it REALLY works much better, the DrPeeper level, so, look on the stats: However, if I toggle the SNES resolution, it renders with 65 FPS fine as less textures gets appear. It's the debug build without optimizations, yeah. Later I'll try to build the release build and try this again. Anyway, GPU (Mali4000) that I have here on the board, is weak by itself. |
Status update for anyone following this: block quadtree has been merged into It makes block logic and rendering MUCH faster in large levels, vertical levels, and levels with layer movement. Makes block logic somewhat slower in "normal" horizontal levels. Wohlstand and I will probably want to continue thinking together about optimization in order to recover this performance before closing this issue. |
Okay so, I gonna manually backport this update into the stable 1.3.5.x branch as this is a very important optimization that should appear at the next release. |
Adding this to 1.3.5.x will not be trivial because some of the changes are tied to previous optimizations I had made. I believe you will want to cherry-pick commits 606fa4c, d030913, 8a75b78, 02f927f, 18a2af7, f6b2874, and 207f4e3. Maybe more. Let me know if you have any questions and I'll be happy to help. |
Okay, the rest of this work was done and successfully merged into the mainstream. So, we can close this. Anything for the future (making quadtree for BGOs, warps, physical environment zones, etc.) should be done separately. |
Since the original game, there is a VERY INEFFECTIVE algorithm of the layer move:
https://github.com/Wohlstand/TheXTech/blob/795b5dc6db3e977622b57844ca5b439bfada9e92/src/layers.cpp#L733-L754
To move one layer (that may contain few blocks only), it does a loop through the entire array of thousands of blocks every cycle!
On modern devices that may don't affect the workflow. On weak devices, it will cause the strong lag and FPS to slow down.
Instead, any per-layer processings must be done through layer member lists which avoids the fetching of the entire block array every cycle. That may require checks on blocks/BGOs/NPCs array modifying that may happen for various reasons.
EDIT: Also, it's need to optimise the GFX rendering to avoid the full scanning through all elements arrays. That would take the use of the sort of a quad-tree class (there is one used at PGE Engine, can be taken for a use). There is a challenge: it's need to go through all cases where objects do change their scene position and size, and then, update them at the quad-tree.
The text was updated successfully, but these errors were encountered: