New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The Great Terminal Rewrite #409
Conversation
We did a little more research on the OpenGL version results from the survey, and figured out that all of the 2.1 users are macOS users. This is because Minecraft uses the compatibility profile, which on the macOS driver, forces the version to 2.1 or lower. So, we are basically implementing the VBO renderer for the macOS users (and whoever else comes along with a 2.1 card, though that's very unlikely). |
This is a backport of 1.15's terminal rendering code with some further improvements. This duplicates a fair bit of code, and is much more efficient. I expect the work done in #409 will supersede this, but that's unlikely to make its way into the next release so it's worth getting this in for now. - Refactor a lot of common terminal code into `FixedWithFontRenderer`. This shouldn't change any behaviour, but makes a lot of our terminal renderers (printed pages, terminals, monitors) a lot cleaner. - Terminal rendering is done using a single mode/vertex format. Rather than drawing an untextured quad for the background colours, we use an entirely white piece of the terminal font. This allows us to batch draws together more elegantly. - Some minor optimisations: - Skip rendering `"\0"` and `" "` characters. These characters occur pretty often, especially on blank monitors and, as the font is empty here, it is safe to skip them. - Batch together adjacent background cells of the same colour. Again, most terminals will have large runs of the same colour, so this is a worthwhile optimisation. These optimisations do mean that terminal performance is no longer consistent as "noisy" terminals will have worse performance. This is annoying, but still worthwhile. - Switch monitor rendering over to use VBOs. We also add a config option to switch between rendering backends. By default we'll choose the best one compatible with your GPU, but there is a config option to switch between VBOS (reasonable performance) and display lists (bad). When benchmarking 30 full-sized monitors rendering a static image, this improves my FPS[^1] from 7 to 95. This is obviously an extreme case - monitor updates are still slow, and so more frequently updating screens will still be less than stellar. [^1]: My graphics card is an Intel HD Graphics 520. Obviously numbers will vary.
- Write to a PacketBuffer instead of generating an NBT tag. This is then converted to an NBT byte array when we send across the network. - Pack background/foreground colours into a single byte. This derives from some work I did back in 2017, and some of the changes made/planned in #409. However, this patch does not change how terminals are represented, it simply makes the transfer more compact. This makes the patch incredibly small (100 lines!), but also limited in what improvements it can make compared with #409. We send 26626 bytes for a full-sized monitor. While a 2x improvement over the previous 58558 bytes, there's a lot of room for improvement.
This uses the system described in #409 (or at least, how I understand it), to render monitors in a more efficient manner. Each monitor is backed by a texture buffer object (TBO) which contains a relatively compact encoding of the terminal state. This is then rendered using a shader, which consumes the TBO and uses it to index into main font texture. My OpenGL skills are pretty much nonexistent, so the implementation of this is no doubt terrible. However, the performance so far is outstanding compared with the current VBO renderer, as it transmits significantly less data to the GPU.
This uses the system described in #409, to render monitors in a more efficient manner. Each monitor is backed by a texture buffer object (TBO) which contains a relatively compact encoding of the terminal state. This is then rendered using a shader, which consumes the TBO and uses it to index into main font texture. As we're transmitting significantly less data to the GPU (only 3 bytes per character), this effectively reduces any update lag to 0. FPS appears to be up by a small fraction (10-15fps on my machine, to ~110), possibly as we're now only drawing a single quad (though doing much more work in the shader). On my laptop, with its Intel integrated graphics card, I'm able to draw 120 full-sized monitors (with an effective resolution of 3972 x 2330) at a consistent 60fps. Updates still cause a slight spike, but we always remain above 30fps - a significant improvement over VBOs, where updates would go off the chart. Many thanks to @Lignum and @Lemmmy for devising this scheme, and helping test and review it!♥
Given that most of the rendering changes have been merged, it's probably worth beginning to look into what changes can be made to our network code now. I guess I'm thinking the following steps:
|
Having talked with @Lemmmy, I'm going to close this for now. I really want to add incremental updates in the future, quite possibly using this design. However, we've made several pretty major optimisations to the network code (reducing traffic by at least 50%, often 75-80%), so this is less of a priority. |
The Great Terminal Rewrite
This is a series of PRs which aims to improve terminal objects all round, with particular focus on:
For the majority of CC's lifetime, the current terminal implementation has worked fine. But recently, especially with the improvements brought by Cobalt, people have been pushing CC further and further to its absolute limits. We've seen this a lot on SwitchCraft, with a huge number of computers doing a lot of work at once. The Juroku cinema especially (rendering 480p video at 20FPS on 16x9 monitors) has prompted some urgent improvements to the system - players frequently time out due to the bulk of serialising monitors inefficiently with NBT. This series of PRs aims to fix all of this.
The Plan
Terminal internals rewrite
Terminals currently use TextBuffers to store their data. This isn't an awful data structure, because it makes it very convenient to access lines, but as terminals get bigger and more of them exist, this can become a 'death by a thousand paper cuts' problem, particularly with the class instance overhead.
This class will be removed entirely, and terminals will store just three 1-dimensional byte arrays: one for text (chars), one for the background colour (0-15), and one for the text colour (0-15). This structure has some great performance benefits in that it can be trivially copied from/to (using the native
System.arraycopy
), and this data can be passed directly to OpenGL in the form of a Uniform Buffer Object or Buffer Texture (more on this later).term.blit
andterm.write
are now instantaneous operations with no loops involved (besides converting colour strings to byte arrays).Networking rewrite
Terminal serialisation is currently terrifying. If you've ever seen the printed pages NBT, you probably know what to expect. The entire terminal state (for monitors and computers) is serialised into NBT and sent to all clients in a structure that looks something like this:
This entire payload is recalculated every time anything on a terminal changes in a tick (I call this 'marked dirty'). As the terminal gets bigger and/or updates more frequently, the overhead of the nested blit tags is not negligible (and gzip can only do so much). The key takeaway here, though, is that the entire terminal state is re-sent to every player anytime anything changes (even just the cursor position!).
The new design is ultimately going to look something like this:
term.scroll
will be a separate packet or flag in the payload, so as to not require re-sending the entire state (as scrolling will mark every chunk dirty otherwise).term.setCursorPos
andterm.setCursorBlink
will send a minimal payload without invalidating any chunks if it is necessary.term.clear
payload will be sent.term.setPaletteColour
will also be a separate packet/payload.I believe that 8x8 chunks are a fair compromise in signal to noise ratio - there is only one (or two) additional byte per 64 characters in a terminal (thus, only 20/40 extra bytes in a regular computer terminal).
Implementation details
With this payload design, it will be easy to perform the calculations necessary to reproduce the terminal. For example, a 51x19 terminal would be chunked like this:
All you need is the terminal's width and height, and you can calculate the position and dimensions of any given chunk from there:
The payload for a single chunk will look something like this:

The implementation for other packets, such as
term.scroll
, is yet to be decided.The focus of these changes are:
Rendering rewrite (separate PR)
The final big change planned is a complete rewrite of the terminal renderer. The current renderer uses a messy mix of Tessellator quads and raw display lists. This is fine for GUI terminals and map-like held terminals, where there is only ever one on the screen at a time, and they aren't that large. However, monitors can get quite large, and there can be many of them visible at once, and updating as fast as the server will allow them. As such, the main focus for the rendering change (and most of the changes in the terminal rewrite in general) is monitors.
Hardware survey
After discussing a few ideas with Lignum, we decided to perform a hardware survey amongst the SwitchCraft userbase. This has provided us with a good sample size for understanding the hardware support of 1.12 players. As of writing, we have received 62 responses. Within those responses, 56 (90.3%) of users support OpenGL 3.1 or greater:
Thus, 56 (90.3%) of users support Texture Buffer Objects and Uniform Buffer Objects directly through these features being core in OpenGL 3.1. But, they are also available through their respective ARB and EXT extensions:
It doesn't seem that there are any users who have TBOs provided to them solely through an extension. Everybody who has access to TBOs has access to OpenGL 3.1. On the other hand, there are 3 users who only have access to UBOs via the ARB extension. As such, it's not impossible that there is a user who only has access to TBOs via the ARB or EXT extensions.
The new renderer idea
So, why do we care about those features? The new plan for terminal rendering is kind of wacky, but very possible. We're hoping to move the entire terminal renderer into a single shader. This shader will take in the following inputs:
This will unify all of the terminal renderers into a single class that handles all of the rendering, and all of the work will be done by the GPU. As such, we won't have thousands of quads on the screen just for each monitor in the world. The buffer textures can be cached (similar to frame buffers), and discarded when the chunk is unloaded. We should also be able to use some Buffer Object Streaming techniques to update the TBOs/UBOs efficiently.
Fallback renderer
Within the hardware survey's sample, there are 6 users who don't have access to TBOs, and 3 who don't have access to UBOs. These are the OpenGL 2.1 users. Despite being released in 2006, OpenGL 2.1 is still relatively common, particularly with users who are using open source graphics drivers, or Intel integrated graphics. Mojang raised the minimum OpenGL version to 2.1 in 1.8, so we don't have to worry about providing a fallback renderer for anything less than this. Regardless, UBOs and TBOs are still unavailable in a small number of cards, so we need to support something.
We decided that we still plan to scrap the current renderer, and as a fallback renderer, use VBOs. The Tessellator actually uses VBOs internally, regardless of the 'Use VBOs' option being turned off. As such, the current font renderer also uses VBOs, with the exception of monitors, where it's a half-VBO half-display list Frankenstein's monster. That said, there is a much more efficient way for these to be used, so we're going to be scrapping most of the existing code.
As such, our plan for rendering support looks something like this:
There doesn't seem to be much point in supporting the extensions for TBOs at current usage, but if there seems to be appropriate demand, then it will look something like this:
With an appropriate level of abstraction, all of this would be easy to manage, and all the terminal renderers would use a common class. It would be a matter of swapping out the function calls and changing a few arguments to support the TBO extensions.
The Roadmap
This PR is still very much a work in progress, and is currently not ready for review. It is mainly here as a marker for progress.
term.clear
term.scroll
term.setCursorPos
term.setCursorBlink
term.setPaletteColour
This PR will definitely need to be rebased later on.