Feature: store table headers for each chunk in savegame #9322
Motivation / Problem
So there is a bit of a story here.
First off, the main reason I started this change, is because I wanted to move around some settings. But the order of settings that are stored in the savegame cannot be changed. That is really annoying, as there isn't really way around that. Sure, we can change the order, do a savegame version bump .. but that also leaves us with a huge list of the original order to make sure old games load. I wanted this fixed.
Next, we quickly figured out this was an universal issue with our savegame format. It allows for no modification of order of anything .. often leaving the code a bit weird and odd, because of what is stored on disk. This tightly-coupled information has made many mistakes in the past.
Lastly, for a while now we have somewhat the ambition to move to SQLite as savegame format. But getting there from where we were, is far from easy.
Those three things combined gave me the motivation (hihi, same word, different context) to write this PR.
To decouple what fields are on disk and what we are expected, I basically made the savegames self-descriptive. With that I mean there is enough information in the savegame to know how to read the savegame, without having to know the savegame version or any internals about how OpenTTD loads the game .. well, other than the format, ofc.
The idea is simple: all chunks (with a few exceptions, see below) are now
The header is very easy: a list of (type, key) pairs, telling in what order the keys are stored, and what type the field is. The list ends with a type=0 entry.
The type-field is a hard-coded list, and can have 1 flag: is this is a list or not. If it is a list, when reading the entry the field will be start with a gamma telling you how many items to expect.
So how does OpenTTD react on headers that are different than what he expects?
These last two are rather important: if the adding or removing of a field influences the gameplay, you should also bump the savegame version. Often this will mean: only removing doesn't need a bump, adding most likely does. But this is up to the developer.
To keep backwards compatibility, in
Finally, to top off this PR, I created https://github.com/TrueBrain/OpenTTD-savegame-reader . This is a very simply Python application that only knows the savegame format. It doesn't know anything about any chunks. And still it can read savegames made with this PR fine. The reader is meant as verification that the client indeed creates savegames that can easily be read by external tools.
I did some extensive testing with this:
I have a test-suite of ~500 savegames, ranging from TTDp savegame up to 1.11.2. A total of over 100 different savegame versions.
And except some minor totally unrelated bugs, this is in fact the case. That shows that at the very least everything that was loaded is saved and loaded correctly again.
Lastly, I did a lot of testing by saving a game with this PR, and then removing fields from the code. That all seems to work out fine .. I can even remove complete structs, and it will do the right thing.
Checklist for review
Some things are not automated, and forgotten often. This list is a reminder for the reviewers.
The text was updated successfully, but these errors were encountered:
Where do you envision this glue code going? There doesn't seem to be an obvious way to handle something like a particular field being made wider at a particular savegame version with this scheme.
For now, I suggest we keep the current system in place. So if you want to change the type, you will have to bump the savegame version and add a line for the old and new version, with the correct conversion attached to it. And there is always AfterLoad to fix up any real weirdness we might run into :D
Possibly the last commit should go in another PR, but I have been experimenting with adding a header for more chunks, mostly the
Basically what I did, is add two new chunk-types:
Next, I made sure
I converted just a few, and not all yet, as it is very boring job, and I first like to get some opinions if this is something we would want.
Putting this in a header is not really possible; I solved this by making the header
I am sure I will find more of these places as I convert more, but first, let's talk if this is something we want :)
To me this looks okay. There's only one thing that we should be aware of. Merging this means that the "variable" names of objecs that are saved in the savegame do become fixed, i.e. part of the savegame format. Changing the variable name means having a savegame bump, or at least noting that the variable is named differently from the name in the savegame. So that gets uglier in the future.
A very valid question. Two things that I think are worth mentioning in this context:
I considered adding both in this PR, but this would add unused code to our code-base, in the hope it fits a future use-case. That I always find a bit tricky, so I opted to not do this. Mainly as we haven't renamed fields like this in years .. and I don't see a reason this would change any time soon :D
Another thing I considered here, and this is what you mention, is that although the name becomes part of the savegame, it is still an "internal" name not meant for the player. It does make tooling much easier, but I think they can live with weird names (they have to assign meaning to it anyway). This is why I opted not to change any names or fields for this PR, as that would make this already insane deep rabbit hole even deeper :P And also a lot larger .. as we have a lot of weird names :D
So in my opinion, we should keep it for future-work to rename fields, and add the "alias" list in some form. But in all honesty I am not sure if/when this will be picked up by us .. so I fully accept if we think that is too much of a risk, and that we already should do it for this PR. But this is I think exactly what @rubidium42 is also mentioning here :) So this extra text purely for context :)
I thought about it for a while but I still fail to see a good reason for changing the format other than "it just happened".
Because when I think about tooling having an external schema/metadata would make much more sense imo as with minimal modifications that would allow parsing any savegame version and help other applications that do not work with savegames directly like server configuration system.
TBH, to me this sounds like a very good reason not to merge this PR. As a PoC for something that's still untested and undecided yet will leave a technical debt for as long as OpenTTD supports old savegames, which is seemingly an eternity.
This reason is very hard to judge without specifics as, in my mind, mistakes are more about the code and documentation than the format itself and I fail to see how adding additional conversion layer and extra flexibility along with voiding all the previous changes as possible examples for further modifications will help to reduce mistakes.
Even if settings ordering is such a huge problem why not solve it in some other way? Like, making it explicit, for example. Or, even the same table-like format can be made with integer unique field id's instead of strings eliminating the whole aliasing issue entirely.
I mean, I can see some advantages in this table/json-like self-describing format but none of them seem to be even mentioned in this PR. For example:
But, question is, how much of that is really a concern for the savegame format?
You seem to be confusing some things and taking other things out of context. This in term seems to confuse some other people again :D I already gave this a spin on Discord, but let me try here too.
First of all, you seem to be hang up on SQLite. This is one of 3 (!) motivations, and the lesser of the three (which is obvious by the way I wrote the motivations). This PR moves towards that, but this is not a PoC, this is not "untested" nor is this adding to technical debt. If anything, pretty much the reverse (it removes technical debt). I am not sure why you are zooming in on this one part about SQLite, seemly ignoring the rest. Using SQLite as goal just shaped the way of the header, in the midst of the other choices available. Means to an end, not any end on its own. And most important of all, not part of this PR at all. Just because someone says: we are going north, doesn't mean we are going to the north-pole. It just means we take a direction to walk to, and we will see how the scenery looks when we get closer. Clearly this doesn't help for you, but many other people find it useful to know we didn't take a direction blindly and "just did what-ever", but had an idea in our mind when shaping this PR.
Next, you seem to be a bit mistaken by what this PR does. Primary, as is apparent from the history of this PR, its motivation, and its comments, this allows settings to be in any order, instead of the current hard-coded fixed order. From this grew its superset that applies this identical solution to all chunks.
Lastly, you finish with advantages of this PR, leaving me puzzled. From where I am standing, you are complaining for the shake of complaining, and you are not presenting any argument why this would be a bad idea. So I also don't really have a way to help you here, as "I don't like it" is a valid opinion, but hardly an argument one can have a conversation about.
So just to reiterate what the PR does / PR description says: this PR decouples code-order from savegame-order, so we can freely change order of fields in the code without breaking savegames or having the write compatibility code for this. This finally makes some things possible, most of all reworking the huge
And could this have been done an other way? Yes, of course. But this is always the case, no matter what. But this is not an argument not to do this. Just merely an observation.
If there are arguments against this, I would love to hear them. But so far all I read is: "I don't like it". Which is a perfectly fine opinion; just not one I share :D
Isn't the disadvantage blatantly obvious? You get two formats to maintain instead of just one. So now every tool that needs old save games needs to parse both. Every tool that doesn't needs a rewrite anyway. Every change to the savegame needs to consider both formats. Every old patch is broken, everyone who wants to make a new one needs to learn how and there a no examples. And none of it can ever be undone because it's needed for backward compatibility. How's that not a HUGE technical debt?
Sure that will make some changes easier in the future but it also doubles the logic in some other places making a lot of other changes more complex. So now the question is, is it really worth it? And this is where it gets weird because you mention a lot of stuff that it's supposed to be good for but when I look at anything specific it seems that it doesn't really solve it well and there are much better and simpler solutions. Which makes me feel like this PR is a solution looking for a problem. And that's not the right approach as it mixes cause and effect and makes any constructive conversation impossible.
Why am I hang up on SQlite? Because this PR is a huge leap in SQLite direction. In my eyes SQlite is the only reason that explains every decision here perfectly. Yet you keep pretending that it's not really relevant and just some mythical lesser "direction". So I come to the conclusion that what's really happening here is that you're preparing the code for SQLite and everything else is just an excuse to sneakily justify this step.
You say it's good for tooling? Well, what tools did you consider? All I see that it's somewhat helpful for diffing and very simple viewers/renderers.
You say settings need to be reordered in ini file, why so? For all I see there is not much problem with existing system, if anything the fact that settings are already split in multiple files makes it only harder to deal with. And how important reordering is anyway, why are you willing to sacrifice the convenience of other changes like renaming for it? Also you removed the reason of separating private settings from the motivation so I didn't comment on that but from what I understand you could've done that anyway as they are not stored in the save file. And even if you want to reorder and split everything so badly why not just add an explicit order? Or some calculated order? It's a minor change and much easier to maintain than name and type and old order that you need anyway. Why does the savegame format itself has to be orderless to make code better? It could always use the same mechanism it uses for old savegames and get rid of all the logic doubling for even cleaner code. Also, it would be much better if instead of abstract words you showed some actual code that could benefit from having an extra orderless format.
And I finish with advantages of the format because that's what I could see as possible reasons for this change yet they're left completely out of scope of this pr. I could offer some suggestions on how to make it more useful for my patchpack, for example, but what's the point if that's not an official goal of this pr?
P.S. Basically, I do a lot of stuff that's affected by this pr: tooling, pathpack, prs and from what I see the marginal benefits it brings are grossly overwhelmed by the extra complexity it adds. That's why "I do not like it".
Holy crap dude ... you have some nerve. I spend a lot of time on you, explaining this in several ways. And STILL you insist I lie to you and try to deceive you? This is a problem on your side, not mine. Feel free to assume everyone is out to get you, but in reality, what I said earlier is true, and nobody is pretending. But .. when someone says things like you said, in my experience, it is pointless to argue, as you already made up your mind. I can only repeat earlier statements: this is not true. Or to put it in modern words: you created an alternative fact. Feel free to bite in it, but in reality, it does not exist.
I think I said all I can say about this.
Just to help fellow developers understand that this is a complete alternative reality that is being created here, lets digest:
I cannot state enough that you do not seem to understand what this PR does. And I run out of energy to explain .. all you currently do is confuse other developers by making them think things are happening that are not really happening. I really did my best trying to explain you things, but as you keep skipping the "ask questions" part, and keep jumping to the "I made up my own story here", it is REALLY difficult to explain this, otherwise pretty simple, PR.
Let me paraphrase what you write: "I am not saying you are lying, but I am thinking it". Yes. Great job dude. you made this a whole lot better.
Just because something doesn't compute in your worldview, doesn't mean we are trying to hide stuff and sneak things in here. If I would want that, I would have gone about this completely differently, and never even mention it. I mean .. wtf dude.
We are back to what we talked about before ... you saying weird stuff, to which we have to defend. Again I advise you to ask questions. Be curious. "how do you see this happening?", "have you considered this?", etc. That is a conversation worth having. This constant bashing isn't gaining anything to anyone .. and all you get in return is this shitty post of mine as I run out of energy to make it polite. And I am sorry about that, as I hate to be like this .. but I am also out of ideas here.
This is such a shitty thing to say. You seem to forget this is not a company building a game. We are Open Source. Most of us do shit because we enjoy doing it. So how does this remark contribute to that? It is just shitting over this PR for no good reason.
First, you bitch you want a motivation. I write a lengthy motivation and description; next you bitch you have an idea but don't want to share because it is not an "official" goal of this PR? Wth? How did you drift so far away? Please get back to shore, where we can all walk in the field of dandies and enjoy the sunset. There really is no need for this .. if you have a good idea, share it. We do not have "official goals". I wrote this PR because I had an issue related to this in another PR .. that happens with rabbit holes. And others suggested to broaden the scope a bit. I don't mind. I enjoyed writing this PR. So now this is here. And if you have a good idea that would make it even better, share that idea.
I really think you somewhere took a turn, and got in a bad vibe when it comes to this PR, and completely filter everything you read in relation to it in a very dark black colour. That is really not needed, and really not as constructive as you might think. Please, come back to shore .. please, let's walk in dandies ... please don't insult me and call me a liar. I would really really appreciate that. Thinking we have alternative motives is .. really not true, as we are very very open about our aims and goals.
You realise that when I tried to talk about that lengthy motivation you bitched that I'm doing it wrong and now you're suggesting I should instead talk about things that weren't even mentioned anywhere before?
If pathes/patchpacks are a part of this PR, add that to motivation. If tooling/forward compatibility is, add it. And if SQlite isn't why is it there?
This "is a lie". You change the core principle of how the data in the format is identified. Instead of it being done by order it's now done by string id/name. You constantly try to downplay the significance of it but to me that's the very definition of format change, adding sqlite after that is just a technicality. Everything else I talk about is just a consequence of this change.
This "is a lie". If that was the case why did you change the way the game parses it? You could just use the same old way simplifying the codebase. Where are any guarantees that field order can still be used for identifying the field? For old format that was it's very nature, for the new one it's the side effect of the implementation if anything.
This "is a lie". That is a very strong forward compatibility guarantee. But not only there is not a word about it in this PR but I don't even see how you could possibly keep it for any tool that actually works with data rather than just dumping it mindlessly like your viewer. For example, how is it supposed to be forever compatible if fields it works with are removed?
Here are some basic changes I could think of that could be done to the data:
For most of them, there are some examples in the changelog for the old system. Where are examples or explanations for the new one? You seem to be trying to say at the same time that any change is done exactly as before and that the new system doesn't require a savegame version bump now. Those two statements contradict each other, so where's the truth? For all I see renaming is pretty much impossible now as aliasing or whatever is required for it isn't even there and if it's added it's likely going to affect other operations as well due to the necessity of avoiding name conflict between active fields and old aliases.
If tooling is the goal, how are tools supposed to be made to "forever be compatible" with these changes?
You assign way too much value to "the goal of this PR". That is not how OpenTTD works, or I think Open Source in general. We do stuff because we had an itch, we scratch that itch, that results in some code changes. You keep trying to force me to make up motivations why I did this PR, but I am not going to do that. I am not going to retroactively make up stuff .. that is not how I roll.
Why I wrote this PR, why it exists, all has been said and done. You clearly don't believe it to be true, that is your right. But there really is nothing more to say about this. Please stop trying to corner me in the hope I say something so you can go: "HA! Now I got you!". That is weird.
If you see other possibilities for this PR we hadn't considered, speak up. If you think other doors are closed, speak up. But don't keep saying: "it is not part of the motivation so I am not going to talk about it". That is not what the word "motivation" means.
I am really unsure where you got this from, as this really is not what is going on. At the risk of repeating myself: this PR adds a header to each chunk to indicate the order of fields. That is all. It is an extra step in the save/load process to store the order in the savegame, instead of only in the code. This decouples code-field-order from savegame-field-order.
Again I think you are misunderstanding this PR. I am pretty sure I know the code I wrote .. and I am very sure (said sarcastically) you can skip the header and still be able to read the savegame like you used to. This is a rather fundamental part of this PR. See the markdown document about the savegame format for more details. But to repeat myself: the only thing that changes on disk is that
This are two totally different things you throw together. So lets split them apart. No, this is not a lie, as that is what this PR actually does: add information to the savegame so the order of fields can be understood. This means that if we reorder fields, no tooling has to suffer.
The other thing you ask: but what if tooling tries to understand the records, more than just read. Interpretation of chunk-data is always up to the tool of course, as there is nothing we can otherwise hint to what its function is. What I am a bit puzzled about is what you see as an issue here. This was already the case, and will remain the case. If a field is removed, and the tool was putting a value to it, nothing is changing: before this PR they would have to adjust, and after this PR they would too. I must be missing something here, as this seems a partial question of a use-case you have in your mind. But as you talk a bit in riddles, I am left to guess what that use-case is.
So lets take an example: we remove the
The only edge-case, which I pointed out a few posts back, is when we start renaming fields. That is the bit awkward part in general, and we have to find a decent solution if/when we do this. But as it is rather unusual / unlikely for fields to be renamed, this is not a concern we currently carry.
Maybe you misread what I write about "compatible". I am talking about how data is stored in the savegame, how you can read the information out of it. Not about interpretation of the data itself. That is, and most likely always will be, a problem of the tooling. But this PR doesn't change that in any way shape or form. That was an issue before this PR, and that will remain an issue after. Most likely even till the end of time.
This is a lot of things mixed together, and you pick a single one out of your lists and say: this is not handled by this PR so this is wrong. But that is heavily misrepresenting this. And I think this is why you see a contradiction where there isn't one.
The description also mentions all of these cases, but lets go through it again, worded slightly different:
As for "there are no examples" argument, I hope you agree that for 1, 2, and 3, examples are .. not useful. I mean, 1) is changing swapping two lines of code. 2) is adding a new line of code. 3) is removing a line of code. Hardly worth an example. This is why the description puts it in words.
You seem to misunderstand what I wrote: tools can read the savegame in full with this PR for ever to come. This in contrast to before this PR, where it is absolutely needed to know each chunk in order to digest the savegame safely. What the fields and values represent, that is of course another story. But that is not what I was referring to. Being able to always safely read the savegame is a result of this PR.
I have no clue what you try to ask here. This is very vague and could mean a lot of things. You would have to be a bit more specific.
Despite the tone of these questions, I am happy that you are asking questions. I can answer questions. I cannot answer assumptions .. as I would first need to understand what the assumption was you made. So keep asking questions, that is a good thing. Tnx for that!
I'll skip most of the things here, but if reordering should be allowed without any savegame bump, any tool is required to use and parse the chunk header, somewhat contradictory to saying that tools can just skip the header and read it as before.
Not what I meant, but fair point. What I meant to say: you can skip the header and do it the old-fashioned way, where you of course need to know the order of the fields to decypher the bytes. That was the case before this PR, and is the case after. So if you don't want to use the header, you still can.
Basically what I try to say: this PR really only adds a header to assist reading. By no way it is mandatory. But of course not using it does require you to know the order (and type) of each record another way.
Does that clear it up a bit?
Edit: on third read, you might mean that without a savegame bump you cannot know the order anymore. That is a fair point. So yeah, that was a silly statement of mine that you can skip the header .. after the first reorder, you can't practically do that anymore. Fair point :)
We won't be able to make it fully self-descriptive (looking at you MAP-chunks), but anything else can. With this framework, we can add headers for each chunk explaining how each chunk looks like in detail. They also will all be tables, making it a lot easier to read in external tooling, and opening the way to consider a database (like SQLite) to use as savegame format. Lastly, with the headers in the savegame, you can freely add fields without needing a savegame version bump; older versions of OpenTTD will simply ignore the new field. This also means we can remove all the SLE_CONDNULL, as they are irrelevant. The next few commits will start using this framework.
We no longer need them. If you want to remove a field .. just remove it! Because of the headers in the savegame, on loading, it will do the right thing and skip the field. Do remember to bump the savegame version, as otherwise older clients can still load the game, but will reset the field you have removed .. that might be unintentially.
IsSavegameVersionUntil() did a [0, N] check, not [0, N) as the name suggests. Until can be a confusing word, where people consider it to be including the upperbound. Dictionary states it means "before", excluding the upperbound. There are long debates about who is right. So, simply remove away from this ambiguity, and call it "before" and "before or at". This makes the world easier for everyone.
…eries::Load num_liveries indirectly contained the same information, but this makes reading these things pretty difficult. So use IsSavegameVersionBefore() like everywhere else instead.