Skip to content

WHY NOT: serialVersionUID

Reinier Zwitserloot edited this page Dec 23, 2018 · 3 revisions

What is the problem?

There are lots of syntax/java checking tools, including IDEs and various linter tools, as well as lots of hints by way of tutorials, templates, and code generators which all seem to strongly suggest that:

  • your classes, if Serializable should have an explicit serialVersionUID field.
  • Tools can generate this field for you.

They do this usually by generating a warning if you do not have this field.

Note that many classes are serializable; any time you extend a class that is serializable, that class also counts. For example, any time you extend java.lang.Exception, you're writing a serializable class.

The first claim is dubious, even though the serializable documentation does mention this. The second claim is flat out wrong. As a consequence, lombok will never generate this field for you and any requests that lombok will do so, will be denied: There is no point to doing that, if you think there is, you probably misunderstand the serialVersionUID mechanism.

What does serialVersionUID do?

Java's built in serialization mechanism turns instances into bytes, letting you save them to disk or send them to another JVM via a network connection. However, the entire mechanism is designed around the notion that the 'receiving' JVM has the exact same .class files available to it, as the 'sending' JVM. When saving to disk, that implies the application that's using java's serialization mechanism to save objects to disk never changes, and for network setups, that both sending and receiving JVM have the exact same binary version.

That's because the VM has no (good) recourse for mismatches; it cannot automatically handle it. If the version you serialize with has a field named 'foo' and the class file you're deserializing with does not, the JVM has no idea what to do. Worse, imagine the definition of what the 'foo' field holds has changed between versions: If java then tries to deserialize, the code silently does the wrong thing.

Thus, there is a mechanism: Any given instance being serialized remembers both:

  • The fully qualified classname of the object being serialized, for example, java.util.ArrayList
  • The serialVersionUID of that class (that is, java/util/ArrayList.class)

And when deserializing, there has to be some class named java.util.ArrayList available to load, and it's serialVersionUID must match exactly.

ALL classes have a serialVersionUID. If you have a static final long serialVersionUID field, it's the value of that field. If your class does not have that field, it's whatever the serialver tool (which is still part of the binaries of a standard JDK distribution!) says it is.

This messing about with serial versions occurs even if you use the serialization 'escape' mechanism of writeReplace/readResolve, or writeObject/readObject; both of those mechanisms kick in only after fully qualified classname+serialVersionUID checks have run.

Why does that mean everybody's wrong about it?

The obvious right way to use serialVersionUID is therefore clear: When you write your first version of some to-be-serialized class, the default serial version, as printed by the serialver tool, is just fine. So long as this is the only version that exists anywhere, serialization will never fail due to serialVersionUID issues.

The day you update your source file such that a new, second version will be out there, there are 4 things you can do:

  1. Forget all about two-way binary compatibility. Let's be honest, this is how it usually goes down.
  2. Review the need and possibility of two-way binary compatibility, and decide that you do NOT want two-way binary compatibility.
  3. Review the need and possibility of two-way binary compatibility, and decide that you DO want it. You write tests to ensure this, and you probably end up having to use the escape mechanisms of readObject/writeObject or writeReplace/readResolve to ensure it.
  4. Realize java's serialization mechanism is really bad and try your best to ignore its existence.

In case #1, it would be very bad if your lack of awareness of having to spend some time thinking about this results in security leaks. Note that there have been many security leaks in java apps due to serialization issues so this is not a theoretical worry!

Having the serialVersionUID change anytime you change any signature of the source file is a good default: As long as your class does not have an explicit serialVersionUID field, if you forget to review binary compatibility, a new version will end up being not binary compatible. This is a sane default.

This is why (almost) all tools are wrong: They ship with code quality plugins that remark on serializable classes without a serialVersionUID field as being wrong. This is incorrect; the vast majority of your serializable classes should not have a serialVersionUID field, because this gets you the desired default behaviour of no binary compatibility.

In case of option 2, where you did review things and explicitly decided that you do not want binary compatibility, not having that field is again the right choice: Assuming you changed any signature anywhere (you removed or added any method, even a private one, or you changed the list of exceptions, and/or return type, and/or any parameter type, of any method, even a private one, or you added/removed any field, or you changed the type and/or name of any field), the implicit serialVersionUID changed, and thus your two versions are automatically marked as incompatible.

In case of option 3, you should put in some effort, and write some tests to ensure that you have two-way binary compatibility (serializing with v2 of your class results in data that correctly deserializes into v1, and vice versa). In this case, let v1 continue with its default implicit serialVersionUID (you have to; that class is already out in the wild and you can't change its behavior anymore), and mark v2 with an explicit serialVersionUID that matches v1. Use the serialver tool on v1 of your class, which you clearly still have access to, to find it.

Unfortunately, once you have marked v2 as explicitly compatible, you lose your nice defaults: Now, if you touch this source file a third time and this time you forget to think about binary compatibility, because of the fact that the serialVersionUID field is still in there, your v3 is by default assumed to be binary compatible with v1 and v2. There are no practical solutions to this problem. I guess you are committed now to making sure to review binary compatibility, and if you fail to do so, security leaks and bugs will ensue.

In case of option 4, it mostly doesn't matter, but you definitely want your linting tools NOT to complain, and, hey, the best default is for any attempt to serialize things anyway to fail as fast as possible. Having each serializable class carry an implicit serialVersionUID is the best choice then.

Lombok cannot generate anything useful for you in ALL of these scenarios.

Actually, don't use java's serialization mechanism

There are many, many security leaks in java apps that are due to serialization issues, and binary compatibility is only a tiny part of why serialization is such a security mess.

The format that the java serialization mechanism uses is technically open, but its highly convoluted, so if you ever want to replace the mechanism with something of your own making, the code to read/write this binary format is very complicated, and as far as I know, there are no nice libraries.

You will never be able to write a tool in not-java that can read or write this data.

Trying to set up a system where you correct past mistakes by reading in serialized data and writing it back out in a more sane format is very difficult and requires class loader shenanigans, as you need to keep a version of your old class around.

You're limited into either two-way binary compatibility (which means newer versions are also attempted to be loaded into older versions, not usually what you want) or no compatibility at all, and these aren't good choices.

Don't use java's serialization mechanism.

Write your own explicit binary format, for example using protobuf to make it easy to write the code to read/write it, or, use Jackson to serialize/deserialize your java objects into well known formats (for example, JSON), and using a considerably better API to let you configure exactly how your instance ends up as bytes on disk/on the network.

Configure your IDEs, commit hooks, and linting tools to disable any warning about missing serialVersionUID fields.