Force-discard cache if cache format changed #20152

ilevkivskyi · 2025-10-31T15:21:25Z

If either low-level (i.e. librt) or high-level cache format changes, discard the cache. Note I intentionally don't use librt to read/write the first two bytes of cache meta, se we are 100% sure we can always read them.

JukkaL

Looks good, I only have a few potential improvements and a question.

JukkaL · 2025-10-31T16:23:27Z

mypy/build.py

+        if meta[0] != cache_version() or meta[1] != CACHE_VERSION:
+            manager.log(f"Metadata abandoned for {id}: incompatible cache format")
+            return None
+        data_io = Buffer(meta[2:])


The slice operation does an almost full copy of the meta buffer. It's probably not a big deal, but it would be nice if we could avoid it. What about adding read_byte operation in the internal API (that would be just an alias to read_tag for now I guess) that could be used to read the initial two bytes? (No need to do this in this PR.)

Could also maybe slice the memoryview?

FWIW I just measured this slice on a "micro-benchmark", it takes around 0.1 microsecond per file (interpreted). So even with 10K files in the build this will be an extra millisecond. It is probably even less when compiled. I will add a TODO here and below.

JukkaL · 2025-10-31T16:25:43Z

mypy/build.py

        meta.write(data_io)
-        meta_bytes = data_io.getvalue()
+        # Prefix with both low- and high-level cache format versions for future validation.
+        meta_bytes = bytes([cache_version(), CACHE_VERSION]) + data_io.getvalue()


Similar to above, the concatenate operation does a full copy of the cache data. We could add a write_byte operation to write the version bytes to a Buffer object in a future-proof way.

JukkaL · 2025-10-31T16:28:55Z

mypy/cache.py

 conventionally *does not* read the start tag (to simplify logic for unions). Known exceptions
 are MypyFile.read() and SymbolTableNode.read(), since those two never appear in a union.
+
+If any of these details change, please bump CACHE_VERSION below.


Do we need to bump CACHE_VERSION also if the structure of the meta cache serialization format changes? We might not able to read the mypy version info field from the meta file.

Yes, we do need to bump it. It will most likely fail with ValueError and thus will be caught and cache will be discarded, but the log line will be wrong. I will update this.

github-actions · 2025-11-01T00:46:40Z

According to mypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅

Force-discard cache if cache format changed

dc38a13

ilevkivskyi requested a review from JukkaL October 31, 2025 15:21

This comment has been minimized.

Sign in to view

JukkaL approved these changes Oct 31, 2025

View reviewed changes

Update docstring, add TODOs

a480323

ilevkivskyi merged commit 92101f3 into python:master Nov 1, 2025
21 checks passed

ilevkivskyi deleted the use-cache-version branch November 1, 2025 00:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Force-discard cache if cache format changed #20152

Force-discard cache if cache format changed #20152

Uh oh!

ilevkivskyi commented Oct 31, 2025

Uh oh!

This comment has been minimized.

JukkaL left a comment

Uh oh!

JukkaL Oct 31, 2025

Uh oh!

hauntsaninja Oct 31, 2025

Uh oh!

ilevkivskyi Nov 1, 2025

Uh oh!

JukkaL Oct 31, 2025

Uh oh!

JukkaL Oct 31, 2025

Uh oh!

ilevkivskyi Nov 1, 2025

Uh oh!

github-actions bot commented Nov 1, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Force-discard cache if cache format changed #20152

Force-discard cache if cache format changed #20152

Uh oh!

Conversation

ilevkivskyi commented Oct 31, 2025

Uh oh!

This comment has been minimized.

JukkaL left a comment

Choose a reason for hiding this comment

Uh oh!

JukkaL Oct 31, 2025

Choose a reason for hiding this comment

Uh oh!

hauntsaninja Oct 31, 2025

Choose a reason for hiding this comment

Uh oh!

ilevkivskyi Nov 1, 2025

Choose a reason for hiding this comment

Uh oh!

JukkaL Oct 31, 2025

Choose a reason for hiding this comment

Uh oh!

JukkaL Oct 31, 2025

Choose a reason for hiding this comment

Uh oh!

ilevkivskyi Nov 1, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Nov 1, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants