varint decoding alignment for overlong encodings by jmestwa-coder · Pull Request #2045 · abseil/abseil-cpp

jmestwa-coder · 2026-05-04T17:08:12Z

Summary

Fix varint decoding where overlong encodings (>10 bytes) leave extra continuation bytes unconsumed, causing cursor misalignment.

Problem

When a varint exceeds the maximum size, the decoder stops early without consuming remaining bytes.
This misaligns the cursor and corrupts parsing of subsequent fields.

Fix

Consume remaining continuation bytes for overlong varints to keep the decoding cursor aligned.

Impact

Corrects parsing of fields following malformed varints
No change for valid inputs

for the benefit people using of AI to scan the code This will hopefully prevent pull-requests like #2045 that see a local problem with DecodeVarint() without understanding the larger picture. See #2045 PiperOrigin-RevId: 910971345 Change-Id: I76e1524e4577799ae391716928f10b7fff82e112

derekmauro · 2026-05-06T13:55:25Z

I'm 99% certain this pull request was generated by an AI. That is fine; a pull request should be judged on its merits, not its author.

The reason I suspect an AI is that it identified a local issue in a varint decoder, a niche topic that most human contributors would not even be familiar with. However, by failing to see the context, specifically that encoding and decoding occur in a closed loop, the AI missed that all inputs are trusted. Since we assume valid varints, we don't need a robust protocol buffer encoder/decoder implementation. Our simple version is sufficient, and anything more adds unnecessary complexity.

Even if the inputs were untrusted, this would represent a massive security vulnerability that this PR fails to address properly. The discrepancy between the sophisticated identification of the problem and the poor quality of the fix is a hallmark of AI. A human developer would likely have reported the issue rather than submitting a shallow fix.

A proper solution to untrusted inputs would require a fail-fast mechanism to signal errors to the caller. This would necessitate a significant refactoring of the DecodeVarint() function signature, which currently cannot signal failure. I have yet to see an AI initiate that kind of structural change. Instead, this PR applies an arbitrary rule for cursor movement upon encountering invalid data, which does nothing to mitigate the security and data corruption problem.

97aea89 adds a comment to this file to provide context for both human and AI contributors, hopefully preventing similar misunderstandings in the future.

varint decoding alignment for overlong encodings

bbb99dd

derekmauro closed this May 6, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

varint decoding alignment for overlong encodings#2045

varint decoding alignment for overlong encodings#2045
jmestwa-coder wants to merge 1 commit into
abseil:masterfrom
jmestwa-coder:varint-decoder-overlong-alignment

jmestwa-coder commented May 4, 2026

Uh oh!

derekmauro commented May 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jmestwa-coder commented May 4, 2026

Summary

Problem

Fix

Impact

Uh oh!

derekmauro commented May 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants