Skip to content

Conversation

@Legoclones
Copy link
Contributor

@Legoclones Legoclones commented Nov 20, 2024

Removed the check to see if there's more than one item on the stack when the STOP opcode is reached, as this isn't enforced by pickle.


print("highest protocol among opcodes =", maxproto, file=out)
if stack:
raise ValueError("stack not empty after STOP: %r" % stack)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if removing the exception makes sense or not. But if we decide to change the code, I would prefer to keep the information: maybe replace raise ValueError() with print().

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that exception is raised after printing all output. You do not lose any information. You can catch and ignore this exception.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, so we're not worried about this then?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can catch and ignore this exception.

It's not easy when using pickletools CLI.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When using CLI you get some output. Whether it a Python traceback or a single line, -- it does not matter much, since such case is exceptional. The CLI output is not formalized, anything is fine.

We could add a mechanism for outputting non-critical errors, but since this particular error is raised after finishing all output, it would not make much difference.

@vstinner vstinner changed the title gh-127079: Remove check for items on stack gh-127079, pickletools: Remove check for items on stack Nov 25, 2024
@tim-one
Copy link
Member

tim-one commented Nov 25, 2024

This check in pickletools was intentional (I wrote the code). While it might "be nice" if all unpicklers complained about exactly the same things, pickletools intended to enforce what Guido and I and Jim Fulton all agreed was the original intent. So if there's a bug here, it's that other unpicklers don't complain about this.

In effect, pickletools is a kind of "executable documentation" as much as it is "a tool". It's intended to be normative: if a pickler produced a pickle that failed this test, that would be considered to be a bug (in the pickler). If anything is left on the stack at STOP time, the pickler lost track of what it intended to do, or the pickle stream got corrupted.

@serhiy-storchaka
Copy link
Member

pickletools does a poor work as a sanity checker. It accepts many cases rejected by unpicklers, and unpicklers today make better error reports. So I think that both tools complement one other -- you get an error in unpickler and then look at the code that led to this.

@tim-one
Copy link
Member

tim-one commented Nov 25, 2024

Yup, I agree that no code here is 100% reliable. At one time there was some faint hope that the pickle format could be made 100% "secure", but pickletools was written around the time it was accepted that it would never happen. Hence the

Warning The pickle module is not secure. Only unpickle data you trust.

in the docs now. Which isn't really the same thing, but related.

One way in which pickletools can fail to detect a "bad pickle" is that pickletools doesn't actually construct any objects beyond basic scalar types (int, float, string, ...). It's not really "an unpickler" at all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants