Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

std::shared_ptr serialization asymmetry (depends on memory layout) #636

Closed
guidovranken opened this issue Mar 27, 2020 · 4 comments · Fixed by #667
Closed

std::shared_ptr serialization asymmetry (depends on memory layout) #636

guidovranken opened this issue Mar 27, 2020 · 4 comments · Fixed by #667

Comments

@guidovranken
Copy link

Cereal employs caching of std::shared_ptr values, using the raw pointer as a unique identifier. This becomes problematic if an std::shared_ptr variable goes out of scope and is freed, and a new std::shared_ptr is allocated at the same address. Serialization fidelity thereby becomes dependent upon memory layout.

#include <cereal/archives/binary.hpp>
#include <cereal/types/memory.hpp>

int main(void)
{
    std::stringstream ss;

    {
        cereal::BinaryOutputArchive archiveOut(ss);
        {
            std::shared_ptr<bool> v = std::make_shared<bool>(true);
            archiveOut(v);
            printf("v is %p\n", v.get());
            printf("serialized: %d\n", *v);
        }
        {
            std::shared_ptr<bool> v = std::make_shared<bool>(false);
            archiveOut(v);
            printf("v is %p\n", v.get());
            printf("serialized: %d\n", *v);
        }
    }

    {
        cereal::BinaryInputArchive archiveIn(ss);
        std::shared_ptr<bool> v1, v2;
        archiveIn(v1);
        printf("deserialized: %d\n", *v1);
        archiveIn(v2);
        printf("deserialized: %d\n", *v2);
    }
    return 0;
}

Output is:

v is 0x5578c0144ec0
serialized: 1
v is 0x5578c0144ec0
serialized: 0
deserialized: 1
deserialized: 1

The input is (true, false) but the output is (true, true).

@ffontaine
Copy link

This issue has been assigned the following CVE number: CVE-2020-11105

@InBetweenNames
Copy link
Contributor

Yeah unfortunately, the control block of std::shared_ptr doesn't encode a unique identifier that can be retrieved to prevent this from happening. Allocators also aren't required to always return unique addresses either, nor is there a way to enforce that. It seems like a hotfix should be released that specifically disables std::shared_ptr overloads provided by Cereal in favour of having users define their own, as I'm not sure this is something that can be safely handled at the library level. Serializing without caching would invalidate invariants from many-to-one relationships that could surprise users of std::shared_ptr upon deserialization, too.

Alternatively, one could document that correct usage of std::shared_ptr in Cereal requires all serialization to be done after all std::shared_ptrs are created and before any of them are destroyed. E.g., all serialization happens at exactly one point in time in the program. For programs that uses serialization to store and load global state upon startup and shutdown, this would probably be okay to use.

Cereal already provides a mechanism to disable or override it's own handling of STL types, so the current implementation could be made opt-in rather than opt-out for safety.

@serpedon
Copy link
Contributor

Either I am overlooking something or the fix of this problem is quite straight forward, see patch proposal in linked pr #667.

My line of though was the following:
As written already above, correct usage of std::shared_ptr in Cereal requires that the shared_ptr is still valid at the point when all serialization occurs, usually at the end of the lifetime of the archive. It was suggested to document this constraint to the user, but since we are already dealing with smart pointers, I though, hey, let's implement this constraint by storing our own copy of the std::shared_ptr.

Am I right and it is this easy, or am I overlooking something?

@InBetweenNames
Copy link
Contributor

I think this makes sense, it should be documented so that users understand how it will affect the lifetime of their smart pointers, but the approach should fix the CVE. I'm kicking myself for not thinking of it sooner!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants