Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stack overflow for large chains of shared_ptr (or smart pointers in general) #185

Closed
AzothAmmo opened this issue Apr 22, 2015 · 7 comments
Closed

Comments

@AzothAmmo
Copy link
Contributor

It is possible to encounter a stack overflow when serializing data structures that utilize a large amount of pointers. Our implementation essentially utilizes a recursive depth first search to serialize pointers which can easily overflow the stack if the number of elements is large.

This was encountered in a neural network library where shared pointers are used to track connections between neurons.

To fix this: pointer serialization needs to be transformed into an iterative algorithm probably with a local data structure holding lambdas to perform the serialization.

@AzothAmmo AzothAmmo added the bug label Apr 22, 2015
@AzothAmmo
Copy link
Contributor Author

@bloodcarter
Copy link

Any news on this bug? Looks like I've run into it too :(

@AzothAmmo
Copy link
Contributor Author

I've never taken the time to engineer a solution and have been using a workaround of setting my stack size to be larger (using ulimit on linux). I unfortunately do not have a lot of time to devote to cereal at the moment but hopefully will be able to make a solid push on various issues in the future. Solving this, as I mentioned in the original posting, involves a fairly significant re-write of how we perform serialization for recursive data structures.

@AzothAmmo
Copy link
Contributor Author

So I've run into this again myself and it is essentially blocking my own work. I did some searching and it's been noted as a problem for boost as well.

Unfortunately there really isn't a good way to resolve this - I don't think it is possible to create an iterative serialization pipeline within the framework cereal provides. For now I suggest working around this as best as possible by restructuring your data or deferring parts of your serialization to avoid deep recursion.

@AzothAmmo
Copy link
Contributor Author

I have an idea for a solution for this. It's not the ideal turn serialization into an iteration which would completely solve the problem, but it's a nice compromise.

I'm thinking of adding something like CEREAL_DEFER, which will cause the serialization to be skipped and done at a later time by calling some other function, or possibly passing something to the archive (like a fake serialization item that does all deferments).

This would work well for things like graphs where you have a lot of nodes and then non-owning connections between the nodes. Normally, with shared pointers, the first item serialized would do a DFS to repopulate every node reachable. With this method, you would defer your connection serialization and first serialize the nodes themselves. When it came time to do the connections, the nodes would already exist so there would be no recursion.

AzothAmmo added a commit that referenced this issue Aug 25, 2017
@AzothAmmo AzothAmmo added this to the v1.2.3 milestone Aug 25, 2017
@AzothAmmo
Copy link
Contributor Author

So cereal::defer effectively solves this if you can structure your serialization or code appropriately. In my use case, for example, I have a large neural network defined by neurons and connections between them - basically a graph structure. I defer the serialization of the connections until the nodes themselves have been serialized, thus preventing deep recursion on traversing the graph.

I still need to clean up and document this feature, but I'm really liking it so far.

Note to self: breaks if an NVP is nested in a defer.

@AzothAmmo
Copy link
Contributor Author

Turns out there likely is a decent solution to go iterative but it would turn serialization into a two-stage process. It would be a lot of work overall and come with various pros and cons, so it's something I'll consider in a future release. For now I think the defer concept is a good substitute.

WSoptics pushed a commit to WSoptics/cereal that referenced this issue Sep 1, 2017
WSoptics pushed a commit to WSoptics/cereal that referenced this issue Mar 15, 2018
jrmadsen pushed a commit to jrmadsen/cereal that referenced this issue Jul 7, 2019
jrmadsen pushed a commit to jrmadsen/cereal that referenced this issue Jul 7, 2019
AzothAmmo added a commit to AzothAmmo/cereal that referenced this issue Oct 17, 2019
AzothAmmo added a commit to AzothAmmo/cereal that referenced this issue Oct 17, 2019
AzothAmmo added a commit to AzothAmmo/cereal that referenced this issue Oct 20, 2019
AzothAmmo added a commit to AzothAmmo/cereal that referenced this issue Oct 21, 2019
AzothAmmo added a commit to AzothAmmo/cereal that referenced this issue Oct 23, 2019
AzothAmmo added a commit to AzothAmmo/cereal that referenced this issue Oct 23, 2019
AzothAmmo added a commit to AzothAmmo/cereal that referenced this issue Oct 24, 2019
JerryRyan pushed a commit to JerryRyan/cereal that referenced this issue Jan 6, 2020
fryguy503 added a commit to wayfarershaven/server that referenced this issue Jan 11, 2023
Cereal updated for work being done in another branch. This bump fixes some issues with the library. Tested

v1.3.1

Bug fixes and minor enhancements:
Fix typo in docs by @tankorsmash in Fix typo in docs USCiLab/cereal#597
Add MSVC 2019 to build, default ctor for static object by @AzothAmmo in Add MSVC 2019 to build, default ctor for static object USCiLab/cereal#593
Fix json.hpp compilation issue when int32_t is a long by @bblackham in Fix json.hpp compilation issue when int32_t is a long USCiLab/cereal#621
[cpp20] explicitly capture 'this' as copy by @lukaszgemborowski in [cpp20] explicitly capture 'this' as copy USCiLab/cereal#640
Fix rapidjson for Clang 10 by @groscoe2 in Fix rapidjson for Clang 10 USCiLab/cereal#645
Fixes to prevent clang-diagnostic errors by @johngladp in Fixes to prevent clang-diagnostic errors USCiLab/cereal#643
cleanup cmake files to be a little more moderen by @ClausKlein in cleanup cmake files to be a little more moderen USCiLab/cereal#659
GHSA-wgww-fh2f-c855: Store a copy of each serialized shared_ptr within the archive to prevent the shared_ptr to be freed to early. by @serpedon in CVE-2020-11105: Store a copy of each serialized shared_ptr within the archive to prevent the shared_ptr to be freed to early. USCiLab/cereal#667
add license files for components of cereal by @miartad in add license files for components of cereal USCiLab/cereal#676
Catch short documents in JSON input by @johnkeeping in Catch short documents in JSON input USCiLab/cereal#677
C++17: use inline globals for StaticObjects by @InBetweenNames in C++17: use inline globals for StaticObjects USCiLab/cereal#657
Use std::variant::emplace when loading by @kepler-5 in Use std::variant::emplace when loading USCiLab/cereal#699
Use std::optional::emplace() when loading non-empty optional by @kepler-5 in Use std::optional::emplace() when loading non-empty optional USCiLab/cereal#698
Fix itsNextName not clearing when not found + style change by @AzothAmmo in Fix itsNextName not clearing when not found + style change USCiLab/cereal#715
Update doctest to 2.4.6 + local fixes slated for upstream by @AzothAmmo in Update doctest to 2.4.6 + local fixes slated for upstream USCiLab/cereal#716
Fixed loading of std::vector by @Darred in Fixed loading of std::vector<bool> USCiLab/cereal#732
Update license to match BSD template by @AzothAmmo in Update license to match BSD template USCiLab/cereal#735
Update doctest to 2.4.7 by @AzothAmmo in Update doctest to 2.4.7 USCiLab/cereal#736
Use GNUInstallDirs instead of hard wiring install directories by @antonblanchard in Use GNUInstallDirs instead of hard wiring install directories USCiLab/cereal#710
This is not an exhaustive list of changes or individual contributions. See the closed issues or complete changelog for more information.
v1.3.0

New features include:

Deferred serialization for smart pointers (Stack overflow for large chains of shared_ptr (or smart pointers in general) USCiLab/cereal#185)
Initial support for C++17 standard library variant and optional (thanks to @arximboldi, Add serialization support for C++17 std::optional and std::variant USCiLab/cereal#448)
Support for std::atomic (thanks to @bluescarni, Implementation and testing of std::atomic serialization. USCiLab/cereal#277)
Fixes and enhancements include:

Vastly improved continuous integration testing (Appveyor updates + boost testing fixes USCiLab/cereal#568, Update Travis CI USCiLab/cereal#569)
Fixed several issues related to compilation on newer compilers (Fixing various compilation warnings USCiLab/cereal#579, Add fall through comments to json.hpp USCiLab/cereal#587, Fix warning unused private member itsValueItEnd USCiLab/cereal#515)
Fixed warnings with -Wconversion and -Wdocumentation (thanks to @WSoptics, Develop USCiLab/cereal#423)
Performance improvements for polymorphic serialization (PolymorphicVirtualCaster StaticObject instantiation takes a very long time at app startup USCiLab/cereal#354)
Minor fixes and enhancements include:

Fixed a bug related to CEREAL_REGISTER_DYNAMIC_INIT with shared libraries (thanks to @m2tm, Issue correctly using CEREAL_REGISTER_DYNAMIC_INIT USCiLab/cereal#523)
Avoid unnecessary undefined behavior with StaticObject (thanks to @erichkeane, Change StaticObject instance management to hopefully avoid UBSAN USCiLab/cereal#470)
New version.hpp file describes cereal version (detect cereal version at compile time / version.hpp USCiLab/cereal#444)
Ability to disable size=dynamic attribute in the XML archive (thanks to @hoensr, Add option to turn off the size=dynamic attributes of lists USCiLab/cereal#401)
Other Notes
The static checking for minimal serialization has been relaxed as there were many legitimate cases it was interfering with (thanks to @h-2, remove const check from load_minimal USCiLab/cereal#565)
The vs2013 directory has been removed in favor of generating solutions with CMake (Remove vs2013 directory USCiLab/cereal#574)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants