Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make SpEnoteStore serializable #25

Draft
wants to merge 21 commits into
base: seraphis_lib
Choose a base branch
from

Conversation

DangerousFreedom1984
Copy link

@DangerousFreedom1984 DangerousFreedom1984 commented Nov 22, 2023

The idea is to make the SpEnoteStore class serializable so we can store/load it into/from files, as it is really important for the wallet to recover the owned/spent enotes.
I tried to keep the serialization pattern used in seraphis (using binary_archive) and I did the following modifications:

  • Added new ser_* structs into serialization_demo_types.h
  • Added the corresponding functions at serialization_demo_utils
  • Modified the checkpoint_cache.* and removed the const of the variables there
  • Modified the enote_store:
    • added setter/getter functions to modify the private variables
  • Added the comparison operator for all the used serializable structs/classes
  • Added one unit_test to test the enote_store serialization

Depends upon:
monero-project#9069
monero-project#9077

Copy link

@rbrunner7 rbrunner7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took quite some work to grind through all those classes, nice that you took that on you, so we can just profit and have it comfortable :)

Nice code, but I had a number of question marks popping up.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a question to further my knowledge: What's the connection between serialization and equality comparison operators? Are they needed for serialization in one way or another?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, they are not needed for the serialization. The only purpose is testing. Maybe they could be useful too later.

/// setters
void set_from_serializable(const serialization::ser_CheckpointCache &serializable);
// void set_min_checkpoint_index(std::uint64_t min_checkpoint_index) {m_min_checkpoint_index = min_checkpoint_index;}
bool operator==(const CheckpointCache&other);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Little nitpick: CheckpointCache&other looks strange to me without any blank. Saw it alread in one or two other places.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I will fix it.


/// stored checkpoints
std::map<std::uint64_t, rct::key> m_checkpoints;
serializable_map<std::uint64_t, rct::key> m_checkpoints;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure my understanding is correct, but isn't turning this into serializable_map a "creeping in" of serialization into the library class that we want to avoid, with great effort, by using special serialization classes like here ser_CheckpointCache?

Not sure how it's already done as a pattern in other similar cases, and not sure whether having to copy maps one element at a time because of different type between this class and ser_CheckpointCache is a good trade-off for absolute "purity" here.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we want to avoid the classes from serialization/containers.h ? We are still using binary_archive with them. What do we want avoid actually?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I am not the best person to ask, because I myself as the "master" over the Seraphis library probably wouldn't go through all the complications with those ser_ classes and add serialization instrumentation directly to the Seraphis library struct and classes ...

But now, as we do separate, I think we could well go the extra mile and keep any serialization stuff out of the Seraphis library, as a matter of principle, and for not mixing things.

Maybe one day somebody will want to serialize Seraphis library structures with a completely different technology than binary_archive where, at least in theory, such a serializable_map would be not helpful in a good case and a problem with some bad luck.

@@ -105,19 +106,22 @@ class SpEnoteStore final
bool has_enote_with_key_image(const crypto::key_image &key_image) const;
/// get the legacy [ legacy identifier : legacy intermediate record ] map
/// - note: useful for collecting onetime addresses and viewkey extensions for key image recovery
const std::unordered_map<rct::key, LegacyContextualIntermediateEnoteRecordV1>& legacy_intermediate_records() const
const serializable_unordered_map<rct::key, LegacyContextualIntermediateEnoteRecordV1>& legacy_intermediate_records() const

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question, of course, of using serializable things in Seraphis library objects as in the checkpoint cache.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same answer :p

std::vector<unsigned char> tx_extra;

BEGIN_SERIALIZE_OBJECT();
FIELD(block_index);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we, quite in general, version all these classes with a version field? I can imagine smaller changes in the Seraphis enote origin context class that alone will not yet lead to SpEnoteOriginContextV2 but will lead to small changes in ser_SpEnoteOriginContextV1 and its serialization that we absolutely want to handle cleanly with the help of a version field?

Consider that the very simple serialization format used here has no meta info stored together with the data, thus without version fields you would have to resort to ugly heuristics in such cases.

I wonder whether we should treat any new such SERIALIZE_OBJECT block without VERSION_FIELD as deffective and have version fields strictly mandatory ...

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's a good point. If we change the type of one variable for example then we wouldn't know that in the serialization. On the other hand, if you dont break the serialization you dont need to know if changes were made. Since it would be deserializing correctly again. I can't think of an example of 'small changes' now that would really require adding a version field. If the changes are big (for example adding another field element) then we would need to have another ser_struct. Can you think of an example to better support the use of a version field?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, imagine you need one more uint64 in that struct. Or origin_status changes from char to int. Or something similar. The issue is: How would you read "old" files? Files that do not yet have have that extra int? How would you know, without seeing "Ah, this is still version 0, not yet version 1 with the new int", that you have to read the old data differently?

See here for a nice example in the existing codebase.

END_SERIALIZE();
};

using ser_LegacyEnoteVariant = tools::variant<ser_LegacyEnoteV1,ser_LegacyEnoteV2, ser_LegacyEnoteV3, ser_LegacyEnoteV4, ser_LegacyEnoteV5>;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may misremember, but I have something at the back of my mind that either @jeffro256 or @j-berman once explained that before variants correctly serialize with binary_archive you have to add some additional declarations somewhere for them. Stuff like this that is now in the file cryptonote_basic.h:

VARIANT_TAG(binary_archive, cryptonote::txin_to_key, 0x2);

As I do not see you adding something like that: Am I mistaken? Is that not needed in your cases of variants?


Just now saw your own question regarding this: "(why) do I need to use the VARIANT_TAG when serializing variants?" A guess: It works without such tags, but only up to the first change, when you have to support more variants?

ser_enote.view_tag = enote.view_tag.data;
serializable_enote_variant_out = ser_enote;
}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In all my own programming I use to add asserts at the end of such constructs, here with a final else, so if we ever have LegacyEnoteV6 and I forget to adjust this code to handle it, in addition to maybe 20 other places that have to deal with that 6th type, I will immediately notice.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I agree. I will add the final conditional.

enote_record_out.amount = enote_record.amount;
enote_record_out.amount_blinding_factor = enote_record.amount_blinding_factor;
enote_record_out.key_image = enote_record.key_image;
memcpy(enote_record_out.address_index.bytes, enote_record.address_index.bytes, sizeof(enote_record.address_index));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That memcpy looks suspicious to me, and I wonder why such a blunt "hammer" should be necessary to just copy something, but I don't know the codebase enough to really judge. Maybe just have a good look again?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, sure. I didnt think much about it and just followed the way koe did it. I do also think that there is a better way to do it.

make_serializable_sp_enote_spent_context_v1(enote_record.spent_context, enote_record_out.spent_context);
}
//-------------------------------------------------------------------------------------------------------------------
void make_serializable_checkpointcacheconfig(const CheckpointCacheConfig &cache_config, ser_CheckpointCacheConfig &cache_config_out)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't that be make_serializable_checkpoint_cache_config?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont think so as the struct has just one big name. But yeah, looks better with underlines.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can see, you test converting between the "real" classes and the ser_ classes, which is certainly nice, but you do not test the actual serialization to bytes itself. Wouldn't it be important to test that as well, with a binary_archive and a memory stream for example? I mean, just one forgotten FIELD line, and all goes wrong, right?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, definitely. Actually my code is not working as it is now. I just spotted many mistakes which I will correct soon. Definitely I need more unit_tests. I will work on them.

@DangerousFreedom1984 DangerousFreedom1984 marked this pull request as draft November 26, 2023 19:12
jeffro256 added a commit to jeffro256/monero that referenced this pull request Nov 28, 2023
Some downstream code (most notably PR UkoeHB#25) wants to use the src/serialization lib
for storing information persistently. When one builds classes/machines wishing to serialize containers, they must use
the `serializable_*` container classes. In this case, this makes the Seraphis library code unnecessarily tightly coupled
with the src/serialization code since one cannot swap out their type of storage format without major refactoring of class
field types. By serializing STL containers directly, we can abstract the serialization details away, making for much
cleaner design.

Depends upon monero-project#9069.
@jeffro256
Copy link

jeffro256 commented Nov 28, 2023

Hey @DangerousFreedom1984! I wrote a PR here monero-project#9077 that would allow you to serialize/deserialize the STL containers directly, without needing the serializable_* wrappers. This will let you address @rbrunner7's comment about making the SpEnoteStore too tightly coupled with the specific type of serialization.

You can simply use FIELD() etc with std::map, std::set, std::vector, etc, no wrapper needed.

jeffro256 added a commit to jeffro256/monero that referenced this pull request Nov 28, 2023
Some downstream code (most notably PR UkoeHB#25) wants to use the src/serialization lib
for storing information persistently. When one builds classes/machines wishing to serialize containers, they must use
the `serializable_*` container classes. In this case, this makes the Seraphis library code unnecessarily tightly coupled
with the src/serialization code since one cannot swap out their type of storage format without major refactoring of class
field types. By serializing STL containers directly, we can abstract the serialization details away, making for much
cleaner design. Also small bonus side effect of this change is that STL containers with custom Comparators, Allocators,
and Hashers are serializable.

Depends upon monero-project#9069.
jeffro256 added a commit to jeffro256/monero that referenced this pull request Nov 28, 2023
Some downstream code (most notably PR UkoeHB#25) wants to use the src/serialization lib
for storing information persistently. When one builds classes/machines wishing to serialize containers, they must use
the `serializable_*` container classes. In this case, this makes the Seraphis library code unnecessarily tightly coupled
with the src/serialization code since one cannot swap out their type of storage format without major refactoring of class
field types. By serializing STL containers directly, we can abstract the serialization details away, making for much
cleaner design. Also small bonus side effect of this change is that STL containers with custom Comparators, Allocators,
and Hashers are serializable.

Depends upon monero-project#9069.
@DangerousFreedom1984
Copy link
Author

Hey @DangerousFreedom1984! I wrote a PR here monero-project#9077 that would allow you to serialize/deserialize the STL containers directly, without needing the serializable_* wrappers. This will let you address @rbrunner7's comment about making the SpEnoteStore too tightly coupled with the specific type of serialization.

You can simply use FIELD() etc with std::map, std::set, std::vector, etc, no wrapper needed.

Thank you @jeffro256 ! I think it will look much better with the changes that you propose. I will rebuild this considering that your changes will be approved.
One question from my side: do you think that we need a file that defines the serializable types (like the seraphis_impl/serialization_demo_types.h) or should we define the serializable structs directly where they are defined? What would be the advantages/disadvantages of them?

It looks clean to me as it is now because if I want to serialize something, I just need to search for the serializable definition and then apply the corresponding functions to do it. On the other hand, all structs (or almost all) should be serializable and it should not be a big deal to call the serialization functions on them. It also seems to me that it is more efficient to do that directly. From my understanding now, I would vote to not have these ser_* structs, what do you think?

@rbrunner7
Copy link

I would vote to not have these ser_* structs, what do you think?

It was the decision of @UkoeHB to not directly equip the Seraphis library classes with serialization capability, under the impression of existing serialization code that pretty much degenerated into a mess over time, so it may indeed be a good idea to make sure that this never happens to parts of the Seraphis library.

I for one have the tendency to grant him as the architect a strong vote in such matters that we shouldn't overrule lightly.

And as I argued already elsewhere, there are a number of serialization technologies around, that also may get used for the Seraphis classes, which would be another argument for being "agnostic" in the library itself.

jeffro256 added a commit to jeffro256/monero that referenced this pull request Nov 28, 2023
Some downstream code (most notably PR UkoeHB#25) wants to use the src/serialization lib
for storing information persistently. When one builds classes/machines wishing to serialize containers, they must use
the `serializable_*` container classes. In this case, this makes the Seraphis library code unnecessarily tightly coupled
with the src/serialization code since one cannot swap out their type of storage format without major refactoring of class
field types. By serializing STL containers directly, we can abstract the serialization details away, making for much
cleaner design. Also small bonus side effect of this change is that STL containers with custom Comparators, Allocators,
and Hashers are serializable. `std::multimap` is added to the list of serializable containers.

Depends upon monero-project#9069.
jeffro256 added a commit to jeffro256/monero that referenced this pull request Nov 28, 2023
Some downstream code (most notably PR UkoeHB#25) wants to use the src/serialization lib
for storing information persistently. When one builds classes/machines wishing to serialize containers, they must use
the `serializable_*` container classes. In this case, this makes the Seraphis library code unnecessarily tightly coupled
with the src/serialization code since one cannot swap out their type of storage format without major refactoring of class
field types. By serializing STL containers directly, we can abstract the serialization details away, making for much
cleaner design. Also small bonus side effect of this change is that STL containers with custom Comparators, Allocators,
and Hashers are serializable. `std::multimap` is added to the list of serializable containers.

Depends upon monero-project#9069.
UkoeHB and others added 13 commits February 8, 2024 14:53
…is_lib_hist_05_15_23 branch for commit history
make JamtisDestinationV1 serializable 

---------

Co-authored-by: DangerousFreedom <monero-inflation-checker@protonmail.com>
* add operator== to JamtisPaymentProposals

---------

Co-authored-by: DangerousFreedom <monero-inflation-checker@protonmail.com>\
* make JamtisPaymentProposal serializable

---------

Co-authored-by: DangerousFreedom <monero-inflation-checker@protonmail.com>
…tProposals (UkoeHB#24)

* modify construct_tx_for_mock_ledger_v1 so it outputs the JamtisPaymentProposals

---------

Co-authored-by: DangerousFreedom <monero-inflation-checker@protonmail.com>
DangerousFreedom1984 added 2 commits February 17, 2024 08:50
@DangerousFreedom1984 DangerousFreedom1984 marked this pull request as ready for review February 17, 2024 08:48
// EnoteStore serialization

// LegacyEnote types
struct ser_LegacyEnoteV1 final

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just want to point out that after PR monero-project#9069, you can actually write free functions for serialization, which means that you can serialize directly from that function and don't have to create new types.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! So I can use something like that for (almost) all structs in seraphis?

BLOB_SERIALIZER(struct);

serialization::dump_binary(struct, blob_struct));
serialization::parse_binary(blob_struct, struct_recovered));

I will try to do so and simplify this PR. Thanks.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you can do that. And for non-blob types, you can write a do_serialize() function. I might make changes to the serialization header to make this an easier process.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually it might be better to leave as-is for now so it matches with the rest of the file's serialization. I will probably refactor all these classes out later

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have been working on that all day. Now I can serialize the enote_store and I dont use the serialization_demo_*. I'm trying to make a tx serializable now without using the serialization_demo_*. I have to fix some bugs. I will post my changes as soon as I fix it.

@DangerousFreedom1984 DangerousFreedom1984 marked this pull request as draft February 21, 2024 09:18
SNeedlewoods pushed a commit to SNeedlewoods/seraphis_wallet that referenced this pull request Mar 8, 2024
Some downstream code (most notably PR UkoeHB#25) wants to use the src/serialization lib
for storing information persistently. When one builds classes/machines wishing to serialize containers, they must use
the `serializable_*` container classes. In this case, this makes the Seraphis library code unnecessarily tightly coupled
with the src/serialization code since one cannot swap out their type of storage format without major refactoring of class
field types. By serializing STL containers directly, we can abstract the serialization details away, making for much
cleaner design. Also small bonus side effect of this change is that STL containers with custom Comparators, Allocators,
and Hashers are serializable. `std::multimap` is added to the list of serializable containers.

Depends upon monero-project#9069.
SNeedlewoods pushed a commit to SNeedlewoods/seraphis_wallet that referenced this pull request Mar 8, 2024
Some downstream code (most notably PR UkoeHB#25) wants to use the src/serialization lib
for storing information persistently. When one builds classes/machines wishing to serialize containers, they must use
the `serializable_*` container classes. In this case, this makes the Seraphis library code unnecessarily tightly coupled
with the src/serialization code since one cannot swap out their type of storage format without major refactoring of class
field types. By serializing STL containers directly, we can abstract the serialization details away, making for much
cleaner design. Also small bonus side effect of this change is that STL containers with custom Comparators, Allocators,
and Hashers are serializable. `std::multimap` is added to the list of serializable containers.

Depends upon monero-project#9069.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants