Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LNS: Work-around lns::generic_owner hashing bug in rpc #1095

Merged
merged 2 commits into from Mar 26, 2020

Conversation

Doy-lee
Copy link
Collaborator

@Doy-lee Doy-lee commented Mar 24, 2020

      // TODO(loki): Appears to be some bug overloading std::hash and using
      // std::map<lns::generic_owner, size_t> causing hash collisions
      // and overwriting the same entry every time, so manually create and store
      // the hash gives us the correct result.

      // But, we now also just serialize both owner and backup_owner, since if
      // we specify an owner that is backup owner, we don't show the (other)
      // owner. This makes the fix redundant but for RPC compatibility we keep
      // it around until the next hard fork (16)

@jagerman
Copy link
Member

A std::map does insertion sorting, no hashing at all. (But for it to work, a < b must work).

@jagerman
Copy link
Member

jagerman commented Mar 24, 2020

Ouch, I see why the existing code even compiles -- because generic_owner has this:

operator bool() const { ... }

which is non-explicit so that when std::map needs to compare elements when it does a a < b comparison the only workable solution is doing implicit conversion to bool on both sides.

@Doy-lee
Copy link
Collaborator Author

Doy-lee commented Mar 24, 2020

That explains everything :(

@jagerman
Copy link
Member

generic_signature also has an operator bool that should be made explicit while you're there.

@jagerman
Copy link
Member

Also a side note in passing: with unordered_map/unordered_set, even if you get hash collisions from a poor hash you still don't overwrite values, you just get degraded performance because you end up with one bucket containing lots of values rather than values spread across the available buckets. Worst case: your hash functions returns a constant for all values and your unordered map/set degrades into a linear search.

@Doy-lee Doy-lee force-pushed the LNSFixHashingCollisionBug branch 2 times, most recently from 26c3817 to 25017e2 Compare March 24, 2020 03:10
@Doy-lee
Copy link
Collaborator Author

Doy-lee commented Mar 24, 2020

Added explicit to operator bool(). Decided to keep the std::map and manually hash as its temporary, then deprecate in hard fork 16.

@jagerman
Copy link
Member

This looks fine, except for this part:

Decided to keep the std::map and manually hash

With the std::hash specialization already here, an unordered_map here seems an obviously better choice: hash collisions won't be destructive (they are with std::map + hash as keys), the code is slightly simpler (you can eliminate the hash call), and performance is better (map is an insertion sorted container, thus O(NlogN) on N insertions).

Also yesterday I came across this variable and comment in daemon/core.h, committed more than 4 years ago:

  // TEMPORARY HACK - Yes, this creates a copy, but otherwise the original
  // variable map could go out of scope before the run method is called
  boost::program_options::variables_map const m_vm_HACK;

These sorts of things have a way of unintentionally surviving.

@@ -3416,7 +3427,7 @@ namespace cryptonote
if (exceeds_quantity_limit(ctx, error_resp, m_restricted, req.entries.size(), COMMAND_RPC_LNS_OWNERS_TO_NAMES::MAX_REQUEST_ENTRIES))
return false;

std::map<lns::generic_owner, size_t> owner_to_request_index;
std::map<size_t, size_t> owner_to_request_index;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be an unordered_map (but keeping the std::hash specialization). This isn't just an optimal data structure thing: using a map here has a potential (if rare) data omission on collision. (And switching to an unordered_map is a nearly trivial change).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, updated

Quoting from Jagerman's reason: With the std::hash specialization
already here, an unordered_map here seems an obviously better choice:
hash collisions won't be destructive (they are with std::map + hash as
keys), the code is slightly simpler (you can eliminate the hash call),
and performance is better (map is an insertion sorted container, thus
O(NlogN) on N insertions).

This isn't just an optimal data structure thing: using a map here has
a potential (if rare) data omission on collision.
@Doy-lee Doy-lee merged commit 96b7f68 into oxen-io:dev Mar 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants