New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
QCop autonomously sends deregisters #105
Conversation
src/cryptonote_core/quorum_cop.cpp
Outdated
return false; | ||
|
||
if (!m_service_node_list.is_service_node(pubkey)) | ||
return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only dependency on service_node_lists. Once we get db storage of service node lists we can remove this and check if the pubkey belongs to some range of heights.
I can imagine we'll want to store uptime proof over some range of heights so we have the tools to provide statistics on the network.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noted
src/cryptonote_core/quorum_cop.cpp
Outdated
", which is greater than the recommended REORG_SAFETY_BUFFER_IN_BLOCKS: " << REORG_SAFETY_BUFFER_IN_BLOCKS); | ||
} | ||
|
||
m_last_height = (height < REORG_SAFETY_BUFFER_IN_BLOCKS) ? height : height - REORG_SAFETY_BUFFER_IN_BLOCKS; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m_last_height should not be updated in the event of a blockchain detach
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the rationale? You need to start voting from the height onwards again. This probably should be m_last_height = height;
because you may not have the vote data anymore depending on the height you popped back to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m_last_height may be from well before the blockchain detach height.
Those blocks still need to be checked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if m_last_height >= blockchain detach height then we have problems with our assumption that large reorgs will not happen
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
src/cryptonote_core/quorum_cop.cpp
Outdated
void quorum_cop::blockchain_detached(uint64_t height) | ||
{ | ||
uint64_t delta_height = std::abs(static_cast<int64_t>(m_core.get_current_blockchain_height() - height)); | ||
if (delta_height > REORG_SAFETY_BUFFER_IN_BLOCKS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m_core.get_current_blockchain_height() should always be >= height, I think...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to use m_last_height. Height will be the height that the chain split at and blockchain has already been popped back to the split height.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be
if (m_last_height >= height) {
Warn user; adjust m_last_height; }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
src/cryptonote_core/quorum_cop.cpp
Outdated
m_last_height = (height < REORG_SAFETY_BUFFER_IN_BLOCKS) ? height : height - REORG_SAFETY_BUFFER_IN_BLOCKS; | ||
} | ||
|
||
static bool participates_in_quorum(const std::vector<crypto::public_key>& quorum, crypto::public_key const &my_key, size_t *index_in_quorum = nullptr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The written function returns true if the pubkey does not exist in the quorum. I don't think this is intended behaviour.
Implement or replace entirely with a call to std::find(). You can get the index by subtracting vec.begin() from the returned iterator
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed the bool return.
src/cryptonote_core/quorum_cop.cpp
Outdated
if (height < execute_justice_from_height) | ||
return; | ||
|
||
if (height == latest_height) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need this branch. The uptime proofs are submitted every hour by the daemon when it is idle, not just when it is in the list of nodes to test.
src/cryptonote_core/quorum_cop.cpp
Outdated
bool vote_off_node = false; | ||
|
||
CRITICAL_REGION_LOCAL(m_lock); | ||
auto const it = m_uptime_proof_seen.find(node_key); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
vote_off_node = (this daemon has been running for at least two hours && m_uptime_proof_seen.find(node_key) == m_uptime_proof_seen.end());
Somewhere else in the code we must remove old entries in the m_uptime_proof_seen table.
src/cryptonote_core/quorum_cop.cpp
Outdated
} | ||
} | ||
|
||
prune_uptime_proof_below(execute_justice_from_height); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would triggered this during idle in m_core, say once every 30 seconds. Works fine here too, just doesn't feel right because the uptime proof thing is not happening related to the blockchain. This also means the block height stuff in this function is unnecessary - you just have to remove them as long as their timestamp < now - 2 hours and 5 minutes
src/cryptonote_core/quorum_cop.cpp
Outdated
return false; | ||
|
||
if (!m_service_node_list.is_service_node(pubkey)) | ||
return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noted
src/cryptonote_core/quorum_cop.cpp
Outdated
return; | ||
} | ||
|
||
const uint64_t prune_from_timestamp = blocks.front().timestamp; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prune_from_timestamp = now() - 2 hours 5 minutes is fine
src/cryptonote_core/quorum_cop.cpp
Outdated
CRITICAL_REGION_LOCAL(m_lock); | ||
|
||
for (auto it = m_uptime_proof_seen.begin(); it != m_uptime_proof_seen.end();) | ||
it = (it->second < prune_from_timestamp) ? m_uptime_proof_seen.erase(it) : it; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an infinite loop.
change to:
if (it->second < prune_from_timestamp)
it = m_uptime_proof_seen.erase(it);
else
it++;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively just add ++it to the second branch in your ternary operator. Not sure if this is defined behaviour.
src/cryptonote_core/quorum_cop.cpp
Outdated
m_last_height = (height < REORG_SAFETY_BUFFER_IN_BLOCKS) ? height : height - REORG_SAFETY_BUFFER_IN_BLOCKS; | ||
} | ||
|
||
static bool participates_in_quorum(const std::vector<crypto::public_key>& quorum, crypto::public_key const &my_key, size_t *index_in_quorum = nullptr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed the bool return.
src/cryptonote_core/quorum_cop.cpp
Outdated
void quorum_cop::blockchain_detached(uint64_t height) | ||
{ | ||
uint64_t delta_height = std::abs(static_cast<int64_t>(m_core.get_current_blockchain_height() - height)); | ||
if (delta_height > REORG_SAFETY_BUFFER_IN_BLOCKS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to use m_last_height. Height will be the height that the chain split at and blockchain has already been popped back to the split height.
src/cryptonote_core/quorum_cop.cpp
Outdated
", which is greater than the recommended REORG_SAFETY_BUFFER_IN_BLOCKS: " << REORG_SAFETY_BUFFER_IN_BLOCKS); | ||
} | ||
|
||
m_last_height = (height < REORG_SAFETY_BUFFER_IN_BLOCKS) ? height : height - REORG_SAFETY_BUFFER_IN_BLOCKS; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the rationale? You need to start voting from the height onwards again. This probably should be m_last_height = height;
because you may not have the vote data anymore depending on the height you popped back to.
const crypto::public_key &node_key = state->nodes_to_test[node_index]; | ||
|
||
CRITICAL_REGION_LOCAL(m_lock); | ||
bool vote_off_node = (m_uptime_proof_seen.find(node_key) != m_uptime_proof_seen.end()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to just find the key, since the entries in the list should be under UPTIME_PROOF_MAX_IN_SECONDS +- 30seconds (depending on the pruner being called every 30s on idle).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would add a buffer. Right now there is a chance that we will drop the ping from 1 hour ago, and then this evaluation happens before the next ping arrives.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If worked this out in my head correctly, this change would just require L155 to change from
if (m_uptime_proof_seen[pubkey] > now - UPTIME_PROOF_MAX_TIME_IN_SECONDS)
return false; // already received one uptime proof for this node recently.
to
if (m_uptime_proof_seen[pubkey] >= now - (UPTIME_PROOF_FREQUENCY_IN_SECONDS / 2))
return false; // already received one uptime proof for this node recently.
So we accept proofs every 30mins, everything else is the same and prune proofs older than 1hr 10mins.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah sounds good
src/cryptonote_core/quorum_cop.cpp
Outdated
if (it->second < prune_from_timestamp) | ||
m_uptime_proof_seen.erase(it); | ||
else | ||
it++; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed iterator inf loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be:
it = map.erase(it);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
src/cryptonote_core/quorum_cop.cpp
Outdated
} | ||
|
||
m_last_height = (height < REORG_SAFETY_BUFFER_IN_BLOCKS) ? height : height - REORG_SAFETY_BUFFER_IN_BLOCKS; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment became outdated. So just rewriting it here.
What's the rationale? You need to start voting from the height onwards again. This probably should be m_last_height = height; because you may not have the vote data anymore depending on the height you popped back to.
src/cryptonote_core/quorum_cop.cpp
Outdated
" blocks which is greater than the recommended REORG_SAFETY_BUFFER_IN_BLOCKS: " << REORG_SAFETY_BUFFER_IN_BLOCKS); | ||
} | ||
|
||
m_last_height = height; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment became outdated, so just rewriting it here.
What's the rationale? You need to start voting from the height you popped back to and onwards again. We could recover from this by making a mandatory ping here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m_last_height may be from well before the blockchain detach height.
Those blocks still need to be checked.
if m_last_height >= blockchain detach height then we have problems with our assumption that large reorgs will not happen
@@ -1004,6 +1033,8 @@ namespace cryptonote | |||
epee::math_helper::once_a_time_seconds<60*2, false> m_deregisters_auto_relayer; //!< interval for checking re-relaying deregister votes | |||
epee::math_helper::once_a_time_seconds<60*60*12, true> m_check_updates_interval; //!< interval for checking for new versions | |||
epee::math_helper::once_a_time_seconds<60*10, true> m_check_disk_space_interval; //!< interval for checking for disk space | |||
epee::math_helper::once_a_time_seconds<UPTIME_PROOF_FREQUENCY_IN_SECONDS, true> m_submit_uptime_proof_interval; //!< interval for submitting uptime proof | |||
epee::math_helper::once_a_time_seconds<30, true> m_uptime_proof_pruner; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prune triggered every 30s if you're a service node.
src/cryptonote_core/quorum_cop.cpp
Outdated
|
||
void quorum_cop::blockchain_detached(uint64_t height) | ||
{ | ||
uint64_t delta_height = std::abs(static_cast<int64_t>(m_last_height - height)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this abs() ?
This check should simply be if (m_last_height > height)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
src/cryptonote_core/quorum_cop.cpp
Outdated
return; | ||
|
||
time_t const now = time(nullptr); | ||
bool been_alive_for_2hrs = (now - m_core.get_start_time()) >= (60 * 60 * 2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#define please, or const time_t, whatever
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
|
||
void quorum_cop::blockchain_detached(uint64_t height) | ||
{ | ||
if (m_last_height >= height) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is always an error, when a very large reorg occurs that it is before m_last_height.
This is sufficient:
if (m_last_height >= height)
{
LOG_ERROR("The blockchain was detached to height: " << etc << " This should never happen, please report");
m_last_height = height;
}
src/cryptonote_core/quorum_cop.cpp
Outdated
continue; | ||
} | ||
|
||
auto it = std::find_if(state->quorum_nodes.begin(), state->quorum_nodes.end(), [&my_pubkey](crypto::public_key const &key) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this not work? std::find(nodes->quorum_nodes.begin(), state->quorum_nodes.end(), my_pubkey);
Quorum cop takes the core and service node list to query the quorum and begins voting off if it's participating in the quorum for nodes up to the (latest height - some safety buffer) if no ping has been seen since some threshold.
Nodes ping every 1 hour and also send out a ping on demand if they detect that they are included in the quorum for the latest block height.
There's an implicit dependency in Quorum Cop that it relies on Service Node Lists to have finished calculating the quorum state before Quorum Cop starts voting otherwise it receives an empty quorum.
I've chosen to enforce this by making it that the hooks for service node lists have to be registered explicitly and requires you to pass in the quorum cop.