Skip to content

fix: network fork on restart#263

Merged
strophy merged 3 commits into
v0.21-devfrom
fix/network-fork-on-restart
Oct 29, 2021
Merged

fix: network fork on restart#263
strophy merged 3 commits into
v0.21-devfrom
fix/network-fork-on-restart

Conversation

@strophy
Copy link
Copy Markdown
Collaborator

@strophy strophy commented Oct 29, 2021

Issue being fixed or feature implemented

Testnet platform chain halted due to a disagreement about how to replay when syncing from scratch or after restarting in some cases. This was investigated and found to be due to the use of two seed nodes that do not have knowledge of each other.

What was done?

Add seed node addresses to config for seed nodes.

How Has This Been Tested?

Deployed on schnapps, stopped all nodes, restarted seed nodes one at a time, then restarted masternodes, network came back without any issues.

Breaking Changes

Checklist:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have added or updated relevant unit/integration/functional/e2e tests
  • I have made corresponding changes to the documentation

For repository code-owners and collaborators only

  • I have assigned this pull request to a milestone

@strophy strophy added this to the v0.21.0 milestone Oct 29, 2021
@strophy strophy requested a review from shumkov October 29, 2021 10:56
@strophy
Copy link
Copy Markdown
Collaborator Author

strophy commented Oct 29, 2021

I've unrolled and indented the code here so it is easier to read:

# Comma separated list of seed nodes to connect to
seeds = "
{% if inventory_hostname not in groups.seed_nodes %}
  {% for seed in groups.seed_nodes %}
    {{ seed_nodes[seed].node_key.id }}@{{ hostvars[seed].public_ip }}:{{ tendermint_p2p_port }}
    {% if not loop.last %},{% endif %}
  {% endfor %}
{%- elif inventory_hostname in groups.seed_nodes -%}
  {% for seed in groups.seed_nodes %}
    {% if seed != inventory_hostname %}
      {{ seed_nodes[seed].node_key.id }}@{{ hostvars[seed].public_ip }}:{{ tendermint_p2p_port }}
    {% endif %}
  {% endfor %}
{%- endif -%}"


# Comma separated list of seed nodes to connect to
seeds = "{% if inventory_hostname not in groups.seed_nodes %}{% for seed in groups.seed_nodes %}{{ seed_nodes[seed].node_key.id }}@{{ hostvars[seed].public_ip }}:{{ tendermint_p2p_port }}{% if not loop.last %},{% endif %}{% endfor %}{% endif %}"
seeds = "{% if inventory_hostname not in groups.seed_nodes %}{% for seed in groups.seed_nodes %}{{ seed_nodes[seed].node_key.id }}@{{ hostvars[seed].public_ip }}:{{ tendermint_p2p_port }}{% if not loop.last %},{% endif %}{% endfor %}{%- elif inventory_hostname in groups.seed_nodes -%}{% for seed in groups.seed_nodes %}{% if seed != inventory_hostname %}{{ seed_nodes[seed].node_key.id }}@{{ hostvars[seed].public_ip }}:{{ tendermint_p2p_port }}{% endif %}{% endfor %}{%- endif -%}"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we simplify it to:

Suggested change
seeds = "{% if inventory_hostname not in groups.seed_nodes %}{% for seed in groups.seed_nodes %}{{ seed_nodes[seed].node_key.id }}@{{ hostvars[seed].public_ip }}:{{ tendermint_p2p_port }}{% if not loop.last %},{% endif %}{% endfor %}{%- elif inventory_hostname in groups.seed_nodes -%}{% for seed in groups.seed_nodes %}{% if seed != inventory_hostname %}{{ seed_nodes[seed].node_key.id }}@{{ hostvars[seed].public_ip }}:{{ tendermint_p2p_port }}{% endif %}{% endfor %}{%- endif -%}"
seeds = "{% for seed in groups.seed_nodes %}{% if seed != inventory_hostname %}{{ seed_nodes[seed].node_key.id }}@{{ hostvars[seed].public_ip }}:{{ tendermint_p2p_port }}{% endif %}{% endfor %}"

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would result in a syntax error on masternodes which should list both seeds, the seed nodes must be separated by a comma. We could probably simplify it like this, but then the seed nodes would also contain their own address:

{% for seed in groups.seed_nodes %}
  {{ seed_nodes[seed].node_key.id }}@{{ hostvars[seed].public_ip }}:{{ tendermint_p2p_port }}
  {% if not loop.last %},{% endif %}
{% endfor %}

Is that acceptable?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory, we have protection against "dialing myself", but we should not add the current node as this is NOT a correct solution. The correct solution is to put all seeds except ourselves, comma-separated.

And, if I understand correctly, your original solution will fail for 3 seeds - because there will be no comma between them?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. We need to find a way to remove the "self" seed from the group of possible seeds, which might exceed 2. Or we can use my initial code suggestion, and limit the number of seed nodes to 2 per network.

Copy link
Copy Markdown
Collaborator Author

@strophy strophy Oct 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I've found a solution. There are no loop controls like break or continue in standard Jinja2 so we need two loops, and set temp = is necessary to avoid outputting the name of the node being removed from the list. So we end up with something like this:

# Comma separated list of seed nodes to connect to
seeds = "
{% for seed in groups.seed_nodes %}
  {% if seed == inventory_hostname %}
    {% set temp = groups.seed_nodes.pop(loop.index0) %}
  {% endif %}
{% endfor %}
{% for seed in groups.seed_nodes %}
  {{ seed_nodes[seed].node_key.id }}@{{ hostvars[seed].public_ip }}:{{ tendermint_p2p_port }}
  {% if not loop.last %},{% endif %}
{% endfor %}"

shumkov
shumkov previously approved these changes Oct 29, 2021
Copy link
Copy Markdown
Collaborator

@shumkov shumkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@shumkov shumkov dismissed their stale review October 29, 2021 13:59

agree with Lukasz, the variable with list must be prepared in ansible

@strophy strophy requested review from lklimek and shumkov October 29, 2021 14:30
Copy link
Copy Markdown
Collaborator

@shumkov shumkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! 👍

@strophy strophy merged commit 8deef11 into v0.21-dev Oct 29, 2021
@strophy strophy deleted the fix/network-fork-on-restart branch October 29, 2021 14:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants