Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tweak VRID calculation to avoid 0 #23

Merged
merged 2 commits into from
Oct 11, 2019

Conversation

cybertron
Copy link
Member

This change modifies how we calculate the fletcher8 checksum to
match the online calculators (for example, [0]). This has two
important effects:

  1. The maximum value it can return is now 0xEE rather than 0xFF.
    This is a documented limitation of the fletcher algorithm because
    it uses modulo operations and 0xF % 0xF is 0, not 0xF, so it's
    impossible to get a 0xF byte. Unfortunately, this does reduce the
    id space available.

  2. Because the algorithm will not return 255, we can safely add one
    to the resulting value to ensure we don't end up with a VRID of
    0. 0 is not a valid VRID for keepalived, as reported in FletcherChecksum8() may generate invalid VRID #21.

0: https://gchq.github.io/CyberChef/#recipe=Fletcher-8_Checksum()&input=bDN2dnFxbmktZjIxY2EtZG5z

This change modifies how we calculate the fletcher8 checksum to
match the online calculators (for example, [0]). This has two
important effects:

1) The maximum value it can return is now 0xEE rather than 0xFF.
   This is a documented limitation of the fletcher algorithm because
   it uses modulo operations and 0xF % 0xF is 0, not 0xF, so it's
   impossible to get a 0xF byte. Unfortunately, this does reduce the
   id space available.

2) Because the algorithm will not return 255, we can safely add one
   to the resulting value to ensure we don't end up with a VRID of
   0. 0 is not a valid VRID for keepalived, as reported in openshift#21.

0: https://gchq.github.io/CyberChef/#recipe=Fletcher-8_Checksum()&input=bDN2dnFxbmktZjIxY2EtZG5z
@openshift-ci-robot openshift-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Sep 23, 2019
@cybertron
Copy link
Member Author

This is one possible approach. Another is to just clamp the id to > 0, but that way I think we'd double the chance of collisions at 1 because two hashes would be resolving to the same id. This approach slightly increases the chance of collisions across the board.

@cybertron
Copy link
Member Author

/assign @celebdor

ckA = (ckA + inp[i]) & 0xf
ckB = (ckB + ckA) & 0xf
ckA = (ckA + inp[i]) % 0xf
ckB = (ckB + ckA) % 0xf
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One other option would be to switch just this line back to &, which would allow us to get to a range of 0 to 254, which is exactly what we want. It wouldn't technically be fletcher 8 then, but I'm not sure how much we care about that.

@mandre
Copy link
Member

mandre commented Oct 11, 2019

Thanks @cybertron for the patch, this is a nice way to take care of both issues reported in #21 and #22.

@celebdor any chance we can get this patch in? We're seeing the issue in CI occasionally.

@celebdor
Copy link
Contributor

/lgtm

@openshift-ci-robot openshift-ci-robot added lgtm Indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Oct 11, 2019
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

8 similar comments
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

Because the fletcher8 algorithm only has a total of 255 hashes
available, it is highly likely that at some point we will have a
collision between our different VRIDs. This change adds checks for
that situation and increments the ids until they no longer collide.
This should be consistent across all nodes, so they will all end up
using the same ids for the same VIPs.
@openshift-ci-robot openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label Oct 11, 2019
@celebdor
Copy link
Contributor

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Oct 11, 2019
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: celebdor, cybertron

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-robot openshift-merge-robot merged commit 10b8c7f into openshift:master Oct 11, 2019
@mandre
Copy link
Member

mandre commented Nov 21, 2019

We should backport this to release-4.2 branch.

We've seen it happen again in https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-openstack-serial-4.2/228

@tomassedovic
Copy link

/backport release-4.2

(wonder if this works)

@iamemilio
Copy link

/cherry-pick release-4.2

@openshift-cherrypick-robot

@iamemilio: new pull request created: #35

In response to this:

/cherry-pick release-4.2

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants