Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug 1837953: fixup server_id regex #694

Merged

Conversation

rbbratta
Copy link
Contributor

@rbbratta rbbratta commented Jul 7, 2020

We need to search the whole line for the server_id.

Current regex is too aggressive.

The original example indicates that the prev_server line is a list of servers

$ ovsdb-tool show-log /etc/ovn/ovnnb_db.db | grep "server_id\|prev_servers"
 server_id: d5db
  prev_servers: 33b4("ssl:10.19.138.37:9643"), 720d("ssl:10.19.138.33:9643"), 7e4f("ssl:10.19.138.32:9643"

The above DB is inconsistent because d5db is not part of prev_servers.

A consistent DB looks like this:

$ ovsdb-tool show-log /etc/ovn/ovnnb_db.db | grep "server_id\|prev_servers"
 server_id: 33b4
  prev_servers: 33b4("ssl:10.19.138.37:9643"), 720d("ssl:10.19.138.33:9643")

Copy link
Contributor

@danwinship danwinship left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense. I have some minor nitpicks

serverid=$(ovsdb-tool show-log "${db}" | sed -ne '/server_id:/s/server_id: *\([[:xdigit:]]\+\)/\1/p')
match=$(ovsdb-tool show-log "${db}" | grep "prev_servers: *${serverid}(" || true)
# remove leading and trailing space around server_id
serverid=$(ovsdb-tool show-log "${db}" | sed -ne '/server_id:/s/ *server_id: *\([[:xdigit:]]\+\).*/\1/p')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sed can be simplified to just

    sed -ne 's/ *server_id: *\([[:xdigit:]]\+\).*/\1/p'

The s/// rule already won't match if there's no "server_id:" in the string, so the extra /server_id:/ at the front wasn't doing anything

Copy link
Contributor Author

@rbbratta rbbratta Jul 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was originally matching the original grep plus sed formulation and the simpler regex /server_id:/ would be faster to match than the substitution regex but it won't matter much.

I finally checked the ovsdb-tools source and there is a fixed amount of space after the server_id: so we can use a positive lookbehind with grep -o and that should simplify things. Also threw in a -m 1 to exit after the first match for good measure, assuming that there should always be only one match. The -m 1 causes ovsdb-tool to complain about the broken pipe but that is okay since it's on stderr.

grep -m 1 -oP '(?<=server_id: )[[:xdigit:]]+'

match=$(ovsdb-tool show-log "${db}" | grep "prev_servers: *${serverid}(" || true)
# remove leading and trailing space around server_id
serverid=$(ovsdb-tool show-log "${db}" | sed -ne '/server_id:/s/ *server_id: *\([[:xdigit:]]\+\).*/\1/p')
# prev_servers is a list we are searching for " $server_id(" in the list
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either that needs to be two sentences (period before "we") or the "the list" at the end is superfluous.

# remove leading and trailing space around server_id
serverid=$(ovsdb-tool show-log "${db}" | sed -ne '/server_id:/s/ *server_id: *\([[:xdigit:]]\+\).*/\1/p')
# prev_servers is a list we are searching for " $server_id(" in the list
# prev_servers: 33b4("ssl:10.19.138.37:9643"), 720d("ssl:10.19.138.33:9643"), 7e4f("ssl:10.19.138.32:9643"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, now I'm totally just nitpicking but presumably there's a ")" missing at the end of that line...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was copied from the original bugzilla. I have yet to see multiple prev_servers in testing, but I checked the source and there should always be a closing paren.

@rbbratta rbbratta force-pushed the fix-server_id_regex branch 3 times, most recently from af4fbe4 to 72590b6 Compare July 8, 2020 00:25
@rbbratta
Copy link
Contributor Author

rbbratta commented Jul 8, 2020

/cc @fedepaol

@fedepaol
Copy link
Member

fedepaol commented Jul 8, 2020

Looks good, thanks for fixing! I think in my original (despictable) implementation I included this case, but it got eventually got lost while improving it.
I wonder if we should point the bz from this PR too. In that case, @rbbratta would you mind changing the title?

@fedepaol
Copy link
Member

fedepaol commented Jul 8, 2020

/retest

@fedepaol
Copy link
Member

fedepaol commented Jul 8, 2020

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jul 8, 2020
@@ -123,8 +123,10 @@ spec:
return
fi
echo "Checking ${db} health"
serverid=$(ovsdb-tool show-log "${db}" | sed -ne '/server_id:/s/server_id: *\([[:xdigit:]]\+\)/\1/p')
match=$(ovsdb-tool show-log "${db}" | grep "prev_servers: *${serverid}(" || true)
serverid=$(ovsdb-tool show-log "${db}" | grep -m 1 -oP '(?<=server_id: )[[:xdigit:]]+')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤷‍♂️ I like the original version better... I feel like it's reasonable to expect future people reading this to understand sed syntax but not reasonable to expect them to know GNU-specific grep flags... Which may be a totally unreasonable expectation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

grep -o seems cleaner instead of trying to remove all the text surrounding the match. It feels kinda sad that GNU maintainers add all these nice features (-o and PCRE, for example, ~20 years ago) and no one uses them.

@fedepaol
Copy link
Member

/retest

@fedepaol
Copy link
Member

@rbbratta do you mind referring the bug in the title?
The bug is half baked as of today and I am afraid we loose track.

@dcbw
Copy link
Member

dcbw commented Jul 14, 2020

/retest

@rbbratta yes, we should have the bug in the title as well.

@rcarrillocruz
Copy link
Contributor

i.e.
Bug X:

@rcarrillocruz
Copy link
Contributor

the description should follow the colon and a space

We need to search the whole line for the server_id.

Current regex is too aggressive.

The original example indicates that the prev_server line is a list of servers

```bash
$ ovsdb-tool show-log /etc/ovn/ovnnb_db.db | grep "server_id\|prev_servers"
 server_id: d5db
  prev_servers: 33b4("ssl:10.19.138.37:9643"), 720d("ssl:10.19.138.33:9643"), 7e4f("ssl:10.19.138.32:9643")
```
  The above DB is inconsistent because d5db is not part of prev_servers.

A consistent DB looks like this:
```bash
$ ovsdb-tool show-log /etc/ovn/ovnnb_db.db | grep "server_id\|prev_servers"
 server_id: 33b4
  prev_servers: 33b4("ssl:10.19.138.37:9643"), 720d("ssl:10.19.138.33:9643")
```
@openshift-ci-robot openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label Jul 14, 2020
@rbbratta rbbratta changed the title fixup server_id regex Bug 1837953: fixup server_id regex Jul 14, 2020
@openshift-ci-robot openshift-ci-robot added the bugzilla/severity-high Referenced Bugzilla bug's severity is high for the branch this PR is targeting. label Jul 14, 2020
@openshift-ci-robot
Copy link
Contributor

@rbbratta: This pull request references Bugzilla bug 1837953, which is invalid:

  • expected the bug to be in one of the following states: NEW, ASSIGNED, ON_DEV, POST, POST, but it is VERIFIED instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

Bug 1837953: fixup server_id regex

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added the bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. label Jul 14, 2020
@dcbw
Copy link
Member

dcbw commented Jul 14, 2020

If this is a follow-on to the already fixed bug, then we need a new bug cloned from the original, target release 4.6, so that QE can verify this additional fix separate from the original one.

@rbbratta rbbratta changed the title Bug 1837953: fixup server_id regex Bug 1857455: fixup server_id regex Jul 15, 2020
@openshift-ci-robot openshift-ci-robot added the bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. label Jul 15, 2020
@openshift-ci-robot
Copy link
Contributor

@rbbratta: This pull request references Bugzilla bug 1857455, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.6.0) matches configured target release for branch (4.6.0)
  • bug is in the state NEW, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

In response to this:

Bug 1857455: fixup server_id regex

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot removed the bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. label Jul 15, 2020
@fedepaol
Copy link
Member

/retest

@sdodson
Copy link
Member

sdodson commented Jul 20, 2020

/retitle Bug 1837953: fixup server_id regex
This is a followup to fixes in #688

@openshift-ci-robot openshift-ci-robot changed the title Bug 1857455: fixup server_id regex Bug 1837953: fixup server_id regex Jul 20, 2020
@openshift-ci-robot
Copy link
Contributor

@rbbratta: This pull request references Bugzilla bug 1837953, which is valid. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.6.0) matches configured target release for branch (4.6.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

In response to this:

Bug 1837953: fixup server_id regex

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@@ -123,8 +123,10 @@ spec:
return
fi
echo "Checking ${db} health"
serverid=$(ovsdb-tool show-log "${db}" | sed -ne '/server_id:/s/server_id: *\([[:xdigit:]]\+\)/\1/p')
match=$(ovsdb-tool show-log "${db}" | grep "prev_servers: *${serverid}(" || true)
serverid=$(ovsdb-tool show-log "${db}" | grep -m 1 -oP '(?<=server_id: )[[:xdigit:]]+')
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

serverid=$(ovsdb-tool show-log "${db}" | grep -m 1 -oP '(?<=server_id: )[[:xdigit:]]+')
# prev_servers is a list. We are searching for " $server_id(" in the list
# prev_servers: 33b4("ssl:10.19.138.37:9643"), 720d("ssl:10.19.138.33:9643"), 7e4f("ssl:10.19.138.32:9643")
match=$(ovsdb-tool show-log "${db}" | grep "prev_servers:.* ${serverid}(" || true)
Copy link
Contributor Author

@rbbratta rbbratta Jul 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/openvswitch/ovs/blob/master/ovsdb/ovsdb-tool.c#L897

        print_servers("prev_servers", h->snap.servers);

https://github.com/openvswitch/ovs/blob/master/ovsdb/ovsdb-tool.c#L828

    printf(" %s: ", name);

    const struct shash_node **nodes = shash_sort(json_object(servers));
    size_t n = shash_count(json_object(servers));
    for (size_t i = 0; i < n; i++) {
        if (i > 0) {
            printf(", ");
        }

        const struct shash_node *node = nodes[i];
        printf("%.4s(", node->name);

        const struct json *address = node->data;
        char *s = json_to_string(address, JSSF_SORT);
        fputs(s, stdout);
        free(s);

        putchar(')');
    }

@dcbw
Copy link
Member

dcbw commented Jul 20, 2020

/override ci/prow/e2e-gcp-upgrade
that job is for openshift-sdn and this PR only touches ovn-kubernetes

@openshift-ci-robot
Copy link
Contributor

@dcbw: Overrode contexts on behalf of dcbw: ci/prow/e2e-gcp-upgrade

In response to this:

/override ci/prow/e2e-gcp-upgrade
that job is for openshift-sdn and this PR only touches ovn-kubernetes

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@dcbw
Copy link
Member

dcbw commented Jul 20, 2020

/approve
/lgtm

@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dcbw, fedepaol, rbbratta

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added lgtm Indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Jul 20, 2020
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

5 similar comments
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Jul 20, 2020

@rbbratta: The following tests failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
ci/prow/e2e-gcp-ovn-upgrade 6c02777 link /test e2e-gcp-ovn-upgrade
ci/prow/e2e-openstack 6c02777 link /test e2e-openstack
ci/prow/e2e-vsphere 6c02777 link /test e2e-vsphere

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-merge-robot openshift-merge-robot merged commit 3ff3b90 into openshift:master Jul 20, 2020
@openshift-ci-robot
Copy link
Contributor

@rbbratta: All pull requests linked via external trackers have merged: openshift/cluster-network-operator#694, openshift/cluster-network-operator#688. Bugzilla bug 1837953 has been moved to the MODIFIED state.

In response to this:

Bug 1837953: fixup server_id regex

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

fedepaol added a commit to fedepaol/cluster-network-operator that referenced this pull request Jul 23, 2020
It turned out the prev_servers list contains only the leader, so it can't be
used to understand if the db is consistent or not. This leads in always deleting
the db file when restarting a ovn-master pod that is not the leader.
More details in https://bugzilla.redhat.com/show_bug.cgi?id=1837953#c36

Revert "Merge pull request openshift#694 from rbbratta/fix-server_id_regex"

This reverts commit 3ff3b90, reversing
changes made to a8ababe.

Revert "Merge pull request openshift#688 from fedepaol/dbhealthy"

This reverts commit a0aaca7, reversing
changes made to 2f04083.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-high Referenced Bugzilla bug's severity is high for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants