New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement CIDR blocklisting #44151
Implement CIDR blocklisting #44151
Conversation
|
ceph test api |
|
@batrick @neha-ojha @vshankar Obviously later than I'd planned, but I haven't done this bit twiddling in a while and my first working solution was too ugly to share. o_0 Please take a look! There are a few teuthology tests that invoke blocklisting so I'm gonna go over those and see if any are possible to extend to test the ranges, but it's gonna be rough in the lab with the limited machines. And documentation incoming shortly. |
1795432
to
d8fa04d
Compare
|
https://pulpito.ceph.com/gregf-2021-12-02_21:15:39-rados-wip-cidr-blocklist-distro-basic-smithi for a few runs of existing rados blocklist tests |
Passed
One failed.
This has popped up at least once before; I think it's just a network blip. But I scheduled another run: http://pulpito.front.sepia.ceph.com:80/gregf-2021-12-06_17:17:45-fs-wip-cidr-blocklist-distro-basic-smithi/ |
|
Ah, this is actually busting mds client eviction as it stands. The MDS has client addresses that look like It's probably simplest to make the UI a "blocklist add range" instead of overloading "blocklist add"; I'll try that out. |
6eecf75
to
36b3ea2
Compare
|
Finally got shaman to build without weird system failures, so scheduled tests again: These passed! There is a failure in one of the fs tests due to ""2021-12-14T20:24:53.732478+0000 mon.a (mon.0) 1960 : cluster [WRN] Health check failed: Reduced data availability: 2 pgs peering (PG_AVAILABILITY)" in cluster log", but that is a persistent issue in master as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a few minor comments, looks good overall though! has it been through upgrade testing?
| @@ -584,8 +584,31 @@ class OSDMap { | |||
| std::shared_ptr< mempool::osdmap::vector<uuid_d> > osd_uuid; | |||
| mempool::osdmap::vector<osd_xinfo_t> osd_xinfo; | |||
|
|
|||
| class range_bits { | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like boost has a network range type we could use here instead - address_v[4|6]_range. This implementation looks correct to me, though
|
jenkins test make check arm64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my review so far; will finish tomorrow
The commit organization makes this a little more difficult. There is a is_cidr method which is not in the final diff. Some commits look like they could have been squashed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise LGTM
36b3ea2
to
4bba847
Compare
I'm not sure if the blocklist events tracking in Client.cc was ever the simplest way to track that state, but it definitely isn't now. We can just hand our addr_vec to the OSDMap and ask it -- it handles version compatibility issues and, happily, means the Client doesn't need to learn to deal with ranges directly. Signed-off-by: Greg Farnum <gfarnum@redhat.com>
Signed-off-by: Greg Farnum <gfarnum@redhat.com>
…ss format I discovered in testing with CephFS that this tends to interpret client IPs (which don't have ports, but do have nonces) as invalid ranges. So give it a separate input keyword that has to be applied first. Signed-off-by: Greg Farnum <gfarnum@redhat.com>
…ocklists Providing a non-range-aware blocklist accessor would just be asking for trouble, so don't. The ugly part of this is how the Objecter is currently just throwing the range blocklist on the end of its own list. The in-tree callers are okay with this, and I'd like to look at removing the blocklist events API from librados entirely -- it exposes "OSD-only" state to clients and, as evidenced by this patch series, is not particularly stable. Signed-off-by: Greg Farnum <gfarnum@redhat.com>
Signed-off-by: Greg Farnum <gfarnum@redhat.com>
… ranges Carry a parallel map from cidr addresses to a new range_bits class (stored entirely as ephemeral state) so that we don't need to re-compute masks and bit mappings too often, and to separate out the unpleasant ipv6 bit mapping logic. Then check against those with range_bits::matches() the same way we check for equality on specific-entity matches. Nice and simple loops! Fixes: https://tracker.ceph.com/issues/53050 Signed-off-by: Greg Farnum <gfarnum@redhat.com>
Signed-off-by: Greg Farnum <gfarnum@redhat.com>
Signed-off-by: Greg Farnum <gfarnum@redhat.com>
These tests are supposed to be validating we don't accept invalid IPs, but they left out the "add" subcommand so they're all failing on that! Signed-off-by: Greg Farnum <gfarnum@redhat.com>
Signed-off-by: Greg Farnum <gfarnum@redhat.com>
Signed-off-by: Greg Farnum <gfarnum@redhat.com>
Signed-off-by: Greg Farnum <gfarnum@redhat.com>
Signed-off-by: Greg Farnum <gfarnum@redhat.com>
4bba847
to
a649703
Compare
Signed-off-by: Greg Farnum <gfarnum@redhat.com>
Signed-off-by: Greg Farnum <gfarnum@redhat.com>
...and make the OSDMap handle it. Signed-off-by: Greg Farnum <gfarnum@redhat.com>
a649703
to
c454d0d
Compare
Addressed changes but he's on long-term leave
|
Jenkins test api |
|
@gregsfortytwo looks like there is a regression related to this PR in the rados suite. I created a Tracker issue for it; can you take a look? https://tracker.ceph.com/issues/55419 |
|
Also @gregsfortytwo was there a final teuthology run done for this PR? |
|
...my apologies, I'm looking at the test history and realizing I ran it through less of the suite than I thought I had. Since the only recent changes were in the unit test code I didn't spin up another test, but I definitely should have given what it's actually run through. :/ I'm looking into this. |
|
@gregsfortytwo I am marking https://tracker.ceph.com/issues/55419 as resolved for now. Is there a tracker this to track the backport of this feature? we should not forget to include the fix. |
https://tracker.ceph.com/issues/53050 is Pending Backport; I was waiting on the test fix to make sure there weren't any other issues before I generate backport PRs. Will get to them in the next day or two. |
This PR lets users specify a CIDR range when blocklisting, and propagates
that range throughout the OSDMap blocklisting functionality.
To facilitate this, it also cleans up some of the blocklist handling in CephFS, and
implements blocklisting unit tests.
Checklist
Show available Jenkins commands
jenkins retest this pleasejenkins test classic perfjenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test dashboard cephadmjenkins test apijenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume tox