New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mon: add support public_bind_addr option #16189

Merged
merged 3 commits into from Jul 17, 2017

Conversation

Projects
None yet
5 participants
@bassam
Member

bassam commented Jul 6, 2017

To support running in dynamic enviornments (like Kubernetes), its desirable for a mon to have two ip addresses. The first it can bind to locally and the other is what goes in the monmap for others to reach the mon.

Added a new config option "public_bind_addr" which if set will be used by the mon to listen locally. The existing "public_addr" is unchanged and can be different from "public_bind_addr".

Also added a new function on Messenger to set_addr which is called by ceph-mon to set
the advertised address after doing the bind.

What's left:

  • add new unit tests
  • end to end testing with Rook
  • add docs

@liewegas would good to get early feedback on the approach here.

@@ -727,12 +779,6 @@ int main(int argc, const char **argv)
prefork.exit(1);
}
dout(0) << "starting " << g_conf->name << " rank " << rank

This comment has been minimized.

@bassam

bassam Jul 6, 2017

Member

I removed this since it was duplicative with the log message on line 757 above. not sure I understand why we need two log messages.

@@ -100,6 +100,8 @@ class XioMessenger : public SimplePolicyMessenger, XioInit
/* Messenger interface */
virtual void set_addr_unknowns(const entity_addr_t &addr) override
{ } /* XXX applicable? */
virtual void set_addr(const entity_addr_t &addr) override
{ } /* XXX applicable? */

This comment has been minimized.

@bassam

bassam Jul 6, 2017

Member

I'm unable to build XioMessenger. The code looks like its gone stale.

@liewegas

This comment has been minimized.

Member

liewegas commented Jul 7, 2017

This looks totally reasonable to me. Let me know how testing goes! Would be great to include this in luminous.

@liewegas liewegas added this to the luminous milestone Jul 7, 2017

@bassam bassam changed the title from [DNM] mon: add support mon_advertise_addr option to [DNM] mon: add support public_bind_addr option Jul 7, 2017

@bassam

This comment has been minimized.

Member

bassam commented Jul 7, 2017

@liewegas I reverted back to using the original "bind_addr" due to a number of setbacks (tests make lots of assumptions about mon hosts, and a number of corner cases that came up while testing). The end result looks a lot better and much less churn. Please take a look.

Also added new unit tests that simulate the behavior of "routing" traffic from the public_addr to the public_bind_addr. it did require adding socat as a build dependency.

bassam added a commit to bassam/rook that referenced this pull request Jul 7, 2017

@@ -1763,6 +1773,46 @@ if test "$1" = TESTS ; then
run_tests "$@"
fi
# NOTE:

This comment has been minimized.

@bassam

bassam Jul 7, 2017

Member

this moved here from src/test/mon/misc.sh

@@ -16,6 +16,7 @@
OPTION(host, OPT_STR, "") // "" means that ceph will use short hostname
OPTION(fsid, OPT_UUID, uuid_d())
OPTION(public_addr, OPT_ADDR, entity_addr_t())
OPTION(public_bind_addr, OPT_ADDR, entity_addr_t())

This comment has been minimized.

@bassam

bassam Jul 7, 2017

Member

we could make this "mon_public_bind_addr" to be more specific. the same behavior could be carried over to OSD, MDS etc. if we wanted to so I left a bit more general. Note that we could also add "cluster_bind_addr" etc. if we wanted to too.

ldout(async_msgr->cct, 0) << __func__ << " connect claims to be "
<< paddr << " not " << peer_addr << " - wrong node!" << dendl;
goto fail;
ldout(async_msgr->cct, 20) << __func__ << " connect claims to be "

This comment has been minimized.

@travisn

travisn Jul 11, 2017

Member

should this check still be in place and look for either the public_addr or public_bind_addr? At a minimum I wonder if the previous behavior of failing should remain when not falling under the new scenario. And in the new scenario I don't think we would want to write the issue to the log since it's expected.

This comment has been minimized.

@bassam

bassam Jul 11, 2017

Member

I think its a little noisy too -- happy to remove it. Maybe @liewegas can weigh in here.

travisn added a commit to travisn/rook that referenced this pull request Jul 12, 2017

@bassam

This comment has been minimized.

Member

bassam commented Jul 12, 2017

@liewegas this is ready for review and testing. we ran this in Rook/Kuberentes scenarios and its working well.

@bassam bassam changed the title from [DNM] mon: add support public_bind_addr option to mon: add support public_bind_addr option Jul 12, 2017

@liewegas

This comment has been minimized.

Member

liewegas commented Jul 12, 2017

retest this please

@liewegas

This comment has been minimized.

Member

liewegas commented Jul 12, 2017

it looks like a submodule was inadvertantly updated?

@bassam

This comment has been minimized.

Member

bassam commented Jul 13, 2017

@liewegas oops. fixed.

travisn added a commit to travisn/rook that referenced this pull request Jul 14, 2017

@liewegas

Looks good! Agree on naming and relaxing that warning.

@liewegas liewegas added the needs-qa label Jul 14, 2017

@bassam

This comment has been minimized.

Member

bassam commented Jul 14, 2017

@liewegas just to confirm, @travisn is recommending we remove the warning completely since it now could potentially be normal operation and is likely to fill up the mon logs. what do you think?

@liewegas

This comment has been minimized.

Member

liewegas commented Jul 14, 2017

bassam added some commits Jul 7, 2017

test,qa/workunits: add tests for public_bind_addr
Add a set of new tests for the case when public_addr and public_bind_addr
are different for a mon. In order to test this properly I had to employ
port forwarding with socat. This helps simulate what would happen in a
environment like Kubernetes. socat is now a build dependency.

Also, moved jq_success to ceph-helpers.sh and refactored run_mon to enable
creating the mons without creating the rbd pool immediately.

Signed-off-by: Bassam Tabbara <bassam.tabbara@quantum.com>
docs: add docs for public_bind_addr
Signed-off-by: Bassam Tabbara <bassam.tabbara@quantum.com>
mon: add support public_bind_addr option
To support running in dynamic enviornments (like Kubernetes) the mon needs
to be able to advertise and ip address that is different from the ip address
that it listens on locally.

Added a new config option "public_bind_addr" which if set becomes the address
that the mon will bind to locally. If empty (the default) the public_addr
will be used to bind locally.

added a new function on Messenger to set_addr which is called by ceph-mon to set
the advertised address after doing the bind.

also relaxed the "wrong node!" errors in AsyncMessenger and SimpleMessenger as
its now valid to talk to a peer whose peer_addr_of_me is different from what
we expect.

Signed-off-by: Bassam Tabbara <bassam.tabbara@quantum.com>

travisn added a commit to travisn/rook that referenced this pull request Jul 14, 2017

travisn added a commit to travisn/rook that referenced this pull request Jul 14, 2017

@liewegas liewegas merged commit dd61a7f into ceph:master Jul 17, 2017

3 of 4 checks passed

make check (arm64) make check failed
Details
Signed-off-by all commits in this PR are signed
Details
Unmodified Submodules submodules for project are unmodified
Details
make check make check succeeded
Details
@@ -131,6 +131,7 @@ BuildRequires: python-pecan
BuildRequires: python-requests
BuildRequires: python-virtualenv
BuildRequires: python-werkzeug
BuildRequires: socat

This comment has been minimized.

@smithfarm

smithfarm Jul 17, 2017

Contributor

@bassam This (socat) is only needed for the unit test, correct?

This comment has been minimized.

@tchaikov

tchaikov Jul 17, 2017

Contributor

@smithfarm i am not Bassam. but i agree that socat is only used for tests. is there anything we can do different? if it's only for test?

This comment has been minimized.

@smithfarm

smithfarm Jul 17, 2017

Contributor

@tchaikov: yes, will handle in #15940

This comment has been minimized.

@bassam

bassam Jul 17, 2017

Member

@smithfarm yes just for make check. Feel free to make it part of a conditional. Note it's also could be a conditional for debs and FreeBSD too.

This comment has been minimized.

@smithfarm

smithfarm Jul 17, 2017

Contributor

Conditionalizing build dependencies is far beyond my poor-to-nonexistent Debian and FreeBSD packaging skills.

This comment has been minimized.

@tchaikov

tchaikov Jul 17, 2017

Contributor

maybe we should just move the build-depends for readying "make check" into install-deps.sh.

This comment has been minimized.

@smithfarm

smithfarm Jul 17, 2017

Contributor

@tchaikov Theoretically, "make check" can be run by rpmbuild, though?

This comment has been minimized.

This comment has been minimized.

@tchaikov

tchaikov Jul 17, 2017

Contributor

@smithfarm sorry for the confusion, i was talking about the debian packaging. because i failed find a way to conditionalize the "make check" only dependency. as it seems debian does not have such a flexibility to do the same thing as we can on rpmbuild.

This comment has been minimized.

@smithfarm

smithfarm Jul 17, 2017

Contributor

@tchaikov I see what you mean, now, and I think it's a good idea. One possible danger, though - people might forget that new unit test dependencies need to be added to ceph.spec.in for RPM, but to install-deps.sh for DEB. It's somewhat non-intuitive.

travisn added a commit to travisn/rook that referenced this pull request Jul 17, 2017

travisn added a commit to travisn/rook that referenced this pull request Jul 19, 2017

travisn added a commit to travisn/rook that referenced this pull request Jul 19, 2017

travisn added a commit to travisn/rook that referenced this pull request Jul 20, 2017

travisn added a commit to travisn/rook that referenced this pull request Jul 24, 2017

travisn added a commit to travisn/rook that referenced this pull request Jul 25, 2017

travisn added a commit to travisn/rook that referenced this pull request Jul 28, 2017

travisn added a commit to travisn/rook that referenced this pull request Jul 28, 2017

travisn added a commit to travisn/rook that referenced this pull request Aug 2, 2017

bassam added a commit to bassam/rook that referenced this pull request Aug 2, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment